--- license: mit --- This repository contains the models presented in the paper [Filter Like You Test: Data-Driven Data Filtering for CLIP Pretraining](https://arxiv.org/abs/2503.08805). Included are FLYT and M-FLYT scoring models, as well as models trained on datasets filtered by these methods. For usage examples and more information visit our [GitHub repository](https://github.com/formll/FLYT).