Simon Ging
simon.ging.ai
Simon Ging
@simon.ging.ai
Doctoral Researcher in Computer Vision and Natural Language Processing.
Excited to release our models and preprint: "Using Knowledge Graphs to harvest datasets for efficient CLIP model training"

We propose a dataset collection method using knowledge graphs and web image search, and create EntityNet-33M: a dataset of 33M images paired with 46M texts.
Using Knowledge Graphs to harvest datasets for efficient CLIP model training
Training high-quality CLIP models typically requires enormous datasets, which limits the development of domain-specific models -- especially in areas that even the largest CLIP models do not cover wel...
arxiv.org
May 8, 2025 at 12:58 PM