alandenadel.bsky.social
@alandenadel.bsky.social
Thank you again to all my collaborators for their contributions and thoughtful feedback.

Madeline Hughes
Akshaya Thoutam
Anay Gupta
Andrew Navia
@nfusi.bsky.social
Srivatsan Raghavan
Peter Winter
@avapamini.bsky.social
@lcrawford.bsky.social

I appreciate any feedback!
November 7, 2025 at 8:07 PM
Interestingly, we saw improved zero-shot performance when increasing model size (but still no data scaling) for both scVI and Geneformer
November 7, 2025 at 8:07 PM
The Nicheformer authors observed a similar phenomenon, that when Nicheformer was pre-trained on 1% of their 110M cell dataset performance did not decrease dramatically:
November 7, 2025 at 8:07 PM
There is an implicit assumption that scaling the pre-training dataset size is inherently better, but the only demonstrated scaling law we know of is in terms of data quality:
arxiv.org/abs/2503.02726
Measurement noise scaling laws for cellular representation learning
Deep learning scaling laws predict how performance improves with increased model and dataset size. Here we identify measurement noise in data as another performance scaling axis, governed by a distinc...
arxiv.org
November 7, 2025 at 8:07 PM
This work addresses a critical consideration in training large-scale models: the size and diversity of the pre-training corpus.
November 7, 2025 at 8:07 PM
And for out-of-distribution perturbation response prediction.
November 7, 2025 at 8:07 PM
We also observed similar results for zero-shot batch integration.
November 7, 2025 at 8:07 PM
The learning saturation points were always 25% or less when evaluating the models on zero-shot classification and were always 10% or less when evaluating the models on fine-tuned classification.
November 7, 2025 at 8:07 PM
To assess the extent to which this plateauing generalized across datasets and tasks, we identified the "learning saturation point" for each model. This is the minimum pre-training dataset size for which a model surpassed 95% of the maximum performance observed.
November 7, 2025 at 8:07 PM
Across all model architectures, model performance at cell type classification (both zero-shot and fine-tuned) plateaued at a small fraction of the total pre-training dataset size, regardless of dataset diversity. When fine-tuning, pre-training has almost no impact on performance.
November 7, 2025 at 8:07 PM
We assessed five model architectures pre-trained to perform as single-cell foundation models (scFMs) in the context of single-cell RNA-seq: PCA, scVI, SSL, Geneformer, and SCimilarity. We pre-trained these models on subsets of the scTab corpus using three downsampling schemes.
November 7, 2025 at 8:07 PM
In our expanded analysis, we show that single-cell foundation models tend to plateau in downstream task performance with pre-training subsets that are a small fraction of the size of current pre-training datasets.
November 7, 2025 at 8:07 PM
Thank you to all my collaborators for their contributions and thoughtful feedback.

Madeline Hughes
Akshaya Thoutam
@anaygupta.bsky.social
Andrew Navia
@nfusi.bsky.social
Srivatsan Raghavan
Peter Winter
@avapamini.bsky.social
@lcrawford.bsky.social

I welcome any comments!
December 18, 2024 at 6:48 PM