Mengye Ren
mengyer.bsky.social
Mengye Ren
@mengyer.bsky.social
Assistant Professor of CS & DS at NYU. Machine Learning, Human-like AI, Continual Learning | Head of @agentic-ai-lab.bsky.social
mengyeren.com
Very soon we will see a rekindled interest in an AI that learns from an individual experience, defining a subjective sense of what is truly new and creative. We will start wondering: what if an AI has only learned handwritten digits in its lifetime? 2/2
December 16, 2025 at 10:09 PM
10/ This @agentic-ai-lab.bsky.social project was led by
Alex Wang @alexnwang.bsky.social and Chris Hoang @choang.bsky.social , together with Yuwen Xiong, @yann-lecun.bsky.social and @mengyer.bsky.social.
April 20, 2025 at 8:31 PM
9/ For more details, please check out our paper and website, or stop by our poster (Fri 10 AM, Hall 3 + Hall 2B #336) at ICLR!
Paper: arxiv.org/abs/2408.11208
Website: agenticlearning.ai/poodle/
PooDLe: Pooled and dense self-supervised learning from naturalistic videos
Self-supervised learning has driven significant progress in learning from single-subject, iconic images. However, there are still unanswered questions about the use of minimally-curated, naturalistic ...
arxiv.org
April 20, 2025 at 8:31 PM
8/ We also study how data augmentation choices like crop scale, input resolution, and time between sampled frames can have a large impact on video pretraining.
April 20, 2025 at 8:31 PM
7/ These performance differences manifest visually too! IN1K has noisy segmentations and FlowE misses small objects, while PooDLe avoids both problems.
April 20, 2025 at 8:31 PM
6/ Interestingly, we find that dense SSL performance is driven by large classes whereas ImageNet pretraining does well on small, foreground classes.
PooDLe is able to perform well on both small and large classes!
April 20, 2025 at 8:31 PM
5/ PooDLe, pretrained on BDD100K and Walking Tours, outperforms prior iconic and dense SSL methods on semantic segmentation and object detection!
We also release WT-Sem, an in-distribution semantic segmentation task for Walking Tours.
April 20, 2025 at 8:31 PM
4/ We also propose a spatial decoder module to upsample the top-level features to higher resolution for the dense loss. The top-level features act as an information bottleneck that both satisfies the high-level invariance loss and is compatible with upsampling for the dense loss.
April 20, 2025 at 8:31 PM