UniverseTBD
banner
universetbd.org
UniverseTBD
@universetbd.org
We are on a mission to democratise science for everyone. Join our Discord at https://discord.gg/RH2jgT3vtQ, or contact us at [email protected].
HypoGen concludes our 2^2 fest and we truly hope you enjoyed it 🎇✨. Thank you for your great support with our mission to democratise science for everyone🌍.
April 18, 2025 at 5:57 PM
8/n A huge thank you to our partners at @msftresearch.bsky.social for enabling our research through the AFMR grant. And to our many friends around the world in academia and industry - this wouldn't be possible without your support 🙏.
April 18, 2025 at 5:57 PM
Huge thanks to Pranav Agarwal for the last minute eval request, we couldn't have done this without you! 💫
April 18, 2025 at 5:57 PM
7/n HypoGen was led by the absolute star @charlesoneill.bsky.social working with our wonderful mentors Tirthankar Ghosal, Roberta Raileanu, Mike Walmsley, Thang Bui, @kevinschawinski.bsky.social, @errai34.bsky.social and our team🚀.
April 18, 2025 at 5:57 PM
6/n Future directions: expand HypoGen to domains like astrophysics, biology, materials science, and build AI that doesn’t just answer questions but sparks them. 🔭🚀

Let us know here if you want to dive in & let’s push scientific discovery forward!
#HypoGen #AI4Science #DemocratisingScience
April 18, 2025 at 5:57 PM
5/n Humans come out on top (~85% win rate) - a comforting result that hints at a vision for the future when AI and human researchers work together to advance scientific discovery 🤝.
April 18, 2025 at 5:57 PM
4/n We fine‑tuned LLaMA 3.1 8B and its R1‑distilled variant on HypoGen (4‑bit quant + LoRA), then evaluated with perplexity, IAScore, and a couple of LLM judges coupled with human verification. We obtain significant gains in hypothesis novelty & feasibility with transparent reasoning steps! 🚀
April 18, 2025 at 5:57 PM
3/n HypoGen deets:

• 5,478 samples from NeurIPS 2023 & ICLR 2024
• JSON fields: bit, spark, flip, chain_of_reasoning
• Extraction courtesy of @OpenAI's tireless o1 model (no coffee required… maybe). 🤖☕
April 18, 2025 at 5:57 PM
arxiv.org
April 18, 2025 at 5:57 PM
2/n Where’s the creativity? HypoGen reframes scientific hypothesis generation as a conditional LM task: feed it the Bit (problem) → get the Spark (4–6 word insight), Flip (solution), plus an explicit Chain‑of‑Reasoning (How did the Bit turn into the Flip). 🧠🔗
April 18, 2025 at 5:57 PM
This work was supported by Microsoft's Accelerating Foundation Models Research program and the ITER Teide HPC cluster. Thanks to all collaborators across our many institutions!
April 16, 2025 at 12:17 PM
For researchers wanting to collaborate, we're available at discord.gg/PUR2FbFRZ4 and our DMs are open. Check out our code at w3id.org/UniverseTBD/..., and come find us at SCI-FM@ICLR if you would like to chat in person!
Join the UniverseTBD Discord Server!
Check out the UniverseTBD community on Discord - hang out with 161 other members and enjoy free voice and text chat.
discord.gg
April 16, 2025 at 12:17 PM
We see a future where multimodal models can reason across astronomical data types beyond just imagery: from spectra to light curves to data cubes.
Join the UniverseTBD Discord Server!
Check out the UniverseTBD community on Discord - hang out with 161 other members and enjoy free voice and text chat.
discord.gg
April 16, 2025 at 12:17 PM
We've evaluated AstroLLaVA on the Galaxy 10 DECaLS dataset and are releasing the model weights, code, and training dataset under the MIT license to support open science and further development by the community.
April 16, 2025 at 12:17 PM
Our two-stage fine-tuning process adapts the model for both image captioning and visual question answering in the astronomy domain, making complex astronomical concepts more accessible through natural conversation
April 16, 2025 at 12:17 PM
We fine-tuned LLaVA on ~30k astronomical images with captions & QA pairs from NASA APOD, ESO, and Hubble archives to create a model that understands astronomical concepts in visual form 👉 hf.co/datasets/UniverseTBD/AstroLLaVA_convos
April 16, 2025 at 12:17 PM
The UniverseTBD is extremely grateful to our partners at @msftresearch.bsky.social for their continuous support that enables our research.
April 15, 2025 at 12:48 PM
For more updates and behind-the-scenes breakthroughs, follow us at bsky.app/profile/univ... as we continue to break through the barriers of the sky! 🌌
bsky.app
April 15, 2025 at 12:48 PM
Dive into the full paper and explore the future of hypothesis generation here 👉 arxiv.org/pdf/2504.054...
#HypoGen #AI
arxiv.org
April 15, 2025 at 12:48 PM
We break down innovative approaches like direct prompting, adversarial methods, fine-tuning, knowledge integration, and even multi-agent systems, transforming how we turn vast scientific literature into actionable, testable ideas
April 15, 2025 at 12:48 PM
Authored by Atilla Kaan Alkan and the UniverseTBD team, this paper provides a single, comprehensive resource that covers everything from human-centric methods to cutting-edge LLM-driven techniques
April 15, 2025 at 12:48 PM