Lightnews — Scholar-powered news

Sam Blau @samblau.bsky.social · Aug 14

Come work with @emorychannano.bsky.social and me!
#robotics #nanochemistry #machinelearning #UCNPs

Emory Chan @emorychannano.bsky.social · Aug 14

We have a postdoc opening w/ @samblau.bsky.social
on the autonomous synthesis of colloidal upconverting nanoparticles.

If you're looking for an exciting postdoc combining #robotics, #nanochemistry, #machinelearning, #UCNPs & simulations, see the link below!

combinano.lbl.gov/openings

Chan Group @ Molecular Foundry - openings

The Chan group welcomes inquiries for motivated, creative, and independent researchers at all levels (postdoctoral fellow, graduate students, undergraduates, and visitors), even in the absence of post...

combinano.lbl.gov

2

Reposted by Sam Blau

Evan Spotte-Smith (they/them) @ewcspottesmith.bsky.social · Jul 31

Interested in learning more about our recently published OMol25 dataset and the advances that it's bringing to atomistic machine learning? Check out this talk that my boy @samblau.bsky.social gave as part of the "Modeling Talk Series".

#CompChem ⚗️ 🧪 #SciML

Modeling Talk Series - The Open Molecules 2025 (OMol25) Dataset, Evaluations, and Models

Samuel Blau, Berkeley Lab Video Recording Slides (pptx, pdf)

sites.google.com

1 3 9

Sam Blau @samblau.bsky.social · Jul 29

I'm presenting OMol25 tomorrow 7/29 at 9 AM PST as part of a talk series at Google. Learn how we built the dataset + how MLIPs trained on OMol are revolutionizing comp chem!
Meet: lnkd.in/g4AAWkcK
YouTube Stream: lnkd.in/ggmtMtTR
Join group: lnkd.in/g5ciuNuX

3

Sam Blau @samblau.bsky.social · May 15

OMol25 was calculated with ORCA. I want to acknowledge the work of the ORCA team to improve the quality of the gradient + the robustness of SCF convergence for complicated systems as part of the OMol effort - it was much appreciated and critical to ensuring that we're releasing high quality data!

FACCTs @faccts.de · May 15

“Built with the high-performance quantum chemistry program package ORCA (Version 6.0.1), OMol25 contains simulations of large atomic systems that, until now, have been out of reach.” - Meta

#ORCAqc #ORCA6 #CompChem #QuantumChem #ML #Meta

ai.meta.com/blog/meta-fa...

arxiv.org/abs/2505.08762

Sharing new breakthroughs and artifacts supporting molecular property prediction, language processing, and neuroscience

Meta FAIR is sharing new research artifacts that highlight our commitment to advanced machine intelligence (AMI) through focused scientific and academic progress.

ai.meta.com

2 13

Reposted by Sam Blau

Berkeley Lab @berkeleylab.lbl.gov · May 14

🚨 Just dropped: Open Molecules 2025 — a record-breaking dataset co-led by Berkeley Lab + Meta FAIR.

100M+ DFT snapshots. Built to train #AI for real-world chemistry 🧪.

Could reshape discovery in batteries, drug discovery & much more! @cs.lbl.gov ⬇️

Computational Chemistry Unlocked: A Record-Breaking Dataset to Train AI Models has Launched - Berkeley Lab

Scientists will finally be able to simulate the chemistry that drives our bodies, our environment, and our technologies.

newscenter.lbl.gov

3 14

Sam Blau @samblau.bsky.social · May 14

We can't wait to see what the community does with OMol! Don't hesitate to reach out with feedback on the data, models, or paper - we aren't going to submit to a journal until the leaderboard goes up, which means we have time to incorporate community feedback (within reason) 10/10

3

Sam Blau @samblau.bsky.social · May 14

A special shout out to co-first authors Daniel Levine and Muhammed Shuaibi who moved mountains making OMol a reality. I also want to recognize the substantial and critical contributions of @ewcspottesmith.bsky.social, Michael Taylor, Muhammad Hasyim, and Kyle Michel 9/N

1 4

Sam Blau @samblau.bsky.social · May 14

Co-leading OMol with Brandon and Larry was a joy and an honor - as was assembling a world-leading team of scientists from 2 companies, 2 national labs, and 6 universities who were excited to help build an open-source, revolutionary molecular DFT dataset to push science forward 8/N

1 1

Sam Blau @samblau.bsky.social · May 14

Right now, OMol data has energy, forces, partial charges, partial spins, and HOMO/LUMO. But we have far more info that we still need to parse and hope to do a battery of GBW postprocessing. Plus we have 10 petabytes of electron densities. Lots more to come! 7/N

1 1

Sam Blau @samblau.bsky.social · May 14

And check out the UMA demo (facebook-fairchem-uma-demo.hf.space UMA is trained on OMol + other FAIR Chemistry datasets) - metal complexes at +1 vs +2 correctly optimize to tetrahedral/planar and reduced ethylene carbonate correctly ring-opens while a neutral EC remains stable 6/N

Gradio

facebook-fairchem-uma-demo.hf.space

1 2

Sam Blau @samblau.bsky.social · May 14

Data, models, & paper are available to download now! 5/N
Paper: arxiv.org/abs/2505.08762
Data + models: huggingface.co/facebook/OMo...

The Open Molecules 2025 (OMol25) Dataset, Evaluations, and Models

Machine learning (ML) models hold the promise of transforming atomic simulations by delivering quantum chemical accuracy at a fraction of the computational cost. Realization of this potential would en...

arxiv.org

1 1

Sam Blau @samblau.bsky.social · May 14

We're also releasing baseline models trained on OMol. To guide future MLIP development, we built novel evaluations on intermolecular interactions, conformers, and charge/spin. We hope to include frequency, ΔG, and TSopt tasks when we put up a public leaderboard in the summer 4/N

1 2

Sam Blau @samblau.bsky.social · May 14

OMol was constructed via an unprecedented diversity of methods: MD, ML-MD, RPMD, rattling, Architector, rxn path interpolation, AFIR, optimization, and scaled separation. We also recalculated some previous datasets and did additional sampling/structure generation atop others 3/N

1 4

Sam Blau @samblau.bsky.social · May 14

OMol covers 83 elements, a wide range of intra and intermolecular interactions, explicit solvation, reactive structures, conformers, charges -10 to 10, 0-10 unpaired electrons, and 2-350 atoms per snapshot. It required >6B CPU hrs, 10x more than any prev MLIP training dataset 2/N

1 1 3

Sam Blau @samblau.bsky.social · May 14

The Open Molecules 2025 dataset is out! With >100M gold-standard ωB97M-V/def2-TZVPD calcs of biomolecules, electrolytes, metal complexes, and small molecules, OMol is by far the largest, most diverse, and highest quality molecular DFT dataset for training MLIPs ever made 1/N

5 10 46

Sam Blau @samblau.bsky.social · May 1

It was a pleasure to give an IIDAI seminar on nanoparticle ML for gradient-based heterostructure optimization (w/ @emorychannano.bsky.social ) and neural network path opt for finding reaction transition states on MLIPs (w/ @thglab.bsky.social) - find the talk here: www.youtube.com/watch?v=-4jB...

IIDAI Seminar, 5/1/2025, Samuel M. Blau (Berkeley Lab)

YouTube video by Coordinated Science Laboratory

www.youtube.com

1

Reposted by Sam Blau

Andrew S. Rosen @andrewrosen.bsky.social · May 1

🧠 New postdoctoral researcher position at Princeton for those interested in data science and machine learning! Specify my group if you are interested in working together. Deadline is May 31. Details: puwebp.princeton.edu/AcadHire/app...

puwebp.princeton.edu

2 5

Sam Blau @samblau.bsky.social · Mar 31

Final day to submit abstracts for ACS Fall 2025! Reminder that @ewcspottesmith.bsky.social , Brett Savoie (Notre Dame), and I are organizing a symposium on "Chemical Reaction Networks, Retrosynthesis, and Reaction Prediction". Will be a mix of invited and contributed talks - please submit! #CompChem

3

Reposted by Sam Blau

Gabe Gomes @gabegomes.bsky.social · Mar 24

the @gpggrp.bsky.social is at the ACS Spring 2025! come check out the works of Daniil Boiko and Rob MacKnight at the "ML + AI in Organic Chemistry" Symposium (Hall B-1, Room 4) today! extreme scaling of experimental chemical reactions via MS and an OS for autonomous comp chem!

2 4

Sam Blau @samblau.bsky.social · Mar 21

Looking forward to speaking at ACS on Sunday at 5:30! Come learn about "Popcornn" - a new method for double-ended transition state optimization atop machine learned interatomic potentials that is substantially better than NEB or GSM.

3

Sam Blau @samblau.bsky.social · Mar 13

Fantastic new work from Aditi & co that shows how to leverage the expressivity + accuracy of massive pre-trained MLIPs to distill smaller, much faster models that are still extremely accurate to drive downstream simulations - no need to compromise on speed vs accuracy!

Aditi Krishnapriyan @ask1729.bsky.social · Mar 13

1/ Machine learning force fields are hot right now 🔥: models are getting bigger + being trained on more data. But how do we balance size, speed, and specificity? We introduce a method for doing model distillation on large-scale MLFFs into fast, specialized MLFFs! More details below:

#ICLR2025

3

Sam Blau @samblau.bsky.social · Jan 24

Applications closing in one week! If you’re interested in a prestigious postdoc at the intersection of AI/ML and nuclear nonproliferation, don’t hesitate to apply - come work with me on fascinating f-block chemistry and computational/ML methods! (Must be a US citizen)

3

Reposted by Sam Blau

John Parkhill @johnparkhill.bsky.social · Jan 10

1 13 62

Reposted by Sam Blau

Evan Spotte-Smith (they/them) @ewcspottesmith.bsky.social · Jan 8

@samblau.bsky.social, Brett Savoie (Notre Dame), and I are organizing a symposium for @amerchemsociety.bsky.social Fall 2025 called "Chemical Reaction Networks, Retrosynthesis, and Reaction Prediction" under @acscomp.bsky.social.

#reactionnetwork #CRN #retrosynthesis 🧪 ⚗️ #CompChem

1 10 16

Reposted by Sam Blau

ChemRxiv Bot @chemrxivbot.bsky.social · Dec 26

Inverse Design of Complex Nanoparticle Heterostructures via Deep Learning on Heterogeneous Graphs

Authors: Eric Sivonxay, Lucas Attia, Evan Walter Clark Spotte-Smith, Benjamin Lengeling, Xiaojing Xia, Daniel Barter, Emory Chan, Samuel Blau
DOI: 10.26434/chemrxiv-2024-1dw4q

3 6