Aditi Krishnapriyan
@ask1729.bsky.social
2K followers 150 following 9 posts
Assistant Professor at UC Berkeley
Posts Media Videos Starter Packs
ask1729.bsky.social
6/ The distilled MLFFs are much faster to run than the original large-scale MLFF: not everyone has the GPU resources to use big models and many scientists only care about studying specific systems (w/ the correct physics!). This is a way to get the best of all worlds!
ask1729.bsky.social
5/ We can also balance training at scale efficiently (often w/ minimal constraints) with distilling the correct physics into the small MLFF at test time: e.g., taking energy gradients to get conservative forces, and ensuring energy conservation for molecular dynamics.
ask1729.bsky.social
4/ Smaller, specialized MLFFs distilled from the large-scale model are more accurate than training from scratch on the same subset of data: the representations from the large-scale model help boost performance, while the smaller models are much faster to run
ask1729.bsky.social
3/ We formulate our distillation procedure as the smaller MLFF is trained to match Hessians of the energy predictions of the large-scale model (using subsampling methods to improve efficiency). This works better than distillation methods to try to match features.
ask1729.bsky.social
2/ Model distillation involves transferring the general-purpose representations learned by a large-scale model into smaller, faster models: in our case, specialized to specific regions of chemical space. We can use these faster MLFFs for a variety of downstream tasks.
ask1729.bsky.social
1/ Machine learning force fields are hot right now 🔥: models are getting bigger + being trained on more data. But how do we balance size, speed, and specificity? We introduce a method for doing model distillation on large-scale MLFFs into fast, specialized MLFFs! More details below:

#ICLR2025
ask1729.bsky.social
Would also appreciate being added, thanks!