Lightnews — Scholar-powered news

Reposted

Jorge Bravo Abad @bravo-abad.bsky.social · 9d

Predictive chemistry often struggles with scarce data. Surrogate models can help, but should we use their predicted QM descriptors or hidden embeddings? Chen & Stuyver show that hidden spaces usually win—faster, more robust, and data-efficient. pubs.rsc.org/en/content/a...

Harnessing surrogate models for data-efficient predictive chemistry: descriptors vs. learned hidden representations

Predictive chemistry often faces data scarcity, limiting the performance of machine learning (ML) models. This is particularly the case for specialized tasks such as reaction rate or selectivity predi...

pubs.rsc.org

1 1

thijsstuyver.bsky.social @thijsstuyver.bsky.social · 26d

Take-home: CYCLO70 exposes where DFAs struggle, helping identify robust & transferable methods. A valuable tool for confident predictions in pericyclic reactivity (6/6)

1

thijsstuyver.bsky.social @thijsstuyver.bsky.social · 26d

Real-world test: self-healing polymer Diels–Alder reactions.
Functionals that looked fine on BH9 underperform badly—while CYCLO70-validated functionals (ωB97M-V, κPr2SCAN) remain robust. (5/6)

1 1

thijsstuyver.bsky.social @thijsstuyver.bsky.social · 26d

Best performers? 🏆
- ωB97M-V (range-separated hybrid)
- PBE-QIDH (double hybrid, even improves on CYCLO70)
- M06-2X & r2SCAN0 (most balanced hybrids) (4/6)

1 1

thijsstuyver.bsky.social @thijsstuyver.bsky.social · 26d

Tested across 93 functionals, errors on CYCLO70 are far larger than on BH9PERI. This dataset captures the worst-case scenarios you might encounter in screening or reactivity modeling. (3/6)

1 1

thijsstuyver.bsky.social @thijsstuyver.bsky.social · 26d

Why CYCLO70?
Popular datasets like BH9 are biased toward “easy” cases. They give an overly optimistic picture of DFT accuracy. CYCLO70 is built to probe the hardest regions of the pericyclic reaction space. (2/6)

1 1

thijsstuyver.bsky.social @thijsstuyver.bsky.social · 26d

New paper from our group out in JCTC! We introduce CYCLO70 — a benchmarking set of 70 challenging cycloaddition reactions (Diels–Alder, dipolar, sigmatropic).
👉 doi.org/10.1021/acs.... (1/6)

CYCLO70: A New Challenging Pericyclic Benchmarking Set for Kinetics and Thermochemistry Evaluation

Here, a new challenging benchmarking data set for cycloaddition reactions, CYCLO70, is presented and analyzed. CYCLO70 has been generated with the specific aim of being representative of the most chal...

doi.org

1 4 8

thijsstuyver.bsky.social @thijsstuyver.bsky.social · Jun 13

Main conclusion of this project: we find that hidden representations extracted from surrogate models generally outperform predicted QM descriptors, particularly when descriptor selection is not tightly aligned with the downstream task

thijsstuyver.bsky.social @thijsstuyver.bsky.social · Jun 13

Harnessing Surrogate Models for Data-efficient Predictive Chemistry: Descriptors vs. Learned Hidden Representations | ChemRxiv - doi.org/10.26434/che...

1

thijsstuyver.bsky.social @thijsstuyver.bsky.social · Jun 11

Performing a PCA for the errors across the dataset, we demonstrate not only that the errors across different functionals correlate to a significant extent, but also that functionals belonging to the same rung of Jacob’s ladder cluster together in the resulting plot (5/5)

thijsstuyver.bsky.social @thijsstuyver.bsky.social · Jun 11

We observe that only one functional, the range-separated hybrid ωB97M-V, reaches ”chemical accuracy” to model barriers and reaction energies; among the double hybrids, PBE-QIDH performs best, and among the hybrids, it is M06-2X and r2SCAN50 that exhibit the lowest errors (4/5)

1

thijsstuyver.bsky.social @thijsstuyver.bsky.social · Jun 11

CYCLO70 is a challenging benchmarking dataset for pericyclic reactions. Testing 93 distinct functionals, we observe that the errors on CYCLO70 are significantly bigger than those on the cycloaddition subset of BH9, the most popular benchmarking set for this reaction class (3/5)

1

thijsstuyver.bsky.social @thijsstuyver.bsky.social · Jun 11

Continuing our recent efforts in constructing more challenging/representative benchmarking datasets with the help of active learning, we present here CYCLO70 (2/5)

1

thijsstuyver.bsky.social @thijsstuyver.bsky.social · Jun 11

New preprint -- CYCLO70: A New Challenging Pericyclic Benchmarking Set for Kinetics and Thermochemistry Evaluation t.co/O6309jKaJq (1/5)

1 3

thijsstuyver.bsky.social @thijsstuyver.bsky.social · May 19

Overall, this hybrid ML–computational chemistry approach enables data-efficient discovery of thermally responsive DA reactions, advancing the rational design of self-healing polymers with tunable properties (5/5)

1

thijsstuyver.bsky.social @thijsstuyver.bsky.social · May 19

We first leverage our models to screen a comprehensive reaction space of synthetic diene-dienophile pairs, and subsequently use them to mine a database of commercially available natural products (4/5)

1 1

thijsstuyver.bsky.social @thijsstuyver.bsky.social · May 19

Refining only a small fraction of these profiles with DFT, we can train a robust ML model that predicts reaction characteristics with excellent accuracy. Adding a graph-based model to the workflow for pre-screening enables expansion to reaction spaces of 100k+ reactions (3/5)

1 1

thijsstuyver.bsky.social @thijsstuyver.bsky.social · May 19

In this work, we present a hierarchical workflow that integrates ML with automated reaction profile calculations to efficiently screen DA reaction spaces. Using our in-house TS-tools software, we first rapidly generate reaction profiles at the semi-empirical xTB level (2/5)

1 1

thijsstuyver.bsky.social @thijsstuyver.bsky.social · May 19

New preprint from our group: Screening Diels-Alder reaction space to identify candidate reactions for self-healing polymer applications (1/5)

chemrxiv.org/engage/chemr...

1 5 8

Reposted

ChemRxiv Bot @chemrxivbot.bsky.social · Feb 12

What can be learned from the electrostatic environments within nitrogenase enzymes?

Authors: Thijs Stuyver, Olena Protsenko, Davide Avagliano, Thomas Ward
DOI: 10.26434/chemrxiv-2025-dndx6

1 2

thijsstuyver.bsky.social @thijsstuyver.bsky.social · Feb 10

Happy to see this nice collaboration with @moranlabchem.bsky.social out in @ChemistryEur -- Abiotic Ribonucleoside Formation in Aqueous Microdroplets: Mechanistic Exploration, Acidity, and Electric Field Effects. @javialra97.bsky.social chemistry-europe.onlinelibrary.wiley.com/doi/full/10....

Abiotic Ribonucleoside Formation in Aqueous Microdroplets: Mechanistic Exploration, Acidity, and Electric Field Effects

A computational investigation of the reported abiotic phosphorylation of ribose and the subsequent formation of ribonucleosides reveals that the most plausible reaction mechanism involves the protona...

chemistry-europe.onlinelibrary.wiley.com

2 12

thijsstuyver.bsky.social @thijsstuyver.bsky.social · Feb 10

Now out in JCTC: Improving the Reliability of, and Confidence in, DFT Functional Benchmarking through Active Learning @javialra97.bsky.social pubs.acs.org/doi/full/10....

Improving the Reliability of, and Confidence in, DFT Functional Benchmarking through Active Learning

Validating the performance of exchange-correlation functionals is vital to ensure the reliability of density functional theory (DFT) calculations. Typically, these validations involve benchmarking dat...

pubs.acs.org

2 7

thijsstuyver.bsky.social @thijsstuyver.bsky.social · Jan 31

One more week to apply!

thijsstuyver.bsky.social @thijsstuyver.bsky.social · Jan 8

We’re reopening applications! A 5-year position as Data Steward/Research Engineer for the chemistry departments of
@psl-univ.bsky.social is available.

💼 New deadline: February 7th.
📄 More info & application details below.
🔁 Reposts appreciated! 🙌

thijsstuyver.bsky.social @thijsstuyver.bsky.social · Jan 14

This is (to some extent) negotiable, and it will also depend on the experience of the retained candidate. In any case, it will be significantly higher than a typical postdoc position in France; probably in the range of €3000 and €3500 a month net (after all taxes)

1

thijsstuyver.bsky.social @thijsstuyver.bsky.social · Jan 8

We’re reopening applications! A 5-year position as Data Steward/Research Engineer for the chemistry departments of
@psl-univ.bsky.social is available.

💼 New deadline: February 7th.
📄 More info & application details below.
🔁 Reposts appreciated! 🙌

1 7 8