Charlie Harris
@harrisbio.bsky.social
960 followers 370 following 39 posts
PhD @ Cambridge in AI for Bio | Interested in generative modelling for drug discovery and science policy 🇬🇧 Website: cch1999.github.io Blog: harrisbio.substack.com Database: harrisbio.notion.site
Posts Media Videos Starter Packs
harrisbio.bsky.social
Small personal update: very pleased to be in Singapore next week to present 2 spotlight papers at ICLR 2025 on AI for molecular design!! 🇸🇬

DM me if you want to meet up and chat about AI for bio, drug discovery, science policy or just chat about aviation!!!
harrisbio.bsky.social
16/ What’s clear is that getting data and compute right is essential—not just for breakthroughs in science but for keeping the UK competitive globally.

Here’s hoping this plan gets the funding, leadership, and focus it needs to succeed!

Happy to chat about any of this.

end
harrisbio.bsky.social
15/ Side note: UK universities could supercharge their AI teaching by embracing industry expertise (where the real knowledge is)

I teach a course at Cambridge led by a DeepMind researcher, and it’s the most popular in the department.
harrisbio.bsky.social
14/ US universities do this well with CS minors, which foster computational literacy across disciplines.

The UK could adopt similar models to produce scientists who are not only domain experts but also skilled at applying AI tools to their fields.
harrisbio.bsky.social
13/ The plan has solid ideas on AI skills, but it's not *just* about creating more "AI graduates." We need to train domain experts in the natural sciences to understand and use AI effectively.

Almost all scientists should know neural networks as well as they know Excel and stats
harrisbio.bsky.social
12/ Another standout: the plan proposes an internal headhunting team within the UK Government to attract top global talent to AISI, the UK Sovereign AI Team, and UK-based companies.

Will they also have the power to fast-track visas? From experience, i hope so....
harrisbio.bsky.social
11/ The UK Sovereign AI Team could be a great connector of
-Public institutions creating scientific datasets
-Industrial labs capable of training models on those datasets

This sort of collaboration could really unlock breakthroughs in science
harrisbio.bsky.social
10/ The plan’s proposal to create a UK Sovereign AI Team is great. This unit will partner with private and academic sectors to back national champions and remove roadblocks in AI, with a strong focus on AI for science and robotics.
harrisbio.bsky.social
(Usual reminder that AlphaFold3 was trained for 120k+ GPU hours... this is multiple times more than the whole compute budget of my lab this year)
harrisbio.bsky.social
9/ Another question: who will the AIRR programme directors work for? UKRI? ARIA?

Will they be empowered to deploy large amount of compute into highly productive groups at the cutting edge?

There is no point in this if it means everyone only gets a few GPU hours each.
harrisbio.bsky.social
9/ One question: will these AIRR programme directors also decide how funding is allocated for data generation?

For scientific initiatives, compute and data strategies are deeply interconnected. Ideally, the same person would oversee both to ensure alignment.
harrisbio.bsky.social
8/ Another standout is the creation of AIRR programme directors—mission-focused individuals with autonomy to strategically allocate compute to high-potential projects.

A kind of "Compute Czar" role, this could significantly accelerate progress on big bets in AI for science.
harrisbio.bsky.social
7/ However, there’s a risk of duplicating efforts where existing world-class institutions, like the EBI managing the PDBe, are already doing excellent work.

Not every problem needs to fit into a National Data Library-sized™ hole. Let’s build on what we already have!
harrisbio.bsky.social
6/ People often say, "Big Pharma has lots of data!"—but much of it is unstructured and sparse, making it unsuitable for deep learning.

The plan acknowledges this challenge and recommends creating better infrastructure and incentives to make datasets AI-ready.
harrisbio.bsky.social
5/ That’s why I’m thrilled to see the plan emphasise strategic data initiatives:
-Identifying high-impact datasets
-Improving data quality
-Incentivising researchers and companies to unlock and curate datasets

These efforts will make sparse, unstructured datasets better for AI
harrisbio.bsky.social
4/ If we want breakthroughs beyond protein folding, we need to address data gaps across science.

AlphaFold was made possible by sustained investment in protein structure data.

Similar long term commitments are essential for other fields like materials and climate science.
harrisbio.bsky.social
3/ AI breakthroughs like AlphaFold wouldn’t be possible without decades of work on datasets.

e.g., AlphaFold was trained on protein structures from the Protein Data Bank (PDB), which took 50+ years and ~$20 *billion* to create.

This is the kind of foundational effort AI needs.
harrisbio.bsky.social
2/ There’s a lot to like in this plan:
- Expanding UK AI compute capacity by 20x
- Establishing AI Growth Zones
- Building up AI talent pipelines

But as a scientist, what excites me most is the report’s focus on **data**—an area we really need to get right.
harrisbio.bsky.social
1/ Just read through the Matt Clifford AI Action Plan now.

Tl;dr: it's great but here are a few things that stood out to me as someone interested in AI for Science and sovereign compute and data capability.

A thread: 🧵
Reposted by Charlie Harris
austinjtripp.bsky.social
A common issue I see in ML, both from ML "experts" and "users", is overly optimistic assumptions.

"experts" (people designing algs) usually assume the data is very simple

"users" (people using algs) usually assume that algorithms are more robust than they really are

Conclusion: always be careful!
harrisbio.bsky.social
Added NewCo Kerna Labs, a new AI-first mRNA payload design company founded by former Moderna CSO with $6M in seed.

Also added new Cradle Bio series B worth $73M
harrisbio.bsky.social
Now I can share external links without the posts being down regulated - I thought it would share this again!:)

I have compiled a list of now 100+ companies in the 'TechBio' space into a fully open database for the community.

Find here and please share if you like
open.substack.com/pub/harrisbi...
harrisbio.bsky.social
Just added Graph Therapeutics, a new startup in Vienna focusing on precision medicine for inflammation and immunology

Founded by former Allcyte team
Reposted by Charlie Harris
lindorfflarsen.bsky.social
📣 Save the dates 📅

We are organizing a Benzon Symposium on "Protein structure prediction and design" with what I think is an amazing set of speakers

Meeting will take place in Copenhagen 🇩🇰 on Sept. 1–4, 2025, and abstract submission will open in March (benzon-foundation.dk/benzon-sympo...)
Flyer for a Benzon Symposium on Protein structure prediction and design in biology and pharmacology (Sept. 1–4, 2025). Speakers include: Gabriel Rocklin, Birte Höcker, Amy Keating, Tanja Kortemme, Bruno Correia, Sarel  Fleishman, Ashutosh Chilkoti, Minkyung Baek, Noelia Ferruz, James Fraser, Alan  Moses, Susan Marqusee, Ben Lehner, Mohammed AlQuraishi, Dek Woolfson, Gustav Oberdorfer, Hannah Wayment-Steele, Ora Schueler-Furman, Jenifer Listgarten, Alexander Rives, & Max Bonomi.