Christian Cao
@christiancao.bsky.social
31 followers 95 following 10 posts
3rd year MD student @UofT. Fitness, music, and nature enthusiast! Automating evidence synthesis @ ottosr.com
Posts Media Videos Starter Packs
Pinned
christiancao.bsky.social
Today we’re announcing otto-SR, an AI workflow to perform systematic reviews 3000x faster

By using gpt-4.1 and o3-mini, ottoSR beats humans at all tasks

In two days, we conducted 12 work-years of Cochrane research–finding 2x more relevant papers and altering key conclusions 🧵
christiancao.bsky.social
Our work isn’t done. We’re currently working on incorporating search and risk of bias assessments. If you’re interested in connecting, please don’t hesitate to reach out!
christiancao.bsky.social
... Bijan Teja; Alexander Leung; Lina Ghosn; Rahul Arora; Michael Noetel; David B. Emerson
christiancao.bsky.social
Plus all other researchers involved: Katherine Manta; Elina Farahani; Matthew Cecere; Anabel Selemon; Linsey Gong; Robert Kloosterman; Scott Jiang; Richard Saleh; Denis Margalik; James Lin; Jane Jomy; Jerry Xie; David Chen; Jaswanth Gorla; Sylvia Lee; Kelvin Zhang; Harriet Ware; Mairead Whelan...
christiancao.bsky.social
If you’re a researcher who works on systematic reviews: we’re offering otto-SR through a free pilot at ottosr.com/sign-up/
otto-SR | The AI Systematic Review Platform
Systematic reviews in hours, not months.
ottosr.com
christiancao.bsky.social
In one review, otto-SR found 5 more relevant papers than the original authors (9 vs 4 studies).

These additional studies revealed that taking immune-enhancing nutrition before gastric surgery likely reduces hospital stay by over 1 day (p<0.05).
christiancao.bsky.social
When we reproduced an entire Cochrane issue of reviews (n=12 reviews), otto-SR found nearly twice as many relevant studies (n=114) compared to the original reviews (n=64).

These additional studies materially changed conclusions: two reviews became statistically significant, one lost significance.
christiancao.bsky.social
Through rigorous benchmarking, otto-SR consistently outperformed or matched humans in:
- screening sensitivity (otto-SR: 96.7% vs. human: 81.7%)
- screening specificity (otto-SR 97.9% vs. human 98.1%)
- data extraction accuracy (otto-SR: 93.1% vs. human: 79.7%)
christiancao.bsky.social
Today we’re announcing otto-SR, an AI workflow to perform systematic reviews 3000x faster

By using gpt-4.1 and o3-mini, ottoSR beats humans at all tasks

In two days, we conducted 12 work-years of Cochrane research–finding 2x more relevant papers and altering key conclusions 🧵