Ian Lee
banner
ianhhlee.bsky.social
Ian Lee
@ianhhlee.bsky.social
Emerging Technologies | Founder and CTO

LinkedIn: https://www.linkedin.com/in/ian-lee-9b2b6920/
Reposted by Ian Lee
Horseshit. Every business in the world has discovered in the last several months that GenAI is not in fact smart enough replace most of their employees.

Whatever you are reading from these (often gamed, sometimes contaminated) benchmarks does not reflect real-world real-world reality.
May 8, 2025 at 1:48 AM
Reposted by Ian Lee
New data throws some cold water on AI accuracy, via @nitasha.bsky.social:
A test by Vals AI of models from OpenAI, Anthropic, Meta, Google, etc. found that all scored LESS THAN 50% accuracy on average for simple tasks required of entry-level financial analysts.
www.washingtonpost.com/politics/202...
Analysis | AI tools mostly fumble basic financial tasks, study finds
The Washington Post’s essential guide to tech policy news.
www.washingtonpost.com
April 23, 2025 at 8:01 PM
Reposted by Ian Lee
Grok 3 required training two new massive data centers operating full time for months, and 15x the compute of Grok 2 — yet all these kinds of errors feel awfully familiar." ~ @garymarcus.bsky.social

garymarcus.substack....

#AI
3/3
Grok 3 Beta in Shambles
Maximal Truth still seems far away
garymarcus.substack.com
February 21, 2025 at 2:30 AM
Because some of us are entering into 2025 way too wound up
January 16, 2025 at 5:21 AM
The gauntlet has been thrown down.
#generativeai
December 31, 2024 at 6:05 AM
Awesome stuff from Gwen! Happy reading!
A bit late, but finally published my collection of good papers, blogs, videos (and one book). So we can all enjoy reading over the holidays and start 2025 inspired.

open.substack.com/pub/hackings...
What every SaaS developer should read on vacation - 2024 Edition
The annual round-up of long-ish papers and videos. Start the new year with new knowledge and inspiration.
open.substack.com
December 28, 2024 at 7:01 AM
Been looking forward to this! I particularly like her apposition of o3's performance with strategies utilized by previous ARC-AGI winners, which left her "a bit disappointed: each of these methods went against the assumptions that I described above that, for me at least, made ARC so attractive"
December 25, 2024 at 4:48 AM
Looking forward to Melanie Mitchell's take.
Nathan Lambert's is sufficiently well-balanced and worth the read, in particular his statement "and add humility, here’s an example from the ARC prize that o3 did not solve. It’s very easy. We clearly have a ways to go, but you should be excited ... "
Excellent post about the recent OpenAI o3 results on ARC (& other benchmarks). I don't know how @natolambert.bsky.social manages to write these so quickly! I highly recommend his newsletter.

www.interconnects.ai/p/openais-o3...

I am (more slowly) writing my own take on all this, coming soon.
o3: The grand finale of AI in 2024
A step change as influential as the release of GPT-4. Reasoning language models are the current big thing.
www.interconnects.ai
December 21, 2024 at 11:10 PM
Reposted by Ian Lee
How many times do we have to see this same movie, where an AI beats some benchmark and influencers gleefully shout “It’s So Over” without even trying out the AI and then on careful inspection the AI turns out to not be robust or reliable?

Thousands?

(It’s already been hundreds.)
December 21, 2024 at 12:59 AM
Reposted by Ian Lee
The first open source program to compete in an International Mathematical Olympiad and get a gold medal (with problems and solutions in normal human format and solutions generated within the same time limit as that for human contestants) will get $5m.
December 12, 2024 at 11:46 PM
Definitely a balanced view in contrast to all that flip-flopping
New AI Snake Oil essay: Last month the AI industry's narrative suddenly flipped — model scaling is dead, but "inference scaling" is taking over. This has left people outside AI confused. What changed? Is AI capability progress slowing? We look at the evidence. 🧵 www.aisnakeoil.com/p/is-ai-prog...
December 19, 2024 at 2:06 PM
Reposted by Ian Lee
If you are into ML theory (RL or not) with a proven track record, and you are interested in an industry research position, PM me. Feel free to spread the word.
December 19, 2024 at 12:55 AM
I interviewed an early career candidate whose resume showed fine tuning Open AI GPT models. I asked her to tell me about it and she said, "honestly it was maddening. It constantly felt like we were on the edge of getting something useful then it would be a mess in a new way." I knew she was legit.
December 14, 2024 at 8:05 AM