GasStationManager
gasstationmanager.bsky.social
GasStationManager
@gasstationmanager.bsky.social
Independent AI Researcher.
Blogging at https://gasstationmanager.github.io/
Building stuff at https://github.com/GasStationManager
Pinned
My essay on making coding AIs safe and hallucination-free gasstationmanager.github.io/ai/2024/11/0...

#LLM #AISafety #blueskAI
A Proposal for Safe and Hallucination-free Coding AI
Update 11/19: First progress report.
gasstationmanager.github.io
LeanTool github.com/GasStationMa...
Now supports Sonnet 3.7; also available as a model-context-protocol (MCP) server that can be plugged into Claude Code.
GitHub - GasStationManager/LeanTool: A "code intepreter" for Lean
A "code intepreter" for Lean. Contribute to GasStationManager/LeanTool development by creating an account on GitHub.
github.com
February 26, 2025 at 6:03 PM
A couple updates to LeanTool github.com/GasStationMa... : Now supporting reasoning models like o3-mini and r1; also with a demo proxy server up at the base URL www.codeproofarena.com:8800/v1
www.codeproofarena.com
February 15, 2025 at 7:58 PM
More results from my hallucination detection and recovery experiments:

gasstationmanager.github.io/ai/2025/02/0...

Try your favorite model/prompt/reasoning strategy on these!
Followup on Hallucination Detection and Recovery (progress report #3.5)
In this followup update to my initial experiment, I report on some additional results as I apply the same approach to more problem instances.
gasstationmanager.github.io
February 5, 2025 at 7:43 PM
January 30, 2025 at 8:42 PM
How to detect hallucinations in the output of coding AI models? My approach is to utilize interactive theorem provers like Lean. See my blog post for details: gasstationmanager.github.io/ai/2025/01/2...

#lean #leanprover #ai
Hallucination Detection and Recovery: Initial Experiment
(This also serves as progress report #3 of our overall program)
gasstationmanager.github.io
January 22, 2025 at 11:23 PM
github.com/GasStationMa...
Connects LLMs with a "code interpreter" for Lean. Here's a proof Sonnet did in a recent chat session.
December 17, 2024 at 6:13 AM
Once your model reach the top and you attract enough funding to acquire compute, then it is the time to modify your approach to make sure it scales. That is by no means easy, but I believe it is the best response. (2/2)
Here's my hot take on the bitter lesson. It is very likely true, but it only holds in the long term. That's why OpenAI people love to recommend it, because following it ensures whoever has the most compute dominates. And that is why for anyone else, now is the time to try novel ideas. (1/2)
#ai
Thoughts on Richard Sutton's 'bitter lesson'?

"... the actual contents of minds are tremendously, irredeemably complex; we should stop trying to find simple ways to think about the contents of minds..."

Okay (maybe?) for engineering but not understanding.

www.incompleteideas.net/IncIdeas/Bit...
December 6, 2024 at 6:24 PM
Here's my hot take on the bitter lesson. It is very likely true, but it only holds in the long term. That's why OpenAI people love to recommend it, because following it ensures whoever has the most compute dominates. And that is why for anyone else, now is the time to try novel ideas. (1/2)
#ai
Thoughts on Richard Sutton's 'bitter lesson'?

"... the actual contents of minds are tremendously, irredeemably complex; we should stop trying to find simple ways to think about the contents of minds..."

Okay (maybe?) for engineering but not understanding.

www.incompleteideas.net/IncIdeas/Bit...
The Bitter Lesson
www.incompleteideas.net
December 6, 2024 at 6:18 PM
Just found out about this project. Very cool!
December 2, 2024 at 7:13 AM
Reposted by GasStationManager
GenChess : Google Labs. https://labs.google/genchess (fun to play with) #AI #images #Google
November 30, 2024 at 12:14 AM
github.com/GasStationMa...
My repository with proof of concept and some preliminary results on autoformalization of coding problems.
GitHub - GasStationManager/FormalizeWithTest: Autoformalization of coding problems, verified with test cases
Autoformalization of coding problems, verified with test cases - GasStationManager/FormalizeWithTest
github.com
November 29, 2024 at 3:34 AM
I have created a starter pack! Let me know if you want to be added.

go.bsky.app/GdZ2wqj
November 28, 2024 at 8:36 PM
Reposted by GasStationManager
November 19, 2024 at 9:42 PM
Any Lean enthusiasts here? You might be interested in Code with Proofs: the Arena www.codeproofarena.com:8000
Test your Lean 4 skills with our coding challenges!

#lean
www.codeproofarena.com
November 25, 2024 at 10:15 PM
My essay on making coding AIs safe and hallucination-free gasstationmanager.github.io/ai/2024/11/0...

#LLM #AISafety #blueskAI
A Proposal for Safe and Hallucination-free Coding AI
Update 11/19: First progress report.
gasstationmanager.github.io
November 24, 2024 at 2:07 PM