di
di-web.bsky.social
di
@di-web.bsky.social
Python developer, Data and AI engineer
Reposted by di
AI Boosters: Don’t worry, AI will create LOTS of jobs!

The jobs:
September 8, 2025 at 4:56 PM
Since that wasn’t an option here, I just specified the output format directly in the prompt, and it turned out easier to validate the output manually in Obsidian based on the filename

the script source code: github.com/dmitriiweb/r...
GitHub - dmitriiweb/recipes-scanner: Convert cookbook PDFs (one recipe per page) into clean Markdown with a local vision LLM (Ollama + pydantic-ai). uv-powered CLI.
Convert cookbook PDFs (one recipe per page) into clean Markdown with a local vision LLM (Ollama + pydantic-ai). uv-powered CLI. - dmitriiweb/recipes-scanner
github.com
August 17, 2025 at 8:56 PM
At first, I planned to use gemma3 for image recognition, but it doesn’t support tools, so I had to wrestle with the prompt—and even then, some manual cleanup is required. If the model supports tools, you can define a pydantic.BaseModel in pydantic-ai with the required arguments and types (!) ->
August 17, 2025 at 8:56 PM
So I wrote a small script that converts all PDFs into images, then passes each image through a multimodal model, and saves the result as Markdown with proper formatting ->
August 17, 2025 at 8:55 PM