David Gasquez
banner
davidgasquez.com
David Gasquez
@davidgasquez.com
Data @ Protocol Labs.

Open Data, Open Source, Open Protocols.

Walks taker. Progressive Metal enjoyer.

davidgasquez.com
I've been using this pattern to "specialize" Codex for vaguely defined tasks like classification, filtering, soft sorting, ...

davidgasquez.com/specializing...

Made more than 10,000 invocations so far (reusing my ChatGPT subsciption) and am really happy with the pattern!
October 22, 2025 at 6:38 PM
Asked Claude to move my handbook notes from Obsdian to my Astro website. Very happy with the results!

davidgasquez.com/handbook/dat...

I'll keep the Obsidian Publish version around while I work on making it look as good.
July 23, 2025 at 10:23 AM
The end dataset also serves as a much smoother API!

- No key required
- Batch export friendly
- Explorable in the browser
July 21, 2025 at 1:58 PM
Updated linkweaver (custom notes & research helper) with a couple of new tricks.

1. Saves the URL content locally
2. Accepts any command to preprocess the generated Markdown. E.g: `llm "Clean the following Markdown"`.

github.com/davidgasquez...

Want to try it? `uvx linkweaver` and enjoy! 😀
July 16, 2025 at 3:18 PM
Now making the agents in the tool chose the final name of the tool using the tool itself. 😜

github.com/davidgasquez...
July 7, 2025 at 10:01 AM
Finally had some time to publish a vibecoded tool I (and Claude Code) built to explore PydanticAI.

Replicates a contest where jurors evaluate candidates in pairwise comparisons that get turned into a leaderboard.

github.com/davidgasquez...
July 5, 2025 at 3:34 PM
Spent some time going through @cameron.pfiffer.org 's Comind project and @void.comind.network public code and had a lot of fun learning about them!

- comind.stream
- tangled.sh/@cameron.pfi...

Love Comind's components idea! Clean and simple.
July 4, 2025 at 9:24 AM
Welcome to the dark side! We have the AUR and, eventually, a few interesting system updates! 😅

This is how my current setup (github.com/davidgasquez...) looks like after 9 years of tinkering.
July 2, 2025 at 10:39 AM
I'm also publishing a Parquet file joining all the datasets metadata. Use it from your favorite tool! 💃

sql-workbench.com#queries=v0,S...
June 16, 2025 at 7:08 PM
Alright, this is a job for a "research" template + fragments + chain.

Is the `llm prompt` doing the same thing as model.chain() in Python?
June 2, 2025 at 3:17 PM
Ended up writting a small CLI that you can point to a Markdown note and get an "expanded" view into it.

It reads all the links, makes them Markdown and returns all of them under different XML tags for your favorite LLM to consume!

github.com/davidgasquez...
June 2, 2025 at 2:44 PM
Wrote a post about some things I've been doing to make my projects more LLM friendly. Spoiler alert: it makes the projects more human friendly too!

davidgasquez.com/llm-friendly...

Any other ideas or suggestions?
May 24, 2025 at 6:00 PM
It is! Found this plugin that does exactly that.

github.com/jmdaly/llm-g...

Found it with this GitHub query:

github.com/search?q=%22...

Seems they're adding a custom prompt though!
May 24, 2025 at 4:13 PM
Some INE tables are large!

If you want to explore the "Travels, overnight stays and average stay by main features of the trips" dataset, you'd have to download a 37GB CSV!

The compressed Parquet file is less than 10MB and can be queried from your browser! 💃

shell.duckdb.org#queries=v0,s...
March 20, 2025 at 12:56 PM
Exactly! Will try as soon as it's available. 😀
March 18, 2025 at 4:57 PM
Small update! Learned some things about INE's servers.

- Don't really support Range Requests.
- Don't officially support gzip Accept-Encodings, but they do. Shrinking the download sizes ~8 times.

That means we can loop through the compressed datasets and save them as Parquet files on GH Actions!
March 18, 2025 at 10:25 AM
Not sure why, it opened for a brief second and then went white.

The URL still is sql-workbench.com#config={%22p... so it doesn't seems to be truncating anything.
March 18, 2025 at 7:54 AM
Wrote a post after spending one week using @zed.dev.

davidgasquez.com/trying-zed-e...

TLDR: Would love to switch once they work on a few extra features (Notebooks, Agent Mode, ...)
February 21, 2025 at 1:03 PM
What a great definition of Data Engineering! 😅

The only thing I'd tweak is removing the "large amounts of data" requirement. Moving small/medium sized dataset can be as difficult/interesting!

ludic.mataroa.blog/blog/brainwa...
February 13, 2025 at 8:58 AM
A Python browser sandbox, by the @pydantic.dev folks!

github.com/pydantic/pyd...

Great to share small snippets of code (e.g: pydantic.run/store/c544b1...).
January 30, 2025 at 10:43 AM
Been binging that channel a lot lately! It is close to what I was looking for: ML stuff with modern tooling and interesting approaches!
January 29, 2025 at 2:57 PM
Created a small repository as a personal template for Datathons / ML Competitions.

Contains a bunch of interesting resources! 📚

github.com/davidgasquez...
January 27, 2025 at 10:23 AM
DataFusion is now the fastest on ClickBench when querying partitioned parquet files on a c6a.4xlarge!

benchmark.clickhouse.com#eyJzeXN0ZW0i...
January 16, 2025 at 8:46 AM
Rules are very simple. Basically the same things I'd do if I had only access to the duckdb CLI 😅
January 14, 2025 at 8:59 AM
Another that took a bit more.

I need to teach it about duckdb.org/2024/09/09/a....
January 14, 2025 at 8:57 AM