Camilla Montonen
spimescape.bsky.social
Camilla Montonen
@spimescape.bsky.social
660 followers 610 following 460 posts
Building recommender systems @ Consumer Tech Co
Posts Media Videos Starter Packs
I am biased, because I mainly use Python, but the BQ sdk is pretty nice and you can read from sqlite with Python and ingest rows in bulk using the sdk.
This is how I learnt about GPUs without basically any background knowledge.
this is the way to learn with ai. you can start anywhere and backfill. you have to remain curious and careful and not just passively eat up plausible explanations. but if you put in the effort and the model is good, it’s powerful
The old path: Learn Lean (6 weeks), study abstract algebra (8 weeks), understand group theory (4 weeks), finally attempt your proof.

The new path: Start with your exact problem. Generate a tutorial for it. Backfill concepts as you hit them.
Reposted by Camilla Montonen
modern LLM inference engines like vLLM & SGlang are becoming tough to dive into. to learn how these inference engines work, nano-vllm is a fantastic educational project—complete Page Attention & LLM scheduler in <1k loc.🤯
flaneur2020.github.io/posts/2025-1...
A Walkthrough of nano-vllm | Flaneur2020
Recently, I&rsquo;ve been delving into the architecture of production-grade inference engines. While projects like vLLM and SGLang are crazy sophisticated, …
flaneur2020.github.io
TIL that junk journaling is a thing!
Microblogging like it's 2010s with vibes.
I was trying to research how prevalent remote Jupyter kernels are, but could only find a few open source projects (ie. something called kernel gateway - anyone using it?)
I have a very poor understanding of concurrency in general and more specifically a poor understanding of the kinds of Heisenbugs that will now be foisted on potentially unsuspecting Python users.
Me after hearing that Python 3.14 has removed the GIL:

"ah, finally increased throughput of pulling data from Big Query into my Jupyter notebooks."

Also me: "ah, a new footgun to add to my repertoire"
This means that if you max out memory by say loading a dataset that's larger than what you have capacity for, you crash the kernel and potentially also lose any code changes in your notebook that hadn't been written to disk.
One of the design issues with Jupyter notebooks when it comes to heavy ML workloads is that the notebook server runs by default on the same machine as the kernel that executes the code.
NVIDIA seems to invest a lot of engineering effort into making higher level libraries for writing efficient GPU code and yet everyone is flexing by rolling out their own CUDA kernels.
I was looking for a solution to "migrate a container that is close to OOM" onto another node and found CRIU.

Still a bit unclear if it supported on Google's GKE or not.
Container image experts - is it possible to manually create a new layer by manipulation the files in the tar archive you get after running docker image save?
One of the major design flaws of many notebook environments like Jupyter is that the kernel that does computations is not separate from the machine that runs the notebook server itself.
I am honestly curious to learn what is the concrete end product in this vision/plan.
When folks say they are going to build AGI - what exactly does that look like?
After my inevitable rejection, I asked for feedback from the interview and I wish I hadn't, because the letter that was sent stated that I simply did not have the abilities or talent to become a scientist.

I can laugh about it now, but at 15 and aspiring to become a chemist, this was devastating.
I suppose one could argue that if I was really into chemistry, I could have maybe come across this knowledge myself in my extracurricular studying, but the interview questions can be pretty much anything and the field is huuuuuge!
I was baffled, because even though I was at a good local school, this kind of knowledge was in the first year university curriculum in my country. Only later did I find out, that many top schools that send students to top research unis have years of interview prep.
Many years ago I was in this position and got an interview to study a science course at one of the top unis. When I went to the interview, I was given a picture of a molecule and asked to sketch a graph of the signals this molecule would give if run through a particular type of spectroscopy.
Many k8s benefits have to do with horizontal scaling. With many services you can deal with increased load by having more replicas. Same with ML inference but ML training doesn't scale horizontally in the same way.
I've probably ranted about this elsewhere but for many ML teams, container image is the wrong abstraction unit.

ML containers can be truly gargantuan in size and restarting them is not as cheap as with more lightweight containers for web services for ex.
For example, a question we can answer with napkin math is how much of the model weights or data could we dump to disk in the x seconds grace period that k8s gives to terminating containers.