Paddy Mullen
@paddymullen.bsky.social
90 followers 550 following 110 posts
Boston/Newport. Python/PyData/Jupyter dev. Building the Buckaroo widgetto enhance the DataFrame viewing experience in Jupyter https://github.com/paddymul/buckaroo
Posts Media Videos Starter Packs
@marcogorelli.bsky.social Did excellent work on the rust plugin tutorial. The cookiecutter worked and came with an impressive CI setup that runs against MacOS, Windows, Linux, and multiple python versions.
I wrote my first rust code and polars extension - pl_series_hash. It runs xx_hash over an entire series to get a single hash u64. Works on most all Polars data types including nested structs, it's fast. WIll be very useful for caching summary stats. github.com/paddymul/pl_...
GitHub - paddymul/pl_series_hash: a polars plugin to performantly cache series
a polars plugin to performantly cache series. Contribute to paddymul/pl_series_hash development by creating an account on GitHub.
github.com
Providence RI still has a large jewelry making industry. Lots of small shops around there. Also, look up CNC tool dealers like Method Machine tools, give them a call and ask who they would recommend from their customers.
BTW I'm mortified by the preview image that PyData or youtube chose. It was a live screen recording and I had to tab between multiple windows where buckaroo runs (VSCode, Jupyter, Google Colab, Marimo).
Getting the script plumbed into consult-mode was a bear. customizing consult-mode requires returning a builder function, that returns another function, that is called by consult. None of the args are documented. #emacs
After a bunch of work with prot, I got my custom find-grep script plumbed into emacs.

It searches a python project for matches in preferred files first (not site-packages, node_modules, py, js, tx extensions preferred). Then it runs a much more comprehensive search of those excluded directories
Nope, some how my new enum_dataframe 3-5xd the memory usage in python. Moral of the story is that polars and parquet have some seriously impressive engineering behind them.
After some more work, I outputted the entire dataframe, original columns + sparse where necessary. Then I looked at the size, a couple megs more than the original parquet... No big deal, my python memory usage should be less wihtout all of those strings right?
sparse values. When I changed to an enum per column, the file went down to 50 MB. This was suspiciously low. I realized my new dataframe only included sparse columns, not regular columns + sparse columns.
UGH, I thought I was so slick. I had a nice function that used polars to convert a csv to parquet with enum columns for sparsely populated columns. I thought it saved about 30% on a 700 meg parquet file (10G csv). Then I dug some more and found out that I was encoding all possible sparse value
Some content is on Youtube and few other places (heavy equipment, dirt bikes...). But I realized generally I have been searching youtube for more topics because I trust the results more than SEO spam... And this is after I switched to ddg. Obviously there are shills on yt, but beats listicles.
Link? What is that site
I need to write more. I have basically no online following. Inhale gotten better distribution through medium. I do try to set my articles to not be behind a paywall
There is some good stuff on facebook. Fun post from the mainframers group about an IBM Sytem 370 card for the IBM PS/2. www.facebook.com/share/p/17tx...
Some people may ask, "What is a mainframe?".  Well, the answer varies, everything from the multi-ton beasts, to the smaller versions.  How small?  May I present one of the mainframes I used to operate/administer, an IBM P/370.  That's a complete IBM S/370 processor, including a full 16M of memory, on a single MCA card, which fit into a PS/2 model 95 system, and which ran VM/SP 5.  
Once, just for grins and giggles, I IPLed a second level VM system.  Then, for more grins and giggles, I IPLed a third level VM system.  But, it drove me crazy trying to remember which prompt to use for which system.  Still, it demonstrated how thorough of an architecture implementation the card was. 
https://en.wikipedia.org/.../PC-based_IBM_mainframe...
It was great having my own personal mainframe, which I used for product development and testing.  It allowed me to do test installs of software on a real mainframe.  
Anyway, since some of y'all are posting really great pictures of machine rooms and big mainframes, I thought I'd offer a glimpse of what may be one of the smallest mainframes.  🙂
Oh, yeah, there was a follow-on product, the P/390, which was a S/390 system on a card.  I lusted after one for many years, but was never able to justify it.  And, there were a couple of proceeding products, the 7437, which was a full S/370 in a box about the size of a PS/2 model 95.  There was also the XT/370, which was a PC/XT, with a card-set in it which was a partial implementation of the S/370 instruction set (one of which I happen to personally own). I was the team leader for the 7437 project and designed most of the CPU. It used an IBM bit slice (similar to the AMD 2900), and an IBM chip called FLAINE which implemented the floating point instructions. The rest of the CPU was built using mostly 7400F and 7400LS chips. The CPU had a writeable control store of 8K words of 96 bits each. Initially, we implemented dynamic address translation in 7400 logic, but then one of our engineers, Tak Ng, designed a CMOS chip to do the translation, and that went into the product-level release. The chip replaced a whole card worth of 7400 logic. The 7437 consisted of 6 cards, each approximately 9x12". The cards were the Instruction Unit, Execution Unit, Control Store, PS2 Interface and two 8 MB memory cards. Below is a photo of the writeable control store card. It's the only piece of hardware I still have.
The XT/370 was based on a Motorola 68000 that IBM paid Motorola to modify to execute 370 instructions. The chip only could run CMS and not CP as it did not support supervisor state. Instead CP functions were provided by X86 code running on the PC.
The P/370 and the P/390 chips were also designed by Tak Ng.
The 7437, P/370 and P/390 were all done in Bill Beausoleil's IBM Fellow department by a team of 6 to 10 engineers and programmers.
Bill wanted IBM to sell the 7437 for about $5,000. But that would have seriously broken IBM's pricing model vs the cost/performance of traditional mainframes like the 4341 which sold for something like $100,000. VM wanted to charge something like $100K for a 7437 license,
Upper level executives were frightened that customers would buy a slew of 7437s instead of a 3090, costing IBM millions in profit. They swore that they would never let the 7437 go out the door if it meant losing the sale of even one 3090. In the end, they severely restricted who could buy a 7437 by bundling it with a 5080 graphics workstation and pricing it at $50,000.
On the toughest most annoying problems I need to try to put in a half hour to an hour a day, and let ideas percolate. But actually do that work, it's much more effective than hours of grinding. I'm thinking about devops stuff especially
You hear those stories about how garbage collection on lisp machines in the 80s didn't really work and engineers just restarted the machine once a day. Then I realized I do the same thing with firefox/chrome. Need to upgrade to a 32+GB laptop
I want larger wheels because I think they’ll help with surface roughness a lot. Something about rollerblading bothers my knees like skiing, and ice skating don’t.
There are constructions that are very useful that I know I would have avoided because they make the type checkers life difficult.

This is a very bad habit to get into. I don’t know how to generally fix it. Getting more comfortable with typing will help.
I think the typing helps, a bit, but I really worry about the code that I would write if I started writing typed python. There are things I’m doing which are complex to type. The urge and incentives are to make it typed first and useful second.
It’s more a matter of seeing the syntax highlighting and wanting to fix it then wanting my code typed.
I have started heavily typing my python code. I’m late to the party on this one, I know, but here are some thoughts.

I was prompted by finally getting eglot mode working with basedpyright in eMacs.
All of this will be part of my article "so you want to serialize a dataframe to JS". Which is a subsection (probably the largest) of "so you want to write a table viewer"