Kirsten Lum
banner
machsci.bsky.social
Kirsten Lum
@machsci.bsky.social
🌲🌲 Applied ML/AI, data science, MLOps | Wife of 1, mom of 2 | Co-Founder and CTO of http://storytellers.ai

python 🐍 AI 🤖 cloud ☁️ data 📊

I also talk about Jesus here: @itskirstenlum.bsky.social
Faaaaascinating. Looking forward to your analysis of the results. Crossing my fingers for a Simpsons paradox
May 25, 2025 at 1:25 PM
Those checks are run in-environment (rather than passing raw data to a service). That part was hard
February 11, 2025 at 6:21 PM
Yes! We developed a way to perform statistical and natural language checks to identify joinable columns in real-world (that is, fubared) data. If brute-forcing it would be ~$100ks and weeks of runtime, we do it in ~$10 - $100s in minutes or hours
February 11, 2025 at 6:19 PM
Yes — it is that, and it is beyond that! Even in cases where your pk/fk are malformed, like mismatched types, mismatched column names, contain prefixes/suffixes, etc.
February 11, 2025 at 6:11 PM
Cube gets it. We SHOULD be able to build a semantic layer in the data platform but we can’t (not one that actually helps the analytics workflow anyway). Thus these tools that fill the gap!
February 11, 2025 at 5:51 PM
I think it’s easy for math folks because math operates logically — but what I think they miss is reality operates logically too!
February 11, 2025 at 5:49 PM
Well grain of salt, I managed to get out of 100% of math classes in my undergrad. But even among math heavy degree havers like engineers, when they’d ask me how I was able to understand/convince across disciplines/levels, I’d tell them to read the textbook from my logic class!
February 11, 2025 at 5:48 PM
Ah yes, going the other way! Uhhhh I’m going to make a mental note that we could probably reverse this process 🤔🤔
February 11, 2025 at 5:12 PM
This exact impulse was what inspired this tool: bsky.app/profile/mach...
February 11, 2025 at 3:51 PM
This dashboard could have been a Google sheet
February 11, 2025 at 3:41 PM
Formal logic was the most useful course I took in college. Hard to explain that there is a style of thinking that helps you to quickly and precisely understand and explain what’s going on in any situation.
February 11, 2025 at 2:56 PM
A much more hopeful picture!
February 1, 2025 at 3:17 PM
One of my professional “worry stones” is that the orgs who were able to set up data infrastructure tend to be bigger for-profits. If AI is revolutionary, that means orgs like education, non-profits, etc are left behind. It still takes way too long to set up a basic DW. Wish I knew the solution
February 1, 2025 at 2:51 PM
*rolls up sleeves* on it!
January 31, 2025 at 7:52 PM
Finally able to prove I haven’t been just complaining this whole time!! 😭
January 31, 2025 at 7:48 PM
And not a dumb question 😊😊
January 31, 2025 at 7:45 PM
What the downside to doing both of these in SQL?
January 28, 2025 at 5:12 PM
Reposted by Kirsten Lum
Trying to make each and every transform efficient in the micro leads to inefficiencies in maintenance in the macro
January 27, 2025 at 9:46 PM
That is, by default, do the transform in SQL, and only think about whether to do it in the database or the client if you run into a blocker doing it in SQL.

No need to litigate every transform — scarcity mentality in an era of compute riches
January 27, 2025 at 9:49 PM
Trying to make each and every transform efficient in the micro leads to inefficiencies in maintenance in the macro
January 27, 2025 at 9:46 PM
Does using the reports the service provides count? Like the GA dashboards inside GA? Or are you thinking more like grabbing some SQL/Tableau templates to run on the data in the DW?
January 22, 2025 at 4:42 PM
Pivots 🥲🥲
January 22, 2025 at 4:43 AM
And isn’t it wild that on the receiving side of the message, appreciation is one of the greatest gifts?

Feels small to give, but immense to receive
January 21, 2025 at 6:34 PM
Sorry I literally just saw your name got autocorrected @vickiboykis.com. I assume this is a similar experience as when people call me Kristin, and I am proportionately appalled/apologetic.
January 20, 2025 at 3:41 PM