OpenDataAlex
opendataalex.bsky.social
OpenDataAlex
@opendataalex.bsky.social
All around data nerd, building awesome data platforms. Talk data with me! Avid board/video gamer and role player.

He/him
It's that time of year again! All Things Open 2025 has kicked off and let the nerding begin!
October 13, 2025 at 12:31 PM
Reposted by OpenDataAlex
Elderly and disabled people told Congress that cutting Medicaid could kill people and in response, they were arrested in their wheelchairs.

What the hell are we doing?
People in wheelchairs are getting arrested right now in the Russell Senate Office Building in DC. They showed up to tell Congress not to cut their Medicaid, because they cant afford health care without it. If you look closely you can see the zip ties on their hands. #WeWontGetOverLosingMedicaid
June 25, 2025 at 9:23 PM
Reposted by OpenDataAlex
🚀 Apache Hop 2.13 is out!
84 tickets, 8 contributors, 2 months of work — this release brings big wins for data workflows 💪
🔗 Blog post: www.know.bi/blog/our-blo...
👉 Work with us: www.know.bi

#datasky #databs #apachehop #etl #dataengineering #opensource #gcp #mysql
April 24, 2025 at 8:26 AM
Reposted by OpenDataAlex
Thinking of moving from #Pentaho to Apache Hop? You're modernizing, not just migrating.
🚀 Learn how to switch smart: www.know.bi/blog/our-blo...
Need help? We coach teams too 👉 www.know.bi/pricing

#apachehop #pentaho #migration #coaching #datasky #databs
7 key points to successfully upgrade from Pentaho to Apache Hop
Discover 7 essential tips for a seamless upgrade from Pentaho to Apache Hop, ensuring smooth data transitions, enhanced workflow management, and optimized data engineering processes.
www.know.bi
April 16, 2025 at 7:44 AM
Just because a tool/technology can do something, doesn't mean it should. Sure I can use a hammer on screws, but a screwdriver or drill would perform a lot better. It's definitely like that with a lot of data tooling - both foundational and esoteric. 1/ #databs
March 5, 2025 at 12:57 PM
Building data platforms is not just about tech. It's also not purely about data governance. It has to be a blend optimized for the use case and for best performance of the data available. Even a basic RDBMS and a data dictionary in a spreadsheet can provide value provided it's the right fit. #databs
February 27, 2025 at 3:39 PM
Reposted by OpenDataAlex
The information you include in a data dictionary (a collection of names, definitions, and attributes about variables in a dataset), depends on your data and how you plan to use the document.

Some ideas of fields to consider including. 👇

Data dictionary template and example are here: osf.io/ynqcu
February 21, 2025 at 1:49 PM
It just goes to show - using data is easy, determining its worth is quite difficult. Especially when resources for data governance are not made a priority.

Data Is Very Valuable, Just Don't Ask Us To Measure It, Leaders Say - Slashdot m.slashdot.org/story/439061
Slashdot
m.slashdot.org
February 22, 2025 at 12:45 PM
Reposted by OpenDataAlex
We just launched a 16TB archive of every dataset that has been available on data.gov since November. This will be updated day by day as new datasets appear. It can be freely copied, and we're sharing the code behind it to help others make their own archives of data they depend on.
Announcing the Data.gov Archive | Library Innovation Lab
Today we released our archive of data.gov on Source Cooperative. The 16TB collection includes over 311,000 datasets harvested during 2024 and 2025, a complet...
lil.law.harvard.edu
February 6, 2025 at 10:02 PM
Reposted by OpenDataAlex
The read 📖 that caused this stitch: open.substack.com/pub/dataprod...
The Data-Conscious Software Engineer
The Unicorn That Data Teams Actually Need
open.substack.com
February 6, 2025 at 6:05 PM
Reposted by OpenDataAlex
Not everyone is on Bluesky, so shoutout to @opendataalex.bsky.social for one of my absolute favourite interviews that's full of humour and hard-won insights:

www.datafold.com/data-migrati...
A Data Migration Is Never Just a Data Migration: Lessons from Alex Meadows
Alex Meadows shares insights on lift-and-shift vs. rearchitecting, data quality priorities, and the human factors that can make or break a data migration.
www.datafold.com
February 5, 2025 at 5:28 PM
Reposted by OpenDataAlex
"Who cares about classic ML when you can have your own AI assistant?" says your CEO #databs
February 1, 2025 at 4:05 PM
This is ridiculous and uncalled for. Years of research are going to be stalled because of this. Researchers are going to be unable to publish findings or even perform/complete their work because of politics.

gizmodo.com/cdc-ordered-...
CDC Ordered to Scrub Website of Words Like 'Transgender' and 'LGBT'
A CDC employee spoke with Gizmodo about the
gizmodo.com
February 3, 2025 at 12:56 PM
Reposted by OpenDataAlex
Tomorrow’s the big day! 🎉 Join us for Public Domain Day 2025 as we celebrate works from 1929 entering the #PublicDomain. It’s free for everyone to enjoy! 🎶🎬 🖋️
📅 Jan 22
🕙️ 10 AM PT/1 PM ET
📍 ONLINE
🎟️ REGISTER ➡️ https://www.eventbrite.com/e/1104135491979

#PublicDomainDay @InternetArchive
Singin' in the Public Domain: Public Domain Day 2025
On January 1, 2025, creative works from 1929 and sound recordings from 1924 will enter the public domain in the US. Celebrate with us!
www.eventbrite.com
January 21, 2025 at 3:00 PM
Reposted by OpenDataAlex
Raleigh Low-Key Data Happy Hour is back for more drinks, data, and fun in 2025! #databs
January Low-Key Data Happy Hour (Lynnwood Brewing), Thu, Jan 30, 2025, 6:00 PM | Meetup
**We're back for more drinks, data, and fun in 2025!** No presentations or pitches, just people who love data hanging out, having a drink and eating food. Show up when co
buff.ly
January 7, 2025 at 12:06 PM
Reposted by OpenDataAlex
Do you know how much CEO pay has skyrocketed since 1978?

100%? 500%?

Try 1,085%

Meanwhile, the $7.25/hr fed. minimum wage hasn't budged in 15 years and the tipped min. wage has been $2.13/hr since 1991.

This is what I mean when I say the system is rigged.
CEO Pay Has Risen 1,085% Since 1978, But for Workers? Just 24% | Common Dreams
CEOs at top US companies saw their pay skyrocket by 1,085% since 1978, while typical worker pay only increased by 24%
www.commondreams.org
December 23, 2024 at 11:00 PM
Reposted by OpenDataAlex
This made me think

Credit : Dr. Ordax on Linkedin
December 19, 2024 at 10:48 PM
Reposted by OpenDataAlex
Data quality is not just a data engineering problem.
Data quality is not just a data engineering problem.
Data quality is not just a data engineering problem.
#dataBS
December 20, 2024 at 1:37 AM
Reposted by OpenDataAlex
🆕 Exciting news 🆕

PydanticAI: A new agent framework to build LLM Apps!

#datasky #mlsky #LLMs
December 7, 2024 at 10:02 PM
Reposted by OpenDataAlex
Excited that my book is on its way to production, with a new name!

The Well-Grounded Data Analyst: Solve real-world problems like a pro

It fits nicely with Manning's "Well-Grounded" series (there are developer versions for Python, Java, and Ruby).

Really looking forward to it being published now!
My book "The Well-Grounded Data Analyst" is out early 2025!

It is a collection of real-world data analysis projects to level up your skills and you can already get it in early access.

Use the code au35asb to get 35% off (on any Manning title, not just mine!)

www.manning.com/books/the-we...
The Well-Grounded Data Analyst
Complete eight data science projects that lock in important real world skills–along with a practical process you can use to learn any new technique quickly and efficiently.</b> The Well-Grounded Data...
www.manning.com
November 27, 2024 at 8:20 AM
Reposted by OpenDataAlex
These incremental changes, week after week, by a relatively small but very motivated team of contributors are what makes open source work.
🚀 This Week at Apache Hop 🚀
- Fixed database schema generation & Regex evaluation.
- Docs: Regex Evaluation, Beam Bigtable Input, Kettle Import.
- Fixed transform selection, file movement (Windows), & dialog resizing.
📢 Join the conversation!
lnkd.in/eUETpiZv
#apachehop #dataengineering #opensource
November 22, 2024 at 9:23 AM
Reposted by OpenDataAlex
According to @chris.blue, analytics engineers and data engineers should be merged into the same team. Data pipelines are commoditized and analytics engineers don't provide enough value. materializedview.io/p/merge-anal...
It's Time to Merge Analytics and Data Engineering (Again)
Data pipelines are commoditized and analytics engineers don't provide enough value.
materializedview.io
November 19, 2024 at 10:31 PM
Distilling clean, quality data should be the default. If it's being used in your business or applications take the time to do quality analysis and ensure it's fit for purpose. Put checks in place to make sure it remains of good quality. Otherwise it's a ticking timebomb. #databs
November 20, 2024 at 1:42 AM
Reposted by OpenDataAlex
Okay, one of the things Twitter was really good for was sounding the alarm & driving calls to Congress when terrible legislation came up for a vote.

Bluesky, IT'S TIME. H.R. 9495 would allow Trump to shut down any nonprofit by claiming they're "terrorists"

www.fightforthefuture.org/actions/no-o...
November 19, 2024 at 10:21 PM