Stefan Grafberger
@stefan-grafberger.com
210 followers 240 following 7 posts
PhD Student at BIFOLD & TU Berlin, researching data management for ML. Previously worked with UvA, Microsoft GSL, Amazon Research, Oracle Labs, and others. https://stefan-grafberger.com
Posts Media Videos Starter Packs
stefan-grafberger.com
I'm joining the SQL Data Types team in Berlin.

The VLDB demo paper:
"mlidea: Interactively Improving ML Data Preparation Code via 'Shadow Pipelines'"
PDF: www.vldb.org/pvldb/vol18/...
Video demo: youtu.be/ePGm1J6S2qk
www.vldb.org
stefan-grafberger.com
Very excited to share that I've started as a Software Engineer at Snowflake! 🥳

I’m also wrapping up my PhD: this week I’m at VLDB in London to present the last demo paper from my time as PhD student, and on September 17 I’ll defend my PhD in Amsterdam.

Really looking forward to this next chapter!
Reposted by Stefan Grafberger
stefan-grafberger.com
Our demo "mlidea: Interactively Improving ML Data Preparation Code via 'Shadow Pipelines'" was accepted at VLDB! 🥳

We demo suggestions for ML pipelines, similar to IntelliJ code inspections or Grammarly suggestions

youtu.be/ePGm1J6S2qk

Joint work w/ @mersault.bsky.social @p-groth.bsky.social
Reposted by Stefan Grafberger
deem-workshop.bsky.social
📢 Deadline extension for DEEM 2025 @sigmod2025.bsky.social!

Following requests, we're extending the submission deadline to April 1, 5pm Pacific Time. More info at: deem-workshop.github.io
DEEM: Workshop on Data Management for End-to-End Machine Learning @ ACM SIGMOD 2025
deem-workshop.github.io
stefan-grafberger.com
Our vision "Towards Regaining Control over Messy ML Pipelines" was accepted for the DAIS workshop at ICDE! 🥳

Initial experiments show LLMs are promising for extracting declarative query plans from messy ML code.

Joint work w/ @guangchen811.bsky.social @oovcharenko.bsky.social @mersault.bsky.social
stefan-grafberger.com
Please help spread the word by reposting!

We've just created the official DEEM Workshop account: @deem-workshop.bsky.social
deem-workshop.bsky.social
The Data Management for End-to-End Machine Learning workshop (@deem-workshop.bsky.social) will be back at #SIGMOD2025! ✨

🔗 Check out the CfP: deem-workshop.github.io
📝 Submission deadline: March 21
📢 Notifications: April 25

Join us for the 9th edition in Berlin!

#DEEM2025
sigmod2025.bsky.social
DEEM - The 9th Workshop on End-to-End Data Management is also co-located with SIGMOD/PODS 2025. The deadline for papers is March 21st. For more details checkout the website
deem-workshop.github.io
Reposted by Stefan Grafberger
mersault.bsky.social
We have a **Postdoc opening** in Berlin on Responsible Data Engineering!

This is a fully-funded position with salary level E14 at the newly founded DEEM Lab, as part of @bifold.berlin .

Details available at deem.berlin#jobs-57624
Reposted by Stefan Grafberger
bifold.berlin
@stefan-grafberger.com, a Ph.D. student in the DEEM Lab at BIFOLD is among the author team, which presented the paper "Towards Query Optimizer as a Service (QOaaS) in a Unified LakeHouse Platform: Can One QO Rule Them All? at the #CIDR2025.

#QOaaS #CIDR

www.bifold.berlin/news-events/...
Reposted by Stefan Grafberger
mersault.bsky.social
Interested in a *PhD in Data Engineering* in Berlin? Our institute has several openings for PhD positions as part of its graduate school, see the post below!

And check out the following page for details on how to work with the DEEM Lab as part of the graduate school deem.berlin#jobs-189196
stefan-grafberger.com
Our CIDR'25 paper "Towards Query Optimizer as a Service (QOaaS) in a Unified LakeHouse Ecosystem: Can One QO Rule Them All?" is now on ArXiv! Excited to have been a part of this project during my internship at Microsoft GSL!

arxiv.org/pdf/2411.13704
arxiv.org
Reposted by Stefan Grafberger
mersault.bsky.social
Pls repost:

We, the DEEM Lab at TU Berlin, are hiring a postdoctoral researcher in data engineering for machine learning. Details available at:

deem.berlin#jobs-57624

This fully-funded position is part of the Berlin Institute for the Foundations of Learning and Data (BIFOLD).

#databs #datasky