Lightnews — Scholar-powered news

Christopher Finlan

@cmfinlan.bsky.social

The Best Thing That Ever Happened to Your Spark Pipeline Is a SQL Database

Here's a counterintuitive claim: the most important announcement for Fabric Spark teams in early 2026 has nothing to do with Spark. It's a SQL database. Specifically, it's the rapid adoption of SQL database in Microsoft Fabric—a fully managed, SaaS-native transactional database that went GA in November 2025 and has been quietly reshaping how production data flows into lakehouse architectures ever since. If you're a data engineer running Spark workloads in Fabric, this changes more than you think. The ETL Pipeline You Can Delete Most Spark data engineers have a familiar pain point: getting operational data from transactional systems into the lakehouse.

christopherfinlan.com

February 13, 2026 at 3:15 PM

Christopher Finlan

@cmfinlan.bsky.social

Monitoring Spark Jobs in Real Time in Microsoft Fabric

If Spark performance work is surgery, monitoring is your live telemetry. Microsoft Fabric gives you multiple monitoring entry points for Spark workloads: Monitor hub for cross-item visibility, item Recent runs for focused context, and…

Monitoring Spark Jobs in Real Time in Microsoft Fabric

If Spark performance work is surgery, monitoring is your live telemetry. Microsoft Fabric gives you multiple monitoring entry points for Spark workloads: Monitor hub for cross-item visibility, item Recent runs for focused context, and application detail pages for deep investigation. This post is a practical playbook for using those together. Why this matters When a notebook or Spark job definition slows down, "run it again" is the most expensive way to debug. Real-time monitoring helps you: spot bottlenecks while jobs are still running isolate failures quickly compare behavior across submitters and workspaces…

christopherfinlan.com

February 12, 2026 at 3:37 PM

Christopher Finlan

@cmfinlan.bsky.social

Running OpenClaw in Production: Reliability, Alerts, and Runbooks That Actually Work

Agents are fun when they’re clever. They’re useful when they’re boring. If you’re running OpenClaw as an always-on assistant (cron jobs, health checks, publishing pipelines, internal dashboards), the failure mode isn’t usually “it breaks once.” It’s it flakes intermittently and you can’t tell if the problem is upstream, your network, your config, or the agent. This post is the operational playbook that moved my setup from “cool demo” to “production-ish”: fewer false alarms, faster debugging, clearer artifacts, and tighter cost control. The production baseline (don’t skip this) Before you add features, lock the boring stuff:

christopherfinlan.com

February 11, 2026 at 3:15 PM

Christopher Finlan

@cmfinlan.bsky.social

Lakehouse Table Optimization: VACUUM, OPTIMIZE, and Z-ORDER

If your Lakehouse tables are getting slower (or more expensive) over time, it’s often not "Spark is slow." It’s usually table layout drift: too many small files, suboptimal clustering, and old files piling up. In Fabric Lakehouse, the…

Lakehouse Table Optimization: VACUUM, OPTIMIZE, and Z-ORDER

If your Lakehouse tables are getting slower (or more expensive) over time, it’s often not "Spark is slow." It’s usually table layout drift: too many small files, suboptimal clustering, and old files piling up. In Fabric Lakehouse, the three table-maintenance levers you’ll reach for most are: OPTIMIZE: compacts many small files into fewer, larger files (and can apply clustering)

christopherfinlan.com

February 10, 2026 at 8:33 PM

Christopher Finlan

@cmfinlan.bsky.social

OneLake catalog in Microsoft Fabric: Explore, Govern, and Secure

If your Fabric tenant has grown past "a handful of workspaces," the problem isn’t just storage or compute—it’s finding the right items, understanding what they are, and making governance actionable. That’s the motivation behind the…

OneLake catalog in Microsoft Fabric: Explore, Govern, and Secure

If your Fabric tenant has grown past "a handful of workspaces," the problem isn’t just storage or compute—it’s finding the right items, understanding what they are, and making governance actionable. That’s the motivation behind the OneLake catalog: a central hub to discover and manage Fabric content, with dedicated experiences for discovery (Explore), governance posture (Govern), and security administration (Secure). This post is a practical walk-through of what’s available today, with extra focus on what Fabric admins get in the Govern…

christopherfinlan.com

February 10, 2026 at 3:00 PM

Christopher Finlan

@cmfinlan.bsky.social

Understanding Spark Execution in Microsoft Fabric

Spark performance work is mostly execution work: understanding where the DAG splits into stages, where shuffles happen, and why a handful of tasks can dominate runtime. This post is a quick, practical refresher on the Spark execution model — with…

Understanding Spark Execution in Microsoft Fabric

Spark performance work is mostly execution work: understanding where the DAG splits into stages, where shuffles happen, and why a handful of tasks can dominate runtime. This post is a quick, practical refresher on the Spark execution model — with Fabric-specific pointers on where to observe jobs, stages, and tasks. 1) The execution hierarchy: Application → Job → Stage → Task In Spark, your code runs as a Spark application. When you run an action (for example, count(), collect(), or writing a table), Spark submits a job…

christopherfinlan.com

February 9, 2026 at 9:22 PM

Christopher Finlan

@cmfinlan.bsky.social

Fabric Spark Shuffle Tuning: AQE + partitions for Faster Joins

Shuffles are where Spark jobs go to get expensive: a wide join or aggregation forces data to move across the network, materialize shuffle files, and often spill when memory pressure spikes. In Microsoft Fabric Spark workloads, the…

Fabric Spark Shuffle Tuning: AQE + partitions for Faster Joins

Shuffles are where Spark jobs go to get expensive: a wide join or aggregation forces data to move across the network, materialize shuffle files, and often spill when memory pressure spikes. In Microsoft Fabric Spark workloads, the fastest optimization is usually the boring one: avoid the shuffle when you can, and when you can’t, make it smaller and better balanced. This post lays out a practical, repeatable approach you can apply in Fabric notebooks and Spark job definitions. 1) Start with the simplest win: avoid the shuffle If one side of your join is genuinely small (think lookup/dimension tables), use a broadcast join so Spark ships the small table to executors and avoids a full shuffle.

christopherfinlan.com

February 6, 2026 at 3:03 PM

Christopher Finlan

@cmfinlan.bsky.social

OneLake Shortcuts + Spark: Practical Patterns for a Single Virtual Lakehouse

If you’ve adopted Microsoft Fabric, there’s a good chance you’re trying to reduce the number of ‘copies’ of data that exist just so different teams and engines can access it. OneLake shortcuts are one of the core…

OneLake Shortcuts + Spark: Practical Patterns for a Single Virtual Lakehouse

If you’ve adopted Microsoft Fabric, there’s a good chance you’re trying to reduce the number of ‘copies’ of data that exist just so different teams and engines can access it. OneLake shortcuts are one of the core primitives Fabric provides to unify data across domains, clouds, and accounts by making OneLake a single virtual data lake namespace. For Spark users specifically, the big win is that shortcuts appear as folders in OneLake—so Spark can read them like any other folder—and Delta-format shortcuts in the Lakehouse Tables area can be surfaced as tables.

christopherfinlan.com

February 5, 2026 at 3:02 PM

Christopher Finlan

@cmfinlan.bsky.social

When ‘Native Execution Engine’ Doesn’t Stick: Debugging Fabric Environment Deployments with fabric-cicd

If you’re treating Microsoft Fabric workspaces as source-controlled assets, you’ve probably started leaning on code-first deployment tooling (either Fabric’s built-in Git integration or…

When ‘Native Execution Engine’ Doesn’t Stick: Debugging Fabric Environment Deployments with fabric-cicd

If you’re treating Microsoft Fabric workspaces as source-controlled assets, you’ve probably started leaning on code-first deployment tooling (either Fabric’s built-in Git integration or community tooling layered on top). One popular option is the open-source fabric-cicd Python library, which is designed to help implement CI/CD automations for Fabric workspaces without having to interact directly with the underlying Fabric APIs. For most Fabric items, a ‘deploy what’s in Git’ model works well—until you hit a configuration that looks like it’s in source control, appears in deployment logs, but still doesn’t land in the target workspace.

christopherfinlan.com

February 3, 2026 at 3:00 PM

Christopher Finlan

@cmfinlan.bsky.social

New OSS drop: Sparkwise (PyPI: sparkwise). Built by Santhosh Kumar Ravindran to help teams improve Fabric Spark price/perf with automated diagnostics + profiling. If you run Spark in Fabric, this will save you time and vCores.

Sparkwise: an “automated data engineering specialist” for Fabric Spark tuning

Spark tuning has a way of chewing up time: you start with something that “should be fine,” performance is off, costs creep up, and suddenly you’re deep in configs, Spark UI, and tribal knowledge trying to figure out what actually matters. That’s why I’m excited to highlight sparkwise, an open-source Python package created by Santhosh Kumar Ravindran, one of my direct reports here at Microsoft. Santhosh built sparkwise to make Spark optimization in Microsoft Fabric less like folklore and more like a repeatable workflow: automated diagnostics, session profiling, and actionable recommendations to help teams drive better price-performance without turning every run into an investigation.

christopherfinlan.com

January 5, 2026 at 9:42 PM

Christopher Finlan

@cmfinlan.bsky.social

Gil Gerard, Buck Rogers, and the Kind of Grief That Shows Up in December

Gil Gerard's departure reminds us that some celebrities aren't just actors; they're the comforting echoes of our past. Buck Rogers was more than a show—it was a place that shaped our childhood optimism.

christopherfinlan.com

December 18, 2025 at 2:05 AM

Christopher Finlan

@cmfinlan.bsky.social

Build Your Own Spark Job Doctor in Microsoft Fabric

Microsoft Fabric simplifies Spark workload management but diagnosing performance issues remains challenging. This post introduces the "Job Doctor," an AI tool that analyzes Spark telemetry to identify problems like skew or excessive shuffles,…

Build Your Own Spark Job Doctor in Microsoft Fabric

Microsoft Fabric simplifies Spark workload management but diagnosing performance issues remains challenging. This post introduces the "Job Doctor," an AI tool that analyzes Spark telemetry to identify problems like skew or excessive shuffles, generates human-readable diagnoses, and suggests fixes. The implementation integrates with Azure AI for optimized Spark job management.

christopherfinlan.com

December 5, 2025 at 7:43 PM

Christopher Finlan

@cmfinlan.bsky.social

Time to Automate: Why Sports Card Grading Needs an AI Revolution

As I head to the National for the first time, this is a topic I have been thinking about for quite some time, and a recent video inspired me to put this together with help from ChatGPT’s o3 model doing deep research. Enjoy!…

Time to Automate: Why Sports Card Grading Needs an AI Revolution

As I head to the National for the first time, this is a topic I have been thinking about for quite some time, and a recent video inspired me to put this together with help from ChatGPT’s o3 model doing deep research. Enjoy! Introduction: Grading Under the Microscope Sports card grading is the backbone of the collectibles hobby – a PSA 10 vs PSA 9 on the same card can mean thousands of dollars of difference in value. Yet the process behind those grades has remained stubbornly old-fashioned, relying on human eyes and judgment.

christopherfinlan.com

July 29, 2025 at 11:47 PM

Christopher Finlan

@cmfinlan.bsky.social

Humans + Machines: From Co-Pilots to Convergence — A Friendly Response to Josh Caplan’s “Interview with AI”

1. Setting the Table Josh, I loved how you framed your conversation with ChatGPT-4o around three crisp horizons — 5, 25 and 100 years. It’s a structure that forces us to check our near-term…

Humans + Machines: From Co-Pilots to Convergence — A Friendly Response to Josh Caplan’s “Interview with AI”

1. Setting the Table Josh, I loved how you framed your conversation with ChatGPT-4o around three crisp horizons — 5, 25 and 100 years. It’s a structure that forces us to check our near-term expectations against our speculative impulses. Below I’ll walk through each horizon, point out where my own analysis aligns or diverges, and defend those positions with the latest data and research. 2. Horizon #1 (≈ 2025-2030): The Co-Pilot Decade Where we agree You write that “AI will write drafts, summarize meetings, and surface insights … accelerating workflows without replacing human judgment.” Reality is already catching up:

christopherfinlan.com

July 15, 2025 at 3:12 AM

Christopher Finlan

@cmfinlan.bsky.social

Please - don't use whilst.

🎩 Retire Your Top Hat: Why It’s Time to Say Goodbye to “Whilst”

There’s a word haunting documents, cluttering up chat messages, and lurking in email threads like an uninvited character from Downton Abbey. That word is whilst. Let’s be clear: no one in the United States says this unironically. Not in conversation. Not in writing. Not in corporate life. Not unless they’re also saying “fortnight,” “bespoke,” or “I daresay.” It’s Not Just Archaic—It’s Distracting In American English, whilst is the verbal equivalent of someone casually pulling out a monocle in a team meeting. It grabs attention—but not the kind you want.

christopherfinlan.com

July 9, 2025 at 4:12 PM

Christopher Finlan

@cmfinlan.bsky.social

The Rise and Heartbreak of Antonio McDyess: A Superstar’s Path Cut Short

Note: Antonio McDyess is one of my favorite players that no one I know seems to know or remember, so I asked ChatGPT Deep Research to help tell the story of his rise to the cusp of superstardom. Do a YouTube search for McDyess highlights - it’s a blast. Humble Beginnings and Early Promise Antonio McDyess hailed from small-town Quitman, Mississippi, and quickly made a name for himself on the basketball court. After starring at the University of Alabama – where he led the Crimson Tide in both scoring and rebounding as a sophomore – McDyess entered the star-studded 1995 NBA Draft .

christopherfinlan.com

June 29, 2025 at 8:11 PM

Christopher Finlan

@cmfinlan.bsky.social

Enjoyed putting this together with ChatGPT's Deep Research capabilities - let me know if you like it!

Microsoft Fabric Capacity Management: A Comprehensive Guide for Administrators (using ChatGPT’s Deep Research)

Author's note - I have enjoyed playing around with the Deep Research capabilities of ChatGPT, and I had it put together what it felt was the definitive whitepaper on Capacity Management for Microsoft Fabric. It basically just used the Microsoft documentation (plus a couple of community posts) to pull it together, so I'm curious what you think. I'll leave a link to download the PDF copy of this at the end of the post. Executive Summary Microsoft Fabric capacities provide the foundational compute resources that power the Fabric analytics platform. They are essentially dedicated pools of compute (measured in Capacity Units or CUs) allocated to an organization’s Microsoft Fabric tenant.

christopherfinlan.com

May 21, 2025 at 12:53 AM

Christopher Finlan

@cmfinlan.bsky.social

The team worked a long time to make this a reality for folks - so excited it is finally here!! True serverless billing for Spark in Fabric!

Introducing Autoscale Billing for Spark in Microsoft Fabric - blog.fabric.microsoft.com/en/blog/intr...

Microsoft Fabric Blog

Keep up with the latest Microsoft Fabric updates, announcements, information, & new features on the Microsoft Fabric blog. Search by category or date published.

https://blog.fabric.microsoft.com/en/blog/introd…

March 31, 2025 at 5:30 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news