If Spark performance work is surgery, monitoring is your live telemetry. Microsoft Fabric gives you multiple monitoring entry points for Spark workloads: Monitor hub for cross-item visibility, item Recent runs for focused context, and…
If Spark performance work is surgery, monitoring is your live telemetry. Microsoft Fabric gives you multiple monitoring entry points for Spark workloads: Monitor hub for cross-item visibility, item Recent runs for focused context, and…
If your Lakehouse tables are getting slower (or more expensive) over time, it’s often not "Spark is slow." It’s usually table layout drift: too many small files, suboptimal clustering, and old files piling up. In Fabric Lakehouse, the…
If your Lakehouse tables are getting slower (or more expensive) over time, it’s often not "Spark is slow." It’s usually table layout drift: too many small files, suboptimal clustering, and old files piling up. In Fabric Lakehouse, the…
If your Fabric tenant has grown past "a handful of workspaces," the problem isn’t just storage or compute—it’s finding the right items, understanding what they are, and making governance actionable. That’s the motivation behind the…
If your Fabric tenant has grown past "a handful of workspaces," the problem isn’t just storage or compute—it’s finding the right items, understanding what they are, and making governance actionable. That’s the motivation behind the…
Spark performance work is mostly execution work: understanding where the DAG splits into stages, where shuffles happen, and why a handful of tasks can dominate runtime. This post is a quick, practical refresher on the Spark execution model — with…
Spark performance work is mostly execution work: understanding where the DAG splits into stages, where shuffles happen, and why a handful of tasks can dominate runtime. This post is a quick, practical refresher on the Spark execution model — with…
Shuffles are where Spark jobs go to get expensive: a wide join or aggregation forces data to move across the network, materialize shuffle files, and often spill when memory pressure spikes. In Microsoft Fabric Spark workloads, the…
Shuffles are where Spark jobs go to get expensive: a wide join or aggregation forces data to move across the network, materialize shuffle files, and often spill when memory pressure spikes. In Microsoft Fabric Spark workloads, the…
If you’ve adopted Microsoft Fabric, there’s a good chance you’re trying to reduce the number of ‘copies’ of data that exist just so different teams and engines can access it. OneLake shortcuts are one of the core…
If you’ve adopted Microsoft Fabric, there’s a good chance you’re trying to reduce the number of ‘copies’ of data that exist just so different teams and engines can access it. OneLake shortcuts are one of the core…
If you’re treating Microsoft Fabric workspaces as source-controlled assets, you’ve probably started leaning on code-first deployment tooling (either Fabric’s built-in Git integration or…
If you’re treating Microsoft Fabric workspaces as source-controlled assets, you’ve probably started leaning on code-first deployment tooling (either Fabric’s built-in Git integration or…
Microsoft Fabric simplifies Spark workload management but diagnosing performance issues remains challenging. This post introduces the "Job Doctor," an AI tool that analyzes Spark telemetry to identify problems like skew or excessive shuffles,…
Microsoft Fabric simplifies Spark workload management but diagnosing performance issues remains challenging. This post introduces the "Job Doctor," an AI tool that analyzes Spark telemetry to identify problems like skew or excessive shuffles,…
As I head to the National for the first time, this is a topic I have been thinking about for quite some time, and a recent video inspired me to put this together with help from ChatGPT’s o3 model doing deep research. Enjoy!…
As I head to the National for the first time, this is a topic I have been thinking about for quite some time, and a recent video inspired me to put this together with help from ChatGPT’s o3 model doing deep research. Enjoy!…
1. Setting the Table Josh, I loved how you framed your conversation with ChatGPT-4o around three crisp horizons — 5, 25 and 100 years. It’s a structure that forces us to check our near-term…
1. Setting the Table Josh, I loved how you framed your conversation with ChatGPT-4o around three crisp horizons — 5, 25 and 100 years. It’s a structure that forces us to check our near-term…
Introducing Autoscale Billing for Spark in Microsoft Fabric - blog.fabric.microsoft.com/en/blog/intr...
Introducing Autoscale Billing for Spark in Microsoft Fabric - blog.fabric.microsoft.com/en/blog/intr...