In the series of next posts, we’d like to go through different aspects of Microsoft Fabric Capacity management. We start with an explanation of Microsoft Fabric capacity overloads — one of the biggest headaches when utilizing this service. Why? Because overloads can paralyze a significant part of the organization’s reporting capabilities and, without a proper strategy, can also drive costs up significantly.
Microsoft Fabric is a capacity-based model. This means that when purchasing a SKU, you’re buying a specific level of available resources. For a deeper understanding of this concept, we highly recommend the excellent article series by Matthew Farrow:
🔗 What is the Capacity Model? – by Matthew Farrow
These resources are shared across all workspaces assigned to the capacity, including reports and dashboards used by end users.
In Premium Capacity Gen2, Microsoft introduced the concepts of bursting and smoothing.
- Bursting allows you to temporarily use more resources than are available to speed up operations.
- Smoothing spreads out the consumption of Capacity Units (CUs) over time, rather than consuming them all at the exact moment an operation occurs.
Background operations (mostly data refreshes) are smoothed over 24 hours. Interactive operations (such as users clicking on reports) are smoothed over 5 minutes, but this can extend up to 64 minutes. For example, a semantic model refreshed at 1 PM on Monday will impact the capacity until 1 PM on Tuesday.
What happens when users consume more resources than the capacity allows?
Every resource used above the limit creates a kind of CU debt. This debt is repaid using capacity from future time points. If the debt is small (e.g., repaid within 10 minutes), there’s usually no visible effect. But when the debt is larger, throttling is applied — and that’s where the real trouble begins.
Throttling is a mechanism that prevents over-utilization of capacity resources. Because capacity is shared, it impacts all users working with content in that capacity. There are three phases of throttling:
- Interaction Delays – All new interactive operations are delayed by 20 seconds. This is when users begin complaining about reports being slow. It may seem minor, but when someone is presenting to a client or CEO, even this delay can be disruptive.
- Interactive Rejections – Now it gets serious. Reports start failing completely, throwing errors. Users can no longer use them.
- Background Rejections – The worst-case scenario. All background operations are rejected. The capacity is essentially down.
Common causes of Microsoft Fabric capacity overloads:
- User development & testing
Frequent refreshing of datasets and models during testing consumes background capacity, leaving less room for interactive use. - Non-optimal semantic models
Heavy transformations and large data volumes lead to high background CU consumption. - Poorly configured incremental refresh
This is often the most dangerous. The initial refresh processes all partitions. If it isn’t configured correctly (e.g., missing query folding, or includes merges), it can result in extreme resource usage, impacting the capacity for the next 24 hours. - Inefficient reports and DAX
Reports that allow large queries or have slow DAX logic can cause interaction spikes. Because interactive operations are smoothed over a short time (usually 5 minutes), even one inefficient report can trigger a CU spike. If it happens multiple times or overlaps with background activity, it can lead to overloads lasting hours.
In the next post, we will explore what kind of actions can be taken to handle overloads.
How does your organization manage capacity overloads in Microsoft Fabric? Let us know your experiences or challenges in the comments.
#MicrosoftFabric #Fabric #Optimization #MicrosoftFabricOptimization #SelfService #CitizenDeveloper #CapacityOverload #Throttling #FabricMonitoring
Leave a Reply