Welcome back to our ongoing series on Microsoft Fabric capacity management. So far, we’ve covered:
- Part 1: Causes and Consequences of Capacity Overloads
- Part 2: Actions to Take During Capacity Throttling
In this third installment, we’ll focus on a crucial aspect — how to alert users effectively when capacity throttling occurs.
Capacity is overloaded, throttling kicks in. Let’s imagine that we don’t want to restart the capacity to monetize all the debt CU. So it will take some time until the capacity burns out all that outstanding CU and returns to normal. Let’s think about alerting users about that.
First question would be – how could the Fabric management team learn that the issue even occurs? Of course, they might hear it from users who escalate slow reports. However, it would be best if they knew beforehand to minimize the impact on the business and demonstrate professionalism. Until signals from worried users reach them, they can check how severe the throttling is and plan the next actions. Microsoft allows configuring notifications for the capacity. For example, we could set up an email to capacity admins (or selected users) when capacity reaches 100%. We may also set this threshold lower, e.g., 90%, but that could generate many false alarms.
Now, let’s move to the business users. The first option, just like in the case of taking actions, is to do nothing. It depends on the organization and how severe the throttling is. If it’s minor and will resolve in a few minutes, it might be fine before you create a message and users read it.
If the throttling is more severe, good practice would require informing the affected users. By now, they are probably starting to realize something is wrong and reports are slow. They may ask their teammates if they have the same issue, wonder if it’s caused by their network or laptop, consider whether it’s a Microsoft service issue, or suspect someone overloaded the capacity again. The Fabric administrator could, for example, write an email to dispel those doubts and inform users about the outage and expected resolution time.
Another type of communication targets the user(s) responsible for the outage. They are most likely unaware of how their actions affect others. After checking the monitoring tool (such as Fabric Capacity Metrics or custom solutions, which we will cover in the next article) to identify the artifacts causing issues, the admins should reach out to the owners and inform them. If they are in heavy development and repeatedly hitting refresh, ask them to pause until capacity stabilizes. If overloads are caused by users running reports with inefficient DAX measures, reach out to them and explain the situation.
The third option is a combination of the two above, plus automation. For example, whenever an email notification about capacity reaching 100% is sent to capacity administrators, automatically alert business users — via email or Teams post. Imagine such a message containing a graph or list showing the top CU consumers recently, highlighting the artifacts causing the problem. This way, all affected users know to expect slower reports, and the owners of those high-CU items understand their work is contributing to the issue, motivating them to optimize and avoid future overloads. We will share more about our solutions in upcoming posts.
How is this handled in your organization? Do you have a process to inform users about capacity throttling? We’ll explore this further in the next article.
#MicrosoftFabric #Fabric #Optimization #MicrosoftFabricOptimization #SelfService #CitizenDeveloper #CapacityOverload #Throttling #FabricMonitoring #CapacityAlerts #Alert
Leave a Reply