Netreo Essentials allows you to automatically scale Azure Cloud Roles, Azure Web Apps, SQL Azure Databases and SQL Data Warehouses by adjusting the number of instances/tier-levels to the actual or expected demand. Auto-scaling may be configured in the Scale Adjustments and Scale Ranges tabs in the Resource Configuration dialog.
It’s also possible to automatically (vertically) scale VMs, however the mechanism is different from described in this article. VMs are scaled UP/DOWN by executing resizing Actions.
Auto-scaling is available in the Ultimate plan or during the initial trial period.
If you have a scenario not covered by the article below, please open a support ticket with the Netreo Essentials team at http://support.cloudmonix.com we frequently assist customers with their specific nuanced scenarios.
Auto-scaling is configured in Scale Adjustments and Scale Ranges tabs in the Resource Configuration dialog.
How does auto-scaling work?
Scale Adjustments define how rapidly Netreo Essentials should change the number of instances or tier levels (e.g., by 1 or by 5 at a time). It also specifies under what conditions the number of instances/tier-levels should be changed.
Scale Ranges complement Scale Adjustments by defining scaling boundaries (i.e., the minimum and maximum number of instances or upper/lower boundary for tier levels that can run at once). It is possible that the maximum and minimum number of instances/tiers are equal, which then defines a precise number of instances that will be run or tier that will be allocated.
Often there are multiple Ranges defined for a single resource (e.g., scale boundaries might vary depending on the time of the day, day of the week or current metrics values). Therefore it is also possible to specify under what conditions the specific Scale Range should be applied.
For example, many websites have high traffic during the day while not many visitors access them at night. In this case, there might be two Scaling Ranges defined, “Day” and “Night.” During the “Day” Netreo Essentials can use Scale Range to limit instance counts from 3 to 7 at that time, and between 1 to 3 during the “Night” scaling range. The exact number of running instances can be adjusted automatically by Scale Adjustments based on the actual load conditions, but it will always respect the boundaries defined by an active Scale Range.
See the Rule Evaluation in Detail section for more information.
What can you auto-scale by?
Scaling adjustments and ranges can be based on expressions that are made up of metrics. Any Metric available in any of the resources tracked by Netreo Essentials can be used for auto-scaling conditions, including:
Standard metrics tracked for the scaled resource including Aggregated and Derived metrics.
Metrics tracked for other resources within the same Account, such as Azure Service Bus Queues and Topics, Storage Queues, and other custom data from SQL Azure, JSON APIs, etc. Metrics tracked by other resources need to be imported into the scaling resource via the Linked Metrics approach.
The auto-scaling can be based on either proactive or reactive indicators, depending on what metrics are used for defining conditions.
Auto-scaling based on reactive indicators
In the case of auto-scaling based on reactive indicators, the number of instances is adjusted to the actual, observed load. The most popular metrics used for determining current load include CPU usage, free memory and number of Requests/sec.
For example, the following configuration is used to automatically spin up new instances when the 15-minute average CPU usage is higher than 50%.
Typically, the rules for adding instances have corresponding rules for removing instances. So there should be another rule responsible for removing the instance (e.g., when the 15-minute average CPU usage gets lower than 25% and stays at that level for 10 minutes).
Auto-scaling based on proactive indicators
In the case of auto-scaling based on proactive indicators, the number of instances is adjusted to the predicted load. The most popular leading indicators used by Netreo Essentials customers include Azure Service Bus queue lengths and custom data in SQL Azure obtained using SQL scripts.
Refer to the Can I auto-scale my Azure Cloud Services based on Azure Queues, Service Bus queues or topics? article to learn more.
The suitable metrics can be found by analyzing the system and determining what the first thing is that shows the increasing demand. In many cases it’s helpful to have a look at historical data from the system (e.g., using Netreo Essentials Historical Dashboard).
Rule Evaluation in Detail
Rule execution order
During the evaluation of auto-scaling rules, Netreo Essentials first evaluates the auto-scaling ranges (in order from first to last) to determine which auto-scaling range qualifies. The first qualifying auto-scaling range determines the boundaries for instance counts. Afterwards, Netreo Essentials evaluates conditions for auto-scaling adjustments (in order from first to last) to determine which adjustment needs to be applied.
In both cases, Netreo Essentials only picks the first qualifying scaling range and adjustment. The subsequent entries are ignored.
Users can manually adjust the order of their ranges and adjustments by clicking and holding the drag icon.
Once the auto-scaling range and adjustment are found, Netreo Essentials will evaluate how many instances the resource should have. If that number is different from the current number of instances, scaling action will occur and an alert with a defined severity will be published.
The user can control how rapidly the number of instances can change by specifying a Cooling period.
The Cooling period may be defined in the Definition tab of the Recourse Configuration dialog by providing Scale-DOWN and Scale-UP cooling periods. Those values represent the minimal wait period between successive scaling operations.
Using the settings shown below, Netreo Essentials won’t add instances more often than every 15 minutes and won’t remove them more often than every 30 minutes. The number of instances added and removed at once is specified in Scale Adjustments. Those are the default values provided by Netreo Essentials for all resources templates.
The cooling period is honored even when the condition specified by Scale Adjustments is fulfilled, which allows the system to stabilize after the scaling adjustment but before Netreo Essentials decides if further changes are needed. If both Scale-DOWN and Scale-UP cooling periods were set to zero, then the scaling conditions would be evaluated at each monitoring cycle (every 1 minute for Professional and Ultimate plans). In most scenarios that wouldn’t be sufficient, as the impact of the previous adjustment wouldn’t yet be reflected in the metrics values.
The Sustained period specifies how long the expression condition must be true before the action will be executed. This prevents premature scaling adjustments caused by temporary spikes in metrics values.
Typically, a shorter Sustained period is used for scaling-up operations (as it’s safer to have too many instances running to handle the load) and a longer Sustained period is used for removing instances.