Alert Basics

 

CloudMonix has a sophisticated alert engine that allows for very precise alerts to be published for very particular conditions.  CloudMonix alerts are configured, maintained and raised on a resource level.  Core component of an alert is its Expression that that governs when alert is turned ON or OFF.  Learn more about expressions here.  Alert's expression evaluates metric values that are captured by CloudMonix.  Learn more about metrics here.


Definition

 

Alerts are defined on the Alerts tab of a particular resource.  During initial setup of a resource, it is likely that a default configuration template was used, so when editing a particular resource it is likely that some alerts have already been predefined.

When creating new alerts or changing existing ones, some configuration settings need to be provided for CloudMonix to effectively evaluate and raise alerts.


Name Alert's name is important.  It is what's used to identify alerts when communicating raised/cleared alerts to users or integration endpoints
Enabled Alerts can be disabled when they are no longer needed but when their configuration details are important to preserve for future
Severity Severity is another key setting that can be very useful.  Severities identify importance of an alert, serve as visual indicators of alerts on the dashboard, and are communicated via notifications to user or integration endpoints
Evalute by Instance This setting applies and visible only on resources that are "multi-instance", ie: resources that aggregate under them a number of ubiquitous instances, such as Azure Cloud Roles. When this setting is true, CloudMonix will evaluate metric-based conditions on a per-instance level.  When the setting is false, CloudMonix will average metrics across all instances before evaluating them.  For some alerts, it is important to know when a particular instance within a resource misbehaves, for others it is important to know when all instances across the resources are impacted.  

For example, checking for low memory condition on Cloud Role instances is usually important to do by instance, so that one can know which instance is running low.  However, checking for Web Requests/sec should likely be done across all instances to determine overall load on the website.
Expression Expressions are the core component of alerts. They indicate the condition when alert is raised and cleared.  Expressions are boolean statements with C#-like syntax that need to evaluate to either TRUE or FALSE.  More information about expressions can be found here.
Sustained Period Expressed in minutes, sustained period is a period of time that CloudMonix will wait before raising an alert and assuming that alert's condition continues to remain true.  Sustained period ensures that alerts do not trigger erroneously for highly chaotic/frequently changing metrics/conditions.  By configuring sustained period, CloudMonix will evaluate the expression during every monitored cycle and ensure that it continues to remain true before finally raising the alert







Raising & Clearing Alerts

 

As mentioned above, alerts are raised when expression of an alert results in a TRUE for specified sustained amount of time.  Alert is cleared when the expression returns FALSE.  Alert's value is not changed if alert's expression evaluates to any other value or to an error.  When alert is raised, it is tracked under Active Alerts dropdown on the portal.  Furthermore, when alert is either raised or cleared, it is published to various people and integration API endpoints as governed by the Notification Management Screen.  Learn more about notification management here.



Other Comments 


All enabled alerts for a resource are evaluated during every monitored cycle.  Evaluation logic does not stop evaluating alerts after some alert is found to be TRUE.