Resource Monitoring Settings


Resource Configuration screen alows users to add and configure resources for monitoring and automation.  Configuration screens for all monitored resources look similar



  • Definition tab provides an ability to enter general information about how CloudMonix should connect to monitored resource.  
  • Metrics tab allows for defining what metrics CloudMonix will track for particular resource.  
  • Actions tab (Universal plan subscribers only) allows users to specify what actions should CloudMonix trigger to execute on monitored resources
  • Scheduled Actions tab (coming soon) (Universal plan subscribers only) allows users to specify scheduled actions that CloudMonix will trigger to execute on monitored resources
  • Scaling Ranges and Adjustments tabs (Universal plan subscribers only) allow users to specify how auto-scaling should work for certain resource types (Azure Cloud Roles only at this time)
  • Advanced tab allows definition of more nuanced configuration settings
  • Test Results tab along-side with the Test button allow users to validate their configuration setup and see what happens when CloudMonix executes a single monitoring cycle against their resource



Definition Tab


Important Definition fields

Resource Name

Should be unique and serve a purpose of easily identifying a resource on dashboards and alerts

Credentials

Some resources may require credentials so that CloudMonix can successfuly connect and monitor them.  Credential are stored in Credential Vault.  Learn more about Credential Vault here.

Configuration Template

During initial setup of a new resource, CloudMonix will suggest starting with one of predefined configuration templates to ease with setup time.  Users can also define their own templates by saving configured resources as templates via 'Save as Template' button.  Learn more about Configuration Templates here.

Categories Resources can be grouped together.  Each distinct resource category is displayed on a separate dashboard tab.  It is also possible to setup notification filters by resource categories.  Learn more about notification filters here.



Metrics Tab

CloudMonix can track and monitor a variety of performance metrics, uptime statuses, logs, and more.  Types of metrics that CloudMonix can track vary from resource to resource.  Learn more about configuration options for individual resource types here.

CloudMonix can capture numeric, text-based, and complex multi-dimensional metric types and allow users to setup alert rules, healing actions, and auto-scaling adustments based on data in these metrics.


Basic Metric Configuration

Defining new metrics is usually straight forward.  Most metrics are resource-specific with exception of three categories that apply to all kinds of resources: 

  1. AvailabilityMetric is a text-based metric that reports availability status of a monitored resource.  Possible values for this metric are: Ready, Down, Stopped, and Unknown.  CloudMonix calculates AvailabilityMetric for each resource in a different way. Learn more about how CloudMonix calculates availability status for particular resources here.
  2. AggregateMetrics allow for time-based or set-based aggregations of numeric or complex metrics, and allow alerts, actions, or scaling adjustments to express complex crieria in simple ways.  Aggregate Metrics can, in real time, aggregate numeric data over time (up to 1hr in the past), calculate counts of certain log entries, evaluate compound aggregate expressions, and more.  Generally speaking, metric aggregates are available only to paid or Trial users of CloudMonix.  Learn more about Aggregate Metrics here
  3. LinkedMetrics allow importing of metric data from other monitored resources within CloudMonix account.  In some cases it maybe important to evaluate metric values alongside metrics in related resources.  For example, it maybe important to evaluate CPU and memory utilization of a worker server alongside the depth of a separately monitored Azure Queue or a particular control value in a separately monitored SQL database. Linked metrics allow for holistic approach to monitoring.  Resources do not have to be monitored in silos.  Learn more about Linked Metrics here.


Highlight on Dashboard option

All metrics tracked by CloudMonix have an option to be highligted on a dashboard.  By default all numeric metrics will display on dashboards as charts.  Up to three important numeric metrics can be given further prominence on the dashboard by being placed in Gauges.  It is helpful to specify best/worst range for highlighted numeric metrics, so that they appear properly on gauges..

Non-numeric metrics that are highlighted on the dashboard will display as separate tabular views

Learn more about working with Dashboards here.



Alerts Tab

Alerts Tab allows for management of alerts for a particular resource.  In general, alerts in CloudMonix turn ON when a condition specified in the Expression property of an alert evaluates to true.  Alerts are considered to be OFF when that condition is no longer true.  If an expression returns an error or a non-boolean response, it is ignored from consideraiton.


When alert is triggered either ON or OFF, it is displayed on a dashboard's Alert view or alongside numeric charts.  When alert changes status from either OFF to ON, or vice-versa, a notification is sent to all recipients that are setup to receive notifications, via their preferred way of notification delivery.  Side note: CloudMonix allows users to be very specific when configuring notification delivery filters, thresholds and mechanisms.  Learn more about notification filtering here.  Learn more about alert customizations here.




Important Alert fields

Alert Severity

Governs alert delivery rules.  It also governs the color of alert on the dashboard and shows up in the title of alert notifications.

Alert Expression

Governs the criteria for when alert is triggered.  Expression is a boolean formula that is freely entered.  Expression allows for comparison of Metric values against specific user criteria.  Expressions support full boolean logic and can be very sophisticated or very basic.  Expressions can work with one or many metrics, reference sub-properties of complex metrics, apply math to numeric metrics, perform string-based comparisons, and more.  Learn more about Expressions here.

By Instance Evaluation

Certain resources consist of multiple instances of the same resource.  These are usually load-balanced resources such as Azure Cloud Roles, Azure Availability Sets, and Web Farms are managed by CloudMonix together as a single unit.  However, CloudMonix tracks individual metrics coming from distinct resources separately and is able to trigger alerts based on these metrics either by individual instance or in aggregate.  For example, in some cases it maybe important to know when the overall total Web Requests/sec metric exceeds a certain number, irregardless of how many web servers are managing the website, while in other cases it maybe important to know that an individual server within a particular web farm has high CPU utilization or poor Disk performance.

Sustained Period

Allows for ignoring intermediate spikes in performance conditions.  In some cases it maybe important to send alert only when a particular condition specified in Alert Expression is true for a sustained period of time.  For example, CPU utilization is a highly "spiky" metric and bursts of this metric for short periods of time can usually be ignored.  However, a prolonged and sustained high CPU utilization is considered a warning sign that should be alerted on.



Actions Tab

Automated Actions are available to Ultimate or Trial subscribers only.  Actions tab allows CloudMonix users to automatically heal and maintain their production environments by instrumenting particular commands that are automatically triggered when certain conditions arise.  Actions are triggered in ways similar to Alerts: when Action Expressions evaluate to true, a request to execute Command of an Action is submitted.  Like metrics, only certain actions are supported by certain resources.  For example, CloudMonix can request its Windows Monitoring agent to execute PowerShell scripts on a Windows server as a result of some triggered action; or it can request to execute a SQL query on a monitored SQL database.  It is important to note, that actions can be executed against the resource that triggered the action OR against any other monitored resource in the CloudMonix account.  For example, an outage with monitored SQL database (resource 1) can trigger an IIS pool recycle on an IIS server resource (resource 2).  Actions that operate across resources and action Expressions that can evaluate metrics from different resources together, allow for truly sophisticated scenarios of failure prevention and healing-reaction in distributed environments.  Learn more about actions here.  Learn more about resource-specific actions here.


When an Action is requested, executed, or failed, it is shown on the dashboard alongside numeric charts (similarly to how Alerts are displayed) and also in its own separate view.  When action is either requested, executed, or failed, a notification is sent to all recipients that are setup to receive notifications, via their preferred way of notification delivery.  Side note: CloudMonix allows users to be very specific when configuring notification delivery filters, thresholds and mechanisms.  Learn more about notifications here.



Important Action fields

Action Severity

Governs notification delivery rules.  It also governs the color of action on the dashboard and shows up in the title of action notifications.

Action Expression Governs the criteria for when action is triggered.  Expression is a boolean formula that is freely entered.  Expression allows for comparison of Metric values against specific user criteria.  Expressions support full boolean logic and can be very sophisticated or very basic.  Expressions can work with one or many metrics, reference sub-properties of complex metrics, apply math to numeric metrics, perform string-based comparisons, and more.  Learn more about Expressions here.

By Instance Evaluation

This option works in similar ways as with Alerts. Certain resources consist of multiple instances of the same resource.  These are usually load-balanced resources such as Azure Cloud Roles, Azure Availability Sets, and Web Farms.  Such resources are managed by CloudMonix together as a single unit.  However, CloudMonix tracks individual metrics coming from distinct resources separately and is able to trigger actions based on these metrics either by individual instance or in aggregate.  For example, in some cases it maybe important to trigger a reboot action on a specific server that is misbehaving.  In such case, by Instance Evaluation is appropriate.  In other case, it maybe important to trigger a particular job elsewhere when combined load on all of the servers in a farm exceeds normal.  Always use Aggregate Metrics when evaluating Expression for an action that works across all instances.

Execute On (target) Actions can be executed against Self (the resource that triggered the action), or against some other resource monitored in CloudMonix account.   Learn more about Actions here.  Learn more about what resources can execute what actions here.

Suspended Period

This option is very important for actions.  Unless this option is specified, CloudMonix will try to execute the same Action over and over again, every minute, as long as Action's expression evaluates to true.  Action execution and its effect can take some time.  It maybe important to limit CloudMonix from executing subsequent similar actions until a Suspended (cooldown) period is reached.

Sustained Period

Allows for ignoring intermediate spikes in performance conditions.  In some cases it maybe important to execute an action only when a particular condition specified in Action Expression is true for a sustained period of time.  For example, CPU utilization is a highly "spiky" metric and bursts of this metric for short periods of time can usually be ignored.  However, a prolonged and sustained high CPU utilization maybe a sign of non-normal behaviour and may need to trigger a service restart.




Scaling Tabs

Auto-Scaling is available to Ultimate or Trial subscribers only.  CloudMonix's scaling capabilities allow users to dynamically adjust the number of instances/servers with real-time demand, time or other conditions.  Auto-scaling is currently supported only by Azure Cloud (Web/Worker) Roles.


There are two complimentary ways to configure and enforce auto-scaling:

  • Scaling Ranges define minimum/maximum ranges for instance counts that need to be enforced. For example, an internal company website with active usage times between 9am and 9pm may need one "Daytime" scaling range with instances counts between 10 and 20, and a "Nighttime" scaling range with instance counts between 2 and 5.  A scaling range becomes active when its boolean Expression returns true. CloudMonix will evaluate all possible scaling range Expressions and pick the first one whose Expression returns true.  Such flexibility allows users to instrument sophisticated scaling scenarious.  For example, enhancing the Daytime/Nightime example from above, users could add an extra condition to Nightime scaling range that would evaluate to false if Requests/sec on the website are still too high to go into "night" mode and force the scaling range to remain in Daytime mode.  Since scaling ranges are evaluated from first to last and only first matching range becomes active, the order of Scaling Ranges is important.  Scaling Ranges can be re-arranged by dragging via up/down arrow button
  • Scaling Adjustments define scale UP or DOWN adjustments that need to happen in order to accomodate fluctuations in real-time demand, within the boundaries of an active Scale Range.  Extending Daytime/Nightime example further, Scaling Adjustments can be configured to adjust instance count (scale UP) when the load on a single server is too great to accomodate for unpredictable but sometimes randomly occuring load on the website. Like Scaling Ranges, Adjustments's Expressions are evaluated from first to last until there is a match and an Expression evaluates to true.  Since the order of Scaling Adjustments is important, they can be re-arranged by dragging them via up/down arrow button

Learn more about auto-scaling here.


Important Scale-Range fields

Severity

Governs notification delivery rules.  It also governs the color of scale action on the dashboard and shows up in the title of scale action notifications.

Expression Governs the criteria for when scaling-range is triggered.  Expression is a boolean formula that is freely entered.  Expression allows for comparison of Metric values against specific user criteria.  Expressions support full boolean logic and can be very sophisticated or very basic.  Expressions can work with one or many metrics, reference sub-properties of complex metrics, apply math to numeric metrics, perform string-based comparisons, and more.  Learn more about Expressions here.

Minimum Instance Quantity

Specifies the smallest amount of instances that CloudMonix will enforce when a particular range is active.  When the range becomes active, the number of instances will be adjusted to specified Minimum Instance Quantity if CloudMonix will sense that a lower number of instances exists.  If 0 is specified, CloudMonix will not enforce lower boundary on the range.

Maximum Instance Quantity

Specifies the largest amount of instances that CloudMonix will cap monitored resource, when a particular range is active.  When the range becomes active, the number of instances will be adjusted to specified Maximum Instance Quantity if CloudMonix will sense that a larger number of instances exists.  If 0 is specified, CloudMonix will not enforce upper boundary on the range.



Important Scale-Adjustment fields

Severity

Governs notification delivery rules.  It also governs the color of scale action on the dashboard and shows up in the title of scale action notifications.

Expression Governs the criteria for when scaling-adjustment is triggered.  Expression is a boolean formula that is freely entered.  Expression allows for comparison of Metric values against specific user criteria.  Expressions support full boolean logic and can be very sophisticated or very basic.  Expressions can work with one or many metrics, reference sub-properties of complex metrics, apply math to numeric metrics, perform string-based comparisons, and more.  Learn more about Expressions here.

Scale Aciton & Adjustment

When a particular scaling adjustment is activated, its action and actual adjustment are executed only if they fit into an active Scaling-Range boundaries.
Sustained Period

Allows for ignoring intermediate spikes in performance conditions.  In some cases it maybe important to react slower when a particular condition specified in Action Expression is true for a sustained period of time.  For example, CPU utilization is a highly "spiky" metric and bursts of this metric for short periods of time can usually be ignored.  However, a prolonged and sustained high CPU utilization maybe a sign of high load and require more resources to be scaled up.




Test Tab


Test button alongside Test Tab allow users to simulate a test monitoring cycle on the resource.  Such simulated cycles do not change any data or execute any actions, but serve primarely for validation of ability to connect, gather metrics, and evaluate Expressions.  When testing resources that CloudMonix can reach only via agent, test results may not always result in "green" pass, since it takes a small amount of time for configuration changes to propagate to deployed Agent.