One of CloudMonix's core premises is to allow users script automatic responses to known issues in their environments or run scripts on a schedule. CloudMonix uses expression evaluation engine to allow its users to specify how problems are detected and using Actions interface, how they should be resolved based on those conditions or based on a schedule. Common usage patterns are: reboot a server when it is running low on RAM, restart a failed service or task, recycle IIS app pools of a website that is misbehaving, reboot a server or run a SQL index update on a daily basis. This list goes on and on. It is often that IT professionals are already aware of common problems and their solutions within their environments. CloudMonix allows detection of such problems using Metrics and Expressions and /automatic/ resolution of these problems using Actions. Learn more about Metrics here and about Expressions here. Automated actions that need to run on a schedule need to be associated with Schedules. Learn more about defining Schedules here:
Action is executed against a monitored resource in the environment when Action's expression turns TRUE for a sustained amount of time. CloudMonix has intimate knowledge of the resources it is tracking and understands what actions are allowed to be setup, based on the resource type. List of supported actions by resource is specified on resource-specific topics here.
Actions are defined on the Actions tab of a particular resource. During initial setup, it is likely that a default configuration template was used, and thus there maybe some disabled actions already predefined.
When creating new actions or changing existing ones, configuration settings need to be provided for CloudMonix to effectively evaluate and execute actions.
|Name||Action's name is important. It is what's used to identify actions when notifying users or integration endpoints|
|Enabled||Actions can be disabled when they are no longer needed but when their configuration details are important to preserve for future|
|Severity||Severity is another key setting that can be very useful. Severities identify importance of an action, serve as visual indicators of actions on the dashboard, and are visually communicated via notifications to users or integration endpoints|
|Evalute by Instance||This setting applies and visible only on resources that are "multi-instance", ie: resources that aggregate under them a number of ubiquitous instances, such as Azure Cloud Roles. When this setting is true, CloudMonix will evaluate metric-based conditions on a per-instance level. When the setting is false, CloudMonix will average metrics across all instances before evaluating them. For some actions, it is important to know when a particular instance within a resource misbehaves, for others it is important to know when all instances across the resources are impacted. It is assumed that for Actions, this checkbox if present, will be ON as most actions for Cloud Services apply to instances.|
|Expression||Expressions are the core component of Actions. They indicate the condition when Action is executed. Expressions are boolean statements with C#-like syntax that need to evaluate to either TRUE or FALSE. More information about expressions can be found here.|
|Sustained Period||Expressed in minutes, sustained period is a period of time that CloudMonix will wait before raising an alert and assuming that alert's condition continues to remain true. Sustained period ensures that alerts do not trigger erroneously for highly chaotic/frequently changing metrics/conditions. By configuring sustained period, CloudMonix will evaluate the expression during every monitored cycle and ensure that it continues to remain true before finally raising the alert|
|Suspended Period||Once an action is executed, it is assumed that time is needed for the effect to take. CloudMonix will try to evaluate condition for action execute every minute and without a way to throttle the execution, action requests will be sent every minute. Suspended Period tells CloudMonix to hold off from trying to execute more actions for amount of time specified after first one was executed.|
|Execute On||Action is meant to be executed on some resource. Typically, when it is the same resource that "owns" the defintion of the action and needs to execute the action on itself, the target of the action is "Self". However, it is also supported to execute actions on other resources in the monitored environment. |
For example, it maybe useful to start a failover Runbook in an Azure Automation resource if some outage condition has occured with Azure Cloud Service; or it maybe useful to post to a REST API when SQL Azure database contains some invalid data, as a means of data correction.
|Execute Command||Command of an action is the core logic that Action executes on the monitored resource. Each type of a resource supports different commands. Furthermore, depending on how that resource is monitored, more or less commands maybe available. For example, Azure VMs and Cloud Services monitored via CloudMonix Agent can execute PowerShell commands as Actions, while those monitored with Azure Diagnostics Extension cannot.|
|Additional Command Context||Certain actions/commands may require "scripts" to execute. For example, PowerShell-based Actions execute PowerShell scripts, API-posting Actions execute Json or XML payloads, SQL-command Actions execute SQL scripts, etc. Content of these scripts is stored in the Script Library so that it can be shared across other commands in other resources. Learn more about Script Library here.|
As mentioned above, commands are requested when alert expressions turn TRUE for specified sustained amount of time. When action is requested and executed, a notification is published to various people and integration API endpoints as governed by the Notification Management Screen. Learn more about notification management here.
All enabled actions for a resource are evaluated during every monitored cycle for potential execution. Evaluation logic does not stop evaluating actions after one of them is found to be TRUE.