One of the core features of Netreo Essentials is to allow users to script automatic responses to known issues in their environments or run scripts on a schedule. Netreo Essentials uses its expression evaluation engine to allow its users to specify how problems are detected and, using the Actions interface, how they should be resolved based on those conditions or based on a schedule. Common usage patterns are: Reboot a server when it is running low on RAM, restart a failed service or task, recycle IIS app pools of a website that is misbehaving, reboot a server or run a SQL index update on a daily basis, etc. This list goes on and on.
Frequently, IT professionals are already aware of common problems and their solutions within their environments. Netreo Essentials allows detection of such problems using Metrics and Expressions and allows automatic resolution of these problems using Actions. Automated actions that need to run on a schedule need to be associated with Schedules.
(Learn more about defining Schedules here.)
An Action is executed against a monitored resource in the environment when that Action's expression evaluates to TRUE for a sustained amount of time. Netreo Essentials has intimate knowledge of the resources it is tracking and understands what actions are allowed to be setup, based on the resource type. (A list of supported actions by resource is specified in resource-specific topics here.)
Actions are defined on the Actions tab of a particular resource. During initial setup it is likely that a default configuration template was used, and thus there maybe some disabled actions already predefined.
When creating new Actions or changing existing ones, configuration settings need to be provided for Netreo Essentials to effectively evaluate and execute them.
|Name||An action's name is important. It is what is used to identify an action when notifying users or integration endpoints.|
|Enabled||Actions can be disabled when they are no longer needed, but when their configuration details are still important to preserve for future use.|
|Severity||Severities identify importance of a particular action, serve as visual indicators of actions on the dashboard, and are communicated via notifications to users or integration endpoints.|
|Evaluate by Instance||This setting applies and is visible only on resources that are "multi-instance" (i.e., resources that aggregate under them a number of ubiquitous instances, such as Azure Cloud Roles). When set to true, Netreo Essentials will evaluate metric-based conditions on a per-instance level. When set to false, Netreo Essentials will average metrics across all instances before evaluating them. For some actions, it is important to know when a particular instance within a resource misbehaves, while for others it is important to know when all instances across the resources are impacted. It is assumed that for Actions this checkbox, if present, will be set to ON, as most actions for Cloud Services apply to instances.|
|Expression||Expressions are the core component of Actions. They determine the condition when an Action is executed. Expressions are boolean statements with C#-like syntax that need to evaluate to either TRUE or FALSE. More information about expressions can be found here.|
|Sustained Period||Expressed in minutes, sustained period is the period of time that Netreo Essentials will wait before raising an alert and assuming that the alert's condition continues to remain true. Sustained period ensures that alerts do not trigger erroneously for highly chaotic/frequently changing metrics/conditions. By configuring sustained period, Netreo Essentials will evaluate the expression during every monitoring cycle and ensure that it continues to remain true before finally raising the alert.|
|Suspended Period||Once an action is executed, it is assumed that time is needed for the effect to take. Netreo Essentials will try to evaluate the condition for an action being executed every minute. Without a way to throttle the execution, action requests would be sent every minute. Suspended Period tells Netreo Essentials to hold off from trying to execute more actions after first one was executed for the amount of time specified.|
|Execute On||An action is meant to be executed on some particular resource. When a resource that "owns" the definition of an action needs to execute that action on itself, the target of the action is "Self." However, actions may also be executed on other resources in the monitored environment. (For example, it maybe useful to start a failover runbook in an Azure Automation resource if some outage condition has occurred with Azure Cloud Service, or it may be useful to post to a REST API when a SQL Azure database contains some invalid data, as a means of data correction.)|
|Execute Command||The Command of an action is the core logic that the Action executes on the monitored resource. Each type of a resource supports different commands. Depending on how that resource is monitored, more or fewer commands maybe available. For example, Azure VMs and Cloud Services monitored via a Netreo Essentials Agent can execute PowerShell commands as Actions, while those monitored with the Azure Diagnostics Extension cannot.|
|Additional Command Context||Certain commands may require scripts to execute (PowerShell-based Actions execute PowerShell scripts, API-posting Actions execute JSON or XML payloads, SQL-command Actions execute SQL scripts, etc.). The content of these scripts is stored in the Script Library so that it can be shared across other commands in other resources. Learn more about Script Library here.|
As mentioned above, commands are requested when an alert's expression evaluates to TRUE for a specified sustained amount of time. When an action is requested and executed, a notification is published to various people and integration API endpoints as governed by the Notification Management Screen. Learn more about notification management here.
All enabled actions for a resource are evaluated during every monitoring cycle for potential execution. Evaluation logic does not stop evaluating actions after one of them is found to be TRUE.