Overview


Information in this article is related to using CloudMonix for monitoring and automating Azure Windows Virtual Machines. In order to learn about monitoring Azure Linux Virtual Machines refer to the dedicated article.


The article covers the following topics:

  • common use cases where CloudMonix can help with monitoring and automation

  • what are monitoring agents, how to select and configure one

  • what happens during the monitoring cycle

  • what is needed to connect to and monitor an Azure Windows VM

  • what metrics CloudMonix tracks, visualizes and monitors

  • what automated actions can be executed by CloudMonix

CloudMonix can monitor both Classic and ARM/v2 VMs, however there are important differences when it comes to configuration, authorization, and certain metrics that can be tracked.

Refer to the article about differences between ARM and Classic API's for more details.

Monitoring of Azure Windows VMs can be done thru either Azure Diagnostics API (default choice) or downloadable CloudMonix Agent.


Why use CloudMonix for Azure Windows VMs?


Popular usages of CloudMonix include the following examples:


  • Track a large number of predefined metrics, or get any other data via PowerShell scripts (requires CloudMonix Agent)

  • Make sure that the VM is up

  • Make sure services are running

  • Retrieve data from event logs

  • Restart Azure VMs according to the schedule, e.g. every day, every week, once per month (requires CloudMonix Agent).

  • Shut down Azure VMs according to the schedule in order to minimize costs when traffic is low (requires CloudMonix Agent).

  • Automatically adjust the number of instances or scale VM based on the actual demand or according to the schedule (requires CloudMonix Agent)

  • Automatically clean up temporary files when disk space is running low (requires CloudMonix Agent)


Monitoring using Azure Diagnostics vs. CloudMonix Agent


CloudMonix can monitor Azure Windows VMs either using Microsoft’s own Azure Diagnostics Extension or downloadable CloudMonix Agent. Azure Diagnostics Extension is utilized by default. CloudMonix Agent supports tracking all of the metrics under Azure Diagnostics Extension as well as additional data.  CloudMonix agent also supports PowerShell-based automation. The complete list of metrics and automated actions supported by the agent is presented in this article in the Metrics and Actions and Automation sections.


Refer to the What are the differences between Azure Diagnostics Extension and CloudMonix agent? and How do I add CloudMonix agent to Azure Windows Virtual Machines? articles for more information about agents and their installation.



Monitoring Cycle


During each monitoring cycle, CloudMonix attempts to retrieve data from either Diagnostics Extension or CloudMonix Agent running on the monitored VM in order to track the configured metrics.


Both Azure Diagnostics and CloudMonix Agent send their monitored data to a shared storage (in case of the former, it is Azure Diagnostics storage account, and in case of the latter, it is CloudMonix blob storage). It is this storage where CloudMonix will retrieve its diagnostic data from.  No direct connections to monitored Windows servers are performed. The list of URLs that the agent needs to communicate with is specified in this article.


After each monitored cycle is completed, CloudMonix will evaluate and possibly execute any automated actions.

Configuration


Azure Windows VMs monitoring can be configured either via Setup Wizard or using the “Add New” button in the dashboard. It’s recommended to use Setup Wizard when configuring permissions for the first time. The Setup Wizard makes certificate and ARM authorization simple and VMs will be automatically visible in the portal. Learn more about authorizing with Setup Wizard here.


The ARM authorization can be only performed using Setup Wizard. The Wizard allows to authorize CloudMonix to access VM by creating a new user in the Azure AD with Contributor privileges. Contributor privileges can later be downgraded to Reader privileges, but it is important to understand the implications. Refer to the Can I downgrade CloudMonix principal from Contributor to Reader for ARM authorization? article to learn more.



The templates suitable for the type of resource (both default and previously defined custom templates) can be selected from the Defaults dropdowns.


Setup Wizard should not be used if CloudMonix Agent is intended to be instrumented on the monitored VMs.  If necessary ARM authorization is needed so that CloudMonix agents can properly register themselves with Azure context, do run thru the Setup Wizard, authorize ARM API, and ignore adding of any Windows Azure VMs.  


When switching between agents in the CloudMonix portal, it’s also necessary to make sure the correct agent is running on the monitored VM. CloudMonix won’t install and configure the agent automatically, so it won’t be able to collect data. Refer to the How do I add CloudMonix agent to Azure Windows Virtual Machines? article to learn more.


When using ARM authorization it is sometimes necessary to let Wizard know which Azure Storage account should be used by extension for storing of diagnostic data.


 

Metrics


Every diagnostic data point that CloudMonix retrieves from the monitored resource is considered a metric in CloudMonix. Refer to the Metrics article to learn more about metrics in general.


CloudMonix provides default templates with popular and useful metrics, alerts and actions recommended for the common configurations. Templates can be applied in the Configuration Template dropdown in the Resource definition dialog.


For Azure VMs there are three default templates:

  • Sample configuration for basic Windows Azure VM,

  • Sample configuration for IIS Server on Azure VM

  • Sample configuration for SQL Server on Azure VM.


The templates for IIS Server and SQL Server extend the basic template with additional IIS or SQL-specific metrics and alerts.



The metrics can be added, removed and customized in the Metrics tab of the Azure Windows VM resource configuration dialog.


Built-in Azure Windows VMs Metrics


ResourceStatus

Identifies the last state of the monitored VM. This is a critical metric that is captured for most types of resources that CloudMonix tracks. It is used for Uptime reports and should not be removed.

  • Available for Classic and ARM VMs.

  • Data Type: string

  • Possible values: Ready, Down, in Azure-aware mode also: Stopped, Unknown

  • Included in sample profile: yes, in all profiles tracked as a metric called Status

  • Included in default alerts: yes, in all profiles:

    • Resource Outage: Status == "Down"

Raises an alert when monitored server is reported as not-Ready by Azure or if no metrics come through from diagnostic agents, for a sustained period of time.


This metric is calculated differently when using Azure VM Diagnostics Extension agent or CloudMonix agent (refer to the What are the differences between Azure Diagnostics Extension and CloudMonix agent? to learn more). This metric is also calculated differently when CloudMonix is connected to Azure context (i.e. when the certificate or ARM API Token credential is setup properly for the VM).



When CloudMonix is Azure-aware then statuses are determined according to the following rules:


  • Ready - Azure Management API reports instance as Ready and there is some diagnostic data found.  

  • Unknown - Azure Management API reports instance as Ready but there is no diagnostic data found, indicating a possible outage with the instance.  

  • Unknown - Azure Management API reports the instance as Unknown, Busy or other non-Ready and non-Shutdown status but there is diagnostic data found.

  • Stopped - Azure Management API reports the instance as Stopped and Deallocated.  

  • Down -  Azure Management API reports instance as non-Ready and there is no diagnostic data found.


When CloudMonix is NOT Azure-aware then statuses are determined according to the following rules:


  • Ready - there is agent data found during a monitoring cycle.  

  • Unknown - there is no agent data found during a monitoring cycle



WindowsEventLogEntry


Tracks entries from the Windows Event Log.

  • Available for Classic and ARM VMs.

  • Tracked by Azure VM Diagnostics Extension agent and CloudMonix agent.

  • Data Type: object with the following properties:

  • EventId (int): ID of the Event Log entry

  • MachineName (string): host name of the server that generated the error

  • Message (string): actual message of the event

  • Source (string): application/service that generated the event

  • UserName (string): user under whose credentials the log entry was generated

  • EntryType (string): Information, Warning or Error

  • Timestamp (datetime); local time when the log entry was generated

  • Can be accessed only through aggregation using Expressions described in the Working with Expressions article in Evaluating data in sets\arrays (advanced) section.

  • Included in sample profiles: yes, in all sample profiles tracked as metrics:

    • ApplicationEventLogs: Tracks entries from the Windows Event Log (Application source)

    • SystemEventLogs: Tracks entries from the Windows Event Log (System source)

  • Included in default alerts: no


WindowsPerformanceCounter

Windows Performance Counter is one of the most popular metric types. Windows OS and applications running on it publish a large number of performance counters that highlight various aspects of performance indicators, health, uptime, etc. In order to learn more about the most popular counters refer to the Monitor Windows Server with Performance Counters article. The Performance Counter class documentation explains how to consume and define custom counters, should there be a need for CloudMonix to track user-generated diagnostic data.

 

CloudMonix can track any published performance counter. Each performance counter that CloudMonix should track must be defined as an individual metric in the Resource Configuration dialog.


  • Available for Classic and ARM VMs.

  • Tracked by Azure VM Diagnostics Extension agent and CloudMonix agent. Tracked only in the Azure-aware mode.

  • Data Type: double

  • Included in sample profiles: yes

Performance Counter Metrics included in all sample profiles:


    • CPUTime: Processor(_Total)\ % Processor Time

    • DiskFreeSpaceTotal: LogicalDisk(_Total)\Free Megabytes

    • DiskIdleTime: PhysicalDisk(_Total)\% Idle Time

    • DiskReadSpeed: PhysicalDisk(_Total)\Avg. Disk sec/Read

    • DiskWriteSpeed: PhysicalDisk(_Total)\Avg. Disk sec/Write

    • MemoryCommittedPct: Memory\% Committed Bytes In Use

    • MemoryFree: Memory\Available MBytes

Metrics included in the Sample configuration for IIS Server on Azure VM template:


    • AspNetApplicationRestarts: ASP.NET\Application Restarts

    • AspNetBytesOut: ASP.NET Applications(__Total__)\Request Bytes Out Total

    • AspNetErrors: ASP.NET Applications(__Total__)\Errors Total/Sec

    • AspNetRequests: ASP.NET Applications(__Total__)\Requests/Sec

    • AspNetRequestsQueued: ASP.NET\Requests Queued

    • AspNetRequestsRejected: ASP.NET\Requests Rejected

    • AspNetRequestWaitTime: ASP.NET\Request Wait Time

Metrics included in the Sample configuration for SQL Server on Azure VM template:


    • SqlBatchRequests: SQLServer:SQL Statistics\Batch Requests/sec

    • SqlBlockedProcesses: SQLServer:General Statistics\Processes blocked

    • SqlBufferHitRatio: SQLServer:Buffer Manager\Buffer cache hit ratio

    • SqlCompilations: SQLServer:SQL Statistics\SQL Compilations/sec

    • SqlLockWaits: SQLServer:Locks(_Total)\Lock Waits/sec

    • SqlPageLifeExpectancy: SQLServer:Buffer Manager\Page life expectancy

    • SqlPageSplits: SQLServer:Access Methods\Page Splits/sec

    • SqlRecompilations: SQLServer:SQL Statistics\SQL Re-Compilations/sec

    • SqlUserConnections: SQLServer:General Statistics\User Connections

  • Included in default alerts: yes

 Alerts included in all sample profiles:


    • High CPU: CpuTime > 70

Raises an alert when CPU utilization is over 70% for the last 5 minutes sustained


    • Low Memory: MemoryFree < 100

Raises an alert if the amount of available physical memory falls below 100MBs for the last 2 monitoring cycles sustained


    • Slow Disk: DiskReadSpeed > 0.025 || DiskWriteSpeed > 0.025 || DiskIdleTime < 20

Raises an alert if the average disk read or write speeds exceed 25 milliseconds or if the disk is idle for less than 20% of the time sustained for 5 minutes.  For mission critical servers, disk speed metrics should not be exceeding 10 milliseconds.

Alerts included in the Sample configuration for IIS Server on Azure VM template:


    • Requests are Queueing Up: AspNetRequestsQueued > 10

Raises an alert when the number of queued requests exceeds 10, for 5 minutes sustained.  Queued requests indicate that IIS or backened processes are not able to process the requests quickly enough.

Alerts included in the Sample configuration for SQL Server on Azure VM template


    • Blocked Processes Detected: SqlBlockedProcesses > 0

Raises an alert when any blocked processes have been detected and are present for at least 5 minutes sustained.


    • High Compilations: (SqlCompilations/SqlBatchRequests) > 0.1 || (SqlRecompilations/SqlCompilations) > 0.1

Raises an alert when the total number of compilations per second is over 10% of overall batch requests, or when the total number of re-compilations is greater than 10% of overall compilations, sustained for 5 minutes.


    • High Page Splits: (SqlPageSplits/SqlBatchRequests) > 0.2

Raises an alerts when page splits exceed 20% of overall batch requests for 5 minutes sustained.


    • Lock Waits Detected: SqlLockWaits > 0

Raises an alert when lock waits are detected and are sustained for 5 minutes.


    • Low Buffer Hit Ratio: SqlBufferHitRatio < 90

Raises an alert when Buffer Cache Hit Ratio is under 90 sustained for 5 minutes. When better performance is needed, the minimal acceptable value is 95. Generally lower value indicates a memory problem.


    • Low Page Life Expectancy: SqlPageLifeExpectancy < 300

Raises an alert when page life expectancy drops below 5 minutes (300 seconds) sustained for 5 minutes.


WindowsPerformanceCounterMultiInstance

Performance counter categories can be either single instance or multi-instance. A single instance category has only one machine wide value for each counter (e.g. the Systems category in Windows). The multi-instance category can have unlimited number of values for each counter (e.g. Process category which has a counter for each process, or DiskFreeSpace which has a counter for each disk).


WindowsPerformanceCounterMultiInstance metric is similar to WindowsPerformanceCounter, however it tracks metrics for all instances, i.e. it tracks multi-instance metrics. It returns an array of PerformanceCounterInstance objects for each counter instance.


  • Available for Classic and ARM VMs.

  • Tracked by Azure VM Diagnostics Extension agent and CloudMonix agent. Tracked only in the Azure-aware mode.

  • Data Type: array of objects with the following properties:

    • Instance (string): instance name

    • Value (double): counter value for the given instance

  • Can be accessed only through aggregation using Expressions described in the Working with Expressions article in Evaluating data in sets\arrays (advanced) section

  • Included in sample profile: yes, in all profiles tracked as a metric:

    • DiskFreeSpace: LogicalDisk\Free Megabytes

  • Included in default alerts: yes, in all profiles:

    • Low Disk Space: checking if DiskFreeSpace value is below 1024 MB

Raises an alert when any of the disks has less than 1GB of free space left


AzureVirtualMachineState

Specifies the current status of a role instance. Possible values for this metric are listed in this MSDN article in the RoleInstanceList section.


  • Available for Classic and ARM VMs.

  • Tracked by Azure VM Diagnostics Extension and CloudMonix agent. Tracked only in the Azure-aware mode.

  • Data Type: string

  • Possible values: CreatingVM, StartingVM, StoppingVM, StoppedVM, DeletingVM, FailedStartingVM, CreatingRole, StartingRole, ReadyRole, BusyRole, StoppingRole, RestartingRole, CyclingRole, FailedStartingRole, UnresponsiveRole, StoppedDeallocated, Preparing, Unknown.

  • Included in sample profile: no

  • Included in default alerts: no


AzureVirtualMachineDetails

Tracks metrics related to instance statuses and disks based on information exposed by Azure in the VirtualMachineExtensionInstanceView object.


  • Available for ARM VMs.

  • Tracked by Azure VM Diagnostics Extension and CloudMonix agent. Tracked only in the Azure-aware mode.

  • Data Type: array of objects with the following properties:

    • Provider (string): possible values Instance_instanceName, Disk_diskName, Extension_extName

    • Code (string): status code

    • DisplayStatus (string): short localizable label for status

    • Level (string): possible values Info, Warning, Error

    • Message (string): Optional message used by Azure for storing alerts and error messages.

    • Time (DateTime?): Time when the information was obtained.

  • Included in sample profile: no

  • Included in default alerts: no


ScheduledTaskLastRunInMinutes

Tracks number of minutes since a particular Windows scheduled task has last executed.


  • Available for Classic and ARM VMs.

  • Tracked by CloudMonix agent.

  • Data Type: decimal

  • Included in sample profile: no

  • Included in default alerts: no


ScheduledTaskLastStatus

Tracks the last status of a particular Windows scheduled task. Status of 0 indicates a successful run.  For a list of all possible statuses, refer to the Windows Task Scheduler article.


  • Available for Classic and ARM VMs.

  • Tracked by CloudMonix agent.

  • Data Type: int

  • Included in sample profile: no

  • Included in default alerts: no


WindowsProcessList

Tracks a list of currently running processes.


  • Available for Classic and ARM VMs.

  • Tracked by CloudMonix agent.

  • Data Type: array of objects with the following properties:

  • Name (string): windows process name

  • IsResponding (boolean): indicator if the process is able to respond or is hung

  • MemorySize (decimal): memory allocated to a particular process in bytes

  • Cpu (decimal): CPU utilization allocated to a particular process in %

  • Can be accessed only through aggregation using Expressions described in the Working with Expressions article in Evaluating data in sets\arrays (advanced) section

  • Included in sample profile: yes, tracked as a metric called ProcessList

  • Included in default alerts: no


WindowsServiceState

Metric that tracks the last known status of a particular Windows service.


  • Available for Classic and ARM VMs.

  • Tracked by CloudMonix agent.

  • Data Type: string

  • Possible values: Running, Stopped, Stopping, Paused, Pausing.

  • Included in sample profile: no

  • Included in default alerts: no


WindowsUpdatesDrivers

Tracks available Windows Driver Updates. Used for ensuring all important updates are installed regularly. CloudMonix limits time needed to retrieve these metrics to 10 sec, and won't retrieve them more often than every 6 hours.


  • Available for Classic and ARM VMs.

  • Tracked by CloudMonix agent.

  • Data Type: WindowsUpdatesSoftware list, with the following properties:

    • Title - (string)

    • Url - (string)

    • Mandatory - (bool)

    • Priority - (string)

    • Date - (DateTime)

  • Included in sample profile: no

  • Included in default alerts: no


WindowsUpdatesSoftware

Tracks available Windows Updates. Used for ensuring all important updates are installed regularly. CloudMonix limits time needed to retrieve these metrics to 10 sec, and won't retrieve them more often than every 6 hours.


  • Available for Classic and ARM VMs.

  • Tracked by CloudMonix agent.

  • Data Type: WindowsUpdatesSoftware list, with the following properties:

    • Title - (string)

    • Url - (string)

    • Mandatory - (bool)

    • Priority - (string)

    • Date - (DateTime)

  • Included in sample profile: no

  • Included in default alerts: no



Alerts

Users can create alerts based on changes in any value tracked by CloudMonix (including custom metrics). Each resource template includes alerts which are suitable for a given resource. The predefined alerts for Azure Windows VMs are listed in the Metrics section.


Alerts are available during the Trial period or in Professional and Ultimate plans only.


Refer to the Alerts article to learn more.



Actions and Automation

Automation features (Actions) allow users to set up powerful reactive, proactive and scheduled actions. CloudMonix can execute a an automated action when a specific monitoring condition occurs or according to a schedule.


As a general rule, every new action and auto-scaling rule should specify appropriate Suspended period and Sustained period values. Those settings can be adjusted in the Action and Auto-scaling rule forms.



The Suspended period defines for how long the action will be ignored after it was executed. If that setting is set to zero, then the action can be executed in each monitoring cycle (i.e. every 1 min. in Professional and Ultimate plans) providing the condition for activation is fulfilled. In most scenarios that is not reasonable, as actions are executed asynchronously and also often it takes a while before the action’s effect is noticeable.


The Sustained period defines for how long the condition must be satisfied before the action will be executed. That prevents executing actions based on the temporary changes in the tracked values, which might resolve itself automatically.


Actions are available during the Trial period or in the Ultimate plan only.


Sample usages:


Built-in Azure Windows VMs Actions


AzureVMInstanceReboot

CloudMonix will request Azure to reboot the VM. This is the equivalent of restarting a VM from the Azure portal.


If Availability Sets are configured then rebooting VM will cause no downtime. The restarted VM will be removed load balancer, then stopped. It won’t be added back immediately after restarting, there’s some delay configured.


  • Available for Classic and ARM VMs.

  • Supported by Azure Diagnostics Extensions and CloudMonix agents. Requires running in the Azure-aware mode.


AzureVmInstanceStartup

CloudMonix will request that a particular VM is started. This is the equivalent of starting a VM from the Azure portal.


  • Available for Classic and ARM VMs.

  • Supported by Azure Diagnostics Extensions and CloudMonix agents.  Requires running in the Azure-aware mode.


AzureVmInstanceShutdown

CloudMonix will request Azure to shutdown the VM. This is the equivalent of shutting down a VM from the Azure portal. Shutting down VMs (and not deallocating them) will NOT help to lower the costs as Azure doesn’t charge for deallocated resources, however it still charges for shutdown resources.


  • Available for Classic and ARM VMs.

  • Supported by Azure Diagnostics Extensions and CloudMonix agents. Requires running in the Azure-aware mode.


AzureVmIInstanceShutdownDeallocate

CloudMonix will request that a particular VM is shutdown and deallocated (i.e. resources are released). Deallocating VMs helps to lower the costs as Azure doesn’t charge for deallocated resources.


  • Available for Classic and ARM VMs.

  • Supported by Azure Diagnostics Extensions and CloudMonix agents. Requires running in the Azure-aware mode.


AzureVmInstanceResize

CloudMonix will request Azure to resize the VM. Resizing VMs on scheduled basis or based on metrics such as current demand helps to lower the costs.


  • Available for Classic and ARM VMs.

  • Supported by Azure Diagnostics Extensions and CloudMonix agents. Requires running in the Azure-aware mode.


PowershellReboot


CloudMonix will restart a specified Windows Service on the target VM. The predefined PowerShell script executes the command: Restart-Computer.


  • Supported by CloudMonix agent.

  • Requires running in the Azure-aware mode.


PowershellRestartService


CloudMonix will restart a specified Windows Service on the target VM. The predefined PowerShell script executes the command: Restart-Service serviceName.


  • Supported by CloudMonix agent.

  • Requires running in the Azure-aware mode.


CustomPowershellScript


CloudMonix will execute on the target VM the custom PowerShell script specified in the action definition. That action is especially useful when used in combination with other CloudMonix features, such as metrics or schedules.


  • Supported by CloudMonix agent.

  • Requires running in the Azure-aware mode.


Sample usages: