Overview

The article covers the following topics:

  • common use cases where CloudMonix can help with monitoring and automation

  • what happens in the monitoring cycle

  • what is needed to connect to and monitor Azure Site Recovery

  • what metrics can be tracked, visualized and monitored

  • what automated actions can be executed by CloudMonix


Why use CloudMonix for Azure Site Recovery?

Popular usages of CloudMonix include the following examples:

  • Tracking and alerting on failed and suspended replication jobs.

  • Monitoring replication events.

  • Tracking monitoring and backup usage.

Configuration

Azure Site Recovery monitoring can be configured either via Setup Wizard or by using the “Add New” button in the dashboard. It’s highly recommended to use Setup Wizard when configuring permissions for the first time, as that will simplify authorization. Learn more about authorizing with Setup Wizard here.


When configuring Azure Site Recovery monitoring, it’s necessary to select the Resource Group and Resource Name from dropdowns in the configuration dialog.


Metrics

Every diagnostic data point that CloudMonix retrieves from the monitored resource is considered a metric in CloudMonix. Refer to the Metrics article to learn more about metrics in general.


CloudMonix retrieves metrics for Azure Site Recovery via Azure Management API.


CloudMonix provides a default template for monitoring Azure Site Recovery that contains the most useful Metrics and Alerts.


The metrics can be added, removed and customized in the Metrics tab in the resource configuration dialog.


Built-in Metrics 


AzureSiteRecoveryBackupMetric

Tracks the value of a single backup metric.

  • Data Type: long

  • Requires selecting a Metric to track

  • Included in default profile: yes, tracked metrics called BackupItemCount, SizeGRS, SizeLRS

  • Included in default alerts: no


AzureSiteRecoveryBackupUsage

Tracks the current Backup Vault Usage.

  • Data Type: array of objects with the following properties:

    • Name (string)

    • Value (long)

    • Limit (long)

    • Unit (string)

    • NextResetTime (string)

    • QuotaPeriod (string)

  • Can be accessed only through aggregation using Expressions described in the Working with Expressions article.

  • Included in default profile: yes, tracked as a metric called BackupUsage

  • Included in default alerts: no


AzureSiteRecoveryReplicationEvents

Tracks the count of replication events.

  • Data Type: int

  • Included in default profile: yes, tracked as a metric called ReplicationEvents

  • Included in default alerts: no


AzureSiteRecoveryReplicationFailedJobs

Tracks the count of failed replication jobs.

  • Data Type: int

  • Included in default profile: yes, tracked as a metric called FailedJobCount

  • Included in default alerts: yes, included in alerts: 

    • Failed Replication Jobs Detected (Warning): Raises an alert when there is at least 1 failed replication job reported by Azure


AzureSiteRecoveryReplicationInProgressJobs

Tracks the count of replication jobs in progress

  • Data Type: int

  • Included in default profile: yes, tracked as a metric called InProgressJobCount

  • Included in default alerts: no


AzureSiteRecoveryReplicationSuspendedJobs

Tracks the count of suspended replication jobs

  • Data Type: int

  • Included in default profile: yes, tracked as a metric called SuspendedJobCount

  • Included in default alerts: no


AzureSiteRecoveryReplicationUnHealthyProviders

Tracks the count of Unhealthy providers

  • Data Type: int

  • Included in default profile: yes, tracked as a metric called UnhealthyProviderCount

  • Included in default alerts: yes, included in alerts: 

    • Unhealthy Providers Detected (Warning): Raises an alert when there is at least 1 unhealthy provider reported by Azure


AzureSiteRecoveryReplicationUnHealthyVmCount

Tracks the count of Unhealthy VMs.

  • Data Type: int

  • Included in default profile: yes, tracked as a metric called UnhealthyVmCount

  • Included in default alerts: yes, included in alerts: 

    • Unhealthy VMs Detected (Warning): Raises an alert if there is at least 1 unhealthy VM reported by Azure


AzureSiteRecoveryReplicationVaultUsages

Tracks the current replication jobs and items count.

  • Data Type: array of objects with the following properties:

    • Name (string)

    • Value (int)

  • Can be accessed only through aggregation using Expressions described in the Working with Expressions article.

  • Included in default profile: yes, tracked as a metric called ReplicationUsage

  • Included in default alerts: no

Alerts


Users can create alerts based on changes in any value tracked by CloudMonix (including custom metrics). Each resource template includes alerts which are suitable for a given resource. The predefined alerts for Azure Site Recovery are listed in the Metrics section. Refer to the Alerts article to learn more about alerts in general.


Alerts are available during the Trial period or in Professional and Ultimate plans only.