CloudMonix Monitoring and Automation


Azure WebApps



Information in this article is related specifically to monitoring and automating Azure WebApps, Webjobs, and WebApp Backups with CloudMonix. To learn more about general features of CloudMonix, kindly read articles in this section.

This article answers a number of questions:

  • what is needed to connect to and monitor an Azure WebApp

  • what happens during the monitoring cycle

  • what specifically can CloudMonix track, visualize and monitor for Azure WebApps and WebJobs

  • what automation does CloudMonix support for Azure WebApps and WebJobs


Connectivity

Easiest way to connect CloudMonix to your Azure WebApps is to run the Setup Wizard. Alternatively “Add New” functionality from the dashboard or Resource Management screen can be used.


Important: even if your WebApps were deployed thru the new portal, CloudMonix uses Classic API with Azure Certificate authentication to connect to and monitor Azure WebApps.

Besides Azure management certificate used by the Classic API, no other special authorization details are required.




Monitoring Cycle

During each and every monitoring cycle, CloudMonix will attempt to connect to monitored WebApp and App Plan and retrieve data for metrics configured on the Metrics tab.

All of the WebApp specific metrics are retrieved by CloudMonix via Azure Management API using previously specified Azure Management certificate.

Important1: Certain metrics (specifically, those of type AzureAppPlanMetric) are only retrievable from Azure if the Application Plan behind the monitored WebApp is either Basic or Standard. This is because Azure does not provide any Plan-level data for WebApps under either Free or Shared plans. Furthermore, Azure does not allow CloudMonix to change instance counts (auto-scale) WebApps under either Free or Shared plans.

Important2: Azure often does not return metric/diagnostic data for WebApps that have no active traffic. In such a case, CloudMonix will report Missing values on the dashboards and ignore any alert/scale expressions that utilize Missed metrics.

By default, CloudMonix will attempt to connect to Management API endpoints of the WebApp up to 4 times, timing out within 30 seconds and with delay of 500ms between connection attempts. Frequency of monitoring cycles, connection timeout and number of attempts at reconnection are configurable on the Advanced tab. Wait period between reconnection attempts (if failures occur) is currently hardcoded at 500ms.


Metrics

When utilizing the provided sample monitoring profile, a number of metrics will already be predefined. Adding, removing, and customizing metrics can be done on the Metrics tab of configuration dialog.

Type of the metric is determined by its Category.




ResourceStatus

This is a critical metric that is captured for most types of resources that CloudMonix tracks.

When CloudMonix detects an “Exceeded” usage status or “Degraded” or “Not Available” runtime availability status from the monitored WebApp, it will mark the ResourseStatus as Down. Otherwise, the ResourceStatus is Ready.

In addition to alerts and automation, this metric is used to drive the Uptime reports.

  • Data type: string

  • Possible values: Ready, Down

  • Desired state: Ready

  • Included in sample profile: Yes, tracked as metric named Status

  • Included in default alerts: Yes, as alert named Resource Outage


AzureAppPlanMetric\CPU Percentage

Tracks overall CPU percentage utilization across all VMs that are hosting the entire Application Plan under which the monitored WebApp is located.

  • Data Type: double

  • Possible values: between 0% and 100%

  • Desired state: Under 70%

  • Included in sample profile: Yes, tracked as metric named AppPlanCpu

  • Included in default alerts: Yes, it is a part of High CPU Utilization in App Plan alert

  • This metric is only available for WebApps in Basic or Standard App Plans


AzureAppPlanMetric\Memory Percentage

Tracks overall Memory percentage utilization across all VMs that are hosting the entire Application Plan under which the monitored WebApp is located.

  • Data Type: double

  • Possible values: between 0% and 100%

  • Desired state: Under 80%

  • Included in sample profile: Yes, tracked as metric named AppPlanMemory

  • Included in default alerts: No

  • This metric is only available for WebApps in Basic or Standard App Plans


AzureAppPlanMetric\Disk Queue Length

Tracks overall number of disk operations that have been queued up and are not yet currently executing – across all VMs that are hosting the entire Application Plan under which the monitored WebApp is located collected in a minute interval.

  • Data Type: double

  • Possible values: 0 or greater

  • Desired state: Smaller the better

  • Included in sample profile: No

  • Included in default alerts: No

  • This metric is only available for WebApps in Basic or Standard App Plans


AzureAppPlanMetric\Http Queue Length

Tracks overall number of queued Http requests across all VMs in the entire Application Plan under which the monitored WebApp is located collected in a minute interval. Presence of any queued requests generally indicates that WebApps cannot keep up with requests

  • Data Type: double

  • Possible values: between 0 and (5000 * numberOfVMs)

  • Desired state: 0

  • Included in sample profile: Yes, tracked as metric named AppPlanHttpQueue

  • Included in default alerts: No

  • This metric is only available for WebApps in Basic or Standard App Plans


AzureAppPlanMetric\Data In

Tracks overall number of bytes incoming to all VMs in the entire Application Plan under which the monitored WebApp is located collected in a minute interval.

  • Data Type: double

  • Units: bytes

  • Possible values: 0 or greater

  • Desired state: depends on usage

  • Included in sample profile: No

  • Included in default alerts: No

  • This metric is only available for WebApps in Basic or Standard App Plans


AzureAppPlanMetric\Data Out

Tracks overall number of bytes outgoing from all VMs in the entire Application Plan under which the monitored WebApp is located collected in a minute interval.

  • Data Type: double

  • Units: bytes

  • Possible values: 0 or greater

  • Desired state: depends on usage

  • Included in sample profile: No

  • Included in default alerts: No

  • This metric is only available for WebApps in Basic or Standard App Plans


AzureWebsiteMetric\*

AzureWebsiteMetric category is capable of tracking a number of different types of metrics collected in minute intervals for specific WebApp only. The list of these metrics keeps increasing as Microsoft exposes more data thru its API. You can also monitor these metrics on Azure Portal; CloudMonix, however has no limitation on how many metrics can be visualized, it allows for derivation and aggregation of this data, as well as alerting and auto-scaling based on any of these metrics.




AzureWebsiteMetric\Requests

Metric tracks number of requests made against the monitored WebApp in the last minute

  • Data Type: double

  • Possible values: 0 or greater

  • Desired state: depends on usage

  • Included in sample profile: Yes, tracked as the metric named Requests

  • Included in default alerts: No


AzureWebsiteMetric\Http*

Metrics track number of Http requests returning a specific Http code that were made against the monitored WebApp in the last minute (corresponds to Http Client Errors, Http Successes, Http Redirects, and other Http 400-error codes in Azure Management Portal).

  • Data Type: double

  • Possible values: 0 or greater

  • Desired state (Http 2xx metric): depends on usage

  • Desired state (all others): 0

  • Included in sample profile: No

  • Included in default alerts: No


AzureWebsiteMetric\Http Server Errors

Metric tracks number of requests that resulted in an unhandled exception and that made against the monitored WebApp in the last minute. These are Error 500’s. Contents of the actual errors is not exposed by Azure API and needs to be logged to some logging storage of user’s choice via code.

  • Data Type: double

  • Possible values: 0 or greater

  • Desired state: 0

  • Included in sample profile: Yes, tracked as the metric named Errors

  • Included in default alerts: Yes, but in a disabled alert Errors Detected


AzureWebsiteMetric\Average Response Time

Metric tracks the average response time to the website hosted by WebApp.

  • Data Type: double

  • Units: milliseconds

  • Possible values: 0 or greater

  • Desired state: 0

  • Included in sample profile: Yes, tracked as the metric named ResponseTime

  • Included in default alerts: Yes, but in a disabled alert Slow Response


AzureWebsiteMetric\CPU Time

Metric tracks the amount of CPU time used by the WebApp in the last minute. To be more precise, if the CPU was doing full time work for 5 seconds out of the full minute, the value for CPU Time should be 5000.

  • Data Type: double

  • Units: milliseconds

  • Possible values: between 0 and (60000 multiplied by number of instances/workers)

  • Desired state: 0

  • Included in sample profile: Yes, tracked as the metric named CpuTime

  • Included in default alerts: No


AzureWebsiteMetric\DataIn & DataOut

Metrics track ingress and egress in bytes for the monitored Web APp

  • Data Type: double

  • Units: milliseconds

  • Possible values: 0 or greater

  • Desired state: depends on usage

  • Included in sample profile: Yes, tracked as the metrics named DataIn and DataOut respectively

  • Included in default alerts: No


AzureWebsiteMetric\MemoryWorkingSet

Metric tracks the amount of bytes used by the WebApp during the last minute

  • Data Type: double

  • Units: bytes

  • Possible values: 0 or greater

  • Desired state: depends on usage, but lower is better

  • Included in sample profile: Yes, tracked as the metric named MemorySet

  • Included in default alerts: No


AzureWebsiteResponseCode

Metric tracks the HTTP response from a URL that is hosted by the monitored Web App


AzureWebsiteTriggeredWebjobLastDuration

Metric tracks the amount of time that monitored triggered Webjob took the last time it ran. Capturing or alerting on this metric maybe useful for certain triggered Webjobs that get “stuck” and do not error out.

  • Data Type: double

  • Units: milliseconds

  • Possible values: 0 or greater

  • Desired state: depends on the job

  • Included in sample profile: No

  • Included in default alerts: No


AzureWebsiteTriggeredWebjobLastEndedMinutesAgo

Metric tracks when monitored triggered Webjob last time finished running.

  • Data Type: double

  • Units: milliseconds

  • Possible values: 0 or greater

  • Desired state: depends on the job

  • Included in sample profile: No

  • Included in default alerts: No

AzureWebsiteTriggeredWebjobLastStartedMinutesAgo

Metric tracks when monitored triggered Webjob last started running.

  • Data Type: double

  • Units: milliseconds

  • Possible values: 0 or greater

  • Desired state: depends on the job

  • Included in sample profile: No

  • Included in default alerts: No


AzureWebsiteTriggeredWebjobLastStatus

Metric tracks the last known status of a specific monitored triggered Webjob

  • Data Type: string

  • Possible values: Running, Completed, Failed

  • Desired state: Completed or Running

  • Included in sample profile: No

  • Included in default alerts: No


AzureWebsiteWebjobList

Metric tracks all Webjobs within monitored WebApp as a collection

  • Data Type: collection

  • Desired state: Webjobs of type Continuous must be “Running” and Webjobs of type “Triggered” must not be “Failed”

  • Included in sample profile: Yes, as metric named Webjobs

  • Included in default alerts: Yes, under alerts Failed Triggered Webjobs Detected and Non-running Continous Webjobs Detected


AzureWebsiteLastBackup

Metric tracks all Backups and their last known statistics

  • Data Type: collection

  • Desired state: Backups should not have a status of Failed or Partially succeeded

  • Included in sample profile: Yes, as metric named Backups

  • Included in default alerts: Yes, under alerts Backups Failed


ResourceInstanceCount

Tracks number of compute instances (VMs) allocated to current WebApp. Applies to Basic and Standard App Plans only.

  • Data Type: int

  • Possible values: 1 or greater (0 for when the WebApp is Free or Shared)

  • Included in sample profile: No

  • Included in default alerts: No


AggregatedMetric

Aggregated allow for averaging, summing, counting, and performing other math calculations on existing metrics over a period of time. Learn more about Aggregated Metrics here.


DerivedMetric

It is often necessary to transform existing metric(s) into new values based on some custom calculations. DerivedMetrics allow to do that. For example, it may be better to track CpuTime in seconds and not milliseconds, thus a new DerivedMetric based on CpuTime/1000 expression maybe needed, etc. Learn more about Derived Metrics here.


LinkedMetric

Evaluation of metric values from other resources in the monitored environment alongside currently configured resource is done thru linking of those metrics. LinkedMetrics is the mechanism to link metrics from other resources. Learn more about Linked Metrics here.


Actions, Automation & Auto-scaling

Automation features (Actions) allow users to setup powerful reactive, proactive and scheduled actions. Learn more about Actions here. Learn more about Auto-Scaling here.

For WebApp resources, CloudMonix is capable of executing of a number of actions.


AzureWebsiteRestart

This command restarts the WebApp. Any session state is lost, any requests in progress are terminated. Subsequent hits to the WebApp will endure a slight delay as the WebApp is prepared for first time use. Execution of this command is helpful when the WebApp has run into trouble (perhaps due to memory leaking, or other application-specific issues).

It may be useful to setup scheduled execution of this command during a time when the WebApp has no traffic, in order to keep it “clean”. Requests metric can assist with knowing when the WebApp has little or no traffic and Suspended timeout maybe specified so that the command does not run too often.




AzureWebsiteRunWebjob

This command allows for manual start of a Webjob. This is useful to do when a job fails and needs to be restarted automatically.


Auto-Scaling

Auto-scaling algorithm in CloudMonix is an ability to automatically grow or shrink number of instances dedicated to Application Plan. This is known as horizontal auto-scaling (scaling in/out).

Auto-scaling logic in CloudMonix is driven by an internal engine and is no way connected to Azure’s native Auto-scaling. Since only one auto-scaling set of rules should apply to a single Application Plan, it is important to disable Azure’s auto-scaling if CloudMonix is used for this purpose. It is also important to remember that auto-scaling for WebApps works at the Application Plan level. If CloudMonix is monitoring multiple WebApps within the same Application Plan, auto-scaling should only be enabled on one of those WebApps.

Learn more about Auto-scaling here.