Summary

 

Continuous monitoring allows enterprises the opportunity to ensure reliable performant services are immediately available to its users. Platforms need to be able to keep up with demand in such a way that its seamless to a user. Systems that become unreliable or unusable are quickly disregarded or abandoned. One sure fire way to ensure uses won't use a service is if the service is unavailable. To account for this, enterprises looks to service level agreements and business continuity strategies. Part of this strategy includes testing for availability.

Azure Application Insights provides features to allow organizations to quickly report and take action on current and trending availability metrics. This article will review the various tests that can be configured within the service. From there we will go into how the data is and can be collected for analysis. We will look into a use case involving monitoring the Dataverse API. Finally, we wrap with implementing a monitoring strategy to assist with notifications and automation.

Azure Application Insights Availability Tests

 

Azure Application Insights availability tests come in three distinct groupings. The first, reaches out to a URL from different points around the world. The second, allows for the replay of a recorded user interaction with a site or web based service. Both of these originate from within the Azure Application Insights feature itself, created in the Azure Portal or through the Azure APIs.

The final type of test is a completely custom test. This custom test allows flexibility into how, what and where we test. Due to these attributes, this type of test is ideal and will serve as the test strategy implemented below.

Important Note on web tests:

The web test mechanism has been marked as deprecated. As expected this announcement comes with various feedback. With this in mind, I recommend avoiding implementing web tests. If web tests are currently being used, look to migrate to custom tests.

Building Ping Tests

 

URL Ping Tests with Azure Application Insights are tests that allow the service to make a request to a user specified URL. As documented, this test doesn't actually use ICMP but sends an HTTP request allowing for capturing response headers, duration timings, etc.

For the Power Platform, this can be useful for testing availability of Power Apps Portals or services utilizing Power Virtual Agents or custom connectors. When configuring the test, conditions can be set for the actual request to the URL. These include the ability to include dependent requests (such as images needed for the webpage) or the ability to retry on an initial failure.

The test frequency can be set to run every five, ten or fifteen minutes from various Azure Data Centers across the globe. Its recommended to test from no fewer than five locations, this will help diagnose network and latency issues.

Finally, the referenced documentation recommends that the optimal configuration for working with alerts is to set the number of test locations equal to the threshold of the alert plus two.

Building Custom Tests

 

Continuing down the path of availability tests, the need to expand beyond URL ping tests will eventually come up. Situations such as validating not only uptime but authentication, latency, service health, etc. all could benefit from custom availability tests.

Before building custom tests, let's look take a closer look at the Availability table within Azure Application Insights.

The Availability Table

 

The availability table is where our test telemetry will reside, either from URL ping tests or custom testing. The table is designed to allow for capturing if a test was successful, the duration (typically captured in milliseconds with a stopwatch approach), location of the test and other properties. I'll review this in depth further in the article but for now keep in mind at a minimum we want to capture successfulness, timing and location for each test.

Creating and Deploying an Azure Function(s)

Azure Functions offer an ideal service for hosting out availability tests. Deployable worldwide and resilient, we can quickly modify and publish changes to our tests with minimal effort. Azure Functions also offer a real advantage, the ability to use a timer based trigger similar to a CRON job (actually uses NCRONTAB expressions) or an HTTP trigger allowing for ad-hoc test runs.

AUTHOR'S NOTE: Click on each image to enlarge for more detail

Triggers

 

The image above shows two public entry points, one based on the HttpTrigger and one based on the TimerTrigger. The HttpTrigger is relatively straight forward allowing for GET and POST messages. The advantage for the use of this type of trigger is flexibility, function requests can be sent from practically any pillar in the Power Platform, such as Power Apps, Power Automate Flows or Power Virtual Agents.

The TimerTrigger on the other hand is, as expected, set to run on a predefined interval using the CRON (NCRONTAB) expression schema. What I like about this approach is we use a well known interface for scheduling tasks utilizing the power of Azure Functions. In the image above the schedule is hard coded but this can be configurable using settings contained within the Azure Function or elsewhere.

The approach laid out here provides another major advantage. This approach is decoupled from the Power Platform and will not be impacted or hindered by the very platform we are testing for availability!

Dataverse Requests

 

The decision how best to connect and report on Dataverse availability is completely up to the business requirements that need to be met. What I would recommend is to authenticate and perform a command confirming not only availability but authorization. Below is an example of requesting an OAuth token and performing two requests, the WhoAmI and RetrieveCurrentOrganization actions.

This helps confirm that the service principal used is valid and can return responses from the Dataverse. In my example, I had two main requirements for working with the Dataverse: connect using a service principal and avoid any SDK dependency. The test must reduce any potential blockers such as license modifications or assembly version lock-in. Again, how you implement this is completely up to you.

Location, Location, Location...

 

Once its been determined what needs to be tested and how to go about testing, the decision on where to test and metric collection still needs an answer. Again, Azure Function comes to the rescue, allowing developers (DevOps engineers cover your ears) to quickly deploy from Visual Studio to locations around the world. That said, ideally proper CI/CD processes are followed.

Measuring latency across regions and continents is natively collected by Microsoft by use of ThousandEyes. The Region to Region Latency tool, documented here, is a good reference of the average latency between Azure Data Centers when performing actions across the Azure network backbone. Alternatively, you can collect latency and bandwidth information using Mark Russinovich's popular tool, PsPing.

Building the Availability Telemetry

 

Once the application we want to test for availability has been identified and the endpoint connected to, we now need to create the availability telemetry message. As discussed above, the AvailabilityResults table within Azure Application Insights contains columns that can be used to track successfulness of API actions and the locations from which they originated.

For the Dataverse, or in fact any HTTP invoked request, we can also capture headers from the response. This can prove beneficial to gaining insights as well into current API usage as it pertains to limits (e.g. Entitlement Limits) or correlation identifiers. These headers work well in the Custom Dimensions column, a JSON serialized column providing the flexibility needed to add additional data points.

That said at a minimum what I have found most useful for the Azure Application Insights tooling is first coming up with a name for your tests. Once named, setting the Run Location property will be key to grouping the tests regionally. Within Azure Functions is an environment variable called "REGION_NAME" that provides the data center location. Finally, setting the Success property along with the Message is needed to ensure we track uptime.

Optionally, duration can be set and depending on what your requirement is for this will dictate what call timings are captured. In my example I am executing a simple action call and wrapping that in a timer. Taking this duration and comparing it to the latest region to region latency should provide ample timing metrics.

Reviewing Tests

 

The messages from our availability tests will reside within the AvailabilityResults table with Azure Application Insights. The messages are summarized and can be drilled into visually using the Availability feature as shown below.

As the image shows, the tests are grouped by name, in this case "Dynamics 365 Availability Test". Expanding the test we can see the various regions. Once a region is selected we can drill into the scatter plot to see how uptime may have been impacted. Consider the gif below, showing how to add filters to expand the time window searched as well as organization version.

Using the technique described above, we can now see not only when a organization version changed but the beginnings of availability and duration timings.

Custom Availability Tests with the Dataverse API Demo

 

Below is a link to my YouTube video including detailed analysis and a demo of working with availability tests within Azure Application Insights. The demo includes setting up the test, deploying to Azure, reviewing logs and creating Azure Monitor alert rules.

Sample Code

 

Sample code can be found here.

Next Steps

 

In this article we have reviewed how to use availability tests within Azure Application Insights. We explored creating URL ping tests as well as building custom tests. From there, we designed and published an Azure Function globally to test Dataverse availability.

As shown, not only can the availability tests let us know the uptime of an API or service but is flexible enough to capture data points such as build versions, flow runs, etc. Consider how to use extend the use of this type of testing to further utilize Azure Monitor capabilities such as alerting.

If you are interested in learning more about specialized guidance and training for monitoring or other areas of the Power Platform, which includes a monitoring workshop, please contact your Technical Account Manager or Microsoft representative for further details.

Your feedback is extremely valuable so please leave a comment below and I'll be happy to help where I can! Also, if you find any inconsistencies, omissions or have suggestions, please go here to submit a new issue.

Index

 

Monitoring the Power Platform: Introduction and Index