First off, you may be shocked to even see this written about, surely it's not necessary these days, with Microsoft hosting in Azure and a Software As A Service (SAAS) provision for Production environments. Right?

Wrong, say a number of larger organisations who have already gone through the pain; I've helped pick up the pieces in many cases, alongside colleagues, since working on "AX7" with Microsoft support in the early days (as it was known then) shortly before general availability. And here's the spoiler, it's not all Microsoft's fault.

Why?

Well, there are several things which Microsoft have little or no control over and indeed partners may also have little or no control over after you’ve gone live. Here are some examples of why such performance issues require a combined effort:

  • Partner / ISV: customisation, configuration, integrations code (in addition to supporting our clients with the below).
  • Customer / Client: configuration, batch job schedule, integrations schedule, data, config, users, browser latency, subscription estimate (essentially, transaction volumes provided to Microsoft for go live).
  • Microsoft: infrastructure, standard code.

This post is primarily aimed at non-technical readers and those responsible for coordinating or managing general performance issues, but with links drilling through to more in depth resources containing further details.

By ‘general performance’, I mean a set of unidentified performance issues across one or more modules, or indeed the entire application.

This is intended to provide a suggested approach drawing on experience, to get you started in what can be a challenging area.

General performance issues can often be highly political and complex, partly due to the combined effort which is required and common misconception that it’s all on Microsoft, as explained above.  Performance can also be subjective; it’s a feeling to some extent. As such, collaboration and user perception is key, hence the importance of understanding the issues and setting the right level of expectation from the start. Generally speaking though, the important thing to remember is the same relatively straightforward approach applies every time when analysing these performance issues.

In the most successful performance projects I have seen, there has been a spirit of openness and collaboration towards the shared goal of improving performance, without which people may tend to try to defend their own areas and can be less willing to contribute information if they feel there is a risk in doing so. If you haven’t already, depending on the scale of the issue you may wish to consider setting up a regular conference call to share information, agree and assign actions and review progress, for example on a weekly basis.

Key steps

These steps are based on typical approaches I’ve followed and seen others follow in similar situations. For the most part, the management of such issues is technology agnostic and therefore not too dissimilar to my previous blog post. Whilst it may be tempting to immediately log an ‘everything is slow’ support case with Microsoft, the engineer on the other end, lovely people that they are, is only likely to ask the same questions as below initially, it’s necessary; in my experience, Microsoft cases can be handled much more expediently and effectively if there are specific facts readily available from the start - especially, as it goes in the famous eighties film: ”Man who catch standard code running slow with latest update accomplish anything” (or something like that).

Right, so enough of the throwbacks, Glen, what should I do?

  1. Set User Expectations

Setting the right levels of expectation from the start is key to keeping any performance tuning project within scope.  I say project here deliberately because general performance issues should be treated as such, including scoping, timelines and allocation of multiple resources. There may be questions like ‘Could [X technical issue] in some way relate to this performance issue?’; bear in mind that while positive contributions should be encouraged, be careful how you use the information – be careful to stick to your original goals and not be sidetracked.

Get a list of processes and validate expected durations for those processes from the end users, i.e. whether they are actually realistic or not based on the underlying logic.  If you don’t think it’s realistic, say so, it’s better to have these conversations as early as possible; ask key users to define in business terms what the requirement is and if possible, provide supporting information to go with it. Ideally, get the target defined in terms of volume and concurrent users as well. For example: “With 200 concurrent users, we expect process ‘X’ to take a maximum of 30 seconds and an average of 10 seconds, for a 100 line order. We have calculated this based on the order volume a user in that area would typically need to process to meet their targets.” It relates back to the principle of SMART: Specific, Measurable, Achievable, Realistic and Time related.

Users saying “AX is generally slow” or “‘X’ process is slow” may be valid, but it doesn’t provide enough information on its own to properly analyse the problem;  the tools (below) can help but there is nothing better than first-hand information from the users.  It can help to stress the importance of their role in this process. This brings us on to point 2.

  1. Ask!

Ask for further details and if you can’t get the required information by simply asking, try other approaches such as shadowing users while they are experiencing the issues (ok, more difficult right now, but hopefully you’re getting the idea).  If you don’t get your answers, keep asking – you may well find that a lot of “noise” simply disappears and some specific issues start bubbling to the surface which you can then begin to address.

Once you do start getting the information though, users need to see that their efforts are worthwhile to offer you continued support (i.e. first hand information), so ensure you at least demonstrate that you are working on it and ensure they are being kept informed (ideally, directly if you can). If you can also achieve some quick wins, even better.

Some examples of questions you might want to ask the end users:

  • Is there a general performance issue or can specific processes be identified?
  • For each process that is slow: is the issue intermittent, if so is there any pattern to it, e.g. particular users and/or times of day?
  • In some instances, I have seen end users (or someone on the ‘shop floor’) recording details in a spreadsheet as and when they occurred – the more first-hand information the better.

Even if it is described only as a ‘general performance’ issue:

  • Can they provide examples of processes they found to be particularly slow to focus in on? – Consider asking for the top 2 worst performing processes from each business area, or the top 20 worst performing processes overall, for example; it’s subjective and different people from within the business may not entirely agree, but at least they are engaged in the process.
  • How many users are affected and in what areas of the business? – Following on from that, are there users or business departments which are not affected?

Where an issue is identified with a specific process:

  • Can it be recreated on a test environment?
  • If not, can it consistently be recreated on the production environment?
  • What are the steps for the process (clear and detailed enough for anyone to understand, specifying the AX path)?
  • If applicable, what parameters were used?
  • Is there any setup required before running the process, if so what are the steps? (e.g. you might need to ask them to provide a file used for an import – one which would recreate the issue if it depends on the file type/size)
  • What are the expected and actual results? As mentioned in point 2, make this as quantitative as possible, including durations, transaction volumes and concurrent users ideally.

  1. Monitor and Diagnose

Fortunately for us, Microsoft have provided multiple tools out of the box which can help us with this task and because it’s on Azure and SAAS, there’s no installation required. Happy days (or it will be once these issues are over, I hear you cry)!

Check out this Tech Talk from the awesome Markus Otte and Davy Vliegen in the Microsoft R & D Solution Architect team.

Finance and Operations: Performance Troubleshooting Tools for Dynamics 365 | December 14, 2018

Wait, what? I’ve installed this thing On Premises! Keep calm, the good old DynamicsPerf and tracing can still help.

https://github.com/PFEDynamics/DynamicsPerf

https://cloudblogs.microsoft.com/dynamics365/no-audience/2017/11/15/collect-dynamics-365-for-finance-and-operations-event-traces-with-windows-performance-monitor/

Additionally, if you’re responsible for writing code, you most definitely want to be running the Customisation Analysis Report as it may yield a lot of quick wins before you even start using those tools:

https://docs.microsoft.com/en-us/dynamics365/fin-ops-core/dev-itpro/dev-tools/customization-analysis-report

Additionally, if you're working with Dynamics 365 Commerce (previously known as Retail), then take a look at this - with thanks to Chris Bittner (also in the Microsoft SA team) for putting me on to that: 

https://dynamicsnotes.com/retail-channel-performance-investigations/

  1. Plan

By this point (if not earlier), you should be in a position to formulate an action plan based on:

  • What the issues and priorities are.
  • The resources (organisations and individuals) required to address the issues and their availability.
  • Initial tasks required to address the issues and the start/end dates (including development, quality assurance, testing, deployment, etc.), taking into account the steps below.

Keep in mind that the plan may well change as time goes on and further issues/actions are identified, but obviously the key is to have one!

  1. Validate Application Builds and Setup

There is less to check these days, assuming you’re in the majority who are using the cloud version of Dynamics 365 Finance (if not that’s for another blog post). However:

  • There is still application setup that can impact performance
  • Regular table clean-ups are still a must
  • Most importantly, ensure you keep your builds up to date using the Regression Suite Automated Testing tool to help you, otherwise you’ll quickly find yourselves in an ‘out of service’ position with Microsoft and potentially missing out on hundreds if not thousands of hotfixes, some of which may resolve your performance issues.

You can use the following resources as a guide:

https://community.dynamics.com/365/b/techtalks/posts/finance-and-operations-performance-key-patterns-and-anti-patterns-for-dynamics-365-1-15-19

https://docs.microsoft.com/en-gb/archive/blogs/axsa/cleanup-routines-in-dynamics-365-for-finance-and-operations

 Microsoft updates:

https://docs.microsoft.com/en-us/dynamics365/fin-ops-core/fin-ops/get-started/public-preview-releases

Finance And Operations: Microsoft Managed Continuous Updates | April 2, 2019

Finance and Operations: Regression Suite Automation Tool - Testing Lifecycle Demo | May 29, 2019

Finance and Operations: Regression Suite Automation Tool - Background & Setup | May 28, 2019

  1. Analyse Specific Processes

This next phase can also overlap to a degree; having said that, it’s important to get as much done at the ‘general’ end first when dealing with general performance issues, to avoid costly and potentially unnecessary additional monitoring / analysis time later.

Performance tuning is iterative, where the cycle includes analysis, corrective actions/tuning, deployment of changes then review.

Even when investigating general performance, as mentioned above you should still get some examples from users of where processes are particularly slow. This is for 2 main reasons:

  1. You can measure durations before and after to see the impact of your changes – but bear in mind a lot of those changes don’t target specific processes, so again user expectations should be set accordingly beforehand.
  2. After general performance tuning, you may still need to look at x++ and specific processes in more granular detail.

As with previous versions, you can still analyse traces using the Trace Parser tool:

https://docs.microsoft.com/en-us/dynamics365/fin-ops-core/dev-itpro/perf-test/trace-parser

  1. Iterative Review

Review should be a regular part of the process because as mentioned already, performance tuning is iterative; resolution of some fundamental issues may help, but then more can be identified after changes are deployed and the performance tuning is able to become more focused and in depth.

However, at the same time it’s important to be able to recognise when to stop tuning and move on. There is generally a ‘law of diminishing returns’ to be applied here, meaning in each iteration of the performance tuning of a specific process, you would expect the potential for improvement to reduce exponentially. Some kind of exit criteria should be applied (and predefined as early as possible).  For example, you may have simply reached the target duration agreed with the end user or some kind of cost / benefit decision was made, such as:

  • The estimated hours of analysis to improve average duration of opening of form X by 0.2 seconds are too great to justify, or
  • To do this requires changing the design and the end users would prefer to live with the additional 0.2 seconds and keep the existing design.

As well as reviewing performance fixes that are deployed to address customisation issues, consider QA (quality assurance) processes and having at least best practice checks put in place for performance of every code deployment. One benefit of this is that the customer can feel more reassured that any downturn in performance is not seen to be a result of a recent code deployment. Other things I have seen (following a similar principle) are:  putting all other code deployments (i.e. anything other than performance fixes) on hold during the period of performance tuning; being prepared to reverse deployments if there is any doubts over whether or not they caused a performance issue.

Review what could have been done pro-actively that can be applied in future to avoid the issue happening again and plan to have it in place on every implementation project. Have another look at the Fast Track checklist and the following Tech Talk, to establish what could have been missed and how.

https://community.dynamics.com/365/b/techtalks/posts/go-live-planning-8-9-18

Finance and Operations: Performance Benchmark for Dynamics 365 | January 30, 2019

And finally:

Don’t lose heart, break your problem down into smaller pieces or if it's all a bit too overwhelming, you can always find a good partner like QUANTIQ to guide and support you.

Good luck!