Building AI agents in Dynamics 365 is no longer just about functionality—it’s about proving reliability, safety, and performance at scale.
In this TechTalk, Microsoft FastTrack architects walk through how to move from evaluation theory to real-world implementation. You’ll see how to design effective evaluation sets, choose the right metrics, and use an Evaluation Design Document (EDD) to ensure your agents behave as expected in production.
This session includes a real-world Dynamics 365 onboarding agent scenario, showcasing how AI-driven workflows introduce new risks—and how structured evaluation prevents costly failures.
Companion deck: EvalDesign.pdf
What you’ll learn:
- Why traditional testing models don’t work for AI agents
- How to build evaluation scenarios (including edge cases that matter most)
- The role of synthetic vs real-world data in testing
- How to select meaningful metrics for single-turn vs multi-turn agents
- Why an Evaluation Design Document (EDD) is critical for governance, risk, and scale
- If you’re working with Copilot, AI agents, or automation in Dynamics 365, this session will help you design evaluations that actually catch issues before users do.

Like
Report
*This post is locked for comments