Discover how Microsoft’s Agent Evaluation Framework addresses the unique challenges of testing AI agents in Dynamics 365. Learn why traditional testing methods fall short for non-deterministic AI, and how structured evaluation ensures quality, safety, and business alignment. This session covers the framework’s principles, lifecycle, key roles, and practical evaluation methods.
Companion file: Foundation of Agent evaluation framework - Part 1.pdf
What's in this Video:
- Introduction to Agent Evaluation Framework
- Differences between traditional and agent testing
- Risks and benefits of structured evaluation
- Framework principles and governance
- Evaluation lifecycle and design document (EDD)
- Responsible AI checkpoints and safety gates
- Five quality dimensions: correctness, safety, reliability, alignment, efficiency
- Evaluation methods and patterns
- Key takeaways and next steps

Like
Report
*This post is locked for comments