Skip to content

Learn to build trustworthy AI with systematic evaluations in Azure AI Foundry. The session covers quality, safety, agent and custom evaluators

License

MIT, CC-BY-SA-4.0 licenses found

Licenses found

MIT
LICENSE
CC-BY-SA-4.0
LICENSE-DOCS
Notifications You must be signed in to change notification settings

microsoft/aitour26-LTG151-build-trustworthy-ai-with-systematic-evaluations-in-azure-ai-foundry

decorative banner

LTG151: Build trustworthy AI with systematic evaluations in Azure AI Foundry

Microsoft Azure AI Foundry Discord Azure AI Foundry Developer Forum


🎙️ | Delivering This Session On The Tour?

  • 1️⃣ | Fork this repo to your personal profile
  • 2️⃣ | Check out the session-delivery-sources for next steps
  • 3️⃣ | Submit an issue if you find bugs or have

Session Description

Building generative AI apps starts with model selection—but earning user trust requires continuous evaluation. In this talk, learn how Azure AI Evaluations SDK helps assess models pre- and post-production, analyze results, and improve quality through Observability.

Learning Outcomes

By the end of this session, learners will be able to:

  1. Understand the E2E Observability support in Azure AI Foundry
  2. Explore and use built-in evaluators for quality & safety
  3. Explore and use built-in evaluators for agentic AI
  4. Create and run evaluations on their own models and agents
  5. View and analyze evaluation results in Azure AI Foundry

Technologies Used

  1. GitHub Codespaces
  2. Visual Studio Code
  3. Azure AI Evaluations (Python SDK)
  4. Azure AI Foundry (Portal & SDK)

Session Resources

Resources Links Description
Documentation Observability in generative AI Azure AI Foundry documentation for all observability-related tools and features including evaluations, red teaming, tracing, and continuous monitoring.
Samples Azure AI Evaluation Samples Azure AI Evaluation SDK samples (Python) showcasing common scenarios for observability in Azure AI Foundry.
Breakout AI and Agent Observability in Azure AI Foundry and Azure Monitor Learn how evaluation and continuous monitoring can help you iterate quickly and move from pilot to production faster in this hour-long breakout from Microsoft Build 2025 (which inspired this talk)
Skilling Models For Beginners Collection with links to an evolving set of resources from a new open-source curriculum focused on model development - with a focused track on observability

Continued Learning Resources

Resources Links Description
AI Tour 2026 Resource Center https://aka.ms/AITour26-Resource-Center Links to all repos for AI Tour 26 Sessions
Azure AI Foundry Community Discord Microsoft Azure AI Foundry Discord Connect with the Azure AI Foundry Community!
Learn at AI Tour https://aka.ms/LearnAtAITour Continue learning on Microsoft Learn

Multi-Language Support

Additional Languages Coming Soon

Content Owners

Nitya Narasimhan
Nitya Narasimhan, PhD

📢
Sydney Lister
Sydney Lister

📢

Responsible AI

Microsoft is committed to helping our customers use our AI products responsibly, sharing our learnings, and building trust-based partnerships through tools like Transparency Notes and Impact Assessments. Many of these resources can be found at https://aka.ms/RAI. Microsoft’s approach to responsible AI is grounded in our AI principles of fairness, reliability and safety, privacy and security, inclusiveness, transparency, and accountability.

Large-scale natural language, image, and speech models - like the ones used in this sample - can potentially behave in ways that are unfair, unreliable, or offensive, in turn causing harms. Please consult the Azure OpenAI service Transparency note to be informed about risks and limitations.

The recommended approach to mitigating these risks is to include a safety system in your architecture that can detect and prevent harmful behavior. Azure AI Content Safety provides an independent layer of protection, able to detect harmful user-generated and AI-generated content in applications and services. Azure AI Content Safety includes text and image APIs that allow you to detect material that is harmful. Within Azure AI Foundry portal, the Content Safety service allows you to view, explore and try out sample code for detecting harmful content across different modalities. The following quickstart documentation guides you through making requests to the service.

Another aspect to take into account is the overall application performance. With multi-modal and multi-models applications, we consider performance to mean that the system performs as you and your users expect, including not generating harmful outputs. It's important to assess the performance of your overall application using Performance and Quality and Risk and Safety evaluators. You also have the ability to create and evaluate with custom evaluators.

You can evaluate your AI application in your development environment using the Azure AI Evaluation SDK. Given either a test dataset or a target, your generative AI application generations are quantitatively measured with built-in evaluators or custom evaluators of your choice. To get started with the azure ai evaluation sdk to evaluate your system, you can follow the quickstart guide. Once you execute an evaluation run, you can visualize the results in Azure AI Foundry portal .

About

Learn to build trustworthy AI with systematic evaluations in Azure AI Foundry. The session covers quality, safety, agent and custom evaluators

Topics

Resources

License

MIT, CC-BY-SA-4.0 licenses found

Licenses found

MIT
LICENSE
CC-BY-SA-4.0
LICENSE-DOCS

Code of conduct

Security policy

Stars

Watchers

Forks

Contributors 4

  •  
  •  
  •  
  •