Azure AI Apps and Agents Developer (AI-103): what the exam tests and how to prepare
Microsoft's AI-103 is a 120-minute proctored Associate exam covering Microsoft Foundry, Foundry Agent Service, RAG on Azure AI Search, computer vision, speech, and document extraction. Domain weights, the Foundry naming maze, scenarios, pricingβand explainx.ai mock tests and study pathway.
Microsoft's Azure AI Apps and Agents Developer β Associate (AI-103) is the company's bid to standardize production-grade generative AI and agent development on Azure β the practical work of shipping Microsoft Foundry apps with RAG, agents, vision, speech, and document extraction, in Python.
Disclaimer: Exam structure and policies belong to Microsoft and Pearson VUE. Confirm details on the official exam page before you register.
Who it is for
The target candidate is an Azure AI engineer or developer who builds AI apps and agents with Python and the Microsoft Foundry SDKs. Microsoft rates it at the intermediate level, delivered as a proctored exam through Pearson VUE, currently in English, with a 24-hour wait before a retake.
You should be comfortable with:
Choosing the right Foundry service for a task β LLMs, small language models, multimodal models, and Foundry Tools
Building RAG and grounding pipelines on Azure AI Search
Building single and multi-agent solutions with Foundry Agent Service β roles, goals, memory, tool schemas
Vision, speech, text analysis, and document extraction using Foundry Tools
Explicitly out of scope: training models from scratch, non-Azure services as primary answers, and data engineering unrelated to AI ingestion.
Exam format (at a glance)
Attribute
Detail
Duration
120 minutes, proctored via Pearson VUE
Questions
~40 (estimate) β Microsoft does not publish an exact count for AI-103; 40 is a reasonable pacing estimate for a 120-minute associate exam
Format
Multiple choice and multiple response (select all correct answers)
Pass score
700 / 1000 β officially documented: "a score of 700 or greater is required to pass"
Pricing
$165 USD per attempt (US; varies by country)
Retake
Wait 24 hours after a failed attempt
Be honest with yourself about pacing. The 700/1000 pass score is official and confirmed. The ~40-question count is an estimate we use only to reason about time per question β do not treat it as gospel. At ~40 questions in 120 minutes you have roughly 3 minutes each, which is generous for MCQs but tight if several are long, multi-part scenario items.
The Foundry naming maze (read this first)
More candidates lose points to branding confusion than to any single technical gap. Microsoft's platform has been renamed repeatedly, and the docs still carry legacy URLs. Here is the map as of early 2026:
Microsoft Foundry is the current name of the platform. It was Azure AI Studio (named at Ignite 2024) β Azure AI Foundry β Microsoft Foundry. Docs now live under learn.microsoft.com/azure/foundry/, but the ai.azure.com portal domain persists from the older naming.
Foundry Tools is the current umbrella term for what used to be Azure AI Services / Azure Cognitive Services β the prebuilt, point-solution APIs: Language, Vision, Speech, Document Intelligence, Translator, Content Understanding. These are things agents call.
Foundry Agent Service is the managed agent runtime and orchestration service (successor to Azure AI Agent Service). This is the thing that runs your agents. The classic trap swaps Foundry Tools and Foundry Agent Service β remember: Foundry Tools are called by agents; Foundry Agent Service runs the agents.
Azure AI Search remains the documented name for the retrieval/indexing engine (semantic, vector, hybrid search). The exam objectives still use plain "vector/hybrid/semantic search."
Foundry IQ is a newer (Build 2026) knowledge/retrieval layer on top of Azure AI Search. It is very new / preview-adjacent β don't over-index on it; the exam objectives anchor on Azure AI Search.
Image/video generation and editing, multimodal understanding, responsible AI for visuals
Implement text analysis solutions
10-15%
Entities, sentiment, translation, and speech as a first-class agent modality
Implement information extraction solutions
10-15%
Semantic/vector/hybrid search, enrichment skills, OCR + layout + field extraction
The agentic and generative AI domain is the largest at 30-35% β give it the most study time. Plan and manage is a close second. The three implementation domains (vision, text, extraction) are each smaller but together make up 30-45% and share a lot of grounding/retrieval concepts.
Domain deep dives
Domain 2 β Implement generative AI and agentic solutions (the big one)
This is where most points live. Three threads:
Generative apps with Foundry. Deploy and consume LLMs, small language models, code models, and multimodal models. Implement RAG against grounded knowledge. Design tool-augmented, multistep reasoning workflows. Evaluate models for fabrications, relevance, quality, and safety. Integrate through Foundry SDKs and connectors and configure app-to-project connections.
Agents with Foundry Agent Service. Define agent roles, goals, conversation tracking, and tool schemas. Integrate retrieval + function calling + memory. Wire up agent tools β APIs, knowledge stores, Azure AI Search, Content Understanding, custom functions. Build orchestrated multi-agent solutions and autonomous/semiautonomous workflows with safeguards and approval flows.
Optimize and operationalize. Prompt engineering and parameter tuning; reflection / chain-of-thought / self-critique loops; observability β tracing, token analytics, safety signals, latency; and orchestrating multiple models or hybrid LLM + rules-engine designs.
Trap alert. When a scenario says answers are wrong or hallucinated, the exam loves to offer fine-tuning as a distractor. In most AI-103 scenarios the right fix is better grounding (RAG) or better prompting, not fine-tuning. Also distinguish chain-of-thought / self-critique within a single model call from a separate evaluator agent in a multi-agent loop β and remember a tool-augmented flow with function calling is not automatically an "agent" (agents imply a role, goal, memory, and tool schema). Our multi-agent orchestration patterns guide covers sequential handoff vs planner/router vs parallel, which the exam swaps in distractors.
Domain 1 β Plan and manage
Service selection (Foundry Tools vs Foundry Agent Service), deployment options, quotas/rate limits (throttling) vs cost management (billing) as distinct controls, security (managed identity, private networking, keyless credentials, role policies), and monitoring (model performance, drift, safety events, grounding quality). Two recurring distinctions:
Keyless credentials β no auth. Keyless means credential-free authentication via Azure AD / managed identity β not the absence of authentication.
Monitoring (production) vs evaluation (pre-deployment). Both feel like "checking the model," but the exam treats production drift/safety monitoring and pre-deployment evaluators as separate activities.
Domain 3 β Computer vision
Image and video generation from text and reference media; editing β inpainting (fill masked regions), mask-based targeted edits, and prompt-driven whole-image modifications (these are technically distinct, and the exam exploits candidates who use them interchangeably). Multimodal understanding β captioning, visual Q&A, accessibility alt-text (concise) vs detailed descriptions, and Content Understanding single-task vs pro-mode pipelines. Plus responsible AI for visuals: unsafe-content filters, watermarks/prohibited symbols/brand rules, and indirect prompt injection via text embedded in an image (distinct from direct prompt injection through user input).
Domain 4 β Text analysis
Entity/topic/summary/structured JSON extraction, sentiment vs tone vs safety/sensitive-content as three distinct tasks, and translation via Azure Translator (dedicated, good for consistent terminology and glossary control) vs LLM-powered translation (nuanced/contextual). Plus speech: speech-to-text/text-to-speech as an app feature vs speech as a first-class agent modality with custom speech models for domain jargon. Structured-output patterns are worth reviewing β see our structured output / JSON mode guide.
Domain 5 β Information extraction
Retrieval/grounding pipelines: ingest and index documents, images, audio, video; semantic/vector/hybrid search; enrichment via built-in skills (native) vs custom skills (a hosted function/API called during indexing); RAG ingestion (chunking, embeddings, grounding metadata, OCR). Document extraction combines OCR + layout analysis + field extraction as three separate stages β pulling a specific named field from an invoice needs field extraction, not just an OCR text dump.
The #1 confusable topic: semantic vs vector vs hybrid search
If you learn one thing cold, make it this. Azure AI Search implements all three, and the exam leans on the distinction constantly:
Vector search ranks by embedding similarity β meaning-based, great for paraphrases and synonyms, weaker on exact keywords/IDs.
Semantic search applies language-understanding reranking on top of results to boost the most relevant passages.
Hybrid searchcombines keyword (BM25) and vector retrieval, then can add semantic reranking β the best default for grounding.
The tell: a scenario describing "combining keyword precision with meaning-based relevance" is hybrid, not semantic alone. We wrote a full technical explainer β semantic vs vector vs hybrid search β because this single distinction shows up across Domains 2 and 5.
Practice exam
Azure AI Apps and Agents Developer - Associate (AI-103) β Mock Tests
3 timed mock exams with shuffled questions, instant scoring, and per-question explanations. Pass score: 720/1000. The fastest way to find your weak domains before exam day.
Microsoft's AI-103 exam page links four learning paths. Work them in this order:
Develop generative AI apps in Azure β the Foundry SDK workflow, deploying and consuming models, RAG basics.
Develop AI agents on Azure β Foundry Agent Service, tool schemas, memory, multi-agent orchestration.
Develop natural language solutions in Azure β text analysis, translation, and speech.
Extract insights from visual data on Azure β vision, Content Understanding, and document extraction.
Then build one real thing: a RAG app on Azure AI Search where you toggle semantic, vector, and hybrid retrieval and watch the difference. Preview the exam sandbox to see the Pearson VUE interface, and finish with explainx mock tests for scenario-style MCQ practice.
Common mistakes and exam traps
Foundry Tools vs Foundry Agent Service β prebuilt APIs vs the agent runtime.
Fine-tuning as a reflex answer β usually grounding or prompting is the intended fix.
Keyless β no auth β it means Azure AD / managed identity.
Monitoring (prod) vs evaluation (pre-deploy) β distinct activities.
Inpainting vs mask-based edits vs prompt-driven modifications β three different vision operations.
Indirect (embedded image text) vs direct prompt injection β a Domain-3 nuance.
Sentiment vs tone vs safety detection β three separate classification tasks.
Azure Translator vs LLM translation β dedicated/glossary-controlled vs nuanced/contextual.
Semantic vs vector vs hybrid β the most confusable topic; "keyword precision + meaning" = hybrid.
Built-in vs custom skills β custom skills require a hosted function/API in the enrichment pipeline.
OCR is not field extraction β extracting a named field needs a field-extraction stage.
Exam day logistics
Proctored through Pearson VUE (test center or online); currently English.
120 minutes; plan for roughly 3 minutes per question against the ~40-question estimate.
No penalty for guessing β never leave an item blank.
Multiple response items ("select all that apply") are common; read how many answers are required.
Pass at 700/1000 β the score is scaled, so it is not a straight percentage of questions correct.
Failed? You must wait 24 hours before re-sitting.
How explainx.ai fits: mock tests and study pathway
AI-103 mock tests β live on explainx.ai
Start mock tests β β eight timed practice exams (two Foundations drills, a Plan & Manage focus, an Agentic & Generative AI focus, a Vision/Text/Extraction focus, a Retrieval & Grounding focus, a scenario marathon, and a 40-question / 120-minute simulation), 400+ shuffled multiple-choice questions, instant explanations, $5 lifetime access.
Certification study guide β β domain weights, task statements per domain, six scenario narratives, in-scope / out-of-scope topics, and an Azure AI service checklist mapped to the official skills outline.
Learning pathway β β articles mapped to exam domains with embedded quiz questions.
AI-103 validates that you can move beyond a prototype to production generative and agentic AI on Azure β Foundry apps, agents, RAG, vision, speech, and extraction. Learn the Foundry naming, master semantic vs vector vs hybrid search, weight your study toward the agentic/generative domain, work Microsoft's four learning paths, and run mock tests before your Pearson VUE sitting.
Practice exam
Azure AI Apps and Agents Developer - Associate (AI-103) β Mock Tests
3 timed mock exams with shuffled questions, instant scoring, and per-question explanations. Pass score: 720/1000. The fastest way to find your weak domains before exam day.
Exam names, weights, and policies are summarized from Microsoft's public certification messaging; the ~40-question count is an estimate for pacing, while the 700/1000 pass score is officially documented. Verify all details on Microsoft Learn before registering. explainx.ai is not affiliated with Microsoft's certification program.