Overview
The Microsoft Certified: Azure Solutions Architect Expert (AZ-305) is one of the most valuable cloud architecture certifications available. It validates your ability to design secure, resilient, and cost-effective solutions on Microsoft Azure and is a strong differentiator for cloud architects and senior engineers.
The exam has 40–60 questions across multiple choice, multiple select, drag-and-drop, and case study formats, a 180-minute time limit, and a passing scaled score of 700/1000.
Prerequisite: AZ-104 (Azure Administrator Associate) or equivalent experience is assumed. AZ-305 is a design exam, not an administration exam, and the questions require you to reason about architecture trade-offs, not configure resources step by step.
Exam Domains
| Domain | Weight |
|---|---|
| Design Infrastructure Solutions | 40% |
| Design Identity, Governance, and Monitoring Solutions | 27% |
| Design Data Storage Solutions | 20% |
| Design Business Continuity Solutions | 13% |
Infrastructure and identity/governance together account for 67% of the exam. These domains must be your priority.
Domain 1: Infrastructure Solutions (40%)
Compute Architecture
- Virtual Machines: When to use VM Scale Sets vs individual VMs; proximity placement groups for latency; Spot VMs for cost
- Containers: AKS for orchestrated workloads; Azure Container Apps for serverless containers; ACI for simple short-lived containers
- App Service: When App Service outperforms VMs (managed platform, autoscaling, deployment slots)
- Azure Functions: Consumption vs Premium vs Dedicated plan trade-offs; Durable Functions for orchestration
The exam tests which compute option is most appropriate for a given scenario. Key decision factors: state requirements, scale patterns, team expertise, and operational overhead tolerance.
Networking Design
- VNet architecture: Hub-and-spoke topology is the standard enterprise pattern; understand when to use VNet peering vs Azure Virtual WAN
- Private connectivity: Private Endpoints vs Service Endpoints (Private Endpoints for PaaS, no data exfiltration risk); ExpressRoute vs VPN Gateway (ExpressRoute for dedicated, high-bandwidth, low-latency)
- DNS: Azure Private DNS Zones for name resolution across VNets; Azure DNS for public zones
- Load balancing decision tree: Azure Front Door (global HTTP/HTTPS with WAF), Application Gateway (regional HTTP/HTTPS with WAF), Azure Load Balancer (TCP/UDP, regional), Traffic Manager (DNS-based global routing)
Migration
- Azure Migrate: Discovery, assessment, and replication for VMs and databases
- Migration strategies: Rehost (lift and shift), refactor, re-architect, replace
- Database migration: Azure Database Migration Service for homogeneous and heterogeneous migrations
Domain 2: Identity, Governance, and Monitoring (27%)
Identity Design
- Entra ID (formerly Azure AD): Hybrid identity with Entra Connect Sync; Entra Connect Cloud Sync for lightweight scenarios
- Managed identities: System-assigned for single-resource identity; user-assigned when the identity is shared across multiple resources
- Workload identity federation: Eliminate secrets by federating external identity providers (GitHub Actions, Kubernetes)
- Conditional Access: Risk-based access policies; MFA enforcement; device compliance requirements
- PIM (Privileged Identity Management): Just-in-time privileged access; access reviews
Governance Design
- Management hierarchy: Management Groups → Subscriptions → Resource Groups → Resources
- Azure Policy: Enforce standards at scale; DeployIfNotExists for remediation; deny effects for blocking non-compliant resources
- RBAC: Built-in roles vs custom roles; scope assignment; avoid broad Owner assignments
- Blueprints / Landing Zones: Microsoft Azure Landing Zone accelerator for enterprise-scale governance
Exam tip: Governance questions often describe an organisation with multiple subscriptions and business units. The answer almost always involves Management Groups + Azure Policy applied at the management group level.
Monitoring Design
Azure Monitor
├── Metrics → real-time numerical telemetry
├── Logs → Log Analytics workspace (KQL queries)
├── Alerts → action groups (email, webhook, Logic App, Function)
├── Application Insights → APM for applications
└── Diagnostic Settings → route resource logs to Log Analytics / Storage / Event Hubs
Know when to use Application Insights (application performance and availability) vs Log Analytics (infrastructure and cross-resource querying) vs Azure Monitor Metrics (real-time dashboards and alerts).
Domain 3: Data Storage Solutions (20%)
Relational Database Selection
| Scenario | Service |
|---|---|
| Existing SQL Server workload, minimal changes | Azure SQL Database or SQL Managed Instance |
| Need full SQL Server agent, CLR, cross-database queries | SQL Managed Instance |
| Fully managed, auto-scale, business critical | Azure SQL Database Hyperscale |
| PostgreSQL or MySQL | Azure Database for PostgreSQL/MySQL Flexible Server |
| Multi-region write, mission-critical | Azure SQL Database Business Critical with geo-replication |
Non-Relational Database Selection
| Scenario | Service |
|---|---|
| Document, key-value, graph, or table data with global distribution | Cosmos DB |
| IoT telemetry at scale, time-series | Cosmos DB for NoSQL or Table API |
| Simple key-value with session state | Azure Cache for Redis |
| Analytical queries over semi-structured data | Azure Data Lake Storage + Synapse Analytics |
Cosmos DB consistency levels are a frequent exam topic: Strong, Bounded Staleness, Session (default), Consistent Prefix, Eventual. Know the trade-offs between consistency and latency.
Storage Accounts
- Redundancy options: LRS → ZRS → GRS → GZRS (each adds a layer of durability)
- Access tiers: Hot (frequent access), Cool (30-day minimum), Cold (90-day minimum), Archive (180-day minimum, rehydration required)
- Lifecycle management policies: Automate tier transitions and deletion based on last modified date
Domain 4: Business Continuity Solutions (13%)
Backup Design
- Azure Backup: VMs, SQL in VMs, Azure Files, Blobs; Recovery Services Vaults
- Retention: Distinguish RPO (how much data can you lose) from RTO (how fast can you recover)
- Soft delete: Protects backup data from accidental or malicious deletion for 14 days (extendable)
High Availability Patterns
| Availability target | Pattern |
|---|---|
| 99.9% | Single VM with Premium SSD |
| 99.95% | Availability Set (fault + update domains) |
| 99.99% | Availability Zones (separate physical datacentres) |
| Multi-region | Active-passive or active-active with Traffic Manager / Front Door |
Disaster Recovery
- Azure Site Recovery: Replication and orchestrated failover for VMs and on-premises workloads
- Geo-replication: Azure SQL Database active geo-replication; Cosmos DB multi-region writes
- Recovery plans: Document and test failover sequences; define RTO/RPO targets before choosing a pattern
Architecture Patterns for the Exam
Well-Architected Framework Pillars
- Reliability: Redundancy, health monitoring, graceful degradation
- Security: Zero trust, least privilege, defence in depth
- Cost Optimisation: Right-sizing, reserved instances, autoscaling
- Operational Excellence: IaC, CI/CD, observability
- Performance Efficiency: Caching, CDN, read replicas, async processing
Common Exam Traps
- Private Endpoint vs Service Endpoint: If the question mentions preventing data exfiltration, the answer is Private Endpoint
- App Service vs AKS: AKS is for containerised microservices; App Service is for web apps and APIs where you don't want to manage orchestration
- ExpressRoute vs VPN Gateway: VPN Gateway is encrypted over the public internet; ExpressRoute is a dedicated private circuit
- Cosmos DB consistency: Strong consistency = higher latency; Eventual = best latency, lowest cost, possible stale reads
- Azure Front Door vs Traffic Manager: Front Door operates at Layer 7 and includes WAF; Traffic Manager is DNS-only
Study Plan (6 Weeks)
| Week | Focus |
|---|---|
| 1 | Identity and Governance: Entra ID, RBAC, Azure Policy, Management Groups |
| 2 | Compute: VMs, AKS, App Service, Azure Functions, migration patterns |
| 3 | Networking: Hub-and-spoke, Private Endpoints, load balancing options |
| 4 | Data: SQL vs NoSQL selection, Cosmos DB, storage redundancy, lifecycles |
| 5 | Business Continuity: backup, geo-replication, Site Recovery, RTO/RPO |
| 6 | Practice exams, case study practice, review weak domains |
Case Study Preparation
The AZ-305 includes case study scenarios: you read a description of a company, its requirements, and its existing environment, then answer 4–6 related questions. Case studies test your ability to apply multiple concepts together and prioritise requirements.
For case studies:
- Read the requirements section carefully before the questions
- Identify the key constraints (regulatory, cost, latency, existing licences)
- The scenario usually has a most-important constraint that eliminates most wrong answers
Practice Exam Strategy
- Read all answer options before choosing
- Eliminate answers that violate stated constraints (if the question says "no infrastructure management," eliminate IaaS answers)
- Look for keywords: "most cost-effective," "least operational overhead," "existing SQL Server licences," "multi-region writes," "RPO of less than 1 hour"
- For case studies, map each question back to the requirements in the scenario description
Use the AZ-305 practice exams to test your readiness across all four domains. Aim for 80%+ consistently before booking your real exam.