AI WisdomArchitecture & guides β†—
HT
How Things Work

Azure API Management (APIM)

The gateway in front of all your APIs β€” policies, rate limiting, transformations, and the developer portal.

How It Works

Azure API Management is a fully managed gateway that sits between API consumers and your backend services. Every request passes through a four-stage policy pipeline β€” Inbound, Backend, Outbound, and on-error β€” where you apply cross-cutting concerns like authentication, rate limiting, caching, and protocol transformation without changing your backend code.

1
Request enters the Inbound pipeline

Every API call hits APIM before reaching your backend. The Inbound section runs policies top-to-bottom: authentication checks, rate limiting, header manipulation, URL rewriting. Any policy can short-circuit the pipeline and return a response directly.

2
Backend call (or mock)

If inbound policies pass, APIM forwards the request to the configured backend. You can configure multiple backends with load balancing, circuit breakers, and health checks. The mock-response policy can skip the backend entirely for testing.

3
Outbound pipeline transforms the response

After the backend responds, outbound policies process the response before returning to the client: strip internal headers, add CORS headers, transform XML to JSON, or store results in the built-in cache.

4
on-error handles failures gracefully

If any policy or the backend throws, the on-error section runs. You can map errors to structured JSON responses, log to Event Hub, or retry with a different backend.

5
Subscriptions and Named Values secure secrets

Clients authenticate via subscription keys sent in the Ocp-Apim-Subscription-Key header. Backend secrets (API keys, connection strings) are stored as Named Values β€” which can reference Azure Key Vault β€” and referenced in policy XML as {{my-secret}}.

Key Concepts

πŸ”—Policy Pipeline

Four sequential sections: Inbound (pre-backend), Backend (forward-request), Outbound (post-backend), on-error. Policies within each section run top-to-bottom.

πŸ”‘Subscriptions

APIM's built-in authentication. Clients send Ocp-Apim-Subscription-Key. Rate limits and usage tracking are per subscription key, not per IP.

πŸ“¦Named Values

Key-value store inside APIM. Reference them in policies as {{my-value}}. Can be plain text, secrets, or Key Vault references (rotated automatically).

πŸ–₯️Backends

Named backend targets with URL, credentials, and circuit breaker config. Allows A/B routing and blue/green deploys without policy changes.

🎭Mock Response

Policy that returns a static response from the API schema definition without forwarding to the backend. Enables frontend teams to work before the API is ready.

πŸ“ŠTiers

Developer (no SLA, single unit) | Basic (99.9%) | Standard (99.9%) | Premium (99.99%, multi-region, VNet). Developer tier must NEVER be used in production.

🏠Self-hosted Gateway

Docker container that runs the APIM gateway on-premises or in any cloud. Config is pulled from APIM in Azure. Enables consistent policy enforcement in hybrid environments.

APIM Policy β€” JWT validation + rate limiting + header stripping
tsx
1// APIM Policy β€” rate limit + JWT validation + header transform
2<policies>
3 <inbound>
4 <base />
5 <!-- 1. Validate JWT first β€” short-circuits on failure (401) -->
6 <validate-jwt header-name="Authorization"
7 failed-validation-httpcode="401"
8 failed-validation-error-message="Unauthorized">
9 <openid-config url="https://login.microsoftonline.com/{tenantId}/v2.0/.well-known/openid-configuration"/>
10 <required-claims>
11 <claim name="scp" match="any"><value>api.read</value></claim>
12 </required-claims>
13 </validate-jwt>
14
15 <!-- 2. Rate limit per subscription key (NOT per IP!) -->
16 <rate-limit-by-key calls="100" renewal-period="60"
17 counter-key="@(context.Subscription.Id)"
18 increment-condition="@(context.Response.StatusCode < 400)" />
19
20 <!-- 3. Named Value β€” secret stored in Key Vault reference -->
21 <set-header name="X-Backend-Key" exists-action="override">
22 <value>{{backend-api-key}}</value>
23 </set-header>
24 </inbound>
25 <backend>
26 <base />
27 </backend>
28 <outbound>
29 <base />
30 <!-- Strip internal headers from response -->
31 <set-header name="X-Powered-By" exists-action="delete" />
32 <set-header name="X-AspNet-Version" exists-action="delete" />
33 </outbound>
34 <on-error>
35 <base />
36 <return-response>
37 <set-status code="@(context.Response.StatusCode)" />
38 <set-body>@(context.LastError.Message)</set-body>
39 </return-response>
40 </on-error>
41</policies>
πŸ’‘
Why This Matters

APIM decouples API consumers from backend implementations. You can version APIs, throttle abusive callers, enforce security policies, mock responses for development, and migrate backends β€” all without touching consumer code. It's the single enforcement point for governance across dozens of microservices.

Common Pitfalls

⚠validate-jwt placed in outbound or backend sections does nothing to protect your backend β€” the request has already been processed. Always place it first in inbound.
⚠Rate limits are per subscription key by default, not per IP. Behind corporate proxies or NAT, all clients share one IP, making IP-based counters unreliable.
⚠Developer tier has zero SLA and is explicitly for non-production. It will be taken offline during Azure maintenance with no compensation.
⚠Outbound policies cannot modify the HTTP status code after it has been set by the backend. Use return-response in on-error instead.
Real-World Use Cases

1Production Rate Limit Misconfiguration β€” 50k Unexpected Backend Calls

Scenario

An e-commerce API on APIM Standard tier suffered a traffic spike. The rate-limit policy was configured but the counter-key was set to the caller IP from a proxy, not the subscription key.

Problem

All requests from behind a corporate proxy shared one IP, so the rate limit applied to the entire company as one 'client'. 10 developers sharing 100 req/min meant each got effectively 10 req/min β€” but more critically, unauthenticated callers from the same IP exhausted the legitimate users' quota.

Solution

Changed counter-key to @(context.Subscription.Id) to track limits per subscription. Added a separate rate-limit-by-key based on the authenticated user claim for fine-grained per-user throttling. Moved unauthenticated endpoints to a separate APIM product with stricter limits.

πŸ’‘

Takeaway: APIM rate limits are per counter-key, not per IP by default. Always audit what context expression you use as the counter key β€” IP-based limiting breaks behind proxies and NATs.

2validate-jwt Placed in Outbound β€” Auth Bypass in Production

Scenario

A team migrated their API to APIM and placed validate-jwt in the outbound section (copy-paste error). The policy appeared to work in testing because the JWT was structurally valid.

Problem

JWT validation in outbound runs AFTER the backend has already processed the request and returned data. The policy rejected invalid tokens in the response phase, but the backend had already executed business logic, mutated database state, and the validation failure was invisible to monitoring.

Solution

Moved validate-jwt to the top of the inbound section. Added a policy unit test using APIM's test console. Enabled APIM diagnostic logging to Azure Monitor to verify the pipeline execution order on every request.

πŸ’‘

Takeaway: validate-jwt must always be in inbound, as early as possible. Auth policies in outbound or backend sections execute after damage is done. The APIM portal highlights this with a warning but teams miss it during copy-paste migrations.

3Developer Tier in Production Causes Outage During Update

Scenario

A startup deployed their MVP on APIM Developer tier to save costs. During an Azure maintenance window, the single APIM unit was unavailable for 22 minutes, taking down all customer-facing APIs.

Problem

Developer tier has no SLA, no redundancy, and is explicitly documented for non-production use. When Azure performed routine maintenance on the underlying infrastructure, there was no failover unit available.

Solution

Migrated to Basic tier ($140/month vs Developer's $50/month) which provides 99.9% SLA with redundant units. For higher availability requirements, Standard tier with geo-redundancy was configured.

πŸ’‘

Takeaway: Developer tier costs ~$50/month vs Basic at ~$140/month β€” a false saving if your API serves real customers. The cost of a 22-minute outage during business hours far exceeded a year's tier upgrade cost.