Most Azure networks are designed like a pub with a strict door policy:
check IDs, keep trouble out, assume everyone inside behaves.
It works right up until it doesn’t and when it fails, the network usually has nothing useful left to say.
Assume‑breach changes the job of networking entirely.
Your network isn’t there to stop the first compromise. It’s there to stop the second, third, and fourth.
The Mental Model
The common assumption
If we get prevention right with private endpoints, no public IPs, hardened ingress then lateral movement becomes unlikely.
Why it breaks
Once an attacker gains a foothold anywhere inside the VNet, most Azure networks quietly revert to trust‑by‑proximity:
- Large, shared address spaces
- East‑west traffic implicitly allowed
- Shared “infrastructure” services reachable from everywhere
- Few boundaries that can be tightened quickly
At that point, prevention has already failed.
Containment is what decides whether the blast radius stays small or becomes organisational.
How It Really Works
Azure networking enforces reachability, not intent.
If a packet is routable and allowed by rules, Azure will move it consistently, quickly, and without judgement. There is no native concept of:
- “This workload is untrusted”
- “This subnet is compromised”
- “This path should only exist during normal operations”
Containment emerges only when you design friction into the graph:
- Boundaries that limit who can talk to whom
- Paths that are explicit, narrow, and easy to revoke
- Shared services that don’t silently collapse isolation
Think less about moats.
Think about fire doors that actually close.
Containment as a Measurable Outcome
If containment matters, it has to be testable without an incident.
Here’s a blunt, design‑time metric:
From any compromised workload, how many distinct network trust boundaries can it reach without a change request?
Not “in theory”.
Not “with enough effort”.
Right now, as deployed.
- If the answer is most subnets in the VNet → containment is weak
- If the answer is only its own role and one or two justified dependencies → containment exists
- If answering requires trawling NSGs, routes, and diagrams → containment is unknowable (also a failure)
This isn’t a SOC metric.
It’s an architectural smell test.
Real‑World Impact
Designing for containment changes what “good” looks like.
Blast radius becomes explicit
A compromised workload should expose:
- One role
- One isolation unit
- One small set of outbound paths
If compromise meaningfully increases reachability, your network is amplifying the incident.
Time becomes a defensive control
Every boundary:
- Slows reconnaissance
- Forces noisier behaviour
- Buys response time without relying on alerts
Containment isn’t binary, it’s cumulative friction.
Defenders regain leverage
Good containment designs provide fast levers:
- Subnets you can isolate without redeploying
- Routes you can blackhole surgically
- Rules you can tighten without breaking unrelated systems
If isolation requires redesign, it won’t happen under pressure.
Implementation Examples
Segmentation designed for incidents, not diagrams
This isn’t about tier purity.
It’s about what happens after AppA is compromised.
In this design:
- AppA compromise does not grant visibility or reachability into AppB
- Isolation of AppA:
- Touches one subnet or NSG
- Does not affect AppB traffic paths
- Defender action is local, not global
The topology isn’t clever.
The failure mode is.
NSGs as containment brakes, not hygiene
An NSG excerpt enforcing explicit east‑west denial:
| |
This rule doesn’t assume workloads behave.
It assumes one of them eventually won’t.
Shared Services: Where Containment Quietly Dies
Most containment failures aren’t caused by app tiers they’re caused by shared infrastructure.
DNS is the usual culprit.
A single, shared DNS resolver reachable from every subnet becomes:
- A universal discovery service
- An implicit trust bridge
- Often, a stepping stone toward management planes or legacy systems
During incidents, DNS is rarely locked down first.
Which means once a workload is compromised, name resolution collapses isolation even if IP‑level rules look sound.
If every subnet can talk to the same DNS endpoint:
- You don’t have independent containment zones
- You have one network with polite boundaries
Containment requires shared services to be deliberately constrained, not just centrally managed.
Gotchas & Edge Cases
Service tags hide scope creep
VirtualNetworkexpands as you add subnets. Containment erodes quietly.Operational convenience pressures boundaries
If every exception weakens isolation, the design is already too brittle.Management paths matter
Backup agents, patching, diagnostics, these are often the first lateral paths attackers exploit.
Best Practices (Containment‑Biased)
- Design subnets as discrete isolation units, not convenience buckets
- Prefer explicit allow paths over clever deny logic
- Make isolation actions fast, local, and reversible
- Treat route tables as incident controls, not static plumbing
- Regularly answer the containment question not once, but continuously
Containment is what turns inevitable failure into survivable failure.