Designing for Containment Over Prevention

Most Azure networks are designed like a pub with a strict door policy:
check IDs, keep trouble out, assume everyone inside behaves.

It works right up until it doesn’t and when it fails, the network usually has nothing useful left to say.

Assume‑breach changes the job of networking entirely.
Your network isn’t there to stop the first compromise. It’s there to stop the second, third, and fourth.

The Mental Model

The common assumption

If we get prevention right with private endpoints, no public IPs, hardened ingress then lateral movement becomes unlikely.

Why it breaks

Once an attacker gains a foothold anywhere inside the VNet, most Azure networks quietly revert to trust‑by‑proximity:

Large, shared address spaces
East‑west traffic implicitly allowed
Shared “infrastructure” services reachable from everywhere
Few boundaries that can be tightened quickly

At that point, prevention has already failed.
Containment is what decides whether the blast radius stays small or becomes organisational.

How It Really Works

Azure networking enforces reachability, not intent.

If a packet is routable and allowed by rules, Azure will move it consistently, quickly, and without judgement. There is no native concept of:

“This workload is untrusted”
“This subnet is compromised”
“This path should only exist during normal operations”

Containment emerges only when you design friction into the graph:

Boundaries that limit who can talk to whom
Paths that are explicit, narrow, and easy to revoke
Shared services that don’t silently collapse isolation

Think less about moats.
Think about fire doors that actually close.

Containment as a Measurable Outcome

If containment matters, it has to be testable without an incident.

Here’s a blunt, design‑time metric:

From any compromised workload, how many distinct network trust boundaries can it reach without a change request?

Not “in theory”.
Not “with enough effort”.
Right now, as deployed.

If the answer is most subnets in the VNet → containment is weak
If the answer is only its own role and one or two justified dependencies → containment exists
If answering requires trawling NSGs, routes, and diagrams → containment is unknowable (also a failure)

This isn’t a SOC metric.
It’s an architectural smell test.

Real‑World Impact

Designing for containment changes what “good” looks like.

Blast radius becomes explicit

A compromised workload should expose:

One role
One isolation unit
One small set of outbound paths

If compromise meaningfully increases reachability, your network is amplifying the incident.

Time becomes a defensive control

Every boundary:

Slows reconnaissance
Forces noisier behaviour
Buys response time without relying on alerts

Containment isn’t binary, it’s cumulative friction.

Defenders regain leverage

Good containment designs provide fast levers:

Subnets you can isolate without redeploying
Routes you can blackhole surgically
Rules you can tighten without breaking unrelated systems

If isolation requires redesign, it won’t happen under pressure.

Implementation Examples

Segmentation designed for incidents, not diagrams

This isn’t about tier purity.
It’s about what happens after AppA is compromised.

flowchart LR Internet -->|Inbound| Edge Edge --> AppA Edge --> AppB AppA --> DataA AppB --> DataB AppA -. blocked .-> AppB AppB -. blocked .-> AppA style AppA fill:#e3f2fd style AppB fill:#e8f5e9

In this design:

AppA compromise does not grant visibility or reachability into AppB
Isolation of AppA:
- Touches one subnet or NSG
- Does not affect AppB traffic paths
Defender action is local, not global

The topology isn’t clever.
The failure mode is.

NSGs as containment brakes, not hygiene

An NSG excerpt enforcing explicit east‑west denial:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
resource nsg 'Microsoft.Network/networkSecurityGroups@2023-11-01' = {
  name: 'appA-nsg'
  location: resourceGroup().location
  properties: {
    securityRules: [
      {
        name: 'Allow-Edge-In'
        properties: {
          priority: 100
          direction: 'Inbound'
          access: 'Allow'
          protocol: 'Tcp'
          sourceAddressPrefix: '10.10.0.0/24' // Edge subnet
          destinationPortRange: '443'
        }
      }
      {
        name: 'Deny-VNet-In'
        properties: {
          priority: 4096
          direction: 'Inbound'
          access: 'Deny'
          protocol: '*'
          sourceAddressPrefix: 'VirtualNetwork'
          destinationPortRange: '*'
        }
      }
    ]
  }
}

This rule doesn’t assume workloads behave.
It assumes one of them eventually won’t.

Shared Services: Where Containment Quietly Dies

Most containment failures aren’t caused by app tiers they’re caused by shared infrastructure.

DNS is the usual culprit.

A single, shared DNS resolver reachable from every subnet becomes:

A universal discovery service
An implicit trust bridge
Often, a stepping stone toward management planes or legacy systems

During incidents, DNS is rarely locked down first.
Which means once a workload is compromised, name resolution collapses isolation even if IP‑level rules look sound.

If every subnet can talk to the same DNS endpoint:

You don’t have independent containment zones
You have one network with polite boundaries

Containment requires shared services to be deliberately constrained, not just centrally managed.

Gotchas & Edge Cases

Service tags hide scope creep
VirtualNetwork expands as you add subnets. Containment erodes quietly.
Operational convenience pressures boundaries
If every exception weakens isolation, the design is already too brittle.
Management paths matter
Backup agents, patching, diagnostics, these are often the first lateral paths attackers exploit.

Best Practices (Containment‑Biased)

Design subnets as discrete isolation units, not convenience buckets
Prefer explicit allow paths over clever deny logic
Make isolation actions fast, local, and reversible
Treat route tables as incident controls, not static plumbing
Regularly answer the containment question not once, but continuously

🍺

Brewed Insight: If your network only works when everything behaves, it’s not defensive, it’s optimistic.
Containment is what turns inevitable failure into survivable failure.