Designing for Containment Over Prevention

Why slowing an attacker down matters more than keeping them out and how Azure networks actually help once prevention fails.

Most Azure networks are designed like a pub with a strict door policy:
check IDs, keep trouble out, assume everyone inside behaves.

It works right up until it doesn’t and when it fails, the network usually has nothing useful left to say.

Assume‑breach changes the job of networking entirely.
Your network isn’t there to stop the first compromise. It’s there to stop the second, third, and fourth.

The Mental Model

The common assumption

If we get prevention right with private endpoints, no public IPs, hardened ingress then lateral movement becomes unlikely.

Why it breaks

Once an attacker gains a foothold anywhere inside the VNet, most Azure networks quietly revert to trust‑by‑proximity:

  • Large, shared address spaces
  • East‑west traffic implicitly allowed
  • Shared “infrastructure” services reachable from everywhere
  • Few boundaries that can be tightened quickly

At that point, prevention has already failed.
Containment is what decides whether the blast radius stays small or becomes organisational.

How It Really Works

Azure networking enforces reachability, not intent.

If a packet is routable and allowed by rules, Azure will move it consistently, quickly, and without judgement. There is no native concept of:

  • “This workload is untrusted”
  • “This subnet is compromised”
  • “This path should only exist during normal operations”

Containment emerges only when you design friction into the graph:

  • Boundaries that limit who can talk to whom
  • Paths that are explicit, narrow, and easy to revoke
  • Shared services that don’t silently collapse isolation

Think less about moats.
Think about fire doors that actually close.

Containment as a Measurable Outcome

If containment matters, it has to be testable without an incident.

Here’s a blunt, design‑time metric:

From any compromised workload, how many distinct network trust boundaries can it reach without a change request?

Not “in theory”.
Not “with enough effort”.
Right now, as deployed.

  • If the answer is most subnets in the VNet → containment is weak
  • If the answer is only its own role and one or two justified dependencies → containment exists
  • If answering requires trawling NSGs, routes, and diagrams → containment is unknowable (also a failure)

This isn’t a SOC metric.
It’s an architectural smell test.

Real‑World Impact

Designing for containment changes what “good” looks like.

Blast radius becomes explicit

A compromised workload should expose:

  • One role
  • One isolation unit
  • One small set of outbound paths

If compromise meaningfully increases reachability, your network is amplifying the incident.

Time becomes a defensive control

Every boundary:

  • Slows reconnaissance
  • Forces noisier behaviour
  • Buys response time without relying on alerts

Containment isn’t binary, it’s cumulative friction.

Defenders regain leverage

Good containment designs provide fast levers:

  • Subnets you can isolate without redeploying
  • Routes you can blackhole surgically
  • Rules you can tighten without breaking unrelated systems

If isolation requires redesign, it won’t happen under pressure.

Implementation Examples

Segmentation designed for incidents, not diagrams

This isn’t about tier purity.
It’s about what happens after AppA is compromised.

flowchart LR Internet -->|Inbound| Edge Edge --> AppA Edge --> AppB AppA --> DataA AppB --> DataB AppA -. blocked .-> AppB AppB -. blocked .-> AppA style AppA fill:#e3f2fd style AppB fill:#e8f5e9

In this design:

  • AppA compromise does not grant visibility or reachability into AppB
  • Isolation of AppA:
    • Touches one subnet or NSG
    • Does not affect AppB traffic paths
  • Defender action is local, not global

The topology isn’t clever.
The failure mode is.

NSGs as containment brakes, not hygiene

An NSG excerpt enforcing explicit east‑west denial:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
resource nsg 'Microsoft.Network/networkSecurityGroups@2023-11-01' = {
  name: 'appA-nsg'
  location: resourceGroup().location
  properties: {
    securityRules: [
      {
        name: 'Allow-Edge-In'
        properties: {
          priority: 100
          direction: 'Inbound'
          access: 'Allow'
          protocol: 'Tcp'
          sourceAddressPrefix: '10.10.0.0/24' // Edge subnet
          destinationPortRange: '443'
        }
      }
      {
        name: 'Deny-VNet-In'
        properties: {
          priority: 4096
          direction: 'Inbound'
          access: 'Deny'
          protocol: '*'
          sourceAddressPrefix: 'VirtualNetwork'
          destinationPortRange: '*'
        }
      }
    ]
  }
}

This rule doesn’t assume workloads behave.
It assumes one of them eventually won’t.

Shared Services: Where Containment Quietly Dies

Most containment failures aren’t caused by app tiers they’re caused by shared infrastructure.

DNS is the usual culprit.

A single, shared DNS resolver reachable from every subnet becomes:

  • A universal discovery service
  • An implicit trust bridge
  • Often, a stepping stone toward management planes or legacy systems

During incidents, DNS is rarely locked down first.
Which means once a workload is compromised, name resolution collapses isolation even if IP‑level rules look sound.

If every subnet can talk to the same DNS endpoint:

  • You don’t have independent containment zones
  • You have one network with polite boundaries

Containment requires shared services to be deliberately constrained, not just centrally managed.

Gotchas & Edge Cases

  • Service tags hide scope creep
    VirtualNetwork expands as you add subnets. Containment erodes quietly.

  • Operational convenience pressures boundaries
    If every exception weakens isolation, the design is already too brittle.

  • Management paths matter
    Backup agents, patching, diagnostics, these are often the first lateral paths attackers exploit.

Best Practices (Containment‑Biased)

  • Design subnets as discrete isolation units, not convenience buckets
  • Prefer explicit allow paths over clever deny logic
  • Make isolation actions fast, local, and reversible
  • Treat route tables as incident controls, not static plumbing
  • Regularly answer the containment question not once, but continuously
🍺
Brewed Insight: If your network only works when everything behaves, it’s not defensive, it’s optimistic.
Containment is what turns inevitable failure into survivable failure.

Learn More