When ‘Secure by Design’ Fails in Production

Why Azure network security erodes not through bad design, but through shared responsibility without shared accountability.

“Secure by design” is one of those phrases that sounds definitive.

It implies an end state that if you get the architecture right up front, security becomes a property of the system rather than something you constantly fight.

In Azure production environments, that belief doesn’t just fail.
It quietly dissolves, one well‑intentioned change at a time.

Not because engineers forget how to design securely but because the assumptions that made the design secure are nobody’s job to preserve.

The Mental Model

Common assumption:
Security fails when controls are missing, misconfigured, or ignored.

Why it misleads:
In mature Azure estates, most controls exist.
The failure happens elsewhere when security assumptions decay across organisational and technical boundaries.

“Secure by design” assumes:

  • stable ownership
  • stable dependencies
  • stable trust boundaries

None of those survive scale.

How It Really Works

In enterprise Azure environments, ownership is divided by control plane, not by security outcome.

Typically:

  • Network teams own connectivity
    (VNets, peering, UDRs, Private Endpoints)
  • System administrators own operations
    (VM access, patching, agents)
  • DBAs own data platforms
    (SQL configuration, firewall rules)
  • Application teams own app behaviour and data usage
    (SDKs, outbound calls, feature flags)
  • Security teams own policy
    (standards, approvals, alerts not packet flow)
  • Microsoft owns the platform but not how you use it

Everyone owns something.
No one owns the security assumptions that exist between these domains.

Those assumptions include things like:

  • “This app cannot laterally reach other environments”
  • “All outbound traffic is inspected”
  • “Only intended consumers can reach this data store”

They are real.
They are critical.
And they are almost never explicitly owned.

Real‑World Impact

This is where “secure by design” fails in production.

Not through a single bad change, but through coordinated innocence:

  • A network team enables VNet peering “for integration”
  • A DBA adds a firewall exception “to unblock delivery”
  • An app team introduces a new dependency “for a feature”
  • A security team approves because policy conditions are met

Each change is defensible in isolation.
Collectively, they invalidate the original threat model.

At small scale, this works because:

  • people remember why things exist
  • blast radius is limited
  • informal review catches mistakes

At enterprise scale:

  • assumptions outlive the people who made them
  • undocumented dependencies become business‑critical
  • trust paths multiply faster than anyone models them

Security doesn’t fail loudly.
It drifts.

Implementation Examples

Consider a common shared‑services pattern.

Original design intent

graph TD A[App VNet] -->|Peering| B[Shared Services VNet] B --> C[Azure Firewall] C --> D[Internet]

Security assumption:
All app egress is inspected and controlled centrally.

Two years later

graph TD A[App VNet] -->|Peering| B[Shared Services VNet] A --> E[Another App VNet] E --> F[Private Endpoint] F --> G[PaaS Service]

Nothing here violates Azure’s rules.

But several assumptions are now false:

  • Traffic no longer consistently traverses the firewall
  • DNS resolution now determines reachability
  • App VNets can reach more than originally intended
  • Firewall logs no longer represent the full picture

A subtle but powerful example is VNet peering itself:

1
2
3
4
5
6
7
resource vnetPeering 'Microsoft.Network/virtualNetworks/virtualNetworkPeerings@2023-09-01' = {
  name: 'app-to-shared'
  properties: {
    allowVirtualNetworkAccess: true
    allowForwardedTraffic: true
  }
}

That single flag establishes permanent lateral trust unless explicitly revisited.

It is rarely reviewed again — because no team owns the assumption it creates.

How would this change something I design, deploy, or operate?
You stop treating connectivity as reversible plumbing and start treating it as a long‑lived security commitment.

Gotchas & Edge Cases

  • Private Endpoints don’t “break security” — they break assumptions
    Especially when routing, DNS, and inspection models were designed first
  • Logging creates false confidence
    Firewall and NSG flow logs only tell the truth about traffic they actually see
  • Policy enforces existence, not intent
    A rule can be compliant and still dangerous
  • Temporary exceptions become structural dependencies
    And nobody volunteers to remove them

Best Practices

Not prescriptive steps — but mindset shifts that survive scale:

  • Assume security assumptions will decay Design review is not a one‑time event
  • Treat network changes as security changes by default Even when raised by non‑security teams
  • Minimise implicit trust Especially lateral trust created by peering and shared DNS
  • Make connectivity decisions expensive to forget If you can’t explain why it exists, it shouldn’t be permanent
🍺
Brewed Insight: Secure‑by‑design fails not because no one owns security, but because no one is accountable for preserving security assumptions as the organisation fragments. In Azure, drift isn’t an exception — it’s the default behaviour.

Learn More

The failure mode described here doesn’t come from a single misconfiguration, it emerges from how Azure architectures divide responsibility across teams.

These references are useful not because they solve that problem, but because they implicitly assume it: