Vetora logo
🌳Architectural Patterns

Strangler Fig Pattern

Discover how the strangler fig pattern enables incremental migration from legacy systems to modern architectures by gradually routing traffic from old components to new ones until the legacy system can be safely decommissioned.

Overview

Named after the strangler fig tree that grows around a host tree and eventually replaces it, this pattern addresses one of the most challenging problems in software engineering: replacing a running legacy system without disruption. The naive approach -- a complete rewrite -- has a well-documented failure rate exceeding 70%. The rewrite team must rebuild years of accumulated features, edge cases, and undocumented business rules, all while the legacy system continues to evolve. By the time the rewrite is 'ready,' the target has moved, and the organization has burned months or years of engineering effort.

The strangler fig pattern takes a fundamentally different approach: incremental replacement. A facade layer (often an API gateway or reverse proxy) is placed in front of the legacy system. Initially, 100% of traffic flows through the facade to the legacy system unchanged. Then, one feature at a time, new implementations are built in the target architecture. The facade routes traffic for migrated features to the new system and un-migrated features to the legacy system. Users see a single system; behind the facade, the legacy system gradually shrinks as more features are migrated.

The migration proceeds through three phases for each feature: Transform (build the new implementation), Coexist (run both old and new in parallel, comparing results), and Eliminate (switch traffic to the new implementation and decommission the old code). During the Coexist phase, the facade can run both implementations simultaneously and compare outputs to verify correctness -- a technique called shadow testing or dark launching. If the new implementation produces different results, it is fixed before receiving real traffic. This dramatically reduces migration risk compared to a big-bang cutover.

The pattern's power lies in its reversibility and incrementalism. Each migrated feature can be individually rolled back to the legacy implementation if issues arise, by changing the routing rules in the facade. The legacy system continues to serve un-migrated functionality indefinitely -- there is no deadline by which the migration must be 'complete.' This removes the schedule pressure that causes quality compromises in big-bang rewrites. Teams can prioritize migrating the highest-value or most-problematic features first, delivering value from the migration effort from week one rather than waiting months for a complete rewrite.

Key Points
  • 1A routing facade (API gateway, reverse proxy, or load balancer) sits in front of both old and new systems, directing traffic based on feature flags, URL paths, or header values. The facade is the single architectural prerequisite.
  • 2Migration proceeds feature-by-feature, not all-at-once. Each feature goes through Transform (build new), Coexist (run both and compare), and Eliminate (switch traffic and decommission old). Features can be migrated in any order based on business priority.
  • 3Shadow testing during the Coexist phase sends requests to both old and new implementations, comparing responses. Discrepancies are investigated and fixed before the new implementation receives real traffic, dramatically reducing migration risk.
  • 4Rollback is trivial: change the routing rule in the facade to send traffic back to the legacy implementation. This per-feature rollback capability is impossible in a big-bang rewrite where the legacy system has been decommissioned.
  • 5The legacy system continues to operate and can even receive new features for un-migrated areas. There is no deadline pressure -- the migration can proceed at whatever pace the team sustains.
  • 6Data migration is often the hardest aspect. Both systems may need to read and write to the same datastore during coexistence, requiring careful schema management and dual-write strategies.
Simple Example

The Highway Bridge Replacement Analogy

When a city needs to replace an old bridge, it does not demolish the bridge and then build a new one (big-bang rewrite) -- that would leave traffic stranded for years. Instead, it builds the new bridge alongside the old one. First, one lane of traffic is moved to the new bridge while the old bridge still carries the remaining lanes. If the new lane has problems, traffic can be routed back to the old bridge. Once all lanes are verified on the new bridge, the old bridge is demolished. At every stage, traffic continues to flow, and each lane transfer can be independently rolled back.

Real-World Examples

Amazon

Amazon used the strangler fig pattern to migrate from its monolithic bookstore application to the microservices architecture that powers Amazon.com today. An API gateway routed requests to either the legacy monolith or new microservices based on the URL path. The product catalog, checkout, and recommendation features were migrated individually over several years. Each migration was validated through shadow testing before receiving production traffic.

Spotify

Spotify used the strangler fig pattern to migrate its backend from a Python monolith to Java microservices. An Nginx-based routing layer directed traffic by feature. The playlist service was migrated first, followed by search, then the social features. The migration took three years, but each migrated service delivered immediate performance improvements. The last Python component was decommissioned in 2015, five years after the migration began.

The Guardian (newspaper)

The Guardian migrated its content management and publishing platform from a legacy Java system to a modern Scala/Play-based architecture using the strangler fig pattern. An Nginx reverse proxy routed requests by URL path: migrated sections (e.g., /sports, /technology) went to the new system while un-migrated sections continued to be served by the legacy CMS. The migration proceeded section-by-section over two years with zero downtime and immediate SEO improvements for each migrated section.

Trade-Offs
AspectDescription
Risk vs SpeedThe strangler fig pattern dramatically reduces migration risk through incremental, reversible steps. However, it takes longer than a big-bang rewrite would if the rewrite succeeded. The total elapsed time for a strangler fig migration is often 2-5x longer, but the probability of success is dramatically higher.
Coexistence Cost vs Migration SafetyDuring migration, both old and new systems must be maintained, monitored, and staffed. This dual-maintenance period increases operational cost. However, the ability to shadow-test and roll back individual features provides safety that a big-bang approach cannot match.
Facade Complexity vs Migration FlexibilityThe routing facade adds an additional layer of complexity, latency (typically 1-5ms), and a potential single point of failure. However, it provides the fine-grained traffic control needed for per-feature migration, shadow testing, and instant rollback.
Case Study

Legacy E-Commerce Platform Migration

Scenario

A mid-size retailer operated a 15-year-old PHP monolith serving 2 million daily users. The system was difficult to modify (deploys took 4 hours with 20% rollback rate), could not scale for peak events, and had accumulated 500,000 lines of undocumented business logic. Two previous full rewrite attempts had failed after 12 and 18 months respectively, consuming $4 million in engineering costs with nothing to show for it.

Solution

The team adopted the strangler fig pattern with an Envoy-based API gateway as the facade. They prioritized migration of the highest-pain features first: checkout (most bugs, highest business impact) and product search (worst performance). Each feature migration followed the three-phase approach: build the new implementation in Go microservices, shadow-test by running both implementations in parallel and comparing responses, then switch traffic. Data migration used a dual-write strategy where the new service wrote to both old and new databases during the coexistence period.

Outcome

After 18 months, 40% of features were migrated to the new architecture. Checkout latency dropped from 3 seconds to 400ms, and the deploy failure rate for migrated services fell to 2%. Each migrated feature delivered measurable business value immediately. The remaining 60% of features continued to run on the PHP monolith without any negative impact. The migration is projected to complete in another 18 months, but unlike the failed rewrites, every month of the strangler fig migration delivered production-ready improvements.

Common Mistakes
  • Trying to migrate the data layer and application layer simultaneously. Migrate the application logic first while both old and new systems share the same database, then migrate the data layer separately. Attempting both at once doubles the risk and complexity.
  • Not investing in the facade layer upfront. A well-designed routing facade with feature flags, shadow testing capability, and per-feature metrics is the foundation of the entire pattern. Skimping on facade infrastructure leads to risky, all-or-nothing cutovers that defeat the purpose of the pattern.
  • Migrating features in technical order instead of business priority. Start with the highest-value or highest-pain features so the migration delivers measurable business impact from month one. This maintains organizational support for the multi-year effort.
  • Letting the legacy system rot during migration. The legacy system must remain operational for un-migrated features, potentially for years. It still needs security patches, performance monitoring, and bug fixes. Neglecting it creates a ticking time bomb.
Related Concepts

See Strangler Fig Pattern in action

Explore system design templates that use strangler fig pattern and run traffic simulations to see how these concepts perform under real load.

Browse Templates

Route traffic between legacy monolith and new services

Metrics to watch
legacy_traffic_pctnew_service_error_ratemigration_progress_pctp99_latency_ms
Run Simulation
Test Your Understanding

1What is the key architectural component that enables the strangler fig pattern?

2What is shadow testing in the context of the strangler fig pattern?

Deeper Reading