Every number on this page comes from a real production MuleSoft deployment. This is what we found when we cracked open an enterprise integration estate — and the playbook we used to move it off MuleSoft for good.
The subject of this case study is a mid-market direct-to-consumer brand in the consumer accessories space, with annual revenue in the $100M–$300M range. Like many D2C companies that scaled past their first hundred million, they run an omnichannel operation: their own e-commerce storefront, major US mass retailers (through a full EDI partner network), Amazon, international distribution through 3PLs in Europe, Japan, and China, and their own direct-sale Chinese marketplace presence.
Integration is the connective tissue of this business. Orders flow from a dozen sources into a single OMS. Product data flows out from a PLM system to every sales channel. Warehouse events flow back from five different 3PLs. EDI documents flow to and from every major US retailer. When any of those pipes breaks, somebody somewhere stops being able to ship product.
For years, MuleSoft was that connective tissue. The company had been on the Anypoint Platform since the mid-2010s, running a mature, three-layer API-led architecture across twelve applications on CloudHub 1.0. By 2026, the bill had quietly climbed past $350,000 a year.
No one wakes up one morning and decides to migrate off a working integration platform for fun. The decision came from three pressures stacking on top of each other.
Their MuleSoft renewal had a 6% annual escalation clause baked into the contract. The original deal had been signed at a number they could swallow. Five renewals later, the annual spend had compounded into territory that made leadership uncomfortable every time they looked at the P&L. A $350,000 annual line item for integration — on a deployment running just 3 production vCores — is the kind of math that gets attention during budget season.
For context on where that price lands: see our full MuleSoft pricing breakdown. $350K is inside the typical range for a mid-size deployment once you layer in vCore licensing, Anypoint Platform tier, Anypoint MQ, premium connectors, and support. In this case, the biggest cost drivers were Anypoint MQ and a Platinum-tier subscription that had been recommended by a Salesforce sales rep three renewals ago and never re-evaluated.
When a senior integration engineer left, the replacement search took months. MuleSoft developers command a $20,000–$30,000 premium over equivalent Java engineers, and the pool is small enough that every posting turned into a bidding war. Every new hire also came with a multi-week ramp on Anypoint Studio, DataWeave, and CloudHub deployment quirks — proprietary knowledge that doesn't transfer to anything else. We wrote about this dynamic in more depth in the hidden cost of MuleSoft developer salaries.
Every new flow written in DataWeave, every queue added to Anypoint MQ, every flow-ref to the shared framework library made the next migration more expensive. The team could feel it: the longer they waited, the harder leaving would be. "We should really look at alternatives" had been on the roadmap for three years. By the time we got involved, the honest question wasn't whether to migrate — it was whether the migration could still be done before the next renewal without another seven-figure commitment.
Before writing a single line of Apache Camel, we spent the first phase of the engagement doing a full audit of the Mule estate. The goal was to produce a complete inventory — every flow, every connector, every external dependency — so we could build a migration plan grounded in the actual codebase instead of wishful thinking.
What follows are the real numbers from that audit. Every figure is pulled directly from the running production system: the Mule XML codebase, CloudHub runtime configuration, Datadog log metrics, and the production audit database.
The deployment consisted of twelve Mule applications organized around MuleSoft's reference three-layer architecture: Experience APIs on top, Process APIs in the middle, System APIs at the bottom. Five business domains (E-commerce, Fulfillment, Orders, Product, ERP) with one app per layer per domain, plus a shared framework library used by all of them.
| Application | Flows | Sub-flows | Total |
|---|---|---|---|
| E-commerce Process | 36 | 61 | 97 |
| E-commerce System | 50 | 30 | 80 |
| Fulfillment Process | 30 | 46 | 76 |
| Product Process | 27 | 41 | 68 |
| Fulfillment Experience | 37 | 25 | 62 |
| Fulfillment System | 16 | 21 | 37 |
| Orders Process | 10 | 26 | 36 |
| ERP System | 17 | 9 | 26 |
| OMS System | 14 | 12 | 26 |
| Product System | 11 | 9 | 20 |
| Marketplace Process (empty) | 0 | 0 | 0 |
| Marketplace System (empty) | 0 | 0 | 0 |
| Shared framework library | 16 | — | 16 |
| Total | 264 | 280 | 544 |
A note on the two empty applications: Marketplace Process and Marketplace System contained zero flows and zero code by the time we got there, but they were still being packaged, deployed, and run with two workers each. They had been scaffolded years earlier for a marketplace initiative that got shelved, and nobody had ever removed them. They were consuming real vCores and generating real CloudHub bills. This turns out to be more common than you'd expect, and we'll come back to it in the gotchas section.
Of the 248 top-level flows, only 107 had message sources of their own — the rest were reachable only via flow-ref from another flow. That gave us a clean picture of the real entry points into the system:
| Trigger type | Count | Share |
|---|---|---|
| HTTP listener (sync API endpoints) | 39 | 36% |
| Anypoint MQ subscriber (event-driven) | 33 | 31% |
| Scheduled (cron-style, fixed frequency) | 32 | 30% |
| SFTP / file listener (continuous watcher) | 3 | 3% |
The mix is informative: 36% of the workload is synchronous request/response (mostly inbound REST APIs), 64% is asynchronous or scheduled. That ratio matters for migration planning because each class of trigger has different replacement patterns in Apache Camel. Sync HTTP endpoints are the easiest — they become Camel REST DSL routes. Scheduled jobs become Camel timer or quartz components. File watchers become SFTP consumers. Anypoint MQ subscribers are by far the hardest because they require replacing the underlying message bus, which we'll cover below.
Three production vCores, spread across ten active applications:
| Application | Workers | Size | vCores |
|---|---|---|---|
| E-commerce Process | 2 | Small | 0.4 |
| Fulfillment Experience | 2 | Small | 0.4 |
| Orders Process | 2 | Small | 0.4 |
| Product Process | 1 | Small | 0.2 |
| E-commerce System | 2 | Micro | 0.2 |
| ERP System | 2 | Micro | 0.2 |
| Fulfillment Process | 2 | Micro | 0.2 |
| Fulfillment System | 2 | Micro | 0.2 |
| OMS System | 2 | Micro | 0.2 |
| Product System | 2 | Micro | 0.2 |
| Marketplace Process (empty) | 2 | Micro | 0.2 |
| Marketplace System (empty) | 2 | Micro | 0.2 |
| Production total | 3.0 |
Every application ran at least two workers for high availability, even the two empty scaffolds. Across all four environments (development, QA, UAT, production), the total vCore subscription was roughly 12.
Sixteen distinct external systems, grouped into six categories:
| Category | Systems |
|---|---|
| ERP | Microsoft Dynamics NAV, four separate company instances (North America, a promotions subsidiary, Europe BV, Japan), each with its own SOAP and OData endpoints. |
| E-commerce platform | Salesforce Commerce Cloud (Demandware), with two WebDAV environments. |
| 3PLs / WMS | Five distinct third-party logistics providers, covering North America, LATAM (Mexico), Europe, and global ocean freight, plus the brand's own internal warehouse SFTP. |
| EDI partners | SPS Commerce as the EDI broker, with individual trading-partner rules for every major US mass retailer the brand sold into. |
| Chinese e-commerce | Tmall/Taobao (via an intermediate SFTP bridge), Guanyi CERP, and JackYun OMS — three overlapping systems reflecting the operational complexity of running D2C in China. |
| PLM / Product data | Arena PLM (event-driven via Anypoint MQ), and Plytix as an item-master feed via SFTP. |
| Marketplace APIs | A major US retailer's marketplace platform plus a legacy customization platform. |
| Order management | A custom Ruby-based OMS hosted externally by a development partner. |
| Tracking | AfterShip for customer-facing shipment tracking. |
| Database | MySQL, used as the EDI audit and tracking store. |
This inventory matters because it bounds the migration scope. We weren't migrating an abstract "integration platform" — we were migrating sixteen concrete pipes, each with its own protocol, auth model, retry semantics, and partner-specific quirks. Every one of them had to be rebuilt against the new stack.
We counted every connector operation in the Mule XML across all twelve apps. The distribution tells you a lot about the shape of the workload:
| Connector | Operations | Purpose |
|---|---|---|
| Database (MySQL) | 172 | INSERT, SELECT, UPDATE against the EDI audit schema |
| HTTP Requester | 129 | Outbound REST calls to NAV, SFCC, marketplace APIs, OMS, etc. |
| Anypoint MQ | 88 | 55 publishes + 33 subscribes across 63 distinct queues |
| SFTP | 73 | Reads, writes, moves, deletes across partner and internal SFTP |
| Web Service Consumer (SOAP) | 41 | NAV ERP, all 4 company instances, with custom codeunits |
| File | 19 | Local temp working files |
| JMS, FTP, Salesforce, S3, SMTP | 0 | Not used directly by the deployment |
Two findings from this inventory shaped the entire migration plan. First, this is a database-heavy and HTTP/SOAP-heavy workload — there's very little proprietary Mule magic in the connector use. Second, the 41 SOAP operations and the 88 Anypoint MQ operations are the two biggest risk areas: both represent deep vendor coupling that has to be unwound carefully.
Zero of them were externalized to .dwl files. Every script lived inline in the Mule XML DSL, across 94 files.
This was the single biggest technical risk in the entire migration. DataWeave is MuleSoft's proprietary data transformation language, and every script has to be hand-ported during a migration off the platform. No automated tool exists that produces high-fidelity output — not for 780 scripts, not for ten. We covered the deeper reasons for this in our post on DataWeave vs. Java for transformation logic.
Most of the 780 scripts were straightforward format conversions: XML to JSON, JSON to database parameter arrays, HTTP response payloads to internal canonical models. These are tedious but mechanical — the kind of thing an experienced engineer can translate at a steady rate once they build up a library of conversion patterns.
The complicated ones are where the risk lives. The gnarliest single script was an EDI 856 Advance Ship Notice generator that joined three upstream documents (the original EDI 850 purchase order, the EDI 940 warehouse shipping order, and the EDI 945 shipping advice), plus a live shipping-status JSON from the carrier, to produce an SPS Commerce–formatted record-typed XML document. It was ~180 lines of DataWeave, and every line of it mattered: the EDI 856 spec is strict and an SPS Commerce rejection ripples back through the warehouse workflow in expensive ways.
Another example: an extractEdiMetadata transform that pivoted on the XML root element name of whatever document was passing through, and extracted different metadata paths depending on whether it was a purchase order, a warehouse shipping order, a shipping advice, an advance ship notice, an invoice, or an acknowledgement. That one script was being called from thirty different places in the codebase and had to be perfect.
Plan on DataWeave porting being 40% to 60% of your total migration engineering effort on a deployment this size. Anyone who tells you otherwise has never done it.
We pulled 24-hour log event counts from Datadog across every production application to establish a realistic sense of scale:
| Application | 24h log events |
|---|---|
| E-commerce Process | 21,909 |
| Product Process | 19,022 |
| E-commerce System | 17,819 |
| OMS System | 10,076 |
| Orders Process | 7,341 |
| Fulfillment Experience | 5,820 |
| Fulfillment Process | 3,569 |
| Product System | 313 |
| ERP System | 0 |
| Fulfillment System | 0 |
| Total | 85,869 |
Each transaction typically generated 3 to 10 log lines (entry, transform, database write, exit, and an optional error handler). Dividing by 5 to 8 gives a working estimate of 10,000 to 17,000 integration transactions per day. For corroboration, the EDI audit table in MySQL contained 560,000 historical records accumulated from 2020 through 2026, split across six EDI document types.
Everything above is the kind of inventory you can build with tools and queries. But the reason this section exists — and the reason this case study is worth writing — is that there are a set of things that never show up in a vendor comparison article. These are the things you only find once you've opened the codebase and read it. They are the things that reliably blow up migration timelines when teams skip the audit phase and try to lift and shift.
Every one of the twelve applications depended on a single shared JAR: framework, version 1.2.9. It contained 16 sub-flows covering cross-cutting concerns — audit logging, error handling, Slack alert webhooks, queue-publishing helpers, standard system-connector config. It was the kind of internal library that gets built once, forgotten about, and then quietly becomes load-bearing.
The gotcha: any migration has to port this library first, or provide equivalent cross-cutting behavior before migrating any of the consuming applications. A team that migrates one app without porting the framework will discover that audit logging has disappeared, error handlers are swallowing exceptions silently, and Slack alerts aren't firing. That's the kind of failure mode that isn't obvious until something goes wrong in production — which is the worst possible time to discover it.
Configuration files across the estate used Mule's secure-properties module with Blowfish/CBC encryption. Every password, API key, database credential, and access token was encrypted with one hardcoded symmetric key, in plain text, in the XML config. That key was in every build artifact, every Maven repo, every developer's laptop, every Git history.
The gotcha: migrating off Mule requires decrypting hundreds of values, auditing whether the key was ever leaked (spoiler: if it's in your Maven history, it's been leaked), and rekeying every secret into the target system's secret manager. This isn't a migration task — it's a security incident that happens to look like a migration task. Most teams find out about this pattern during the migration audit, which is three years too late.
Eighty-eight total MQ operations, spread across sixty-three distinct queues and topics, with publish semantics, subscribe patterns, manual acknowledgement, and dead-letter-queue configurations that all had to be preserved behavior-for-behavior. Replacing Anypoint MQ meant picking a target bus (we chose Azure Service Bus for this deployment, for reasons we'll get to) and then redrawing the entire async topology — not just wiring the pipes but getting the retry, ack-timeout, and DLQ behavior to match.
The gotcha: Anypoint MQ is priced high, but it's also good, and its defaults are subtle. The exactly-once semantics, the visibility timeout on manual ack, the way messages are re-queued after a consumer failure — these behaviors feel automatic in Mule, but they are product decisions MuleSoft made for you. Replicating them on a new bus requires actually understanding what the old bus was doing, and most Mule teams have never had to think about it at that level. Budget real investigation time here, not just reimplementation time.
The 41 Web Service Consumer operations in this deployment all pointed at Microsoft Dynamics NAV, and they were spread across four distinct NAV company instances (one per operational region), each with its own WSDL, its own service instance, and its own NTLM-authenticated domain user. The endpoints weren't standard NAV APIs — they were custom codeunits (ExportInventory, MANInterface, AssemblyOrders) that had been built by the brand's NAV partner years earlier.
The gotcha: moving off Mule required a SOAP client on the target platform that could do NTLM over HTTPS with a stored Windows domain user, against four separate WSDL-generated type sets. Apache Camel can do this with the CXF component, but it's fiddly. We ended up building a small per-region NAV client wrapper on top of Apache CXF that handled the NTLM auth, the WSDL typing, and the custom codeunit calling conventions in one place. Plan on a week or two of pure plumbing here before any integration logic moves.
Inside the Mule apps were significant quantities of deprecated code that was no longer referenced by any live business process but was still being packaged, deployed, and running: sixteen Anypoint MQ queues for decommissioned EMEA and Japan distributor relationships, seventeen sub-flows for a marketplace channel that had been shut down two years earlier, six shipping-label queues tied to a carrier the brand no longer used, and complete flows for a customization platform that had been deprecated.
The gotcha: every one of these dead code paths still consumed memory, threads, and cognitive overhead during the audit. A clean migration is a great opportunity to leave dead code behind, but first someone has to audit every item to confirm no live process depends on it. This is tedious detective work and it goes faster if you pair it with business-side stakeholders who know which markets and partners are still active. Don't try to do it alone in the codebase.
Environment selection was controlled through Maven profiles. Deploying to production required mvn deploy -Pprod; deploying to staging required -Pstaging; and so on. The resulting artifact was environment-specific — the encrypted property values were literally baked into the JAR at build time. Combined with CloudHub's own worker-level property system, this created two completely different places where environment configuration lived, and tracking down "which setting is actually in effect" often required reading both the YAML config and the CloudHub runtime properties panel and then reasoning about which one wins.
The gotcha: this is one of those patterns that feels normal while you're inside it and insane once you leave. The modern expectation is that a single immutable artifact runs in every environment and gets its configuration from the runtime. Migrating away from build-time environment baking is a forcing function for adopting proper runtime configuration — which is good — but it's also a mindset shift, and every surface area that touched a Maven profile has to change.
As mentioned earlier, two of the twelve applications (Marketplace Process and Marketplace System) contained zero flows. Not "some flows" — literally zero. They were empty scaffolds from a multi-year-old initiative that got paused and never resumed. But they were still packaged into the build, still deployed to CloudHub, and still running on two Micro workers each, 24/7, for years.
The gotcha: these two apps alone were consuming 0.4 vCores of paid CloudHub capacity, doing nothing. The yearly cost of running them was into the five figures. Nobody had ever questioned it because "the deployment pipeline includes them" was the default state and removing them would have required a conversation with someone about whether the initiative was coming back. The lesson here isn't technical — it's organizational. Every long-running enterprise integration estate has some version of this.
The primary EDI processing flow lived in one well-known scheduled job in the E-commerce Process app, gated by an initial_state.edi property. Every engineer on the team knew about it. What they didn't know about — and what we only found by reading the Fulfillment Process app carefully — was a sub-flow called writeToPsSFTP that wrote EDI 856 Advance Ship Notice files directly to the internal SFTP server. It was being called from a completely different scheduled job (job_consumer-sync-shipped-b2b-orders-to-erp) gated by a completely different property (initial_state.b2b_orders).
The gotcha: a migration team looking for "the EDI flag" would turn off initial_state.edi, believe EDI was fully off, and then be baffled when 856 files kept getting written by the other job. This is the kind of thing you cannot find with a high-level architecture diagram. You find it by reading the code, and you find it specifically by looking for every caller of an SFTP write, not just the obvious EDI flow. We caught this one during the audit, but only because we knew to look.
Before escalation clauses. At 6% compounding annually, the three-year projected cost through the next renewal was over $1.1M.
Broken down against the 3 production vCores and the rest of the stack, this works out to roughly $117,000 per production vCore per year — which, once you include Anypoint MQ, the Platinum-tier platform subscription, premium connector licenses, and support, lands squarely inside the ranges we documented in our MuleSoft pricing breakdown. It's not an outlier. This is what a real mid-market deployment costs when you add up all the line items.
Across the 528 flow units, that's approximately $663 per flow per year just to run the MuleSoft platform. The company was paying MuleSoft roughly $0.07 per integration transaction processed, before counting any of the downstream database or partner costs.
A five-year projected total, without reducing scope and assuming the 6% escalation clause continued, came out to approximately $1.97 million. That's the counterfactual: what staying on MuleSoft for the next contract cycle would cost. The migration decision came down to whether the one-time engineering cost of moving could come in under the recurring savings within a reasonable payback window. Spoiler: it did, comfortably.
After the audit, we picked a target stack that could replace every capability MuleSoft was providing, using components we could run on infrastructure the team already understood. Here's what the replacement looked like:
| MuleSoft capability | Replacement |
|---|---|
| Mule Runtime (flows, EIP routing) | Apache Camel 4 + Spring Boot |
| CloudHub deployment model | Kubernetes (AKS) with GitOps via Argo CD |
| Anypoint Studio IDE | IntelliJ IDEA with the Camel plugin |
| DataWeave transformations | Java transformer classes (with Groovy for the trickier cases) |
| Anypoint MQ | Azure Service Bus for the 63 queues and topics |
| Anypoint Exchange connectors | Camel's 350+ built-in components, plus custom wrappers for NAV |
| Anypoint API Manager | Kong for gateway, rate limiting, and policy |
| Anypoint Monitoring | Grafana + Prometheus, with existing Datadog retained for logs |
| Object Store v2 | Redis for shared state |
We chose Azure Service Bus for the messaging replacement specifically because the brand's infrastructure was already on Azure for other workloads. If the starting point had been AWS, we'd have picked SQS and SNS. If it had been a team that genuinely wanted to own their own bus, we'd have looked at Kafka or RabbitMQ. The principle here matters more than the specific pick: replace Anypoint MQ with whatever managed message bus your existing cloud already offers, because the cost delta between a managed bus and a self-run one usually isn't worth the operational overhead for a team that's also rewriting flows at the same time.
The migration was phased by business domain, not by application. That decision mattered more than any technical choice we made. A domain-first migration lets you prove the entire vertical slice end-to-end before moving on to the next one, which gives you real confidence that the target stack can handle your specific patterns. An application-first migration (all Experience APIs first, then all Process APIs, then all System APIs) looks neater on a slide but leaves you stranded for months with half-finished verticals that can't be cut over cleanly.
Our phase order, from lowest risk to highest:
Each phase followed the same internal rhythm: build the new Camel routes in a feature branch, run them in parallel with the Mule version for a tunable observation window (typically two weeks), reconcile outputs, flip the traffic, and then decommission the Mule side once the new side had been stable for a full business cycle. Nothing was ever deleted on the Mule side until the new side had survived month-end close and a full weekly EDI cycle.
From $350K/year on MuleSoft to approximately $50K/year in Azure infrastructure, ops tooling, and managed services on the Camel stack.
The recurring cost of running the replacement stack came out to roughly $50,000 per year: AKS cluster capacity, Azure Service Bus throughput, Kong instances, monitoring tooling, and the small Azure PaaS footprint for supporting services. Against a $350,000 MuleSoft baseline, that's a ~$300,000 annual saving, before accounting for the escalation clause that would have made year two and year three worse.
The one-time engineering cost of the migration paid for itself inside the first year. Year two onward is pure savings. Multiplied across the five-year horizon we'd modeled before the project started, the total avoided cost is north of $1.4 million — and that's with conservative assumptions on the baseline.
Secondary outcomes worth calling out:
A few lessons from this specific migration that we'd hand to anyone about to start a similar project:
Our free assessment produces the same audit we did here: a full inventory of your Mule estate, a real cost baseline, and a migration plan grounded in your actual codebase. No sales pitch — just data.
Estimate Your Savings Book a Free Assessment