Why we stopped using n8n for complex automation
Category: AI Automation / Developer / Business owner
This isn't a hit piece on n8n. We used it happily for a long time, and on simpler projects we still do. But somewhere in the middle of a client engagement last year, we hit a wall that no amount of nodes, sub-workflows, or creative workarounds could get us past. That moment clarified something we'd been vaguely sensing for months: there's a ceiling on visual automation tools, and vendors don't advertise where it is.
Here's our version of what happened, and the framework we now use to decide when a workflow tool is the right call versus when you actually need an agent. The informed reader will probably sense that digging even further into documentation and workarounds has it's merit but sometimes, enough is enough.
n8n is genuinely good
Worth saying clearly before anything else. n8n is open-source, self-hostable, has a generous free tier, and the node library covers most of the integrations you'd actually want. For connecting two or three services with predictable inputs and outputs, it's hard to beat the speed at which you can stand something up. Zapier and Make operate in the same space with different trade-offs around pricing and flexibility, but the underlying model is the same: you draw a flowchart, you fill in credentials, it runs.
That model works brilliantly. Right up until it doesn't.
Where the flowchart starts pulling your leg
The first project where we noticed the cracks was a document intake pipeline for a professional services client. Incoming documents: contracts, invoices, correspondence, needed to be classified, routed to the right team, and summarised in a specific format depending on type. Sounds like a workflow problem. It wasn't.
The issue is that real documents are ambiguous. A document that looks like a contract might also contain invoice line items. A letter might need to go to legal, or to accounts, or to both, depending on what it says, not just what it's called or what email address it arrived from. In n8n, you can add IF nodes. You can add Switch nodes. You can build a deeply nested tree of conditional branches. We did, for a while. The workflow hit sixty-something nodes and became genuinely unreadable. More importantly, it was brittle. Every new document type meant another branch. Every exception to the rules meant another condition stacked on top of the existing ones.
IF node = "if this is true, go left; otherwise go right."
Switch node = "check the value, then pick one of several paths."
What we actually needed wasn't more branches. We needed something that could read the document, reason about what it was, and make a judgment call. That's not a flowchart problem. That's a LLM problem.
The second wall was memory and context
The same client wanted a follow-up system: if a document was flagged for review and nobody acted on it within 48 hours, escalate. Fine, that part is easy. But they also wanted the escalation message to reference the original document, summarise what was in it, and flag if it was related to any other open items. That last piece is where n8n stops helping.
Visual workflow tools are stateless by design. Each execution is its own thing. You can work around this by storing state in a database and reading it back in, and we did, but you end up writing the actual logic in code nodes anyway, at which point you're just using n8n as an awkward wrapper around JavaScript that you'd rather write properly elsewhere, have version history on etc. The compounding problem is that once you start building context-awareness into a workflow, something like:
-"this document relates to that contract, which is from this client, who last contacted us about X"
Here you're essentially building a retrieval and reasoning layer by hand. That's not glue code anymore. That's an agent architecture written in the wrong tool.
Point of no return
The moment we knew we'd outgrown the tool was specific. We were trying to build a routing step that would read an incoming request, check against a set of criteria that changed depending on client configuration, and decide which team to notify, preferrably with a short explanation of why. The "why" was the thing which created mayhem. There's no node for "generate a one-sentence rationale for a routing decision based on contextual factors." You can fake it with templates, but faking it means you've already decided what the output looks like before the model has a chance to tell you what the input actually is.
We had been spending more time fiddling around with the tool than building the flow. Seems like that would be a clear sign.
What we moved to
We rebuilt the pipeline as a FastAPI service with an AI agent at the centre. The agent handles classification, routing logic, and summary generation. FastAPI handles the API layer, webhook ingestion, and integration with the client's existing tooling. The whole thing is a few hundred lines of Python, it's testable, it's version-controlled, and when the logic needs to change we change the code rather than screenshotting a node graph and trying to remember what that particular IF branch was for.
[We wrote about the FastAPI setup in detail here โ link to FastAPI post.]
The honest tradeoff is that the initial build took longer than spinning up an n8n workflow would have. No-code tools are fast to prototype with! However, the maintenance burden on the n8n version would have been substantially higher, and the things it couldn't do: adaptive, context-aware reasoning, was central to what the client actually needed.
A framework for the decision
The way we think about it now.
Stay with n8n (or Zapier, or Make) when:
Inputs and outputs are predictable and well structured
Routing logic can be expressed as a finite set of rules you'd be comfortable maintaining as a flowchart
The automation runs in the background and nobody needs to explain its decisions
You need something working by Friday and it doesn't need to be clever
Move to an agent-based architecture when:
You're handling natural language or unstructured documents as primary inputs
Routing or classification depends on understanding content, not just metadata
You need the system to explain its reasoning
You're building more than three or four levels of conditional logic and it's not getting cleaner
State and memory across executions matter to the output
The no-code tools aren't broken. They're just solving a different problem than the one you have at that point. Recognising that distinction earlier would have saved us weeks on that engagement.
One more thing
n8n 1.x introduced AI nodes, and the tool has been adding model integrations actively. This is worth watching. But in our experience, bolting an LLM onto a flowchart doesn't change the underlying model. You still end up wrestling with the tool when the logic gets genuinely complex. The AI nodes are useful for simple enrichment steps: classify this text, translate this field, rewrite this string. For anything that requires the model to reason across context and drive the control flow, you want the model to be the thing running the show, not a node inside a diagram someone else is running.
Written from experience on a client engagement in early 2025. We're a small team working on AI automation projects. If you're hitting similar walls, feel free to reach out!