Hiring an n8n Consultant: What to Actually Test For

Hiring an n8n consultant is harder than it looks. Here is the test we wish more clients ran on us before signing the SOW, with sample questions to ask.

Hiring an n8n Consultant: What to Actually Test For

Updated May 2026. Rewritten with the test questions, portfolio probes, and red flags we wish more clients used on us before signing the work order.

“n8n expert” is a phrase that means almost nothing. Two years ago it implied someone who had built three or four workflows. Today it includes everyone from people who have watched a YouTube series to teams running hundreds of production workflows for regulated clients. The market signal got noisier as n8n got more popular, and the cost of picking the wrong consultant got higher.

We are Osher Digital, a Brisbane-based automation consultancy. We are also, full disclosure, an n8n consultancy you might be evaluating. The advice in this guide is what we genuinely wish more clients did during their selection process, including with us. A consultant who survives a real test is worth signing. A consultant who flinches is telling you something useful before any money changes hands.

For background on the broader topic, our piece on self-hosting n8n in production covers the operational ground a good consultant should know cold. For the AI-on-n8n angle, see building AI workflows in n8n.


Why “n8n expert” is now a noisy signal

n8n’s growth pulled in a long tail of new practitioners. Some of them are excellent. Some have built one impressive demo and not much else. Telling the two apart from a portfolio screenshot is genuinely hard, because n8n’s visual editor flatters demos and hides production complexity.

The complex automations that actually deliver business value (multi-system integrations, error handling, production observability, sane secret management, version control discipline) are mostly invisible in screenshots. A workflow with twenty nodes can be either a thoughtful production system or a chain of bandaids. The screenshot looks identical.

This guide is the diagnostic test. Run it on every consultant on your shortlist. The good ones will pass it visibly. The mediocre ones will give themselves away, often within the first hour.

The portfolio review: signal vs noise

What to ask for: two or three production workflows the consultant has actually shipped, anonymised if necessary, walked through verbally rather than just shown as screenshots.

What to listen for during the walkthrough:

  • Do they talk about what broke in production, or only about what works? Real production experience comes with battle scars. A consultant who has only happy stories has not run anything for long.
  • Do they describe the inputs and outputs in business terms, or only in node terms? “It moves data from system A to system B” is fine for an internal tool. For a serious consultant, you want “we processed 1,200 invoices a month at 92 percent straight-through, with the remainder going to a queue for human review.”
  • Do they have an opinion about which parts they would build differently? A consultant who would not change anything about a workflow they shipped a year ago has not learned anything in that year.

What to ignore: visual polish in screenshots, the number of nodes (more is not better), and case study logos by themselves. We have seen consultants list logos for clients where the actual work was a single afternoon’s prototype that never went to production.

Technical questions that actually filter

These are the questions we hope clients ask us. Most don’t. The consultants who answer them well are usually the ones worth hiring.

“How do you handle errors and retries?” Good answer: an explicit error workflow that catches failures, classifies them as retryable or terminal, retries with exponential backoff for the first category, and routes the second to a queue or notification channel for human review. Bad answer: handwaving about n8n’s built-in retry settings without explaining when they are not enough.

“How do you manage secrets?” Good answer: n8n credentials for what they are designed for, an external secrets manager (Doppler, 1Password, AWS Secrets Manager, HashiCorp Vault) for the rest, and an explicit policy on what counts as a secret. Bad answer: secrets in plain text inside Set nodes or environment variables that everyone on the team can read.

“How do you version-control workflows?” Good answer: workflows exported to JSON and tracked in Git, with a clear story about how they get from Git to the running n8n instance. Bonus points for CI checks. Bad answer: “n8n has a version history feature.” That is true, but it is a single-instance audit log, not version control.

“How do you test workflows before they hit production?” Good answer: separate dev/staging/prod n8n instances, fixture data for repeatable runs, and a manual smoke test against staging before any production deploy. Excellent answer: an automated test harness that calls the workflow with known inputs and asserts on the output. Bad answer: “I test in production.” If they laugh when they say it, they’re joking. If they don’t, they’re not.

“What do you do when n8n changes a node’s behaviour in an upgrade?” Good answer: pinned n8n versions in production, staged upgrades with regression tests, and a rollback plan. Bad answer: blank stare. n8n upgrades have broken our workflows twice in the last year. A consultant who has never thought about this has not maintained anything for long.

The complex automation question: what they should ask you

The most useful test is to describe a real, gnarly automation problem from your business and watch what they do with it. The consultant’s response in the first ten minutes tells you almost everything.

Good consultants ask questions before proposing a solution. Specifically:

  • What is the volume? Per day, per month, peak.
  • What does the data look like in practice, not in theory? Can they see five real anonymised examples?
  • What happens today when this fails? Who knows? Who fixes it?
  • What is the cost of a wrong answer? Are we paying a refund, missing a regulatory deadline, or just resending an email?
  • Who owns the upstream and downstream systems? Will we get help from them or are we working around them?

Bad consultants jump straight to “we’d build that with the X node and the Y node” without understanding the problem. They might be right; they might be wrong. You won’t know, and they won’t either, until they have committed to a path that is hard to back out of.

One specific test: describe a workflow that sounds simple but has an obvious gotcha, and see if they spot it. A favourite of ours: “We need to sync new contacts from system A to system B every fifteen minutes.” The gotcha is that system A’s API doesn’t have a reliable “modified since” filter, system B has rate limits, and contacts can be merged or deleted in either system. A good consultant catches at least two of those in the first conversation.

The operational maturity test

This is where good consultants separate themselves from very good consultants. The questions:

“How do you monitor production workflows?” A meaningful answer mentions n8n’s built-in execution log, an alerting integration (PagerDuty, Slack, email) for failed executions, and ideally an external uptime check. Excellent answers mention metrics dashboards (we use Grafana with a Postgres exporter pointed at n8n’s database) so you can see trends, not just incidents.

“What’s your backup and disaster recovery story?” n8n’s encryption key is the single most important file to back up. We have written more about why in our piece on self-hosting n8n. A consultant who cannot tell you what happens if that key is lost has not actually run a production n8n instance.

“How do you handle long-running workflows?” Queue mode, ideally. Workers separate from the main n8n instance. A specific story about a workflow that triggered the change. If they have run anything serious, they have hit this wall.

“How do you scale n8n?” Vertical first (more CPU and RAM on the main instance) until you hit the limit of about 40 to 60 concurrent executions. Then horizontal with queue mode, Postgres for the database, and a Redis instance for the queue. A consultant who immediately reaches for Kubernetes or microservices for any n8n problem has either misjudged the question or is selling a more expensive engagement than they need to.

AI in workflows: the modern litmus test

n8n’s AI Agent node and the broader AI tooling have become the differentiator in 2026. Consultants who have built serious production AI workflows on n8n share a few specific habits.

They have an opinion about which model to use and why. claude-sonnet-4-5 for extraction, gpt-4.1 for general reasoning, gpt-4o-mini for high-volume classification. They pin model versions in production and don’t use “latest” tags.

They have an evaluation pattern. They can describe how they validate that the AI step is doing what they think it is doing, before and after a model change.

They know when not to use AI. The classic failure mode is putting an LLM call in the middle of a workflow where a regex would have been faster, cheaper, and more reliable. A consultant who proposes Claude for “extract the order number from this email subject” rather than a regex is signalling they are not thinking about cost or robustness.

They have an opinion about RAG vs prompt-only. For document-heavy workflows, they should be able to talk about embedding strategies, chunk sizes, and the tradeoffs between n8n’s vector store nodes and a dedicated vector database.

How to actually use references

Asking for references is table stakes. Asking the right questions of those references is where the value is. A reference call that goes “were they good to work with?” tells you nothing.

What to actually ask:

  • What was the most painful part of the engagement? Every honest reference has an answer.
  • How did they handle a request you weren’t sure they could deliver on?
  • What is the workflow doing today, six or twelve months after they handed it over? Is it still in production?
  • If you needed a change tomorrow, would you go back to them, or could your internal team handle it from the documentation they left?
  • Did they overpromise and underdeliver, or the reverse?

The “still in production six months later” question is the most informative one we have found. A surprising number of consultancies leave behind workflows that get switched off within a year because nobody can maintain them. The reference will tell you, if you ask.

Pricing models and what each one tells you

Three common pricing structures, each with a tell.

Hourly or daily rates. Common for discovery and small builds. Typical Australian rates for a competent n8n consultant land somewhere between $150 and $300 AUD per hour, with the higher end usually reflecting AI integration work or regulated industry experience. The tell: how do they handle scope creep? Hourly without a cap is fine for short engagements; for anything over a few weeks, you want either a cap or a fixed-price phase.

Fixed price per workflow or per project. Easier to budget but only works if both sides understand the scope. Watch for low headline prices that exclude things you assumed were included (testing, deployment, documentation, handover). The tell: how detailed is the scope document? A two-page scope is fine for a small build; for a complex automation, you want the failure modes, the data sources, and the success criteria written down before money changes hands.

Retainer or managed service. Monthly fee for ongoing build and support. Typical retainers we see range from $4,000 to $20,000 AUD per month depending on volume. The tell: what do they actually do for the money? A retainer that buys you “up to X hours” is fine. A retainer that buys you “as much as you need” is either too good to be true or hiding a slow response time.

Red flags worth walking away from

Things that, in our experience, predict a bad engagement:

  • They cannot show you a workflow they have built. Even one. Even anonymised. Walk away.
  • They promise a delivery date in the first conversation, before they have understood the problem. Either they’re padding heavily or they’re guessing.
  • They want the work to live entirely on their n8n instance, with no handover plan. You will be locked in until you pay to migrate it elsewhere.
  • They cannot explain the difference between n8n Cloud and self-hosted in operational terms. This is basic.
  • They claim n8n can do everything any other automation tool can do. It is excellent and it has limits. A consultant who pretends otherwise will sell you the wrong tool for the job.

When you don’t actually need a consultant

If your team has someone with general programming experience and the time to learn n8n properly, you can usually build the first three to five workflows yourself. n8n’s documentation is good. The community is helpful. Consultants are most worth hiring when the cost of a wrong build is high (regulated workloads, complex integrations, AI-heavy workflows where evaluation matters) or when the team genuinely cannot make the time.

For early-stage builds, our piece on self-hosting n8n covers most of the operational ground. Once you hit a wall (queue mode, AI evaluation, regulated data, multi-instance scaling), that is the natural point to bring in outside help. We cover where consultants typically add value in our overview of our n8n consulting work.

If you would like a hand thinking through whether you need a consultant at all, book a call and we will give you an honest answer, even if the honest answer is “you don’t”.


Frequently Asked Questions

What does an n8n consultant cost?

Hourly rates for competent Australian n8n consultants typically range from $150 to $300 AUD. Fixed-price small builds usually land between $3,000 and $15,000 AUD. Ongoing retainers for build and support range from $4,000 to $20,000 AUD per month depending on volume and complexity. Cheaper exists; the cheapest options are usually freelancers without operational experience.

What skills should a good n8n consultant have?

Beyond knowing n8n itself: practical experience with REST APIs, authentication patterns (OAuth, API keys, JWT), basic Postgres or SQLite, JavaScript or Python for Code nodes, error handling and observability patterns, secrets management, and version control. For AI workflows, hands-on experience with at least one LLM provider’s API and some sense of evaluation discipline.

How long does a typical n8n project take?

A simple integration workflow: one to two weeks from kickoff to production. A complex multi-system automation with AI components: four to twelve weeks. A program of multiple workflows with proper operational tooling: three to six months for the first phase. Anyone promising “next week” for serious work is either underestimating or planning to cut corners you will pay for later.

Should we go with n8n Cloud or self-hosted?

n8n Cloud is fine for tiny workloads and prototypes. We do not recommend it past five or six workflows in serious use, because the per-execution pricing gets uncomfortable fast and operational control is limited. Self-hosted is the production answer for almost every client we have shipped for. A good consultant will help you make this decision based on volume, data residency, and budget rather than pushing one option by default.

What does a good handover look like?

Workflows exported to JSON and tracked in your Git repository. A one-page operations note per workflow explaining what it does, what to do when it breaks, and who to call. Credentials and secrets stored in your account, not the consultant’s. Documentation that an internal developer with no prior context can use to make a small change. If any of these are missing, the handover is incomplete.

Should we ask for a paid test task before signing the full SOW?

For larger engagements, yes. A scoped paid test (a small workflow built to a clear spec, two to five days of effort) tells you more about working with the consultant than any number of meetings. Most reputable consultants are happy to do this. The ones who refuse are sometimes signalling that they cannot deliver on what they have promised.

“Experte n8n” – is this just the German for n8n consultant?

Yes. The German n8n market is large and many of the same questions apply. Look for German-language references, documentation in German if your team needs it, and operational experience with the European data hosting regions. The technical evaluation criteria in this guide transfer cleanly across markets.

How important is AI agent experience for an n8n consultant in 2026?

Increasingly the default. Most serious n8n projects we are quoting in 2026 include at least one AI step. A consultant without hands-on experience building and evaluating workflows that include LLM calls is going to learn on your time, which is rarely what you want. For a sense of what good looks like, see our piece on building an AI agent with n8n and OpenAI.


Where to from here

Run the test. The good consultants will appreciate it. The mediocre ones will out themselves. Either way you are better off than you were before the conversation.

If you would like to put us through the same test, get in touch. We have answered all of these questions enough times to have opinions about the answers, and we would rather work with clients who care about the difference.

Ready to streamline your operations?

Get in touch for a free consultation to see how we can streamline your operations and increase your productivity.