08 Aug 2025

AI Is Good at Scutwork

Paul Graham's viral tweet sparked crucial conversations about AI's role in work. Explore practical strategies for using AI on grunt work without sacrificing quality or career development.

Artificial Intelligence
AI Is Good at Scutwork

I came across this tweet in my feed earlier this week and it certainly reflects what our clients come to us for:

In just four words, Paul Graham captured something profound about AI’s current sweet spot. His August 2025 tweet sparked hundreds of replies, ranging from enthusiastic agreement to thoughtful pushback about the implications for career development, quality control, and business models.

TL;DR: Graham’s insight is spot-on: AI excels at repetitive, low-judgment, structure-heavy tasks. But the replies revealed crucial nuances—especially around preserving learning opportunities for juniors, maintaining quality in regulated work, and defining where “scutwork” ends and essential craft begins. Here’s a practical guide to leveraging AI for grunt work without hollowing out capability or standards.

Why this tweet hit a nerve

Graham’s observation resonates because it reframes the AI-and-jobs conversation away from apocalyptic “which professions will disappear?” toward the more practical “which types of work tasks are ripe for automation?”

This shift in perspective is liberating. Instead of fearing that AI will replace lawyers, journalists, or engineers wholesale, we can focus on identifying the specific activities within these roles that are structured, repetitive, and time-consuming—the stuff that professionals do because it’s necessary, not because it’s intellectually stimulating.

This isn’t a fringe view. Legal and media observers have made similar arguments for years: AI thrives on repetitive research, summarisation, drafting boilerplate, first‑pass analysis, and data wrangling — the parts of work that feel like chores. In fact, versions of this idea have circulated since long before generative AI’s breakout.

What is “scutwork,” really?

“Scutwork” is a colloquial label for low‑autonomy, repetitive, time‑consuming tasks that are necessary but rarely celebrated. Think:

  • Law: first‑pass contract comparisons, precedent lookups, chronology building, cite checking.
  • Newsrooms: background packets, transcript tidying, spreadsheet merges, FOI indexing, and even brainstorming prompts.
  • Software: scaffolding, boilerplate, test stubs, lint fixes, documentation and error‑message improvements — a lot of programming has always been this kind of work, as Graham himself noted in a follow‑up.

These are exactly the tasks where today’s models shine: they’re structured, auditable, and (with the right guardrails) easy to review.

What the replies got right

Reading through the reactions, several themes kept cropping up. Here are the strongest — with how the evidence stacks up.

1. “If AI eats the grunt work, where do juniors learn?”

This is the training‑pipeline problem. Multiple observers worry that removing entry‑level tasks will starve early‑career people of practice, making it harder to grow senior talent later. Recent talent reports and labour‑market coverage echo the risk: entry‑level postings are down, early‑career unemployment is up, and employers are prioritising mid‑senior hires who can “ship” with minimal oversight. In short: the ladder’s lower rungs are wobbling.

Takeaway: Don’t skip apprentices; instrument their scutwork with AI instead (see playbook below).

2. “Not all scutwork is low‑risk.”

Exactly. In law, audit, finance, health, and safety‑critical software, the “boring bits” carry compliance and liability; hallucinations here are costly. Analysts expect AI to compress time spent on these tasks, but also stress the need to redesign pricing, staffing, and review models (e.g., less pure billable‑hour leverage on juniors; more outcome‑based pricing and QA). Your AI can draft the NDA in minutes — you still own the signature.

Takeaway: Treat AI outputs as drafts, not decisions — and make the review step visible and priced.

3. “Counting lines of code isn’t progress.”

Some replies cautioned against productivity theatre: “10k LOC/day” is meaningless if quality, security, and maintainability slip. Community discussions the same week underscored the point: AI can flood repos with code, but it also demands tests, static analysis, and human judgement to keep the blast radius small.

Takeaway: Measure defect rates, review latency, cycle time, and customer outcomes — not sheer volume.

4. “AI isn’t *just scutwork — it can help creativity too.”*

Fair pushback. Editors and product leaders argue that AI is useful for brainstorming, idea surfacing, and data‑driven story hunting — still in the “assistive” bucket, but not merely janitorial. The line between scutwork and creative prompt‑engineering is blurrier than it sounds.

Takeaway: Use AI to amplify creative throughput, not to replace creative ownership.

A practical playbook: Hand the right “CARTS” to AI

When you hand tasks to AI, bias toward CARTS — work that is:

  • Constrained by explicit instructions and definitions
  • Auditable with checklists, tests, or second‑source verification
  • Repeatable across cases with stable patterns
  • Time‑consuming relative to its marginal value
  • Structured in inputs/outputs (tables, templates, schemas)

Examples by function:

  • Legal: first‑pass clause extraction into a template table → human counsel reconciles differences and signs off.
  • Journalism: AI drafts a backgrounder from public docs with links and quotes flagged → editor verifies lines and tone before publication.
  • Engineering: generate tests from requirements, tighten docstrings, suggest lintable refactors → senior engineer approves changes behind a CI gate.

Guardrails you’ll actually use

  1. Two‑source rule: Anything public‑facing or regulated gets a second source (corpus search, database, or human SME) before it’s treated as fact.
  2. Red‑flag taxonomy: Pre‑agree what can’t be auto‑accepted (novel legal interpretations, financial adjustments, PII transformations, safety‑critical code).
  3. Traceability: Keep prompts, versions, and review notes with the artefact (PR description, matter file, or story slug).
  4. Pricing & timeboxing: If you’re in services, sell the review and judgement, not the raw hours your AI just saved. (Your clients already know the boilerplate went faster.)

How to preserve the training ladder (without pretending AI doesn’t exist)

The most compelling worry in the replies was about skills formation. If juniors don’t do the grunt work, how do they become seniors? Sensible options:

  • Reverse the ratios, not the exposure. Let AI perform the first pass; have juniors own the second pass: verification, test design, failure‑mode analysis, and post‑mortem reflection. This builds judgement faster than rote drafting ever did — and it’s reviewable.
  • Design “apprenticeship loops.” Pair every junior with a named reviewer; require brief written rationales (“what I changed and why”). These artefacts compound into internal playbooks.
  • **Publish promotion rubrics that reward error‑finding and risk reduction, not just output volume.
  • Rotate through “critical craft.” Even in an AI‑rich workflow, schedule hands‑on weeks in low‑automation areas (incident response; complex negotiations; unfamiliar codepaths).

This is consistent with Graham’s broader point: operate “above scutwork,” but don’t delete the learning moments. Make them cheaper, faster, and safer.

Where AI scutwork already changes business models

  • Big Law & advisory: As AI compresses research hours, firms are experimenting with value‑based pricing and thinner pyramids. The billable‑hour leverage on juniors — traditionally built on scutwork — becomes harder to defend without visible quality uplift.
  • Tech hiring: Reports show a tilt toward experienced operators; without intentional apprenticeships, we’ll feel a skills gap in 3–5 years.
  • Newsrooms: Editors are openly arguing for AI to do the “boring bits” so journalists spend more time reporting and shaping narratives.

A 7‑day plan to rebalance your scutwork

Day 1–2 — Inventory: List your weekly tasks. Tag each as CARTS/Not‑CARTS.

Day 3 — Pilot: Pick three CARTS tasks; draft prompts/templates; set pass/fail checks.

Day 4 — Instrument: Add tests/checklists and a second‑source rule for anything public or regulated.

Day 5 — Train the humans: Short session on review heuristics (what to distrust first).

Day 6 — Measure: Capture cycle time, review latency, defects, and rework.

Day 7 — Decide: Keep, tweak, or kill each pilot based on the above.

Rinse monthly. Your aim is to shrink the scutwork timebox while increasing the surface area of judgement.

So… is the tweet right?

Yep, as a rule of thumb, today’s AI is extremely good at scutwork. Graham’s follow‑ups make the more important point: your safest career and business strategy is to operate above that level of work, where human taste, responsibility, and synthesis dominate. But the strongest replies add the crucial how: protect the training ladder, police the boundary between “boring” and “high‑consequence,” and measure outcomes, not output.

If you adopt AI this way, you get the hours back and you end up with better practitioners, not fewer.

Osher Digital Business Process Automation Experts Australia

Let's transform your business

Get in touch for a free consultation to see how we can automate your operations and increase your productivity.