AI Agents Can't Actually Do Your Job (Yet) | eWeek

AI Agents Can’t Actually Do Your Job (Yet) — New Benchmark Reveals The Gap

Robot with businessmen on the street near the building.

Image: ORION_production/Envato

Written By
Grant Harvey
Grant Harvey
Nov 3, 2025
2 minute read
eWeek content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More

The hype: AI agents will automate entire workflows! Replace freelancers! Handle complex tasks end-to-end!

The reality: a measly 2-3% completion rate.

See, Scale AI and CAIS just released the Remote Labor Index (paper), a benchmark where AI agents attempted real freelance tasks. The best-performing model earned just $1,810 out of $143,991 in available work, and yes, finishing only 2-3% of jobs.

This benchmark is a much-needed reality check for an industry spending untold trilli’s like Bond movie villains on the hypothesis that AI will automate all work. And honestly? It’s useful data.

Here’s what they tested

Real tasks from freelance platforms. Not toy problems or academic benchmarks, but actual gigs that humans get paid to complete: writing, research, data entry, and design tasks.

Why agents struggled:

  • Multi-step workflows with unclear handoffs.
  • Ambiguous requirements that we humans clarify through conversation.
  • Tasks requiring judgment calls and context.
  • Work that needs iteration and client feedback.

What agents CAN do: In production environments, small fine-tuned models handle day-to-day repetitive tasks well, while bigger models orchestrate workflows or handle edge cases. This setup works, but it’s narrow and human-supervised.

These agents come with hidden costs, too. Even when agents work, Rate Limited’s recent breakdown shows “free” coding agents carry costs: rate limits, latency, security reviews, and rework. You need guardrails and budgets, not blind automation.

The counterpoint = a new study that shows 74% of companies that actually measure GenAI ROI report positive returns (full report).

Why this matters

We’re in a weird middle ground. AI can augment work impressively, but can’t yet replace skilled humans on complex tasks (the middle-to-middle problem). Understanding this gap helps set realistic expectations.

What’s coming: Better agent architectures, tighter human-in-the-loop workflows, and specialized agents for narrow domains. Progress is happening, it’s just not happening (successfully) as quickly as the AI companies want you to think.

The takeaway: If someone’s selling you on fully autonomous AI workers, ask to see completion rates on real tasks you do every day… or don’t buy it.

Editor’s note: This content originally ran in today’s newsletter send from our sister publication, The Neuron. To read more from The Neuron, sign up for its newsletter here.

Grant Harvey

Grant Harvey is the Lead Writer of The Neuron, where he continues to lead the publication's daily coverage of AI news, tools, and trends.

eWeek Logo

eWeek has the latest technology news and analysis, buying guides, and product reviews for IT professionals and technology buyers. The site's focus is on innovative solutions and covering in-depth technical content. eWeek stays on the cutting edge of technology news and IT trends through interviews and expert analysis. Gain insight from top innovators and thought leaders in the fields of IT, business, enterprise software, startups, and more.

Property of TechnologyAdvice. © 2026 TechnologyAdvice. All Rights Reserved

Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.