The agentic AI rollout has a contradiction at its core: the same software that vendors call a colleague is, in controlled studies, making the humans around it measurably worse at their jobs

Microsoft, OpenAI, Anthropic, and Google are reportedly selling software that introduces itself by name, holds a Slack handle, and shows up on org charts. The same software, in controlled studies, makes the humans around it worse at their jobs.

That contradiction is the whole story of the agentic AI rollout so far. The marketing department says coworker. The data says liability.

The conventional view, repeated by major platform vendors in 2026, is that the next leap in productivity comes from treating AI agents as digital employees: give them names, give them roles, give them seats at the table, and let managers run mixed teams of humans and machines. Nvidia CEO Jensen Huang has spoken openly about workplaces where AI agents are onboarded and oriented like new hires.

The pitch is intuitive. It is also, according to a growing body of behavioural research, the wrong frame.

The branding exercise that breaks oversight

The clearest evidence so far comes from research published in Harvard Business Review that tested what happens when the same AI output is labelled two different ways.

In one condition, participants reviewed work attributed to a chatbot. In another, identical work was attributed to an AI team member called Alex. The output was the same. The behaviour around it was not.

The naming convention did the damage.

The research, covered in MIT Technology Review, found that calling an AI agent an employee is easy and convenient, particularly when something goes wrong, but it is a branding exercise.

The branding has consequences. A significant proportion of companies already frame AI agents as employees, with some placing the agents on official organisational charts.

Those are not hypothetical scenarios. Those are reporting lines.

What happens to a manager who thinks the software is a colleague

The psychological mechanism is not mysterious. When something is framed as a peer, the default posture shifts from supervision to deference.

A spreadsheet is a tool. You check its formulas. A junior analyst is a person. You give them the benefit of the doubt, then loop in your boss if their work seems off.

Software branded as an employee inherits the second posture. Managers stop catching errors because they have stopped looking for them in the same way. Escalation replaces correction because correcting a colleague feels socially awkward in a way that correcting a tool does not.

This is the inversion the agent marketing is producing. The very framing meant to make humans more comfortable working alongside AI is the framing that dulls their critical attention to it.

And the people most likely to be reassured by framing AI as a coworker are also the people most likely to be on the hook when something breaks.

The Nobel laureate’s objection

MIT economist Daron Acemoglu, who shared the 2024 Nobel Prize in economics, has been making a related argument from a different angle. The replacement frame, in his view, is not just bad for accountability. It is bad economics.

AI agents are currently marketed as things that can replace humans, which he has described as a losing proposition. They should instead be optimised, he has argued, to improve human capabilities, which is not what they are doing at the moment.

His framing matters because the augmentation-versus-replacement debate has been treated as ideological for most of the last three years. Acemoglu is making it empirical. The productivity gains the industry keeps promising have not arrived at the scale the capital expenditure assumed. The replacement frame may be part of the reason.

If the agent is meant to do the work, the human stops engaging with it. If the human stops engaging with it, the errors compound. If the errors compound, the productivity dividend evaporates.

What workers actually want automated

There is a separate strand of evidence, from Stanford’s Future of Work initiative at SALT Lab, that complicates the replacement story further.

Workers surveyed about which of their tasks they actually wanted AI to take over showed answers that diverged sharply from what tech companies had identified as the most automatable.

The tasks workers most wanted to offload were the dull, repetitive, low-stakes ones. The tasks they wanted to keep were the ones involving judgment, relationships, and creative discretion. The tasks the industry was busiest automating sat closer to the second category than the first.

This is a design problem masquerading as a deployment problem. Vendors are building agents to do what is technically impressive. Workers are asking for agents that do what is genuinely tedious. The two lists barely overlap.

The org chart as legal document

Listing an AI agent on an org chart looks like a branding choice. It is also, eventually, a legal one.

When a clinical error happens in a hospital, an audit traces the decision back to a named clinician. When a procurement fraud happens in a government department, an investigator traces the approval back to a named official. Org charts are the substrate of accountability, not just communication.

Inserting a non-human entity into that substrate creates a fault line. If Alex approved the discharge summary, who is liable when the discharge summary is wrong? The vendor? The manager who deployed Alex? The clinician who signed off without reading it because Alex had already “reviewed” it?

The last of those is the most likely real-world outcome. People defer to named entities. Named entities cannot be sued in any meaningful sense. The deference, therefore, is a transfer of risk from the vendor to the individual employee who trusted the label.

This is the scapegoat dynamic that becomes a convenient way to absorb blame that humans, in practice, will still carry.

Where this gets dangerous

The stakes are not uniform across sectors. A misfiled marketing brief is not a misfiled radiology report.

In healthcare, government, and defence, the same psychological mechanism that makes managers escalate rather than correct produces failures with body counts attached. The Guardian’s reporting on the Iran school bombing earlier this year illustrated the pattern in extremis: AI was the first thing blamed, and the truth, as the reporting laid out, was more uncomfortable than that. The system had not gone rogue. The humans around it had stopped checking it.

That is the agentic failure mode in miniature. Not a malicious machine. A quiet collapse of oversight, dressed up as collaboration.

The coworker metaphor has a cost the org chart hides

There is a reason the coworker frame is appealing to vendors. Software-as-a-service is a crowded category with shrinking margins. Software-as-a-colleague is a fresh category with seat-licence economics attached. If an AI agent costs the equivalent of a junior salary, the addressable market is the entire global wage bill rather than the global IT budget.

That is a much bigger number. It is also the number the agent rollouts have been priced against.

But the seat-licence framing only works if buyers accept the colleague metaphor. The metaphor is not neutral. It changes behaviour in measurable ways. And the behaviour it produces is worse oversight, not better collaboration.

What the engagement research already told us

None of this should be surprising to anyone who has read the workplace engagement literature. Gallup’s long-running research on what predicts retention and well-being at work has consistently pointed to one question that sounds almost too personal to belong on a survey: do you have a best friend at work?

The reason it works as a predictor is that jobs run on human friendship in ways the org chart does not capture. Trust, slack, repair after conflict, the willingness to flag a problem without escalating it. These are not features that an agent named Alex can deliver, even in principle.

Treating agents as coworkers does not extend the friendship layer of work. It hollows it out. The colleague you ask for a sanity check is not a process that returns a token stream.

The asymmetry the industry will not name

Since April, Microsoft, OpenAI, Anthropic, and Google have released tools for managing teams of AI agents. The tooling assumes the colleague frame. The marketing assumes the colleague frame. The pricing assumes the colleague frame.

The research assumes nothing of the sort. It treats the agent as software with a misleading label, and it measures what happens to the humans who believe the label. The answer is that they get worse at their jobs in specific, replicable ways.

There is an asymmetry there that no vendor deck addresses. The upside of the colleague frame accrues to the vendor. The downside accrues to the buyer and, more pointedly, to the individual employee who trusted that Alex had read the document properly.

What the better design would look like

The Acemoglu position, taken seriously, leads to a specific design discipline. Agents should be built around the tasks workers actually want help with, not the tasks that are easiest to automate or most photogenic in a keynote.

They should be labelled, in interface terms, as tools rather than peers. The friction of remembering you are using software is not a bug. It is the mechanism by which oversight is preserved.

They should produce outputs that invite checking rather than outputs that simulate finality. A draft labelled as a draft gets edited. A draft labelled as Alex’s completed work gets forwarded.

None of this is technically harder than what is currently being shipped. It is commercially harder, because it gives up the seat-licence framing. The honest version of the product is less valuable as a story, even if it is more valuable as a tool.

The accountability question is the only question

Strip away the marketing and the underlying question is straightforward. When the AI gets it wrong, who is responsible?

If the answer is the human who deployed it, then the agent is a tool and should be labelled as one. If the answer is the agent itself, then there is no answer, because the agent cannot be held responsible in any meaningful sense, and the question is being used to obscure rather than assign liability.

The colleague frame is the second answer in a costume. It is the rhetorical device that lets vendors sell autonomy while leaving accountability with the buyer. The data shows it also degrades the buyer’s ability to catch the errors they will eventually be blamed for.

The clean version of the deal is simpler. AI agents are software. They have no colleagues. The humans who use them have all the responsibility they had before, and slightly less ability to discharge it well, because the interface has been designed to make them defer.

That is not a coworker. That is a liability with a name tag.

Source link