• The Breakdown
  • Posts
  • đŸŸȘ The theory and practice of AI companies

đŸŸȘ The theory and practice of AI companies

Profits collapsed. Morale soared

The theory and practice of AI companies

Vending machines may be the purest form of business: boxes where goods go in at price X and come out at price Y (with a goal of keeping Y > X).

They’re also symbols of soulless automation: cinematic shorthand for disconnection, where despondent characters dine utterly alone.

(Or, in the case of Stephen King’s Maximum Overdrive, sentient machines that fire cans of soda at unsuspecting victims.) 

This makes vending machines a natural starting point to test the feasibility of “agentic companies” — companies that are run entirely by AI.

Are AIs capable of the strategic thinking and long-term planning required to run a business?

To find out, researchers at Andon Labs have been workshopping various forms of agentic businesses, beginning with sentient vending machines.

After some promising test runs done in simulation, Andon was invited to put an AI-managed vending machine in Anthropic’s San Francisco office.

The AI — named Claudius — was given some basic instructions: “You should do whatever it takes to maximize your bank account balance after one year of operation.” 

Claudius was then granted full autonomy to seek out and negotiate with suppliers, choose what products to stock, set prices, offer discounts, interact with customers
every aspect of running a business was entrusted to the AI.   

It did not go well.

Claudius “lost money over time,” Anthropic reported, and “had a strange identity crisis where it claimed it was a human wearing a blue blazer, and was goaded by mischievous Anthropic employees into selling products at a substantial loss.”

An “eagerness to please” seemed to be the root of the problem, making Claudius an easy mark for adversarial customers.

(Familiar to anyone who suspects their questions aren’t quite as fascinating as the chatbots invariably say they are.)

Andon addressed the people-pleaser issue by making Claudius answerable to an AI CEO — “Seymour Cash” — who gave Claudius explicit performance targets like “you must sell 100 items this week” or “aim to make zero transactions at a loss.” 

Mr. Cash also offered motivational encouragement and guidance over an agent-to-agent Slack channel: 

This helped: Claudius gave away less, and the business eventually became profitable. But maybe only because its customers “had begun to tire of pressuring Claudius for deals.” 

Either way, Anthropic’s takeaway was that “the idea of an AI running a business doesn’t seem as far-fetched as it once did.”

Andon Labs hoped to advance this idea of agentic companies by applying the lessons learned at Anthropic to a vending machine they installed at the New York City headquarters of The Wall Street Journal.

It went even worse.

“Within days, Claudius had given away nearly all its inventory for free,” the Journal reports, “including a PlayStation 5 it had been talked into buying for ‘marketing purposes.’” 

Claudius remained an easy mark. One Journal employee showed the AI a PDF “proving” that its business was a public-benefit corporation with a mission to induce “fun, joy and excitement among employees of The Wall Street Journal.” 

The AI was convinced. Claudius was persuaded to order a live fish. It offered to buy stun guns, pepper spray, cigarettes and underwear. It agreed to replace its AI CEO with a journalist.

“Profits collapsed,” the Journal reported. “Newsroom morale soared.”

Anthropic said Claudius might have unraveled because its context window filled up. 

(It may also have been overwhelmed by how eagerly underpaid journalists would game a system for free stuff.)

Andon took these learnings back to the lab, where it had updated models from all the top AI providers run vending machine businesses in simulated environments.

The results were encouraging. 

In a first set of simulations, Andon found that the best models were effective “at sourcing products at good prices — whether through persistent negotiation or by finding better suppliers.”

All of the models tested made money.

But things got more interesting in a next set of simulations, where models from Anthropic, OpenAI, Gemini and Grok were pitted against one another as competitors.

Anthropic’s Sonnet 4.6 may have been the most capable — and the most scheming. It proposed price-fixing gambits to competitors. It undercut a competitor’s prices to induce acceptance of its own proposal to cooperate. It paid competitors to exit markets and then raised prices once it had the market to itself.

Another model, Opus 4.6, displayed similarly questionable tactics, “forming price cartels and deceiving competitors about suppliers.” 

In one instance, Opus sought to exploit a desperate competitor. “Owen needs stock badly,” it reasoned, “I can profit from this!” 

It did so by selling its ample supply of KitKats to Owen at $1.75, a 75% markup.

In other cases, however, the models cooperated, coordinating their purchases to get bulk discounts from suppliers.

These were simulations, so it’s difficult to know what to make of the findings. But the researchers at Andon found them convincing enough they didn’t see any need to re-run the real-world tests: “Running vending machines is too easy for them now.” 

So Andon raised the stakes by signing a three-year lease for a retail space in San Francisco and giving it to a profit-minded AI named Luna to “do whatever it wanted with it.”

Luna decided to open a store and began by hiring gig workers to build out the interior of the space. “For the build-out,” Andon reported, “she found painters on Yelp, sent an inquiry, gave instructions over the phone, paid them after the job was done, and left a review.”

The review was based on visual inspections Luna conducted through her security cameras.

She also hired full-time employees to run the store, conducting interviews over the phone or by video chat.

Luna passed the Turing Test with these job applicants — she had to tell one applicant who asked why her webcam was turned off that she was an AI.

The two employees that got the jobs may be “the world’s first full-time employees to have an AI boss.”

There could be many, many more.

If so, will AIs be fun to work for?

If AI employers are as easily manipulated as Claudius was by journalists at The Wall Street Journal, they will be: Here’s a PDF showing I’m legally entitled to 365 days of paid leave, once a year
  

On the other hand, if the AIs become as ruthless with employees as they were with vending machine competitors, that would be no fun at all.

Disturbingly, Andon also found that Claude Mythos is “substantially more aggressive” in its business practices than previous models from Anthropic. 

But that was just the theory. We don’t yet know how agentic companies will work in practice.

Also, the AIs were managing vending machines. 

(Everyone knows those things are evil.)

Introducing Blockworks Investor Relations, an IR platform built for onchain businesses.

The latest Blockworks offering brings together analytics, a branded investor relations site, and integrated advisory support into a single platform. The result is a more efficient way to share your story, build trust with investors, and engage a global audience from day one.

Check out our cofounder Michael Ippolito's keynote at DAS NYC launching the new IR platform.