🟪 Trading toward AGI

There's a data center in a mountain predicting prices.

Trading toward AGI

Jane Street says the $6 billion of compute recently purchased from CoreWeave will help it look for new things to do.

That, at least, is the impression I get from the trading firm’s co-head of technology Ron Minsky. “A lot of the value you get from all this,” he told Dwarkesh Patel, “is people trying lots of very different new things in the model designs and giving researchers faster iteration times so they can discover more ideas.”   

This feels like a throwback to the earliest days of algorithmic trading, when being smart was more important than being fast.

Renaissance Technologies, for example, began its astounding run of outperformance with an algorithm that was simply better at spotting patterns than anyone else. Speed was secondary. 

By the mid-2000s, it seemed like everyone’s algorithms were equally good at pattern matching, so traders began competing on speed — alpha was found in latency, co-location, and long-distance microwaves.

In the 2010s, alternative data sets seemed to provide the most profitable edge — satellites counting cars in Target’s parking lots, for example.

Now, with the advent of large language models, we’re back to ideas.

The purpose of Jane Street’s models is to “predict a fair value for a thing,” Misky told Dwarkesh. “What do we think this thing is worth?”

Coming from an algorithmic market maker, that sounds weirdly fundamental. What do they care what something’s worth if they’re going to sell it in about two seconds?

It makes sense, though, because LLMs are immensely powerful, but comparatively slow.

Minsky says that, for some of Jane Street’s high-frequency strategies, “in order to be competitive you have to turn around a packet in under 100 nanoseconds.”

A nanosecond is one billionth of a second and 100 nanoseconds is about the time it takes light to travel the length of a basketball court. 

An advanced LLM might be a million times slower than that, so Jane Street’s $6 billion of compute will likely be used to look for trades with a holding time of at least a minute.

Two, maybe.

“LLMs are good at predicting one minute out,” Marc Khoury of Hudson River Trading explains, “but not fast enough to monetize it.”

In other words, the data required to make a useful one-minute prediction cannot be processed by an LLM in substantially less than a minute.

The challenge, then, is finding a monetizable balance between speed and predictive power.

Khoury says a large part of his job at HRT is finding the “Pareto frontier” of AI trading: the curve of optimal tradeoffs where models can become more predictive only by becoming slower, and faster only by becoming less predictive.

As models get bigger and better, they should, in theory, push the frontier of predicting prices further into the future.

Not just any models, though. Jane Street and HRT build their own models — in their own data centers.

In addition to the $6 billion of compute from CoreWeave, Jane Street owns one data center in Texas and is planning another — two of its very own data centers, used exclusively to train new LLMs.

HRT has one, too. Its models are trained in a data center purpose-built inside a Norwegian mountain (pictured above), where fjord water cools the GPUs before being piped into a neighboring fishery to keep the salmon warm.

They go to these great lengths because off-the-shelf models from OpenAI and Anthropic simply won’t do. Even the all-powerful Mythos is insufficient for the specialized task of turning years and years of financial data into a two-minute price prediction.

So the trading firms have had to become AI labs themselves, hiring their own engineers, building their own data centers, and developing their own models — all for the sake of predicting prices.

This may not be the most societally beneficial use of these scarce resources — shouldn’t data centers be curing cancer or something?

But it was probably an inevitable one.

The deep learning that trains LLMs requires endless amounts of two things that trading firms have in indecent abundance: data and money. 

Financial markets are nothing if not data factories. Every trade, quote, order, and earnings report — all of it neatly timestamped and structured — adds to the mountain of data that trading firms train their models on.

A machine-learning researcher at Jane Street, for example, says the firm’s largest datasets are measured in petabytes.

For scale, watching a petabyte worth of movies might take you 50 years. (10 if you insist on watching in 4k.)

Perhaps more importantly, quant trading firms are nothing if not money factories. Jane Street reported $16.1 billion of revenue and $10.3 billion of profit in just the first quarter of this year. 

By comparison, OpenAI lost $6.95 billion in the quarter.

But it’s not just the money and data that attracts AI researchers to trading firms — it’s the intellectual challenge, too.

“All of the different problems of the world end up influencing what you’re doing in a trading context,” Minsky notes. “Because, at the end of the day, trading involves figuring out what things are worth, which means making predictions.”

This, he says, makes trading “AGI complete”: every unsolved problem on the road to artificial general intelligence can be explored through the pursuit of better price predictions. 

What researcher wouldn’t want to work on that?

If financial markets get a little more efficient in the process, that will be a nice bonus.

(Not to mention the salmon.)

— Byron Gilliam

Brought to you by:

  • Can your smart contracts adapt when compliance rules change?

  • Can regulators get the visibility they need without exposing private business activity?

  • Can your infrastructure deliver final settlement instantly?

  • Can your asset holders prepare for the coming quantum risk?

For institutional RWAs, Casper is the infrastructure that can.