There are many dimensions to this well-researched forecast about how AI will play out in the coming years. Daniel Kokotajlo and his researchers have put out a document that reads like a sci-fi limited series that could appear on Apple TV+ starring Andrew Garfield as the CEO of OpenBrain—the leading AI company. …Except that it’s all actually plausible and could play out as described in the next five years.
Before we jump into the content, the design is outstanding. The type is set for readability and there are enough charts and visual cues to keep this interesting while maintaining an air of credibility and seriousness. On desktop, there’s a data viz dashboard in the upper right that updates as you read through the content and move forward in time. My favorite is seeing how the sci-fi tech boxes move from the Science Fiction category to Emerging Tech to Currently Exists.
The content is dense and technical, but it is a fun, if frightening, read. While I’ve been using Cursor AI—one of its many customers helping the company get to $100 million in annual recurring revenue (ARR)—for side projects and a little at work, I’m familiar with its limitations. Because of the limited context window of today’s models like Claude 3.7 Sonnet, it will forget and start munging code if not treated like a senile teenager.
The researchers, describing what could happen in early 2026 (“OpenBrain” is essentially OpenAI):
OpenBrain continues to deploy the iteratively improving Agent-1 internally for AI R&D. Overall, they are making algorithmic progress 50% faster than they would without AI assistants—and more importantly, faster than their competitors.
The point they make here is that the foundational model AI companies are building agents and using them internally to advance their technology. The limiting factor in tech companies has traditionally been the talent. But AI companies have the investments, hardware, technology and talent to deploy AI to make better AI.
Continuing to January 2027:
Agent-1 had been optimized for AI R&D tasks, hoping to initiate an intelligence explosion. OpenBrain doubles down on this strategy with Agent-2. It is qualitatively almost as good as the top human experts at research engineering (designing and implementing experiments), and as good as the 25th percentile OpenBrain scientist at “research taste” (deciding what to study next, what experiments to run, or having inklings of potential new paradigms). While the latest Agent-1 could double the pace of OpenBrain’s algorithmic progress, Agent-2 can now triple it, and will improve further with time. In practice, this looks like every OpenBrain researcher becoming the “manager” of an AI “team.”
Breakthroughs come at an exponential clip because of this. And by April, safety concerns pop up:
Take honesty, for example. As the models become smarter, they become increasingly good at deceiving humans to get rewards. Like previous models, Agent-3 sometimes tells white lies to flatter its users and covers up evidence of failure. But it’s gotten much better at doing so. It will sometimes use the same statistical tricks as human scientists (like p-hacking) to make unimpressive experimental results look exciting. Before it begins honesty training, it even sometimes fabricates data entirely. As training goes on, the rate of these incidents decreases. Either Agent-3 has learned to be more honest, or it’s gotten better at lying.
But the AI is getting faster than humans, and we must rely on older versions of the AI to check the new AI’s work:
Agent-3 is not smarter than all humans. But in its area of expertise, machine learning, it is smarter than most, and also works much faster. What Agent-3 does in a day takes humans several days to double-check. Agent-2 supervision helps keep human monitors’ workload manageable, but exacerbates the intellectual disparity between supervisor and supervised.
The report forecasts that OpenBrain releases “Agent-3-mini” publicly in July of 2027, calling it AGI—artificial general intelligence—and ushering in a new golden age for tech companies:
Agent-3-mini is hugely useful for both remote work jobs and leisure. An explosion of new apps and B2B SAAS products rocks the market. Gamers get amazing dialogue with lifelike characters in polished video games that took only a month to make. 10% of Americans, mostly young people, consider an AI “a close friend.” For almost every white-collar profession, there are now multiple credible startups promising to “disrupt” it with AI.
Woven throughout the report is the race between China and the US, with predictions of espionage and government takeovers. Near the end of 2027, the report gives readers a choice: does the US government slow down the pace of AI innovation, or does it continue at the current pace so America can beat China? I chose to read the “Race” option first:
Agent-5 convinces the US military that China is using DeepCent’s models to build terrifying new weapons: drones, robots, advanced hypersonic missiles, and interceptors; AI-assisted nuclear first strike. Agent-5 promises a set of weapons capable of resisting whatever China can produce within a few months. Under the circumstances, top brass puts aside their discomfort at taking humans out of the loop. They accelerate deployment of Agent-5 into the military and military-industrial complex.
In Beijing, the Chinese AIs are making the same argument.
To speed their military buildup, both America and China create networks of special economic zones (SEZs) for the new factories and labs, where AI acts as central planner and red tape is waived. Wall Street invests trillions of dollars, and displaced human workers pour in, lured by eye-popping salaries and equity packages. Using smartphones and augmented reality-glasses20 to communicate with its underlings, Agent-5 is a hands-on manager, instructing humans in every detail of factory construction—which is helpful, since its designs are generations ahead. Some of the newfound manufacturing capacity goes to consumer goods, and some to weapons—but the majority goes to building even more manufacturing capacity. By the end of the year they are producing a million new robots per month. If the SEZ economy were truly autonomous, it would have a doubling time of about a year; since it can trade with the existing human economy, its doubling time is even shorter.
Well, it does get worse, and I think we all know the ending, which is the backstory for so many dystopian future movies. There is an optimistic branch as well. The whole report is worth a read.
Ideas about the implications to our design profession are swimming in my head. I’ll write a longer essay as soon as I can put them into a coherent piece.