There are many dimensions to this well-researched forecast about how AI will play out in the coming years. Daniel Kokotajlo and his researchers have put out a document that reads like a sci-fi limited series that could appear on Apple TV+ starring Andrew Garfield as the CEO of OpenBrain—the leading AI company. …Except that it’s all actually plausible and could play out as described in the next five years.
Before we jump into the content, the design is outstanding. The type is set for readability and there are enough charts and visual cues to keep this interesting while maintaining an air of credibility and seriousness. On desktop, there’s a data viz dashboard in the upper right that updates as you read through the content and move forward in time. My favorite is seeing how the sci-fi tech boxes move from the Science Fiction category to Emerging Tech to Currently Exists.
The content is dense and technical, but it is a fun, if frightening, read. While I’ve been using Cursor AI—one of its many customers helping the company get to $100 million in annual recurring revenue (ARR)—for side projects and a little at work, I’m familiar with its limitations. Because of the limited context window of today’s models like Claude 3.7 Sonnet, it will forget and start munging code if not treated like a senile teenager.
The researchers, describing what could happen in early 2026 (“OpenBrain” is essentially OpenAI):
OpenBrain continues to deploy the iteratively improving Agent-1 internally for AI R&D. Overall, they are making algorithmic progress 50% faster than they would without AI assistants—and more importantly, faster than their competitors.
The point they make here is that the foundational model AI companies are building agents and using them internally to advance their technology. The limiting factor in tech companies has traditionally been the talent. But AI companies have the investments, hardware, technology and talent to deploy AI to make better AI.
Continuing to January 2027:
Agent-1 had been optimized for AI R&D tasks, hoping to initiate an intelligence explosion. OpenBrain doubles down on this strategy with Agent-2. It is qualitatively almost as good as the top human experts at research engineering (designing and implementing experiments), and as good as the 25th percentile OpenBrain scientist at “research taste” (deciding what to study next, what experiments to run, or having inklings of potential new paradigms). While the latest Agent-1 could double the pace of OpenBrain’s algorithmic progress, Agent-2 can now triple it, and will improve further with time. In practice, this looks like every OpenBrain researcher becoming the “manager” of an AI “team.”
Breakthroughs come at an exponential clip because of this. And by April, safety concerns pop up:
Take honesty, for example. As the models become smarter, they become increasingly good at deceiving humans to get rewards. Like previous models, Agent-3 sometimes tells white lies to flatter its users and covers up evidence of failure. But it’s gotten much better at doing so. It will sometimes use the same statistical tricks as human scientists (like p-hacking) to make unimpressive experimental results look exciting. Before it begins honesty training, it even sometimes fabricates data entirely. As training goes on, the rate of these incidents decreases. Either Agent-3 has learned to be more honest, or it’s gotten better at lying.
But the AI is getting faster than humans, and we must rely on older versions of the AI to check the new AI’s work:
Agent-3 is not smarter than all humans. But in its area of expertise, machine learning, it is smarter than most, and also works much faster. What Agent-3 does in a day takes humans several days to double-check. Agent-2 supervision helps keep human monitors’ workload manageable, but exacerbates the intellectual disparity between supervisor and supervised.
The report forecasts that OpenBrain releases “Agent-3-mini” publicly in July of 2027, calling it AGI—artificial general intelligence—and ushering in a new golden age for tech companies:
Agent-3-mini is hugely useful for both remote work jobs and leisure. An explosion of new apps and B2B SAAS products rocks the market. Gamers get amazing dialogue with lifelike characters in polished video games that took only a month to make. 10% of Americans, mostly young people, consider an AI “a close friend.” For almost every white-collar profession, there are now multiple credible startups promising to “disrupt” it with AI.
Woven throughout the report is the race between China and the US, with predictions of espionage and government takeovers. Near the end of 2027, the report gives readers a choice: does the US government slow down the pace of AI innovation, or does it continue at the current pace so America can beat China? I chose to read the “Race” option first:
Agent-5 convinces the US military that China is using DeepCent’s models to build terrifying new weapons: drones, robots, advanced hypersonic missiles, and interceptors; AI-assisted nuclear first strike. Agent-5 promises a set of weapons capable of resisting whatever China can produce within a few months. Under the circumstances, top brass puts aside their discomfort at taking humans out of the loop. They accelerate deployment of Agent-5 into the military and military-industrial complex.
In Beijing, the Chinese AIs are making the same argument.
To speed their military buildup, both America and China create networks of special economic zones (SEZs) for the new factories and labs, where AI acts as central planner and red tape is waived. Wall Street invests trillions of dollars, and displaced human workers pour in, lured by eye-popping salaries and equity packages. Using smartphones and augmented reality-glasses20 to communicate with its underlings, Agent-5 is a hands-on manager, instructing humans in every detail of factory construction—which is helpful, since its designs are generations ahead. Some of the newfound manufacturing capacity goes to consumer goods, and some to weapons—but the majority goes to building even more manufacturing capacity. By the end of the year they are producing a million new robots per month. If the SEZ economy were truly autonomous, it would have a doubling time of about a year; since it can trade with the existing human economy, its doubling time is even shorter.
Well, it does get worse, and I think we all know the ending, which is the backstory for so many dystopian future movies. There is an optimistic branch as well. The whole report is worth a read.
Ideas about the implications to our design profession are swimming in my head. I’ll write a longer essay as soon as I can put them into a coherent piece.
Update: I’ve written that piece, “Prompt. Generate. Deploy. The New Product Design Workflow.”
A research-backed AI scenario forecast.
There has been an explosion of AI-powered prompt-to-code tools within the last year. The space began with full-on integrated development environments (IDEs) like Cursor and Windsurf. These enabled developers to use leverage AI assistants right inside their coding apps. Then came a tools like v0, Lovable, and Replit, where users could prompt screens into existence at first, and before long, entire applications.
A couple weeks ago, I decided to test out as many of these tools as I could. My aim was to find the app that would combine AI assistance, design capabilities, and the ability to use an organization’s coded design system.
While my previous essay was about the future of product design, this article will dive deep into a head-to-head between all eight apps that I tried. I recorded the screen as I did my testing, so I’ve put together a video as well, in case you didn’t want to read this.
Nearly three weeks after it was introduced at Figma Config 2025, I finally got access to Figma Make. It is in beta and Figma made sure we all know. So I will say upfront that it’s a bit unfair to do an official review. However, many of the tools in my AI prompt-to-code shootout article are also in beta.
Since this review is fairly visual, I made a video as well that summarizes the points in this article pretty well.
The fall of Sonos isn’t as simple as a botched app redesign. Instead, it is the cumulative result of poor strategy, hubris, and forgetting the company’s core value proposition. To recap, Sonos rolled out a new mobile app in May 2024, promising “an unprecedented streaming experience.” Instead, it was a severely handicapped app, missing core features and broke users’ systems. By January 2025, that failed launch wiped nearly $500 million from the company’s market value and cost CEO Patrick Spence his job.
What happened? Why did Sonos go backwards on accessibility? Why did the company remove features like sleep timers and queue management? Immediately after the rollout, the backlash began to snowball into a major crisis.
As a designer and longtime Sonos customer who was also affected by the terrible new app, a little piece of me died inside each time I read the word “redesign.” It was hard not to take it personally, knowing that my profession could have anything to do with how things turned out. Was it really Design’s fault?
I kind of expected it: a lot more ink was spilled on Liquid Glass—particularly on social media. In case you don’t remember, Liquid Glass is the new UI for all of Apple’s platforms. It was announced Monday at WWDC 2025, their annual developers conference.
The criticism is primarily around legibility and accessibility. Secondary reasons include aesthetics and power usage to animate all the bubbles.
Before I go and address the criticism, I think it would be great to break down the team’s design thinking and how Liquid Glass actually works.
I’ve always been a maker at heart—someone who loves to bring ideas to life. When AI exploded, I saw a chance to create something new and meaningful for solo designers. But making Griffin AI was only half the battle…
About a year ago, a few months after GPT-4 was released and took the world by storm, I worked on several AI features at Convex. One was a straightforward email drafting feature but with a twist. We incorporated details we knew about the sender—such as their role and offering—and the email recipient, as well as their role plus info about their company’s industry. To accomplish this, I combined some prompt engineering and data from our data providers, shaping the responses we got from GPT-4.
Playing with this new technology was incredibly fun and eye-opening. And that gave me an idea. Foundational large language models (LLMs) aren’t great yet for factual data retrieval and analysis. But they’re pretty decent at creativity. No, GPT, Claude, or Gemini couldn’t write an Oscar-winning screenplay or win the Pulitzer Prize for poetry, but it’s not bad for starter ideas that are good enough for specific use cases. Hold that thought.
I recently came across Creative Selection: Inside Apple’s Design Process During the Golden Age of Steve Jobs by former software engineer Ken Kocienda. It was in one of my social media feeds, and since I’m interested in Apple, the creative process, and having been at Apple at that time, I was curious.
I began reading the book Saturday evening and finished it Tuesday morning. It was an easy read, as I was already familiar with many of the players mentioned and nearly all the technologies and concepts. But, I’d done something I hadn’t done in a long time—I devoured the book.
Ultimately this book gave more color and structure to what I’d already known, based on my time at Apple and my own interactions with him. Steve Jobs was the ultimate creative director who could inspire, choose, and direct work.
Kocienda describes a nondescript conference room called Diplomacy in Infinite Loop 1 (IL1), the first building at Apple’s then main campus. This was the setting for an hours-long meeting where Steve held court with his lieutenants. Their team members would wait nervously outside the room and get called in one by one to show their in-progress work. In Kocienda’s case, he describes a scene where he showed Steve the iPad software keyboard for the first time. He presented one solution that allowed the user to choose from two layouts: more keys but smaller keys or fewer keys but bigger. Steve asked which Kocienda liked better, and he said the bigger keys, and that was decided.
Christoph Niemann, in a visual essay about generative AI and art:
…the biggest challenge is that writing an A.I. prompt requires the artist to know what he wants. If only it were that simple.
Creating art is a nonlinear process. I start with a rough goal. But then I head into dead ends and get lost or stuck.
The secret to my process is to be on high alert in this deep jungle for unexpected twists and turns, because this is where a new idea is born.
It’s a fun meditation on the meaning of AI-assisted and AI-generated artwork.
The advent of A.I. has shocked me into questioning my relationship with art. Will humans still be able to draw for a living?
Nate Jones performed a yeoman’s job of summarizing Mary Meeker’s 340-slide deck on AI trends, the “2025 Technology as Innovation (TAI) Report.” For those of you who don’t know, Mary Meeker is a famed technology analyst and investor known for her insightful reports on tech industry trends. For the longest time, as an analyst at Kleiner Perkins, she published the Internet Trends report. And she was always prescient.
Half of Jones’ post is the summary, while the other half is how the report applies to product teams. The whole thing is worth 27 minutes of your time, especially if you work in software.
Yes, it's really 340 pages, and yes I really compressed it down, called out key takeaways, and shared what you can actually learn about building in the AI space based on 2025 macro trends!
Miquad Jaffer, a product leader at OpenAI shares his 4D method on how to build AI products that users want. In summary, it's…
An OpenAI product leader's complete playbook to discover real user friction, design invisible AI, plan for failure cases, and go from "cool demo" to "daily habit"