Skip to content

379 posts tagged with “ai”

Arpan Patel wrote a nice consolidated Claude Code reference: the directory layout, CLAUDE.md the way Anthropic’s Boris Cherny writes it, skills, subagents, MCPs, the underused commands. The whole guide turns on one shift:

Claude Code clicked for me once I quit treating it like ChatGPT in a terminal. The mental model flipped from “I need to write this code” to “I need to set Claude up to write this code well.” Setup is the work. Execution is verification.

If you use Claude Code daily, bookmark it.

Screenshot of the article page at arps18.github.io.

Beyond the Prompt: Claude Code

A field guide to using Claude Code as an agent, not a chatbot: the .claude directory, CLAUDE.md, skills, subagents, and the verification loops that make delegation work.

arps18.github.io iconarps18.github.io

Emma Webster, writing for Figma, argues that AI tools are pulling prototyping earlier in the product process: teams can validate with more fidelity, then carry design context forward instead of recreating it at handoff.

Product teams are rapidly adapting to the new way of working in the AI era. They’re prototyping before writing specs, testing in code before designing, exploring at unprecedented scale, and shipping with design system context that used to get lost in the handoff. We talked to product builders at FloQast, Merkle, Affirm, and Accor about how that’s playing out in practice.

The shift is about when the hard questions show up. Specs used to be the artifact that let teams pretend they had alignment. The team behind Claude Design skipped the PRD entirely and prototyped its way to the answer instead. With AI tools, a prototype can become the first serious question: does this flow hold up when the data, logic, motion, and system constraints are present?

Webster describes the code-to-canvas loop this way:

Testing an idea against intricate constraints—things like multi-step flows where one action triggers the next, or interfaces that behave differently depending on the data behind them—used to require significant developer investment. Today, AI coding tools have made it possible for more people on a product team to quickly build and test these kinds of interactions before committing to a direction. That’s opened up a new workflow. A product builder can create a working prototype in code, then move it onto the Figma canvas using Codex to Figma to see the full picture and refine it together. From there, if more work needs to happen in code, they can move back via MCP with the design context intact.

This is the version of AI-assisted design I care about. Not “prompt a shiny UI from a blank page.” A working model becomes the place where designers, PMs, and engineers can see the same problem at once. Figma is still the canvas for style exploration and visualizing complex flows, but the decision surface is becoming more product-shaped.

That matters because prototype fidelity changes what a team is allowed to learn. A flat mockup can test preference and comprehension. A working prototype can test sequencing, edge cases, permissions, motion, and whether the idea survives contact with real constraints. Bringing that earlier into the process should make design less speculative, not less thoughtful.

The Accor example shows why this matters before anyone commits to a build:

Justine opened Figma Make and prototyped something she wouldn’t have had time to build by hand—a webpage that reorganizes itself based on what the user types. Search for “golf” and the page reshapes around properties with golf courses, curated outings, and relevant experiences. Make handled the micro-interactions and transitions, and the Figma MCP server kept everything connected to the brand’s design system. Within days, she had a working prototype ambitious enough to show leadership what was possible—and concrete enough to start a real conversation about what to build next.

Webster’s Affirm example carries the same logic all the way into production:

A PM prototyped the badge variations in Figma Make—going from idea to working prototype in two days instead of the usual six weeks. Designers refined the winning direction on the canvas, and when the team was ready to move that design into production, they loaded the design artifacts into the Figma MCP server and connected it to Cursor. MCP passed the components, tokens, and layout structure directly into the coding environment, where an AI agent generated the front-end implementation. Developers used that as their starting point, building production code that already reflected the designs instead of reinterpreting them from scratch.

Preserving components, tokens, and layout structure turns the prototype into a rehearsal for the real build. It has enough fidelity to expose bad directions early and enough context to keep the winning direction from being rebuilt from memory.

Header image for the Figma blog post on AI tools for going from idea to product.

4 New Ways to Go From Idea to Product With AI Tools

AI tools are changing how teams build products—from where they start to what carries through to production.

figma.com iconfigma.com

Simon Willison thinks the AI labs have found product-market fit. Here’s his own monthly usage priced at API rates against the $200 he actually pays:

  • $1,199.79 for Anthropic Claude Code
  • $980.37 for OpenAI Codex

That’s $2,180.16 worth of tokens for $200—not bad at all! I’m a moderately heavy user of these tools, but I’m certainly not running agents every hour of the day and night.

That discount is gone: since April 2026 enterprises pay full API rates. Willison’s read:

Coding agents really did change everything. These are tools which burn vastly more tokens, but are also quickly becoming daily drivers for the work carried out by extremely well-compensated professionals. Right now that’s still mostly software engineers, but a coding agent is a tool that can automate anything you can do by typing commands into a computer… so they are clearly applicable to a much wider set of skilled knowledge workers.

Right now the bill falls on engineers. Designers may be next. Anthropic has already rolled out a separate usage meter for Claude Design. And Figma is charging for AI usage overages.

Screenshot of the article page at simonwillison.net.

I think Anthropic and OpenAI have found product-market fit

Simon Willison reads the coding-agent boom through pricing: enterprises shifting from discounted seats to usage-based bills as Claude Code and Codex become daily tools.

simonwillison.net iconsimonwillison.net

Mozilla.ai’s Alejandro Gonzalez asks a useful question for designers working through agent-native software: what is the agent actually editing in the first place?

He starts with Claude Design but the piece is really about the old software contract underneath most productivity tools:

Most human-computer interaction has been built around two patterns: issuing commands (typing, clicking, speaking) and manipulating representations (dragging, resizing, arranging, formatting). Every productivity tool ever built is designed around one or both of those. The keyboard, the mouse, the touchscreen. That is the full vocabulary. The interface and the product were, for practical purposes, the same thing.

This is a useful distinction for designers because it doesn’t treat UI as decoration. It treats UI as a historical compromise: the thing humans needed in order to reach the state underneath. Agents put pressure on that compromise because they don’t need the same surface.

Gonzalez is careful about the transition, though:

This is useful. More than useful, it is probably necessary. The world already runs on existing software. Companies have years of organizational knowledge embedded in Gmail, Slack, Jira, Salesforce, Notion. If agents are going to be helpful today, they need to work inside that world.

That is the bridge.

But the bridge is not the destination. Agents using existing apps help bring AI into the current software stack. Apps built for agents may change the shape of the stack itself.

And there is something more valuable in that process than just short-term utility. Watching where agents struggle with existing interfaces, where the translation between intent and UI operation is most painful, is probably the most honest way to find where the structural opportunity is. The friction is the signal.

The agent failing to use a legacy screen isn’t only a product bug; it may be a map of where the product’s abstraction is wrong. For design teams, that shifts the work from polishing the path through a tool to naming the real object of work.

Gonzalez’s product-strategy example makes that abstraction concrete:

The source of truth for a product strategy is not the slide deck, the roadmap doc, the ticket board, or the dashboard. It is the strategy itself: the goals, the bets, the risks, the owners, the metrics, the decisions. Everything else is a view. The memo, the board deck, the launch checklist, the customer brief are renderings of the same underlying object, shaped for different audiences.

A product launch is not a Notion doc, a Linear project, a slide deck, and a dashboard. It is a product launch.

In Gonzalez’s framing, the deliverable is no longer the deck, the board, or the dashboard. The durable thing is the structured model that can be rendered, checked, diffed, approved, and exported.

That is the part that changes the designer’s job. If the source of truth is a structured object instead of a visible artifact, then design has to specify the object: its fields, constraints, permissions, states, failure modes, and views. The screen becomes one projection among many. A human may need a deck, an agent may need a schema, a manager may need a dashboard, and the system needs a versioned record of what changed. Those are not separate products if they are all renderings of the same underlying thing.

Gonzalez closes by keeping the old tools in the frame:

The old tools will not vanish quickly. They have distribution, habits, enterprise contracts, file compatibility, and decades of user training on their side. But the center of gravity moves. The work happens in the agent-native system. The legacy app receives the export.

I do not think this transition will be clean. The old world will remain around us for a long time. People will still export PowerPoint files, update spreadsheets, paste things into email, and manage work through tools that were designed before any of this existed.

But that feels increasingly like a transitional phase.

The more interesting future is not only agents operating apps. It is applications designed so agents, humans, and existing tools can all work with the same underlying objects.

Not because every app disappears but because the source of truth may move.

Illustration of transforming Platonic solids, the header image for the Mozilla.ai essay.

The Interface Is No Longer the Product

The future of AI may not be agents using today’s apps but apps rebuilt around structured objects agents can inspect and edit directly. The deck or dashboard becomes one view.

blog.mozilla.ai iconblog.mozilla.ai

Jakob Nielsen starts from OpenAI’s new $4 billion consulting arm and its acquihire of 150 Forward Deployed Engineers, the kind who embed with a client instead of building from headquarters. His argument is that they solve the wrong level of the problem: you can speed up the work without changing it. The missing counterpart he proposes is the Forward Deployed Designer.

Nielsen draws the line between faster individual work and a faster business:

With AI, the old workflows must no longer be treated as the design brief; they must be questioned at the root. AI is, in fact, a great productivity enhancer even when used in the traditional way to increase the efficiency of individual employees performing the same tasks as always, just faster and better. A paralegal can summarize a legal brief in seconds; a junior developer can write boilerplate code instantly; a digital marketer can generate campaign copy with a single prompt. We can typically improve that employee’s performance on those specific rote tasks by roughly 40%.

But at the company-wide level, such local productivity gains rarely translate into substantial profit gains and shareholder value. When you have a long chain of steps and optimize only a few, the delay simply shifts to the remaining steps, which will dominate the overall time to solve the underlying problem.

Nielsen again:

Once AI removes the cognitive bottleneck, a different bottleneck appears: authority. The limiting question becomes not “Can the system know what to do?” but “Is the system allowed to do it?” AI-native workflow design must therefore redesign decision rights, escalation rules, audit trails, and accountability boundaries. Otherwise, the organization merely replaces slow human cognition with fast machine recommendations waiting for the same old human permission structure.

Title graphic for Jakob Nielsen's UX Tigers essay on Forward Deployed Designers.

Forward Deployed Designers: From FDE to FDD

Jakob Nielsen argues enterprise AI needs Forward Deployed Designers who redesign whole workflows, decision rights, and handoffs—not just engineers who make individual tasks faster.

uxtigers.com iconuxtigers.com

Design leaders spend a lot of time telling teams to experiment with AI. Nathan Lavertue, a Design Program Director for IBM Z and LinuxONE, turns that advice back on leadership itself:

We spend a lot of time helping designers understand how to work with AI. The question I keep coming back to is simpler. How many of us are doing the same for ourselves in ways that meaningfully support the business?

So instead of just encouraging my teams to experiment with AI tools, I put myself in the work. I built a design program signals website using IBM Bob. What started as a wireframe to sketch out an idea became something I realized I could actually build myself. That surprise is the whole story.

I appreciate the reciprocity here. If designers are being asked to work through this shift in public, leadership cannot treat AI as a strategy deck it reviews from a comfortable distance. You cannot build useful judgment about these tools by asking other people to absorb the uncertainty for you. That is why I’ve been reading about them, writing about them, and experimenting with them on my own. Whether it’s OpenClaw, Hermes, running a local LLM, ComfyUI, or Claude Design, curiosity is key here.

The interesting part of Lavertue’s example is not that he made a dashboard. Dashboards are cheap. The useful part is that he used AI to make a leadership problem legible enough to discuss. His signals site pulled together team health and business impact, then sorted indicators into required, expected, and optional categories so the absence of a signal became something to interpret, not just a blank cell to punish.

Lavertue is clear about this:

I had to remind myself of that more than once while building it. The signals site was useful. Bob was a capable collaborator. But the risk with any tool that comes together quickly is mistaking the build for the point. The site was never the outcome. It was infrastructure for conversations. Design’s impact on the business was the outcome. Keeping that distinction clear required the same discipline I would ask of any designer getting excited about a new tool.

The site did not replace leadership judgment. It grounded it. Instead of reacting to delayed updates or anecdotal signals, I could engage teams with shared context and a clearer ability to look forward rather than back. This was another form of walking the walk. Not just encouraging teams to work differently but building the system that made that work visible and meaningful.

That feels like the better bar for AI-native leadership. Not “leaders should code now.” Not “every management problem needs a custom tool.” The bar is whether leaders are willing to put their own work through the same change they are asking from their teams.

Title card for an IBM Design essay on design leadership in an AI-native world.

Walking the Walk: Design Leadership in an AI-Native World

Design leaders keep telling teams to experiment with AI. Nathan Lavertue turns the advice on himself, building a signals site with AI to make leadership decisions legible.

medium.com iconmedium.com

Felipe A. Carriço, a UX designer and AI product builder, turns accessibility guidance into context AI coding agents have to follow with A11Y.md:

A11Y.md is not a guideline. It is an accessibility validation protocol and a persistent context architecture for developing accessible software with AI. It is designed to integrate with AI agent systems and human review workflows to ensure certifiable compliance.

By adopting the mental model of Anthropic’s CLAUDE.md—which acts as a system prompt memory for code generation—A11Y.md translates this architecture into a universal, portable governance layer. Instead of generic coding rules, it forces any coding agent (Claude, Cursor, Copilot) to strictly adhere to WCAG 2.2 AA and ADA standards from the very first line of generated UI code.

I appreciate how operational this is. It pairs well with Joost de Valk’s Website Specification, which treats machine-readable standards as part of what a good site does. A11Y.md brings the same idea into the build process: the generator has to carry the accessibility context while it makes the UI. That matters because accessibility failures in generated code are rarely abstract. They show up as broken keyboard paths, silent error states, and interface logic that only works for the person who can see and click everything.

Carriço is blunt about the difference between reading and changing the workflow:

Reading about accessibility is the first step, injecting it into your code is the real goal. Do this right now in your project:

  1. Download the Rules: Copy the A11Y.md file from docs/en/ to the root of your application’s repository.
  2. Inject into the Prompt: If you use Cursor, GitHub Copilot, or Claude, add this to your global rules file (.cursorrules or Context system):

“Strictly follow the development rules defined in the A11Y.md file.”

  1. Use as a Quality Gate: Before merging important PRs, use the checklist in docs/en/templates/REPORT.md.

If you do not perform the steps above, you are not changing your workflow — you are just reading about the subject.

That is the product here: wiring accessibility into the build process so it changes what gets generated.

A11Y.md project banner showing the project name and accessibility badges for WCAG 2.2 AA and ADA compliance.

A context system for building accessible software by default — for developers and AI, with enforceable rules aligned to WCAG.

A persistent context architecture that enforces WCAG 2.2 AA and ADA standards from the first line of UI code—a governance layer for AI coding agents built on the CLAUDE.md mental model.

github.com icongithub.com

Joost de Valk, creator of the Yoast SEO plugin for WordPress, has turned the “what should a good website do?” question into The Website Specification: a platform-agnostic checklist that puts HTML basics, SEO, accessibility, security, performance, privacy, internationalization, and agent readiness in one place.

The useful shift is that the AI-facing work is treated as normal website hygiene. Not a separate “AI strategy” project. Not a prompt-engineering side quest. Just another part of making the site understandable to the systems that now read, rank, quote, and retrieve it.

A platform-agnostic specification of the technical features every decent website should have — from <title> to /.well-known/security.txt, from WCAG contrast to llms.txt. Written for humans and agents.

Ten areas, mapped to widely-accepted standards.

Each topic links back to the source standard — WHATWG, W3C, IETF RFCs, WCAG, MDN, and the organisations defining the modern web.

Whether you ship WordPress, Drupal, TYPO3, Next.js, Astro, Hugo, a Django app, or plain HTML, the spec is the spec. Implementation hints follow it, not the other way round.

I like that standards-first posture. A lot of AI advice still treats the web like a pile of pages to be scraped, summarized, and maybe attributed later. De Valk pulls it back toward contracts: stable URLs, explicit policies, structured data, clean source material, and machine-readable ways to discover what matters.

From the Agent Readiness section:

Agent readiness is a loose umbrella term for the choices that make a website legible to AI agents — chat assistants, autonomous browsers, retrieval pipelines, and any other non-human client that reads the web at scale. None of it is a single formal standard. It is a collection of existing web fundamentals plus a few emerging conventions.

Agents read the same HTML as browsers, but they read it differently. They:

  • Fetch a page, often without executing JavaScript.
  • Strip away navigation, ads, and chrome to extract the main content.
  • Follow links, structured data, and well-known endpoints to discover more.
  • Cache and quote your content in answers, with or without a link back.

If your content is locked behind client-side rendering, your URLs change every release, or your robots.txt blocks the assistants your customers use, you are invisible in that surface. The pages that win in agent answers are the ones that are easy to fetch, easy to parse, and easy to trust.

That’s the part designers should pay attention to. We tend to think of the interface as the thing on the screen. But if agents are part of the audience now, the interface also includes off-screen surfaces: metadata that explains the page, feeds and sitemaps that expose what exists, crawler policies that say what can be read, and curated indexes like llms.txt that tell software what matters.

De Valk again:

There is no single switch. The items in this category each cover one part:

  • Stable URLs so cached answers stay valid.
  • Structured data (JSON-LD) so agents can extract entities without guessing.
  • Clean semantic HTML so content extraction does not pull in navigation.
  • A robots.txt that names AI crawlers explicitly so your policy is unambiguous.
  • /llms.txt as a curated index of your most important content (emerging).
  • Machine-readable endpoints — sitemaps, RSS, JSON feeds — where they fit.
  • MCP server endpoints for sites that expose tools or actions (emerging).

Most of these also benefit traditional search engines and accessibility. Agent readiness rarely conflicts with the rest of the spec; it just raises the priority of things that have always been good practice.

De Valk’s point is simpler: agent readiness mostly means doing the old web discipline well enough that agents can actually read and trust the site.

The Website Specification homepage, a platform-agnostic reference for what every good website should do.

The Website Specification

A platform-agnostic, full specification of the technical features a good website should have. Built in the open under an MIT licence.

specification.website iconspecification.website

Dan Carey leads product at Anthropic Labs, the team behind Claude Code and Claude Design. In a talk on how a three-person team shipped Claude Design in ten weeks, he describes what happened to everyone else after their engineers got fast:

And so once Claude Code took off, the bottleneck moved. The bottleneck moved from building the feature to figuring out the right things to be building for your users, in a lot of cases. So the option was either skip those early steps, just try and decide on the fly, and potentially build the wrong thing really fast, or try to find ways for the rest of us to speed up. So our designers, our PMs, were having trouble keeping up. We needed our own accelerator tool.

Carey just relocated the bottleneck onto the exact work designers and PMs own: figuring out what’s worth building. That’s product discovery becoming the real constraint. When building gets cheap, what’s left to get right is the decision about what to build at all.

How does the team make that call? Not by writing it down:

So we like to use prototypes because documents are imprecise. It’s so easy for two people to look at the same doc and have two different products in mind about what the experience should be. […] Prototypes are more concrete, more visceral. They let you get hands on with the thing and really feel the experience yourself.

They skipped the PRD and the vision docs entirely. A working prototype immediately aligns people, and it doubles as the discovery tool: you build the rough thing to find out what the right thing is.

And it helped that the team was small enough to skip coordination entirely. Here’s Carey:

Everyone on the team does everything. The engineers talk to users, PMs write code, designers do data analysis. All of these things are enabled in part with Claude. And the lines between the roles on this team, they have essentially dissolved at this point. You do have your specialization, you do have the unique perspective and diversity that you bring to a team, but at any moment, any one of these people on this team can talk to 10 users, you can realize what the underlying problem is, you can design a solution to it, you can ship it to users, you can listen for feedback, you can keep iterating solo if you need to.

On Carey’s team, the designer who spots the problem also builds the solution and ships it. That’s the kind of role a lot of designers are now being asked to grow into, and it looks less like a handoff between specialists than one person carrying an idea from problem to finished screen.

Speed doesn’t guarantee you build the right thing, though, and Carey is candid about the team’s misses. They built a set of advanced, fine-grained controls for power users. A few vocal testers loved them—I know I would have. But the usage showed everyone else hated them, and the team pulled the controls in a week. Two lessons came out of it:

So this taught us a couple of things. One, this taught us that we should be a tool that lifts the level of craft for everybody, not just the ceiling on power users. It also taught us that we want to be as open as possible, because there will be users that we never meet the full needs of. There’s going to be some power user out there who wants to do something very specific that we’re not going to support. And that’s what convinced us that we wanted this to be a very open tool. That’s why if you export from it, you get HTML, CSS, JavaScript.

Designing with Claude: From prompt to production

Claude Design lets you describe what you want in plain language and get production-quality outputs. Learn how a small team built a design tool that ships in your brand, from prompt to production.

youtube.com iconyoutube.com

Deva Corriveau, Creative Director at Brandpie, writing for The Subtext, asks what happens when the customer doesn’t choose at all.

Take something mundane, like ordering a takeaway. Consumers don’t feel a deep emotional connection to whether dinner arrives via Just Eat, Deliveroo or Uber Eats. They may have habits or interface preferences, but they don’t meaningfully care about the logo on the rider’s jacket or the tone of voice in their advertising. Yet these businesses still spend tens of millions each year trying to build precisely that sense of distinction.

Now imagine a step beyond this. Instead of opening an app at all, the consumer simply instructs an AI assistant to order a pizza. The system scans available providers, evaluates delivery times, compares pricing, reviews reliability data, and executes the transaction. The entire process takes place behind a seamless, invisible layer of automation. The user does not browse. They do not compare. They are not exposed to campaigns or nudged by distinctive brand assets. The decision is simply optimized.

From a consumer perspective, this is seamless and efficient. From a brand perspective, it’s an unsettling shift in how choice is made and where influence sits.

That distinction matters. In a browser or an app, brand can still interrupt the customer at the point of comparison. Inside an agent, the brand has to show up as criteria the agent can evaluate.

Corriveau on the shift:

For decades, we’ve treated awareness as the foundation of growth. Be famous. Be distinctive. Be top of mind. When the moment of choice arrives, ensure your brand is mentally available. That logic remains sound – but only if a human is making the decision.

An AI agent does not remember your jingle or favour your colour palette. It does not feel reassured by your heritage or inspired by your purpose. It simply calculates against a defined set of criteria.

This does not mean brand disappears, but its role shifts. Marketeers must move upstream from the moment of choice to defining the parameters of the choice itself.

If the agent is comparing delivery time, service ratings, return policies, privacy history, and price, then the promises a brand makes need to map to service behaviors, policies, and performance an agent can actually evaluate. A promise the company can’t prove becomes decoration.

I don’t read this as “branding is dead.” Corriveau is saying something narrower: people still define preferences; automation changes when and how those preferences get expressed.

Discussions about automation often miss a critical point: humans still define the criteria. A user may delegate comparison and selection to an AI, but they still decide what it optimises for. They might instruct it to prioritize companies with high customer service ratings, favour businesses with strong sustainability credentials, or exclude brands that have suffered data breaches. Human values, identity, and worldview remain central – they are simply expressed differently.

Trust, identity signalling, ethical alignment: these human drivers do not disappear just because a machine intermediates the transaction. In fact, they may become more explicit. Rather than being subconsciously influenced by advertising, consumers will consciously encode their preferences into the system.

In that world, the role of brand becomes less about capturing attention in the moment and more about establishing a presence so clear and widely understood that people choose to embed it into their decision rules. The brands that endure will be those that stand for something concrete enough to be deliberately included in the instructions given to machines.

That keeps Corriveau’s argument from becoming a pure optimization story. He isn’t replacing human values with machine logic; he is moving the expression of those values upstream. The shelf moment gets quieter because the proxy has already filtered the options. Brand becomes less about a burst of attention and more about operational consistency the customer can delegate with confidence.

For designers and brand teams, the practical consequence is simple: brand claims need proof a system can read and compare. The work doesn’t stop at making a company memorable; it has to make the company’s promises observable, consistent, and legible before a human sees the options. The artifact isn’t only the campaign anymore; it’s the evidence trail behind the campaign.

The biggest risk sits in the middle. The brands whose differentiation relies primarily on communication rather than capability. Agentic AI will expose decorative branding with uncomfortable clarity. If your distinctiveness lives in marketing but not in service, performance or trust, optimisation will reduce your value to price alone.

For branding professionals, this is not a minor adjustment; it is a structural reframing. The future will rely less on megaphones and more on architecture. We must move away from simply generating awareness toward establishing qualities so credible and so consistent that they influence how customers configure their digital proxies. The question is no longer just how to be noticed, but how to be retrievable, recommendable, and selectable inside AI-driven systems.

This is where Generative Engine Optimization (GEO) begins to matter. Brands will need to think less in terms of impressions and more in terms of machine-readable signals of trust, performance, and relevance – the inputs that shape whether an AI system even considers them in a ranked set of options. Practically, this means building brand equity in ways that can be consistently interpreted by both humans and machines: structured proof of service quality, transparent value signals, strong third-party validation, and behavioural consistency over time.

Abstract digital visualization representing AI-driven consumer decision processes and brand filtering.

AI Doesn’t Care About Your Latest Campaign

AI agents are reshaping consumer decisions. What it takes for brands to stay relevant as algorithms drive choice.

thesubtext.online iconthesubtext.online

Thirty-year veteran software engineer Christoph Mütze shipped a 25,000-parameter transformer that runs on a stock Commodore 64, complete with an exhaustive test harness and a stack of reference implementations that all have to agree before anything ships. He called it SoulPlayer. In return he got called a vibecoder. Same reflex as the Monet pile-on: label first, verdict next, evidence optional. His response is the takedown of the “vibecoded slop” accusation I’d been waiting for somebody to write, and it lands on a single question that nobody on the accusing side wants to answer:

If vibecoding is what you say it is, if AI does the hard part, if the human just prompts and ships, if expertise is no longer a moat, then the world should be drowning in proper software right now. Not slop. Real tools. The kind people pay for, depend on, use every day. Two years of access. Millions of people with the models. The barrier supposedly fell. …where is everything?

David Pierce, catalogued the bespoke micro-apps people are building for themselves: family budget trackers, fantasy baseball rank engines, migration logs with a total addressable market of one. That’s real, and it’s the right scale to celebrate. But Mütze is asking a different question: where is the vibecoded Photoshop? Where is the vibecoded Maya, the vibecoded Blender, the vibecoded compiler that compiles itself? If the prompt-and-ship cartoon were true, two years in we’d have an avalanche of sophisticated tools built by people who don’t know how to code. We don’t. The category is empty. Mütze’s diagnosis of why is the part I want every designer reading this to take in:

Level 1 is what the industry usually calls coding. The syntax, the loops, the years memorizing pointer arithmetic and which header file the function lives in. LeetCode-measurable. The job interview essence. The mechanical part. The typing.

Level 2 is flow. What you do with Level 1. Knowing the right data structure. Knowing which ugly pragmatic solution to ship instead of the beautiful academic one. Reading other people’s code. Taste and judgment. The reflex of rejecting solutions that almost work and shipping the ones that do. Debugging, unit testing, the quality-control part.

Level 3 is architecture. The macro decisions, made with full awareness of their consequences. What to build at all. Why this data structure and not that one. Why this trade-off and not the obvious one. Which design survives contact with the real world, and which one silently falls apart two years later. The deciding part.

The three have never been the same thing. The gate was never at Level 1. The gate was at Levels 2 and 3, where the work that holds together actually happens. AI lowered the cost of Level 1. It didn’t really touch Levels 2 or 3. The gate is exactly where it always was.

You can easily translate this framework from engineering to design. Level 1 in design is pushing pixels: the auto-layout setup, the icon nudging, the variant-matrix work in Figma that fills our days. Level 2 is the taste that picks which of the fifteen generated directions is actually worth shipping. Level 3 is deciding what to build at all, and for whom. AI is eating Level 1 in design the same way it has eaten Level 1 in code. The designers who panic about “vibecoded design” are panicking because Level 1 was the layer they could see, measure, and defend. The gate is somewhere else, and it always was.

The reason this gets so emotional is the part Christopher Butler has been pointing at for a while: AI doesn’t just replace tools, it renegotiates what made you worth hiring. Mütze says the same thing:

The accusers cannot see this. They are not at the gate. They were at Level 1. Level 1 was their identity, their hours, their proof of belonging, their reason to feel at home in this profession. When AI made Level 1 cheap, it did not threaten the gate. It threatened them. Because they bet their self-worth on the layer that just got rented out. So they call the work vibecoded. They have to.

Mütze could weaponize the accusation back. He has the receipts: the test harness, the reference implementations, thirty years on the demoscene. He refuses and ends with a call-to-action:

If you’ve been sitting on something you made with AI, ship it. Name your tools. Don’t apologize. The accusation is cheaper than the work. Yours is worth more.

Hero image from Indiepixel's essay asking where the vibecoded Photoshops are.

Where are the vibecoded Photoshops?

If vibecoding is what people say it is, the world should be drowning in vibecoded artifacts right now. Two years of access. Millions of people with the tools. The barrier supposedly fell. So where is everything?

indiepixel.de iconindiepixel.de

David Pierce, writing for The Verge, dates the inflection precisely: late 2025, when an update to Claude Code crossed the line from “surprising when it worked” to “surprising when it didn’t.” That’s the moment vibe coding stopped being a demo and started being a tool ordinary people could actually use.

In late 2025, an update to Anthropic’s Claude model turned its Claude Code tool from a code generator that was surprising if it worked to one that was surprising when it didn’t. Suddenly, all you needed was $20 a month and a half-formed idea, and an AI model could build you functional software. If you could explain what wasn’t working, Claude Code could probably fix it. Andrej Karpathy, an educator and researcher who was on OpenAI’s founding team, had called this new behavior “vibe coding.” Suddenly the vibes were off the charts.

The reliability threshold matters more than the headline number. Twenty dollars a month was already true. What changed is that the output stopped breaking when you asked it to do something real. That’s what made the personal software lineage—from HyperCard in 1987 through Lee Robinson’s essay, the home-cooked-app idea, micro-apps, and fleeting apps—turn from a niche aesthetic into something a normal person could actually do over a weekend.

Pierce documents his own version of that weekend: building Timetable, abandoning it, building Spring and forgetting what it did, getting stuck on Twilio bills. The realization that pulls him out of the loop is the one that’s worth dwelling on:

What saved my efforts was the realization that personal software doesn’t have to be built from scratch. Knowledgeable developers might be newly capable home cooks, but the rest of us are more like customers at Chipotle. We don’t make the food, we don’t even really assemble it, but we get to decide what goes where and how it’s served to us. For most of us, the future of software is not building our own Excel from scratch, it’s using the models to build spreadsheets wildly more capable than we could create ourselves. It’s building the Chrome extension for your favorite app that is really only missing a Chrome extension. It’s tweaking the way things look to suit your exact taste and needs.

Most coverage of vibe coding implies the future is everyone becoming a one-person engineering team. Pierce’s actual claim is narrower and more useful: like ordering a Chipotle burrito, you’re picking ingredients and toppings, not running the kitchen. The point is not to replace Notion or Obsidian or Todoist. It’s to bend them an inch closer to how you actually work.

My whole publishing workflow for this blog switched from Payload CMS to a custom admin UI and now a custom Obsidian plugin.

Which brings the conversation to where Pierce lands it: taste.

In this new world, the most important thing you’ll need is taste. Not objectively good taste, necessarily, so much as a keen sense of your own. You need to be like Rick Rubin, the famous music producer, who once told 60 Minutes that what made him successful was not any particular technical ability, but “the confidence I have in my taste, and my ability to express what I feel.” Rubin practices that art with A-list celebrities; you need to be able to do it with AI. Otherwise, you’ll land in what Lovin calls “doom loops,” telling your chatbot only what you don’t like and counting on the model to be the creative one. That way lies madness — and bad software.

Yan Liu’s working definition of taste cites the same Rubin formula—sensitivity times standards—and that’s the part of Pierce’s argument that designers should sit with. The $20 vibe-coder has the tool. What they often don’t have is the trained eye to know when Claude’s purple gradient is wrong, or why the icon looks like a butthole instead of a planner. Pierce learned this the hard way and concluded, sensibly, that he didn’t have opinions about databases but did have opinions about typefaces. That’s the right diagnosis. It also undersells what designers actually do—Raj Nandan Sharma’s warning about taste-as-end-of-pipeline selection is the other half of this. If designers don’t show up as authors here—shaping what gets generated, not just thumbs-upping it after the fact—the personal software era will produce a lot of bespoke purple gradients and not much else.

Illustrated hero image for The Verge's feature on the personal software revolution and vibe coding.

Welcome to the personal software revolution

AI is empowering a generation of vibe coders to build exactly what they want. The personal software revolution is here.

theverge.com icontheverge.com

An X user posted a painting from Claude Monet’s Water Lilies series, labeled it as AI-generated, and asked the timeline to explain what made it inferior to the real thing. Michael Zhang, writing for PetaPixel, collected what came back. Critics produced confident, formally-worded takedowns of an actual masterpiece:

“I’m disappointed I have to even point it out. There is no cohesion to the depth and color choices. The reflection of the tree bleeds into the lilypads with no regard for spatial depth or contrast. The background lilypad-algae amalgam is egregiously vague, like most AI art.”

The reflections are noise. The composition has no focal point. The lily pads look drawn on. Reply after reply, in vocabulary borrowed from art-school crit, explaining why a Monet is not a Monet.

The article ties the prank to research published in Nature in 2024 by Simone Grassini and Mika Koivisto:

“Participants were unable to consistently distinguish between human and AI-created images. Furthermore, despite generally preferring the AI-generated artworks over human-made ones, the participants displayed a negative bias against AI-generated artworks when subjective perception of source attribution was considered, thus rating as less preferable the artworks perceived more as AI-generated, independently on their true source.”

The finding lands the experiment: source attribution does the work, not vision. Tell people the image is AI and the same image becomes worse. The technical vocabulary arrives to justify a judgment that was already made.

Viewers don’t even need the prompt. They’ll supply the label themselves: parts of Lady Gaga’s Tim Burton-directed Dead Dance video struck people as AI because the imagery looked odd, and the slop critiques followed.

This is what Christopher Butler called the reactionary red-lining of AI—drawing hard lines against a category of work and then reverse-engineering the reasons. The Monet experiment is the same bias caught in the act, just running aesthetically instead of ethically.

Social card overlaying a Monet water lily painting beside the X post asking critics to explain why the AI image is inferior.

Someone Shared a Real Monet Painting as AI and Asked for Critiques

Someone shared a real Monet painting as an AI image and asked for critiques as to why AI art is inferior to the real thing. Hilarity ensued.

petapixel.com iconpetapixel.com

Gess Puglielli, writing on LinkedIn, argues that the speed of AI interface generation has revealed something other than a new tool. It has revealed that a lot of companies were never working from a real definition of design:

But interfaces were never the real value of design. They were just the artefacts left behind. The output. The visible layer of a much deeper process involving human behaviour, systems thinking, psychology, usability, strategy, communication, emotion, culture and invention. Design was never about moving pixels around a canvas. Design is how humans shape the world around them.

Jakob Nielsen made an adjacent argument about the shift from artifact production to intent shaping. Puglielli is pointing at something sharper. Nielsen describes a shift in what designers do; Puglielli says the shift has exposed a category of companies that mistook the artifact for the work in the first place.

The diagnostic part is what stayed with me:

In many organisations, designers were already being treated like production software long before generative AI arrived. The process often looked something like this: Product defines requirements. Engineering defines constraints. Leadership defines strategy. Then design is invited in to “make it look good.” At that point, the designer has already been removed from the act of designing. They’ve become decorators of predetermined decisions.

This is what makes “AI replaced our designers” make sense inside certain rooms and sound absurd inside others. If your design function had already been narrowed to ticket-taking execution, AI can replicate execution. Karri Saarinen pointed at the same misunderstanding when he wrote that the hard part of design is understanding the problem well enough to know what should exist at all. Puglielli’s contribution is the corollary: the companies that don’t know that won’t notice it’s missing when it gets cut.

Puglielli argues what AI isn’t good at:

AI can generate screens. It cannot independently define meaningful problems worth solving. It cannot deeply understand cultural nuance, emotional context or human contradiction in the way experienced designers can. It cannot navigate organisational politics, align competing stakeholder priorities, recognise ethical implications or identify latent human needs before users themselves can articulate them.

Most importantly, it cannot care. And care matters more than the industry likes to admit.

Care is the right word for designers and a weak word for industry, because businesses don’t pay for care. They pay for the outputs care produces—taste, the ability to see a problem before it’s named, and the thing we call judgment.

LinkedIn article cover image for Gess Puglielli's essay on AI exposing companies that never understood design.

If AI Can Replace Your Designers, You Never Understood Design

We’ve reached a strange moment in tech where generating an interface in 12 seconds has convinced an entire industry that design was never more than arranging rectangles on a screen.

linkedin.com iconlinkedin.com

The headline says it all: “Uber president says AI spending is getting ‘harder to justify.’”

Jess Weatherbed, writing in The Verge:

After reportedly exhausting its annual AI budget just four months into 2026, Uber is now questioning whether it’s actually seeing meaningful returns on its investments. In an interview with Rapid Response, Uber president and chief operating officer Andrew Macdonald said the company isn’t seeing a connection between rising token consumption for Claude Code and more useful features being delivered to consumers.

“That link is not there yet, right? I think maybe implicitly there is more that is getting shipped, but it’s very hard to draw a line between one of those stats and, ‘Okay, now we’re actually producing 25 percent more useful consumer features,’” said Macdonald. “I think over the coming quarters and years, maybe that will become clearer, but I think today it’s hard, even if some of the underlying metrics are trending in a really astronomical direction.”

Two quick thoughts. First, engineering—and by extension, product and design—velocity gains like 2x, 3x, or 10x show up in the output. They aren’t showing up directly in the outcomes. Getting to a design faster doesn’t mean you designed the right thing.

Second, we haven’t redesigned the factory floor yet. It’s a metaphor I’m borrowing from Tommy Geoco. When factories converted from steam power to electricity in the 1880s, they swapped out the engines and did nothing else. The floor plan and workflow didn’t change. For three decades, output barely moved. Only when companies redesigned their factories and process around the new technology did they see an increase in output.

We haven’t quite figured this out as an industry or discipline yet. As I’ve written previously, it’s foggy but the shape is unmistakable. The answer is out there.

A man wearing a lapel microphone speaks animatedly on a conference stage, gesturing with both hands against a blue and green lit backdrop.

Uber president says AI spending is getting ‘harder to justify’

There’s no clear connection between AI usage and productivity.

theverge.com icontheverge.com

Addy Osmani makes a clean separation that most of the “is AI making us dumber” discourse keeps glossing over. He reports on Anthropic’s randomized trial of engineers learning a new Python library:

Engineers who used AI to ask conceptual questions scored above 65%. Engineers who copy-pasted the generated code scored under 40%. The tool didn’t determine the outcome. The posture did.

Osmani is writing for engineers, but most of that translates to designers picking up Figma Make, Lovable, or v0. Ship-without-comprehension scales beautifully right up until the moment you have to debug, redesign, or defend a choice you didn’t really make.

He ends on a ritual any designer can adopt verbatim:

I’ve started ending coding sessions with a simple question: did I learn anything today, or did I just close tickets? Sometimes the honest answer is “I just closed issues” and that’s fine. If it becomes the answer for months in a row, cognitive debt is accumulating in the background. Ship and learn are two separate metrics.

Workslop is the companion failure mode: the cost goes to your coworkers, where skipped learning costs your future self.

Hero image from Addy Osmani's post about not outsourcing the learning when coding with AI.

Don’t Outsource the Learning

Right now, it’s too easy to let AI write the code while you skip the learning. The bug gets fixed. Your mental model doesn’t move. We are silently trading future capability for present-day speed.

addyosmani.com iconaddyosmani.com

The artifact-to-intent argument has been working its way through design writing for a while now. What Jakob Nielsen adds to it, writing in UX Tigers, is a name for the failure mode that comes with the territory:

We used to accumulate design debt when teams shipped inconsistent components or patched over poor flows. Now we will accumulate intent debt: undocumented assumptions, vague brand guidance, missing escalation rules, untested agent permissions, and research insights that never become usable by the systems doing the work. Intent debt will be harder to see than visual inconsistency, but it will be more damaging because it compounds invisibly through every generated output.

Nielsen’s prior writing on intent-based UX argued that evaluation has become the new bottleneck for the user. A chat completes the task in seconds, and you spend the next half hour checking whether it actually did what you meant. Intent debt extends that bottleneck to the organization. The team ships ten variants in an afternoon, and nobody can tell which ones violated a brand rule that was never written down, or bypassed an escalation path that only lived in a senior designer’s head.

Nielsen puts the failure plainly:

The new danger is that AI will produce many adequate screens that all seem defensible in isolation and incoherent in aggregate. Mediocrity will arrive well-dressed. The designer’s role is to prevent the organization from drowning in plausible options.

Which is why the design system has to grow up:

The design system thus stops being a component library and becomes an operating system for taste. Tokens, components, and usage rules are only the visible layer. Underneath must be a deeper set of instructions about brand behavior, interaction philosophy, accessibility standards, motion logic, content tone, escalation patterns, and product judgment. The system must know not only which button to use, but when not to add a button at all.

Developer Mark Anthony Cianfrani has argued that LLMs finally let us ship the reasoning behind a token alongside the token. Nielsen draws the consequence of skipping that work: a weak design system in the AI era becomes an active liability. Agents will faithfully build with whatever’s encoded, and faithfully invent the rest.

AI-generated hero image for Nielsen's UX Tigers post on design shifting from artifact production to intent shaping.

Design Changing from Artifact-Production to Intent-Shaping

AI is changing the object of design itself. The UX profession’s most valuable contribution stops being UI production and becomes the design of intent: defining what good means, encoding judgment into live systems.

uxtigers.com iconuxtigers.com

Dan Shipper, CEO of Every, has spent the last few years running his company as an early-adopter lab for AI tooling. The report from inside is that aggressive automation has not shrunk the team. The work has changed shape, but the volume of expert human work has gone up, not down. The reason, Shipper argues, is structural.

Slop is not any one particular mistake. It is not the use of em dashes, or a certain sentence rhythm, or purple accents on a landing page. Slop is visible sameness, repeated ad nauseam. It is what gets produced by default when humans in many different circumstances use the same tool, trained on the same corpus, without thinking too hard. It is what happens when everyone has access to an expert who has the same default tendencies. When someone in operations can issue a pull request, marketers can create YouTube thumbnails in seconds, or engineers are writing product guides, it’s easy to end up in a world where your output has gone up—but the quality, coherence, and differentiation of what you’re producing has dropped.

Sameness as the failure mode lines up with what BetterUp Labs researcher Kate Niederhoffer and her co-authors named workslop: AI-polished output that shifts the burden of judgment downstream onto whoever has to interpret, correct, or redo it. Shipper’s contribution is to follow the mechanism one more turn: once everyone is producing the same default output, the work that doesn’t look like the default becomes the scarce thing. Difference becomes the new status game, and difference has to come from a human who is alive to this moment, this customer, this codebase, this conversation.

The second half pushes the same logic up to AGI. Shipper on why even AGI doesn’t escape the loop:

In any hypothetical AGI built by any of the major labs, there is still going to be a framer—a human—directing the model to achieve a goal. And because the frame is not the framer, we’ll see the same pattern repeat: AI turns yesterday’s framed competence into something cheap; people use that cheap competence in more places; the results become abundant; experts move to the edge to decide what matters now; their judgment creates the next frame; and then the model climbs that, too.

At the end, Shipper drops the analytical voice and writes:

The race is over. You can almost feel your muscles beginning to atrophy, useless in the face of this mechanical copy of you and everyone you’ve ever met, of the whole of humanity. A ghost chasing a ghost, and winning. But then something strange happens. The model turns to you. Your cursor blinks, off and on, in the blank text box, expectantly. Waiting.

Social banner from Every magazine for Dan Shipper's article After Automation.

After Automation

Dan Shipper ran Every as an early-adopter lab for AI tooling. His report from inside: aggressive automation didn’t shrink the team. The work changed shape, and the volume of expert human work went up.

every.to iconevery.to

Michael Riddering brings Tommy Geoco on Dive Club fresh off field visits to Vercel, Perplexity, Metalab, Ramp, and Snowflake. Geoco and his team are making a documentary after roughly 200 conversations with designers and design leaders this year. The survey finding he leads with is the one I would have least expected: designers who have moved more of their work into AI-assisted prototyping are also more satisfied with their workflows. The hierarchy of who is actually doing that work is the part worth sitting with:

The number one thing that stood out to me was that designers who are currently vibe coding are more satisfied with their workflows. […] And I did not expect that. […] People seem to dig it in this survey. […] It’s the people who are currently doing the majority of their workflow on vibe coding activities. It’s design engineers. That makes sense. Lead principals. [After that] it’s non-designer roles, which might be students and researchers. Then it’s managers. And then it’s your general junior mid-level IC. And that part was fascinating that managers are doing more than junior and mid-level ICs. Either things are trickling down and people are experimenting and then they’re going to pass learnings down, which is kind of what we’ve seen on location. But it also might mean that like some managers or teams haven’t yet made room for the rest of the team.

Design engineers and leads at the top is unsurprising. Managers above juniors and mid-levels is the inversion, and remains basically unchanged from two years ago when Geoco’s 2024 survey found the same thing.

Leadership-IC Divide. Leaders adopt AI at a higher rate (29.0%) than ICs (19.9%)

So what’s the read? Geoco gives it the generous read first—learnings cascading down—and then concedes the other possibility: some teams haven’t made room for the rest of the team. Riddering puts it more bluntly: “I’m looking at a bunch of junior and mid designers that are getting cut out of the process.”

The other finding is that 59% of designers have built their own tool for their workflow. The example Geoco brings back from Vercel makes the builder-mode shift concrete:

When I went over to Vercel, they had this brand designer, who had never coded before. And now was vibe coding a tool. Their marketing team would put out blog posts. And they were like, “Why does the design team need to create the OG blog post cards for every page? That’s not a good use of [their time].” So he built a tool that just allowed them to insert any sort of images. And it just already had all of the branding and the sizing baked in. And they just roll these [tools] out quickly. And I’m like, that just became a tool, an internal tool. That’s cool. And so because it was really interesting that they started referring to him as a brand engineer… And I’m like, okay, that kind of qualifies it actually.

A designer who had never coded solves an actual marketing-team problem, ships the tool, and the role title arrives after the work. That is how the next batch of “blank engineer” titles is going to land. Riddering then describes how the orchestrator pattern works in his own day-to-day, offering a concrete account of the workflow I have been writing about as orchestration from a working designer:

Part of me is almost slightly self-conscious about it. But I do the vast, vast majority of my messy explorations with AI now. I feel like I have made the jump to the quote unquote creative director where I’m just working with AI to show me a certain thing 50 different ways. And then I’m pulling the pieces that I like and then combining them again. And finally I get to somewhere where I’m like, yep, that’s good. And then I take that from paper, run it through cloud code, and now it exists on localhost. And then I will sweat the details and actually do the precision designing in code, which is, that’s crazy, man. That’s a very, very different workflow than I’ve done at any point in my career.

The orchestrator gap is opening where I thought it would. What I did not account for is who is getting invited into that work first. The data Geoco surfaces points to leads, managers, and design engineers getting more chances to build with AI than junior and mid-level ICs.

Here’s a hypothesis I’ll put out there: leads are more used to directing. I’m personally comfortable with orchestrating, being the editor because I’ve been a creative director and leader for so long. The loop is right there: frame, review, direct.

Tommy Geoco - The state of the design industry right now

Tommy Geoco has been visiting today’s top design teams—Vercel, Perplexity, Metalab, Ramp—to study how their workflows are changing with AI. He joins Dive Club to share what he’s learned.

youtube.com iconyoutube.com

After watching six agents design an app together in Pencil and spending a little time in Paper, I’ve been waiting for Figma to answer. Rodrigo Davies and Tammy Taabassum, writing on the Figma blog, finally announce it: a native design agent on the canvas, not bolted on through a separate app or a third-party MCP client.

Davies and Taabassum open with the pitch:

Designers need purpose-built tools that serve the essentials: exploration, experimentation, collaboration, and precision. Figma was built as a multiplayer canvas to make all of that possible. As teams adopt agentic tools to build products more quickly, false choices are emerging: Speed or precision? AI generation or direct manipulation? You shouldn’t have to choose.

Earlier this year Figma opened the canvas to third-party agents through its MCP server, letting Claude Code, Codex, and other agents push designs into a Figma file. That move covered the integration story. This one covers the in-app story:

That’s why we built the Figma agent. Our goal was to create an agent fluent in Figma and native to the way teams work. That meant making Figma itself legible to a model in ways that aren’t possible with third-party tools—with deep context on your components, tokens, standards, and best practices.

A third-party agent reaching in through MCP has to translate every request through a protocol; a native agent already speaks the file format. It knows your components, your tokens, your variables. That’s the gap between an agent that can edit a Figma file and an agent that lives in one.

The use cases Davies and Taabassum walk through—going wide on style explorations, bulk-updating variables across a design system, distilling comment threads into actionable plans—are the work designers were already paying the tax on. On exploration specifically:

The best designs rarely come from the first idea—or the first prompt. Exploring directions, comparing approaches, and iterating is already core to how designers work. Our agent will help you cover more ground in less time.

Renaming variables across a file, repeating padding changes through an entire flow, swapping one component for another across a dozen screens: that’s the busywork the agent is perfect for. The taste call on which direction to ship stays with the designer.

Davies and Taabassum close with:

Figma’s agent is embedded where the work already happens. There’s no toggle tax, no context switching, no learning curve. You stay in Figma and your team stays in the loop. We built this with one goal: to help you work faster without compromising on quality and craft.

That’s the competitive answer to Paper and Pencil. Agent-native canvases get a head start by not carrying any legacy assumptions; Figma carries millions of files and the design systems inside them. The bet is that the install base plus a fluent-in-Figma agent beats a greenfield canvas plus a generic one. We’ll see who’s right once the beta opens up.

Hero image from Figma's blog announcing the new on-canvas design agent.

The Figma Design Agent is Here

Starting today, work with an agent that is built for Figma—directly on the canvas.

figma.com iconfigma.com

Last week I linked Ravi Mehta on the three layers of context engineering for AI prototyping: functional spec, visual wireframe, structured data. Karo Zieminski, an AI PM writing Product with Attitude, makes the same case at the product scale and cites Mehta directly. Mehta wrote about prototyping one screen; Zieminski writes about designing the whole product around an agent.

Zieminski puts it in one line:

Prompt engineering is deciding what and how to ask the model. Context engineering is deciding what the model knows when it answers.

Then the asymmetry:

A well-crafted prompt in a poorly engineered context still fails. A poorly crafted prompt in a well-engineered context often succeeds.

That asymmetry is the argument for treating context as the underlying system.

If that asymmetry is real—and a year of using these tools tells me it is—then most teams are still optimizing the wrong layer. The visible artifact is the prompt. The work that actually decides the output is everything around it.

The piece I want to underline is who owns the work:

PMs define what goes in each context layer. Engineers build the infrastructure to fetch and store it… If the PM isn’t doing this, one of two things happens. Either an engineer makes the product decision by default, or nobody makes it and the agent gets every available signal dumped into the window.

Zieminski calls the alternative abdication. I think she’s right and I also think most PM job descriptions in 2026 haven’t caught up. The hiring filter still selects for ticket-shaping and roadmap maintenance, not for “decide what the model should know about the user, what should age out, what should never get re-fetched.” Those are product decisions about how memory is organized, and the people best positioned to make them—PMs who understand the product and the user—are often the ones least equipped to talk about retrieval and eviction. The gap is one of vocabulary and authority.

Both write for PMs, but the work is also design work. The context an agent sees is a designed surface: what gets included, what gets hidden, what should age out, what should persist between sessions. Mehta’s three-layer brief—spec, wireframe, JSON, twenty minutes in Figma, real data—is daily prototyping for designers working with agents now. Zieminski’s architecture is the system those prototypes live inside. If designers don’t show up here, PMs and engineers will design this surface for us.

Illustrated header for Karo Zieminski's Product with Attitude essay on context engineering for AI PMs.

An Illustrated Guide to Context Engineering, Prompt Engineering, and The Future of Both

Karo Zieminski, an AI PM writing Product with Attitude, draws the line between prompt engineering (what you ask) and context engineering (what the model knows when it answers). She argues PMs—not engineers—own the context architecture.

karozieminski.substack.com iconkarozieminski.substack.com

My terminal setup these days is cmux layered on top of Ghostty, so I can run multiple workspaces side by side without losing my place. Most of my actual work happens there now, as the primary surface.

MC Dean spent real time building UIs for her designpowers agents, looked at what she’d made, and tore them down:

I’d taken something that was direct, alive in a particular way, that would show you its thinking if you let it, and I’d dressed it up. Made it presentable. Wrapped its reasoning inside a crisp ui. In doing that I’d introduced distance between the person using it and the thing they were actually talking to and trying to collaborate with. I’d made it more comfortable, predictable and less true. I killed the delight of using designpowers.

Dean on why she killed it:

The GUI was scaffolding. Brilliant, necessary, world-changing as an innovation, but nonetheless created around a fundamental limitation so invisibly useful that we forgot what it was for.

The scaffolding was for humans who couldn’t speak the machine’s language. When the machine starts speaking ours—even partially, even just inside certain conversations—the scaffolding becomes friction for anyone trying to learn how the underlying system actually thinks.

Dean then connects this back to design craft, instead of letting it become a “designers should learn the terminal” finger-wag:

Designers are really good at this important way of thinking already. Every time you decided how an error message should make a person feel. Every time you chose what to surface and what to hide. Every time you designed for the person who didn’t fit the assumed user, you were encoding values into a system. That work doesn’t go away when the interface does. It moves upstream, into the reasoning layer itself, and it has no canvas. That’s design literacy for the world that’s coming.

Dean’s argument pairs with the other end of the same expansion: designers gaining authorship downstream in the code. The easing curve and the hover state at one end; what the agent gets to surface and who counts as the assumed user at the other. The reasoning layer is the upstream end without a canvas to hide behind.

Nick Babich asks if the old surface still earns its place. Dean is pointing at where the new surface already is.

Dean closes with:

We are early enough that the agents are still legible. You can still watch one think. You can still feel the shape of how it reasons before it’s been wrapped in chrome and shipped with a logo.

That window is why I bother with cmux at all. Ghostty by itself is already a genuine pleasure to use; cmux lets me keep a Claude Code session running per project without losing context when I switch. Just enough plumbing to make watching agents think a habit rather than a stunt. Dean’s right that the legibility won’t last. Worth being here while it does.

Hero image for MC Dean's Substack essay on stripping the UI off her AI agents.

The Terminal Belongs to Designers Too

MC Dean built beautiful UIs for her AI agents, then looked at what she’d made and took them all down. The GUI was scaffolding. The terminal lets you watch the agent think. Design craft moves upstream into the reasoning layer, where it has no canvas.

marieclairedean.substack.com iconmarieclairedean.substack.com

Most agent-velocity hype rests on one premise: that writing code was the slow part. .txt, the team behind the structured-generation library, takes a saw to that assumption. The point goes back to two foundational software-engineering texts—Fred Brooks’s The Mythical Man-Month (1975) and Gerald Weinberg’s The Psychology of Computer Programming (1971)—and .txt puts it like this:

Software is what’s left over after a group of humans finishes negotiating with each other about what the system should do. The code matters, but it is the residue of the harder work, not the work itself.

Code as residue. That inversion reorganizes the whole conversation. The tools and processes we’ve built around software for fifty years—IDEs, wireframes, mockups, code review, even pair programming—have been about lowering the cost of producing the residue. Once that cost approaches zero, what’s left to slow you down is the negotiation underneath. And that negotiation has not gotten any cheaper.

What that layer actually consists of, in practice:

What slows down a team where agents do the implementation is the production of specifications precise enough for an agent to pick up and run. Roadmap, written down. Acceptance criteria, written down. The “what we actually want” forced into precision, be it via a test suite, a ticket, or a written design.

The bottleneck moves from people writing code to people deciding what code should exist. .txt calls that work management, and I’d put it a little wider; it’s also product, design, and anyone whose job description includes the phrase “what we’re building.” A spec precise enough for an agent is a falsifiable description of the outcome, with the trade-offs already made.

.txt on what runs underneath the spec:

Context is the commodity an organization runs on. It is the shared understanding of what we are building, why it matters, what has been tried, who decided what, what is load-bearing and what is vestigial. Humans on a team accrete it by osmosis. By being in the room, by reading the same Slack channel, by debugging the same outage at two in the morning. Most of it is never written down. When a senior engineer reviews a PR and says “this’ll break the migration,” they are drawing on context that has no document. Agents cannot do osmosis.

“Agents cannot do osmosis” is the line. Specs are the formal surface; context is what’s underneath, and teams absorb it without writing it down. The post closes here:

The companies that win the next decade will not necessarily have the best models or the best agent infrastructure. It will be the companies whose fifty people, then two hundred, then two thousand, can stay aligned on a shrinking set of decisions while shipping more output per head. They will be the ones that already knew, before agents arrived, that their hardest problem was coherence. That is a culture and management problem. Always has been.

Default header image for thetypicalset.com, .txt's company blog.

The bottleneck was never the code

.txt revisits Brooks and Weinberg’s old observation: software is what’s left over after humans negotiate what to build. With agents writing code cheaply, the negotiation is now the bottleneck. Coherence is the moat.

thetypicalset.com iconthetypicalset.com

Nick Babich, writing in UX Planet, takes inventory of where Figma still earns its place once teams stop treating the mockup as the deliverable:

One thing is clear: the conventional process in which UI and UX designers spend hours and days pushing pixels to create perfect layouts is no longer the reality for many organizations. The reason is simple: in the AI era, time-to-market has become a critical metric, and most companies would rather ship a “good enough” product quickly than spend extra time perfecting every detail.

The concept of Figma as a design tool originated from the conventional design process. You could say that Figma is an almost perfect design companion for designers who follow a traditional UI/UX workflow.

But the problem is that the conventional design process is no longer the reality for most organizations.

Organizations that embrace rapid prototyping are switching to tools that allow them to build and ship quickly. Instead of starting with static UI mockups in Figma, they jump straight into the prototyping phase using tools like Claude Code. In this phase, teams create coded prototypes that later evolve into fully functional products.

Figma’s role is narrowing from everything-tool to exploration-and-iteration tool, and narrowing is not the same as dying. Babich is now drawing the lines around what that specialized future actually looks like: design systems (especially the ones already living in Figma), complex enterprise workflows with real business logic, and the brand and visual-identity work where taste is the whole point.

On Figma Make, Babich is blunt:

But the problem is that Figma Make is still nowhere near tools like Codex or Claude Code in terms of output quality and overall user experience. Claude Code and Codex are significantly more capable, flexible, and comfortable for rapid product development workflows. Even for simple tasks like creating a prototype of design imported from Figma, Make tends to add a lot of visual defects.

I scored Figma Make 58 out of 100 at launch. It has improved since, but Babich is right about the gap. Make is competing against tools that were born for code generation against a working repo; Make was retrofitted onto a vector editor. That difference shows up in every prototype that looks fine until you zoom in.

On design systems, Babich:

In other words, you don’t necessarily need to maintain your design system in Figma; as long as you can provide access to a GitHub repository containing your design system, you’re in a good position to generate consistent interfaces.

If the design system can live in the repo and the agent can read it directly, the Figma library becomes a mirror rather than the source. That doesn’t kill the Figma file. It does change who has to maintain it and why.

Header illustration for Nick Babich's UX Planet essay on Figma's relevance in the AI design era.

Is Figma Still Relevant in the AI Design Era?

Nick Babich on Figma’s narrowing role: time-to-market killed the mockup-then-handoff workflow Figma was built for. Babich argues it doesn’t die—it specializes into design systems, complex enterprise workflows, and brand work where taste is the whole point.

uxplanet.org iconuxplanet.org

Nearly nine in ten organizations now use AI in at least one business function. Ninety-four percent aren’t seeing significant value from it. Gale Robins, writing for UX Collective, argues that the gap is a framing problem, not an adoption problem. Her earlier piece on discovery judgment made the same case; the new one sharpens it with an anecdote that shows the trap:

A team I spoke with recently had compressed their discovery cycle from six weeks to ten days using AI. They were proud, and the throughput was real. When I asked what the work had taught them that they did not already believe, the answer was: not much. Same questions, faster. Same answers, sooner.

Same questions, faster. Same answers, sooner. Her analogy for the wider pattern is the electric factory one I’ve used before:

When factories first installed electricity, productivity barely moved. Manufacturers replaced steam engines with electric motors and kept the line-shaft layout. The breakthrough came later, when they redesigned the factory around what electricity made possible. The technology was only part of the answer.

Robins maps McKinsey’s three waves of AI value—productivity, differentiation, transaction-cost reduction—and finds most teams stuck in the first one. Robins on where they have to go to get out:

These decisions are upstream of every artifact a team produces. They are also where AI productivity gains help least, and where human judgment compounds the most.

Robins’s evidence undersells her own thesis. She leans on Generative AI at Work—the Stanford-and-MIT customer-support study by economists Erik Brynjolfsson, Danielle Li, and Lindsey Raymond that became the canonical citation for “AI helps novices most”—to argue AI raises the floor, not the ceiling. Novices gained 34%; experienced workers, basically zero. That’s why so many designers who have never coded—like me—are now suddenly shipping with this newfound superpower. It’s the same finding behind the junior designer crisis. But LinkedIn’s Full Stack Builder rollout found the opposite: top performers adopted AI fastest and got the most out of it, because they had the judgment to know what to ask for. The floor-not-ceiling story is only true where the questions are fixed. Once the questions are the work, the pattern inverts. That’s exactly the territory Robins is mapping. If AI rewards the experienced most when the work is judgment-shaped, framing is where the gap between teams widens.

Cover illustration for Gale Robins's UX Collective essay on discovery as the work AI gives back.

Discovery is the work AI gives back

Nine in ten organizations use AI. Ninety-four percent see no significant value. Gale Robins says the gap isn’t about adoption: teams use AI to do the same work faster instead of asking what’s worth building.

uxdesign.cc iconuxdesign.cc