Skip to content

123 posts tagged with “tools”

In designer and investor Soleio’s South Park Commons interview, Rasmus Andersson, creator of Inter and an early Figma designer, draws a useful distinction between software and type. Regular software decays fast; Andersson says that if he hasn’t touched something in two weeks, he usually trashes it and starts over. But a typeface can keep gaining value years later.

Andersson starts with how complex modern software is:

Today there’s just this towering complexity of making software, and for a very good reason, right? For like 15, maybe 20 years by now, our industry as a whole has been hyper-focused on scale because it’s just been an economic evolution that’s been out of this world. So it makes sense that we’ve been trading off and trading away things like shareware, concepts like that, and things like simplicity and ease for the ability to scale and for the ability to things at scale to actually work and break apart. But in the wake of all of that, which is great in so many ways, we’ve given up these things that are listed, and I feel like at least me and many people like me and who sort of are trained to build this thing, enjoy making software at a smaller scale.

That’s like a metaphor that I think can be helpful is imagine your room, apartment, house, whatever, your dwelling place. It’s probably different from the person next to you. And it’s not so that we go out and we buy the Apple apartment and all the furniture is there and everything is perfect and we can change the carpet, and that’s about it. That’s ridiculous, no one would do that. Yet that’s what people do with software.

Manufacturing software was professionalized and gentrified. The stack got so optimized for global products that it left less room for the small, personal, weird things people used to make for themselves.

Andersson’s Figma story keeps that from becoming nostalgia:

First off, something that was just amazing about working at Figma at the time was that there was just this culture of doing things differently. That was amazing. And every single individual who I worked with were just like, if that person went to start a company now and they put me with it, I’ll come. Every person was like that. Every person had an opinion about something that was really exciting.

Another thing that I think was so fascinating about how things were done at the time at Figma was this deep intention around everything and moving slow. But moving slow with many things in parallel, like staggered, right? So from the outside, Figma would ship something every month, maybe even more often than that. But internally, a person would work on one thing for one year. And sometimes at the end of that one year, it would just go into the bin and never ship. So that I think was really cool to see from the inside and then having a very different effect on the outside.

That distinction matters in an AI tooling moment where output is getting cheap. Figma looked fast from the outside because slow, deliberate work was happening underneath. Speed at the surface depends on judgment below it.

On Inter:

One thing for sure is finding the right ratio between impact and effort. I think Inter is one of those things. Sure, I’ve spent like 10 years on it, and who knows how much of my time on it, right? So there’s a lot of effort behind it. But I think it’s one of those types of efforts that has a very disproportionate impact from the effort.

[…]

I gotta say though, something that is amazing about working at Typeface is, it’s something like, it’s almost like in the field of design, you might think about cars as a unique type of thing to design because it’s both like, and architecture’s a little bit like that too. It’s not strictly the sign as in signage for roads, right? It’s very technical and it’s not the sign as in expressing myself through art, like graphic design, like a poster, it’s somewhere magically in between.

And another aspect of it that’s kind of cool is that you can put in 10 minutes here and 10 hours there and they add up over time which I don’t think is true without a software because the rate of decay of any type of regular software is very high. Two weeks is my cutoff. If I have not worked on something for two weeks I have trashed it and start over but with a typeface like that decay is extremely extremely low, in some cases zero.

Lessons from Figma, Software Decay, & the Creation of the Inter Font

Rasmus Andersson—founding designer of Spotify, early Figma designer, creator of Inter—on why most software decays fast while a typeface keeps gaining value, and what Figma’s slow, parallel craft looked like from the inside.

youtu.be iconyoutu.be

Amber Bouabdallah, writing in UX Collective, gets at the learning problem lots of designers are facing in this AI transition: the tools don’t produce one shared path toward competence.

Bouabdallah draws the line between deterministic software training and relational AI practice here:

Traditional software training works because the tools are deterministic. You learn where the buttons are, what the shortcuts do, how the system behaves when you click the thing. Mastery, in that world, converges — everyone arrives at roughly the same competence, following roughly the same path, and you can write a training deck for it. Mastery means knowing the tool’s correct use.

AI tools break that definition. Maggie Appleton — designer and anthropologist, now at GitHub Next — gave a talk in 2023 called “Squish Meets Structure” about designing products with language models, and the line from it that I love is her description of the magic-input box: it has “no affordances,” “no knobs or door handles.” The interface, she writes, “offloads a ton of cognitive labour to the user.” There is no correct use to learn. The tool meets you where you are. Which means what you bring to it — your instincts, your mental models, your accumulated taste, your willingness to iterate, your custom claude.md files — is the tool, as much as the model is.

So mastery hasn’t disappeared. It has shifted. With deterministic software, mastering the tool meant converging on its logic. With AI tools, mastering one means the opposite: learning to bend it toward your logic. Tailoring it to how you already want to work. Mastering an AI tool is the craft of making it amplify the specific strengths and experience you bring — so the work that comes out is sharper, and unmistakably yours. That kind of mastery is real, and hard-won, and worth teaching toward. It is just personal rather than universal. Divergent rather than convergent. Everyone’s version of it should look different, because everyone’s version is built out of a different person.

Her Salesforce examples keep that from becoming an abstract tool-training claim:

Six months in, Ningdan and I had designed for tool adoption and accidentally created conditions for something more intimate. Seeing each other work. Seeing the specific choices someone makes when the tool doesn’t behave, the workarounds they’ve invented, the mental models they’ve constructed to make sense of something genuinely new. A window into someone’s thinking — and into how each person was mastering these tools in a shape no one else’s would match.

And once you can see the thinking, you can see the worry too. Amanda Harris, a User Experience Architect on our team, named a tension directly in the post-mortem: “I worry that we’ll lose the exploratory aspects of finding what’s wrong with an idea by jumping so quickly into hi-fi prototyping.” That’s not resistance to new tools. That’s a designer protecting something she knows matters. Hearing it voiced — in a room where everyone is nominally learning the same things — is only possible in a setting small enough and safe enough for honest uncertainty.

The anxiety designers are feeling is a signal, not a weakness.

And Bouabdallah closes by naming the training layer her team actually designed:

The tools will keep changing. They will keep arriving faster than any module can be written, any best practice can be documented, any official curriculum can ratify. It is tempting to treat the peer layer as a bridge — something to lean on until the real training arrives. But the real training is not coming, because there is no fixed competence to train people toward. As long as the tools keep moving, the peer layer isn’t the bridge. It’s the ground.

We did design how designers master AI. We just found that mastery wasn’t what we thought it would be. Not a competence everyone arrives at, but a practice each person builds — bending a generic tool toward their own strengths, their own experience, their own way of working, until the work it helps them make is sharper and theirs.

Diagram of five seeds growing into different structures, representing personal AI mastery.

Designing how designers master AI

AI tool mastery is not a universal curriculum. It is a personal practice of bending tools around judgment, workflow, and taste.

uxdesign.cc iconuxdesign.cc

Apple’s developer conference WWDC kicked off on Monday with a keynote. They announced various OS improvements, including refinements to Liquid Glass, and most importantly, a revamped AI strategy.

In the days leading up to the keynote, longtime Mac journalist Jason Snell wrote about building his first Mac app with Claude Code in just a couple of hours. We’ve heard this story before. We’ve been talking about it here for months. And yet, here’s a veteran technologist who’s just now discovering Claude Code’s power and building an app.

It’s easy to get caught up in the Silicon Valley AI hype bubble and think the whole world has changed and is using AI for everything. But no, that’s not actually the case.

Snell on what the experience actually required:

The process of building the app reinforced something I’ve been thinking about for quite a while: coding is a specific skill, but it’s only one part of a much larger process. Great developers aren’t necessarily great coders, though they can be. Apps must be envisioned, their specifications defined. The act of trying to describe an app to an AI coding engine is a clarifying one. The more you describe the app, the harder your brain has to work, because it’s always more complicated than you think it’s going to be. The decisions you make determine what the app comes to be. […]

Yup, tell me about it. Tell us product builders about it! The code was never the hard part.

Where I’d push back is on the optimism around it:

We now live in an era where, if you can dream an app, you can probably build it. Especially Mac utilities. And who cares more about native Mac software than Mac users? Certainly not those companies that gave up on Mac development and focused all their energies on giant cross-platform code bases to attract venture investment and big payouts.

Snell himself calls his app “ugly and incomplete” a paragraph earlier, so “if you can dream it, you can build it” is a bit of a stretch. The gap between a thing that runs and a thing you’d ship is where the real work lives: envisioning, deciding, refining.

And it’s a reminder of where the next barrier sits. Snell ends on the tooling:

Which brings me to a final point: Apple’s development tools, most notably Xcode, are nightmarish. My developer friends are used to them, but as someone who has never really used Xcode before, I was shocked at just how deeply unintuitive it is. As in, Claude would tell me to click on things, and I would have to reply, “I have no idea what that is or where it’s supposed to be.” And I’ve been a Mac user for a long time! I’ve gotten very good at intuiting where stuff is in a Mac interface.

[…]

While AI tools have made it more possible to build apps on Apple’s platforms, the developer tools themselves are still a formidable barrier. As the definition of “developer” changes, so, too, must the definition of developer tools.

I wholeheartedly agree with Snell that Xcode is a mess. For those like me who only open it on occasion, it’s baffling that Apple developers live with such a nutty application. Take a look at the best of Apple’s first-party apps like Keynote, Final Cut, or even Numbers, and Xcode is just…bizarre.

Apple did announce something at WWDC 2026 that was interesting—that nods to where they could go if they wanted to—users can ask Siri AI to vibecode Shortcuts and Safari extensions. Will have to see if that’s the seed for something.

Packed outdoor Apple event audience facing a large stage screen displaying the colorful Apple logo, with a presenter standing at a podium beneath a white canopy.

Road to WWDC 2026: What’s a developer?

Tim Cook and Craig Federighi at WWDC 2024. Next week is WWDC, which has always represented Apple’s connection to its community of third-party developers, and in recent years has also served a…

sixcolors.com iconsixcolors.com

Arpan Patel wrote a nice consolidated Claude Code reference: the directory layout, CLAUDE.md the way Anthropic’s Boris Cherny writes it, skills, subagents, MCPs, the underused commands. The whole guide turns on one shift:

Claude Code clicked for me once I quit treating it like ChatGPT in a terminal. The mental model flipped from “I need to write this code” to “I need to set Claude up to write this code well.” Setup is the work. Execution is verification.

If you use Claude Code daily, bookmark it.

Screenshot of the article page at arps18.github.io.

Beyond the Prompt: Claude Code

A field guide to using Claude Code as an agent, not a chatbot: the .claude directory, CLAUDE.md, skills, subagents, and the verification loops that make delegation work.

arps18.github.io iconarps18.github.io

Emma Webster, writing for Figma, argues that AI tools are pulling prototyping earlier in the product process: teams can validate with more fidelity, then carry design context forward instead of recreating it at handoff.

Product teams are rapidly adapting to the new way of working in the AI era. They’re prototyping before writing specs, testing in code before designing, exploring at unprecedented scale, and shipping with design system context that used to get lost in the handoff. We talked to product builders at FloQast, Merkle, Affirm, and Accor about how that’s playing out in practice.

The shift is about when the hard questions show up. Specs used to be the artifact that let teams pretend they had alignment. The team behind Claude Design skipped the PRD entirely and prototyped its way to the answer instead. With AI tools, a prototype can become the first serious question: does this flow hold up when the data, logic, motion, and system constraints are present?

Webster describes the code-to-canvas loop this way:

Testing an idea against intricate constraints—things like multi-step flows where one action triggers the next, or interfaces that behave differently depending on the data behind them—used to require significant developer investment. Today, AI coding tools have made it possible for more people on a product team to quickly build and test these kinds of interactions before committing to a direction. That’s opened up a new workflow. A product builder can create a working prototype in code, then move it onto the Figma canvas using Codex to Figma to see the full picture and refine it together. From there, if more work needs to happen in code, they can move back via MCP with the design context intact.

This is the version of AI-assisted design I care about. Not “prompt a shiny UI from a blank page.” A working model becomes the place where designers, PMs, and engineers can see the same problem at once. Figma is still the canvas for style exploration and visualizing complex flows, but the decision surface is becoming more product-shaped.

That matters because prototype fidelity changes what a team is allowed to learn. A flat mockup can test preference and comprehension. A working prototype can test sequencing, edge cases, permissions, motion, and whether the idea survives contact with real constraints. Bringing that earlier into the process should make design less speculative, not less thoughtful.

The Accor example shows why this matters before anyone commits to a build:

Justine opened Figma Make and prototyped something she wouldn’t have had time to build by hand—a webpage that reorganizes itself based on what the user types. Search for “golf” and the page reshapes around properties with golf courses, curated outings, and relevant experiences. Make handled the micro-interactions and transitions, and the Figma MCP server kept everything connected to the brand’s design system. Within days, she had a working prototype ambitious enough to show leadership what was possible—and concrete enough to start a real conversation about what to build next.

Webster’s Affirm example carries the same logic all the way into production:

A PM prototyped the badge variations in Figma Make—going from idea to working prototype in two days instead of the usual six weeks. Designers refined the winning direction on the canvas, and when the team was ready to move that design into production, they loaded the design artifacts into the Figma MCP server and connected it to Cursor. MCP passed the components, tokens, and layout structure directly into the coding environment, where an AI agent generated the front-end implementation. Developers used that as their starting point, building production code that already reflected the designs instead of reinterpreting them from scratch.

Preserving components, tokens, and layout structure turns the prototype into a rehearsal for the real build. It has enough fidelity to expose bad directions early and enough context to keep the winning direction from being rebuilt from memory.

Header image for the Figma blog post on AI tools for going from idea to product.

4 New Ways to Go From Idea to Product With AI Tools

AI tools are changing how teams build products—from where they start to what carries through to production.

figma.com iconfigma.com

The line running through Tobias Van Schneider’s interview is simple: designers complain about tools all the time, but the better move is to build the environment you wish you had.

Nikolas Wrobel interviews designer Tobias Van Schneider, founder of Semplice and mymind, and the profile traces that pattern directly: Semplice for portfolios that didn’t fit the platform/template world, then mymind for thinking without the performative noise of social media.

Van Schneider:

How do you protect yourself against consuming, draining effects from Social Media, or disenchanting tech-mechanisms?

This question almost too perfectly leads into what I do everyday. In part, I protect myself by using mymind.com (which I created) — there are no ads, no vanity metrics, no social media features, nothing but myself. Only me and the things I care about. Over the years, mymind has become so valuable to me, it’s the first place I go to look for inspiration. In fact, while I am answering this interview, I find myself going back and forth between old notes and musings inside mymind.

Social media still drains me, just like everyone else, but it’s nice to at least have one place just for myself. And thats mymind.

Aside from that, I just get offline and out into the world. Or I create something.

Van Schneider isn’t pitching another social layer; he’s describing a private room for memory, taste, and reference.

He later explains where that product idea started:

As with many things, it was a total coincidence. When we initially worked on the mymind product and brand, we didn’t even have the name “mymind” yet. The whole thing was called AWMT, which stands for “As We May Think” and is a reference to an old essay by Vannevar Bush from 1945 in which he wrote about a machine called “The Memex” which was some sort of machine that collects and connects your personal knowledge.

Coming off of that inspiring essay, we came up with a slogan called “Think for yourself” which is sort of the antithesis to the cloud/hive mind of what we call social media today. Especially since we position mymind as a private sanctuary, it just made sense to us.

All of this eventually got me into the rabbit whole to search for ideas for our logo. The classic “Thinker” statue immediately came to mind. I always loved that one, a man deep inside his own thoughts, unfazed by the world around him. But the statue was a bit too literal to me, too well known, too sharp and serious. We needed something more abstract, more playful. Eventually I found out about Cycladic art, originating from the Aegean islands during the Early Bronze Age. Very famous for their minimal and stylized marble figurines. Now, the rest is history. I immediately fell in love with the simplicity of it and it felt like a great canvas to build our visual universe on it. The rest is history (:

Wrobel asks Van Schneider about the conditions he works best under:

The creative me enjoys being alone. Completely isolated, nobody even in the other room. It gives me freedom and clarity to think for myself and be myself. My real creative being thrives in these moments, untouched by the opinions and desires of others.

My ultimate solitude tends to arrive at midnight. It almost transforms me. The dark brings focus. Silence brings new ideas. No voices interrupt, no chance of emails, just me and my thoughts. It’s this time I feel most creatively alive.

Now, add a soundtrack to it and I’m in creative heaven (:

I don’t think every designer needs midnight solitude. But design work does need stretches where taste can form before it becomes consensus. In a work culture that treats collaboration as a default good, that distinction matters.

This is also why the tools question matters. A portfolio system, a private reference space, even a type foundry site are not neutral containers. They either protect the conditions where taste can develop, or they pull the work back toward the defaults of the platform. Van Schneider’s career makes that feel less like a manifesto and more like a working habit: when the available environment makes the work smaller, build a different environment, then keep using it until it changes the work.

Van Schneider’s favorite advice turns complaint into output:

“The best way to complain is to create something” by James Murphy, founder of LCD Soundsystem. It has become one of my guiding principles. It turns useless, negative energy into productive, positive energy.

Portrait of designer Tobias Van Schneider.

Tobias Van Schneider creates the things he wish existed

An interview with Tobias Van Schneider on solitude, taste, and why the best answer to bad tools is to build the environment you wish existed: Semplice, then mymind.

nikolastype.com iconnikolastype.com

Fulya Lisa Neubert, writing for the Slack Design blog, starts with a familiar design handoff problem:

For most of my design career, Figma was where the real work happened. I’d design screens, build prototypes, then hand off the designs for someone else to build. If something felt slightly wrong in production, we’d go back and forth trying to articulate what “it should feel like” in words. This process worked well enough, but there was always a gap between what I could show in a static design tool and what someone would actually experience in the product.

The piece gets concrete when Neubert moves from screens to Slack search. She points to a kind of interaction where static prototypes can suggest the flow, but can’t prove whether the experience works under someone’s hands:

The search experience in Slack is a deeply keyboard-driven feature: typing states, focus management, the way content scrolls and reflows as you interact. These are things you can sketch in Figma and even prototype to a degree, but a Figma prototype can’t tell you whether the focus ring moves correctly between elements when you press tab, or whether a scrolling gradient feels right as content overflows. You need to actually use it.

I don’t read this as an argument that designers need to become frontend engineers. Neubert came in with the basics and figured it out:

I came into this with basic HTML and CSS — enough to roughly understand what I was looking at, but not enough to write it myself. That turned out to matter less than I expected. The first time I described a focus interaction in plain language and had a prototype working in minutes, I stopped thinking of code as someone else’s territory. […]

My setup is fairly straightforward. Cursor is my main environment. I use Figma’s MCP integration to pull in components I’ve already designed and Slack’s design tokens, so I don’t have to rebuild spacing, color, and type from scratch every time. I tried working directly in Slack’s codebase — years of accumulated complexity that made every small change feel like a bigger undertaking than it needed to be. […]

The Figma integration matters because the prototype isn’t starting from a blank toy environment. It pulls real components and tokens into a lighter workspace, which makes the thing shareable without pretending to be production. For Slack search, that means the team can review behavior instead of debating screenshots.

Neubert is also clear about the tax:

AI also doesn’t always preserve what you’ve already built. You prompt it to change one thing, and it quietly breaks something else. If you’re not testing after every turn, you won’t notice until you’re sharing the prototype with someone — or worse, presenting it — and that’s when you realize something’s broken.

That is the right caution. AI changes the distance between intent and working behavior, but it doesn’t remove verification. If anything, it makes the habit of testing after every small change part of the design process itself.

Pixel art illustration by Fulya Lisa Neubert representing designing in a live browser environment rather than a static tool.

Designing Where the Pixels Actually Live

A Slack designer on shifting from Figma to AI-assisted code prototyping—and why static tools can’t tell you whether the focus ring moves correctly when you press Tab.

slack.design iconslack.design

Felipe A. Carriço, a UX designer and AI product builder, turns accessibility guidance into context AI coding agents have to follow with A11Y.md:

A11Y.md is not a guideline. It is an accessibility validation protocol and a persistent context architecture for developing accessible software with AI. It is designed to integrate with AI agent systems and human review workflows to ensure certifiable compliance.

By adopting the mental model of Anthropic’s CLAUDE.md—which acts as a system prompt memory for code generation—A11Y.md translates this architecture into a universal, portable governance layer. Instead of generic coding rules, it forces any coding agent (Claude, Cursor, Copilot) to strictly adhere to WCAG 2.2 AA and ADA standards from the very first line of generated UI code.

I appreciate how operational this is. It pairs well with Joost de Valk’s Website Specification, which treats machine-readable standards as part of what a good site does. A11Y.md brings the same idea into the build process: the generator has to carry the accessibility context while it makes the UI. That matters because accessibility failures in generated code are rarely abstract. They show up as broken keyboard paths, silent error states, and interface logic that only works for the person who can see and click everything.

Carriço is blunt about the difference between reading and changing the workflow:

Reading about accessibility is the first step, injecting it into your code is the real goal. Do this right now in your project:

  1. Download the Rules: Copy the A11Y.md file from docs/en/ to the root of your application’s repository.
  2. Inject into the Prompt: If you use Cursor, GitHub Copilot, or Claude, add this to your global rules file (.cursorrules or Context system):

“Strictly follow the development rules defined in the A11Y.md file.”

  1. Use as a Quality Gate: Before merging important PRs, use the checklist in docs/en/templates/REPORT.md.

If you do not perform the steps above, you are not changing your workflow — you are just reading about the subject.

That is the product here: wiring accessibility into the build process so it changes what gets generated.

A11Y.md project banner showing the project name and accessibility badges for WCAG 2.2 AA and ADA compliance.

A context system for building accessible software by default — for developers and AI, with enforceable rules aligned to WCAG.

A persistent context architecture that enforces WCAG 2.2 AA and ADA standards from the first line of UI code—a governance layer for AI coding agents built on the CLAUDE.md mental model.

github.com icongithub.com

Joost de Valk, creator of the Yoast SEO plugin for WordPress, has turned the “what should a good website do?” question into The Website Specification: a platform-agnostic checklist that puts HTML basics, SEO, accessibility, security, performance, privacy, internationalization, and agent readiness in one place.

The useful shift is that the AI-facing work is treated as normal website hygiene. Not a separate “AI strategy” project. Not a prompt-engineering side quest. Just another part of making the site understandable to the systems that now read, rank, quote, and retrieve it.

A platform-agnostic specification of the technical features every decent website should have — from <title> to /.well-known/security.txt, from WCAG contrast to llms.txt. Written for humans and agents.

Ten areas, mapped to widely-accepted standards.

Each topic links back to the source standard — WHATWG, W3C, IETF RFCs, WCAG, MDN, and the organisations defining the modern web.

Whether you ship WordPress, Drupal, TYPO3, Next.js, Astro, Hugo, a Django app, or plain HTML, the spec is the spec. Implementation hints follow it, not the other way round.

I like that standards-first posture. A lot of AI advice still treats the web like a pile of pages to be scraped, summarized, and maybe attributed later. De Valk pulls it back toward contracts: stable URLs, explicit policies, structured data, clean source material, and machine-readable ways to discover what matters.

From the Agent Readiness section:

Agent readiness is a loose umbrella term for the choices that make a website legible to AI agents — chat assistants, autonomous browsers, retrieval pipelines, and any other non-human client that reads the web at scale. None of it is a single formal standard. It is a collection of existing web fundamentals plus a few emerging conventions.

Agents read the same HTML as browsers, but they read it differently. They:

  • Fetch a page, often without executing JavaScript.
  • Strip away navigation, ads, and chrome to extract the main content.
  • Follow links, structured data, and well-known endpoints to discover more.
  • Cache and quote your content in answers, with or without a link back.

If your content is locked behind client-side rendering, your URLs change every release, or your robots.txt blocks the assistants your customers use, you are invisible in that surface. The pages that win in agent answers are the ones that are easy to fetch, easy to parse, and easy to trust.

That’s the part designers should pay attention to. We tend to think of the interface as the thing on the screen. But if agents are part of the audience now, the interface also includes off-screen surfaces: metadata that explains the page, feeds and sitemaps that expose what exists, crawler policies that say what can be read, and curated indexes like llms.txt that tell software what matters.

De Valk again:

There is no single switch. The items in this category each cover one part:

  • Stable URLs so cached answers stay valid.
  • Structured data (JSON-LD) so agents can extract entities without guessing.
  • Clean semantic HTML so content extraction does not pull in navigation.
  • A robots.txt that names AI crawlers explicitly so your policy is unambiguous.
  • /llms.txt as a curated index of your most important content (emerging).
  • Machine-readable endpoints — sitemaps, RSS, JSON feeds — where they fit.
  • MCP server endpoints for sites that expose tools or actions (emerging).

Most of these also benefit traditional search engines and accessibility. Agent readiness rarely conflicts with the rest of the spec; it just raises the priority of things that have always been good practice.

De Valk’s point is simpler: agent readiness mostly means doing the old web discipline well enough that agents can actually read and trust the site.

The Website Specification homepage, a platform-agnostic reference for what every good website should do.

The Website Specification

A platform-agnostic, full specification of the technical features a good website should have. Built in the open under an MIT licence.

specification.website iconspecification.website

Dan Carey leads product at Anthropic Labs, the team behind Claude Code and Claude Design. In a talk on how a three-person team shipped Claude Design in ten weeks, he describes what happened to everyone else after their engineers got fast:

And so once Claude Code took off, the bottleneck moved. The bottleneck moved from building the feature to figuring out the right things to be building for your users, in a lot of cases. So the option was either skip those early steps, just try and decide on the fly, and potentially build the wrong thing really fast, or try to find ways for the rest of us to speed up. So our designers, our PMs, were having trouble keeping up. We needed our own accelerator tool.

Carey just relocated the bottleneck onto the exact work designers and PMs own: figuring out what’s worth building. That’s product discovery becoming the real constraint. When building gets cheap, what’s left to get right is the decision about what to build at all.

How does the team make that call? Not by writing it down:

So we like to use prototypes because documents are imprecise. It’s so easy for two people to look at the same doc and have two different products in mind about what the experience should be. […] Prototypes are more concrete, more visceral. They let you get hands on with the thing and really feel the experience yourself.

They skipped the PRD and the vision docs entirely. A working prototype immediately aligns people, and it doubles as the discovery tool: you build the rough thing to find out what the right thing is.

And it helped that the team was small enough to skip coordination entirely. Here’s Carey:

Everyone on the team does everything. The engineers talk to users, PMs write code, designers do data analysis. All of these things are enabled in part with Claude. And the lines between the roles on this team, they have essentially dissolved at this point. You do have your specialization, you do have the unique perspective and diversity that you bring to a team, but at any moment, any one of these people on this team can talk to 10 users, you can realize what the underlying problem is, you can design a solution to it, you can ship it to users, you can listen for feedback, you can keep iterating solo if you need to.

On Carey’s team, the designer who spots the problem also builds the solution and ships it. That’s the kind of role a lot of designers are now being asked to grow into, and it looks less like a handoff between specialists than one person carrying an idea from problem to finished screen.

Speed doesn’t guarantee you build the right thing, though, and Carey is candid about the team’s misses. They built a set of advanced, fine-grained controls for power users. A few vocal testers loved them—I know I would have. But the usage showed everyone else hated them, and the team pulled the controls in a week. Two lessons came out of it:

So this taught us a couple of things. One, this taught us that we should be a tool that lifts the level of craft for everybody, not just the ceiling on power users. It also taught us that we want to be as open as possible, because there will be users that we never meet the full needs of. There’s going to be some power user out there who wants to do something very specific that we’re not going to support. And that’s what convinced us that we wanted this to be a very open tool. That’s why if you export from it, you get HTML, CSS, JavaScript.

Designing with Claude: From prompt to production

Claude Design lets you describe what you want in plain language and get production-quality outputs. Learn how a small team built a design tool that ships in your brand, from prompt to production.

youtube.com iconyoutube.com

David Pierce, writing for The Verge, dates the inflection precisely: late 2025, when an update to Claude Code crossed the line from “surprising when it worked” to “surprising when it didn’t.” That’s the moment vibe coding stopped being a demo and started being a tool ordinary people could actually use.

In late 2025, an update to Anthropic’s Claude model turned its Claude Code tool from a code generator that was surprising if it worked to one that was surprising when it didn’t. Suddenly, all you needed was $20 a month and a half-formed idea, and an AI model could build you functional software. If you could explain what wasn’t working, Claude Code could probably fix it. Andrej Karpathy, an educator and researcher who was on OpenAI’s founding team, had called this new behavior “vibe coding.” Suddenly the vibes were off the charts.

The reliability threshold matters more than the headline number. Twenty dollars a month was already true. What changed is that the output stopped breaking when you asked it to do something real. That’s what made the personal software lineage—from HyperCard in 1987 through Lee Robinson’s essay, the home-cooked-app idea, micro-apps, and fleeting apps—turn from a niche aesthetic into something a normal person could actually do over a weekend.

Pierce documents his own version of that weekend: building Timetable, abandoning it, building Spring and forgetting what it did, getting stuck on Twilio bills. The realization that pulls him out of the loop is the one that’s worth dwelling on:

What saved my efforts was the realization that personal software doesn’t have to be built from scratch. Knowledgeable developers might be newly capable home cooks, but the rest of us are more like customers at Chipotle. We don’t make the food, we don’t even really assemble it, but we get to decide what goes where and how it’s served to us. For most of us, the future of software is not building our own Excel from scratch, it’s using the models to build spreadsheets wildly more capable than we could create ourselves. It’s building the Chrome extension for your favorite app that is really only missing a Chrome extension. It’s tweaking the way things look to suit your exact taste and needs.

Most coverage of vibe coding implies the future is everyone becoming a one-person engineering team. Pierce’s actual claim is narrower and more useful: like ordering a Chipotle burrito, you’re picking ingredients and toppings, not running the kitchen. The point is not to replace Notion or Obsidian or Todoist. It’s to bend them an inch closer to how you actually work.

My whole publishing workflow for this blog switched from Payload CMS to a custom admin UI and now a custom Obsidian plugin.

Which brings the conversation to where Pierce lands it: taste.

In this new world, the most important thing you’ll need is taste. Not objectively good taste, necessarily, so much as a keen sense of your own. You need to be like Rick Rubin, the famous music producer, who once told 60 Minutes that what made him successful was not any particular technical ability, but “the confidence I have in my taste, and my ability to express what I feel.” Rubin practices that art with A-list celebrities; you need to be able to do it with AI. Otherwise, you’ll land in what Lovin calls “doom loops,” telling your chatbot only what you don’t like and counting on the model to be the creative one. That way lies madness — and bad software.

Yan Liu’s working definition of taste cites the same Rubin formula—sensitivity times standards—and that’s the part of Pierce’s argument that designers should sit with. The $20 vibe-coder has the tool. What they often don’t have is the trained eye to know when Claude’s purple gradient is wrong, or why the icon looks like a butthole instead of a planner. Pierce learned this the hard way and concluded, sensibly, that he didn’t have opinions about databases but did have opinions about typefaces. That’s the right diagnosis. It also undersells what designers actually do—Raj Nandan Sharma’s warning about taste-as-end-of-pipeline selection is the other half of this. If designers don’t show up as authors here—shaping what gets generated, not just thumbs-upping it after the fact—the personal software era will produce a lot of bespoke purple gradients and not much else.

Illustrated hero image for The Verge's feature on the personal software revolution and vibe coding.

Welcome to the personal software revolution

AI is empowering a generation of vibe coders to build exactly what they want. The personal software revolution is here.

theverge.com icontheverge.com

Michael Riddering brings Tommy Geoco on Dive Club fresh off field visits to Vercel, Perplexity, Metalab, Ramp, and Snowflake. Geoco and his team are making a documentary after roughly 200 conversations with designers and design leaders this year. The survey finding he leads with is the one I would have least expected: designers who have moved more of their work into AI-assisted prototyping are also more satisfied with their workflows. The hierarchy of who is actually doing that work is the part worth sitting with:

The number one thing that stood out to me was that designers who are currently vibe coding are more satisfied with their workflows. […] And I did not expect that. […] People seem to dig it in this survey. […] It’s the people who are currently doing the majority of their workflow on vibe coding activities. It’s design engineers. That makes sense. Lead principals. [After that] it’s non-designer roles, which might be students and researchers. Then it’s managers. And then it’s your general junior mid-level IC. And that part was fascinating that managers are doing more than junior and mid-level ICs. Either things are trickling down and people are experimenting and then they’re going to pass learnings down, which is kind of what we’ve seen on location. But it also might mean that like some managers or teams haven’t yet made room for the rest of the team.

Design engineers and leads at the top is unsurprising. Managers above juniors and mid-levels is the inversion, and remains basically unchanged from two years ago when Geoco’s 2024 survey found the same thing.

Leadership-IC Divide. Leaders adopt AI at a higher rate (29.0%) than ICs (19.9%)

So what’s the read? Geoco gives it the generous read first—learnings cascading down—and then concedes the other possibility: some teams haven’t made room for the rest of the team. Riddering puts it more bluntly: “I’m looking at a bunch of junior and mid designers that are getting cut out of the process.”

The other finding is that 59% of designers have built their own tool for their workflow. The example Geoco brings back from Vercel makes the builder-mode shift concrete:

When I went over to Vercel, they had this brand designer, who had never coded before. And now was vibe coding a tool. Their marketing team would put out blog posts. And they were like, “Why does the design team need to create the OG blog post cards for every page? That’s not a good use of [their time].” So he built a tool that just allowed them to insert any sort of images. And it just already had all of the branding and the sizing baked in. And they just roll these [tools] out quickly. And I’m like, that just became a tool, an internal tool. That’s cool. And so because it was really interesting that they started referring to him as a brand engineer… And I’m like, okay, that kind of qualifies it actually.

A designer who had never coded solves an actual marketing-team problem, ships the tool, and the role title arrives after the work. That is how the next batch of “blank engineer” titles is going to land. Riddering then describes how the orchestrator pattern works in his own day-to-day, offering a concrete account of the workflow I have been writing about as orchestration from a working designer:

Part of me is almost slightly self-conscious about it. But I do the vast, vast majority of my messy explorations with AI now. I feel like I have made the jump to the quote unquote creative director where I’m just working with AI to show me a certain thing 50 different ways. And then I’m pulling the pieces that I like and then combining them again. And finally I get to somewhere where I’m like, yep, that’s good. And then I take that from paper, run it through cloud code, and now it exists on localhost. And then I will sweat the details and actually do the precision designing in code, which is, that’s crazy, man. That’s a very, very different workflow than I’ve done at any point in my career.

The orchestrator gap is opening where I thought it would. What I did not account for is who is getting invited into that work first. The data Geoco surfaces points to leads, managers, and design engineers getting more chances to build with AI than junior and mid-level ICs.

Here’s a hypothesis I’ll put out there: leads are more used to directing. I’m personally comfortable with orchestrating, being the editor because I’ve been a creative director and leader for so long. The loop is right there: frame, review, direct.

Tommy Geoco - The state of the design industry right now

Tommy Geoco has been visiting today’s top design teams—Vercel, Perplexity, Metalab, Ramp—to study how their workflows are changing with AI. He joins Dive Club to share what he’s learned.

youtube.com iconyoutube.com

After watching six agents design an app together in Pencil and spending a little time in Paper, I’ve been waiting for Figma to answer. Rodrigo Davies and Tammy Taabassum, writing on the Figma blog, finally announce it: a native design agent on the canvas, not bolted on through a separate app or a third-party MCP client.

Davies and Taabassum open with the pitch:

Designers need purpose-built tools that serve the essentials: exploration, experimentation, collaboration, and precision. Figma was built as a multiplayer canvas to make all of that possible. As teams adopt agentic tools to build products more quickly, false choices are emerging: Speed or precision? AI generation or direct manipulation? You shouldn’t have to choose.

Earlier this year Figma opened the canvas to third-party agents through its MCP server, letting Claude Code, Codex, and other agents push designs into a Figma file. That move covered the integration story. This one covers the in-app story:

That’s why we built the Figma agent. Our goal was to create an agent fluent in Figma and native to the way teams work. That meant making Figma itself legible to a model in ways that aren’t possible with third-party tools—with deep context on your components, tokens, standards, and best practices.

A third-party agent reaching in through MCP has to translate every request through a protocol; a native agent already speaks the file format. It knows your components, your tokens, your variables. That’s the gap between an agent that can edit a Figma file and an agent that lives in one.

The use cases Davies and Taabassum walk through—going wide on style explorations, bulk-updating variables across a design system, distilling comment threads into actionable plans—are the work designers were already paying the tax on. On exploration specifically:

The best designs rarely come from the first idea—or the first prompt. Exploring directions, comparing approaches, and iterating is already core to how designers work. Our agent will help you cover more ground in less time.

Renaming variables across a file, repeating padding changes through an entire flow, swapping one component for another across a dozen screens: that’s the busywork the agent is perfect for. The taste call on which direction to ship stays with the designer.

Davies and Taabassum close with:

Figma’s agent is embedded where the work already happens. There’s no toggle tax, no context switching, no learning curve. You stay in Figma and your team stays in the loop. We built this with one goal: to help you work faster without compromising on quality and craft.

That’s the competitive answer to Paper and Pencil. Agent-native canvases get a head start by not carrying any legacy assumptions; Figma carries millions of files and the design systems inside them. The bet is that the install base plus a fluent-in-Figma agent beats a greenfield canvas plus a generic one. We’ll see who’s right once the beta opens up.

Hero image from Figma's blog announcing the new on-canvas design agent.

The Figma Design Agent is Here

Starting today, work with an agent that is built for Figma—directly on the canvas.

figma.com iconfigma.com

My terminal setup these days is cmux layered on top of Ghostty, so I can run multiple workspaces side by side without losing my place. Most of my actual work happens there now, as the primary surface.

MC Dean spent real time building UIs for her designpowers agents, looked at what she’d made, and tore them down:

I’d taken something that was direct, alive in a particular way, that would show you its thinking if you let it, and I’d dressed it up. Made it presentable. Wrapped its reasoning inside a crisp ui. In doing that I’d introduced distance between the person using it and the thing they were actually talking to and trying to collaborate with. I’d made it more comfortable, predictable and less true. I killed the delight of using designpowers.

Dean on why she killed it:

The GUI was scaffolding. Brilliant, necessary, world-changing as an innovation, but nonetheless created around a fundamental limitation so invisibly useful that we forgot what it was for.

The scaffolding was for humans who couldn’t speak the machine’s language. When the machine starts speaking ours—even partially, even just inside certain conversations—the scaffolding becomes friction for anyone trying to learn how the underlying system actually thinks.

Dean then connects this back to design craft, instead of letting it become a “designers should learn the terminal” finger-wag:

Designers are really good at this important way of thinking already. Every time you decided how an error message should make a person feel. Every time you chose what to surface and what to hide. Every time you designed for the person who didn’t fit the assumed user, you were encoding values into a system. That work doesn’t go away when the interface does. It moves upstream, into the reasoning layer itself, and it has no canvas. That’s design literacy for the world that’s coming.

Dean’s argument pairs with the other end of the same expansion: designers gaining authorship downstream in the code. The easing curve and the hover state at one end; what the agent gets to surface and who counts as the assumed user at the other. The reasoning layer is the upstream end without a canvas to hide behind.

Nick Babich asks if the old surface still earns its place. Dean is pointing at where the new surface already is.

Dean closes with:

We are early enough that the agents are still legible. You can still watch one think. You can still feel the shape of how it reasons before it’s been wrapped in chrome and shipped with a logo.

That window is why I bother with cmux at all. Ghostty by itself is already a genuine pleasure to use; cmux lets me keep a Claude Code session running per project without losing context when I switch. Just enough plumbing to make watching agents think a habit rather than a stunt. Dean’s right that the legibility won’t last. Worth being here while it does.

Hero image for MC Dean's Substack essay on stripping the UI off her AI agents.

The Terminal Belongs to Designers Too

MC Dean built beautiful UIs for her AI agents, then looked at what she’d made and took them all down. The GUI was scaffolding. The terminal lets you watch the agent think. Design craft moves upstream into the reasoning layer, where it has no canvas.

marieclairedean.substack.com iconmarieclairedean.substack.com

Nick Babich, writing in UX Planet, takes inventory of where Figma still earns its place once teams stop treating the mockup as the deliverable:

One thing is clear: the conventional process in which UI and UX designers spend hours and days pushing pixels to create perfect layouts is no longer the reality for many organizations. The reason is simple: in the AI era, time-to-market has become a critical metric, and most companies would rather ship a “good enough” product quickly than spend extra time perfecting every detail.

The concept of Figma as a design tool originated from the conventional design process. You could say that Figma is an almost perfect design companion for designers who follow a traditional UI/UX workflow.

But the problem is that the conventional design process is no longer the reality for most organizations.

Organizations that embrace rapid prototyping are switching to tools that allow them to build and ship quickly. Instead of starting with static UI mockups in Figma, they jump straight into the prototyping phase using tools like Claude Code. In this phase, teams create coded prototypes that later evolve into fully functional products.

Figma’s role is narrowing from everything-tool to exploration-and-iteration tool, and narrowing is not the same as dying. Babich is now drawing the lines around what that specialized future actually looks like: design systems (especially the ones already living in Figma), complex enterprise workflows with real business logic, and the brand and visual-identity work where taste is the whole point.

On Figma Make, Babich is blunt:

But the problem is that Figma Make is still nowhere near tools like Codex or Claude Code in terms of output quality and overall user experience. Claude Code and Codex are significantly more capable, flexible, and comfortable for rapid product development workflows. Even for simple tasks like creating a prototype of design imported from Figma, Make tends to add a lot of visual defects.

I scored Figma Make 58 out of 100 at launch. It has improved since, but Babich is right about the gap. Make is competing against tools that were born for code generation against a working repo; Make was retrofitted onto a vector editor. That difference shows up in every prototype that looks fine until you zoom in.

On design systems, Babich:

In other words, you don’t necessarily need to maintain your design system in Figma; as long as you can provide access to a GitHub repository containing your design system, you’re in a good position to generate consistent interfaces.

If the design system can live in the repo and the agent can read it directly, the Figma library becomes a mirror rather than the source. That doesn’t kill the Figma file. It does change who has to maintain it and why.

Header illustration for Nick Babich's UX Planet essay on Figma's relevance in the AI design era.

Is Figma Still Relevant in the AI Design Era?

Nick Babich on Figma’s narrowing role: time-to-market killed the mockup-then-handoff workflow Figma was built for. Babich argues it doesn’t die—it specializes into design systems, complex enterprise workflows, and brand work where taste is the whole point.

uxplanet.org iconuxplanet.org

Open four agent windows at once and the day disappears in a way that feels productive but isn’t. David Hoang, writing in Proof of Concept, puts it plainly:

At times, HITL [human-in-the-loop] agent orchestration feels addictive like Candy Crush or scrolling social media. Every prompt shows a stream of tokens and visible progress being made. You sit and wait to hit the number 2 or continue prompting. Instead of doom scrolling, you’re doom building; a sense of productivity which leaves you not doing anything else.

To be abundantly clear, I’m not against HITL and it’s a great way to build. What I’m saying is the massive productivity gains take a toll on you. I’ve shipped real work this way; being locked in for entire afternoons and evenings to prompt sessions. Sometimes I get good outputs and other times I don’t get anything valuable.

The orchestration tax is like the coordination tax at work. I’m feeling like I’m building but really air traffic controlling in parallel. You are reading partial outputs, deciding which to merge, which to discard, which to re-prompt. It’s a job, and an important one, but it’s not the deep work in design, writing, or thinking I need to do. That is a real job. It is not, however, the same job as design or writing or thinking. It uses a different part of you and it depletes a different reservoir. By the time I sit down to actually draw something or write a paragraph that matters, the reservoir is empty.

I orchestrated my way out of having anything to say.

Hoang’s analogy to coordination tax—the meeting load that eats the day at any tech company—is exact. Watching a token stream and deciding what to keep is real work. It is not the work you sat down to do. Orchestration spends from the same reservoir or account that making spends from, and you do not feel the withdrawal until the end of the day when you go to write the paragraph and there is nothing in the tank. Hoang’s tactical answer is to switch defaults: human-in-the-loop for the few things that benefit from your synchronous attention, human-on-the-loop for everything else, with a real review block on the calendar.

The shift is from watching to bracketing. Agents need start conditions and end conditions, not a babysitter in between.

Header illustration for the Escape from agentic loop essay on Proof of Concept.

Escape from agentic loop

David Hoang on the cognitive cost of orchestrating four agents at once: the productivity feels real, but it depletes the same reservoir you need for design, writing, and thinking. He calls it the orchestration tax.

proofofconcept.pub iconproofofconcept.pub

Brandon Harwood opens with Picasso’s Guernica. He asks you to look at the painting, then tells you the story behind it—the bombing of the Basque town, the civilian deaths, Picasso’s intention to communicate that horror—and asks you to look again.

If you didn’t know the story of this painting beforehand, now you do, and it might strike a different chord, if just slightly. The details of the painting now have the context that shows us what Picasso was thinking when he painted Guernica. […] It’s this kind of context that drives meaning in art. Guernica is not just a painting. It’s communication.

Harwood uses it to draw a line between what AI can generate (the aesthetics of a thing) and what humans build (the context that makes a thing communicate). His answer: instead of asking AI to make meaning, design around the fact that it can’t.

Meaning Machines are, at their core, “signifiers, randomized into a fixed grammar, and read for new meaning.” […] The randomized signifiers are the contextual data surrounding our creative pursuit, the data the AI is trained on, and the relationships built on that data through its training. These signifiers, the data, are then placed into a fixed grammar through agentive interaction and/or agentic actions, and the user can then interpret the result to stimulate their creativity, build new meaning, or explore ideas they might not have considered before.

Tarot doesn’t know what your week looks like. Oblique Strategies doesn’t know what song you’re stuck on. The cards work because they hand you raw material and you do the interpretation. Harwood’s claim is that an LLM, used right, can sit in that same chair. Provoke the human. Dr. Maya Ackerman calls this same arrangement “humble creative machines”: the AI is not the creator, it’s the prompt the creator responds to.

Harwood breaks co-creative AI into three roles:

The Puller: The AI system gathers information about the context the user is working in through active question generation and passive information collection on the works. […] The Pusher: The AI system uses some/none of this context to synthesize considerations for the user to employ throughout their creative journey. […] The Producer: The AI system creates artifacts for use as elements of the users’ larger creative output.

The Puller / Pusher / Producer vocabulary is what I wish more design teams had before they shipped their first AI feature. Each role is a constraint, a way to keep the human in the chair the work actually belongs in. Most AI tools for creatives flatten all three into one button that produces a finished thing. Harwood’s whole argument is that the finished thing is where the meaning has to originate; it can’t be the destination.

Pablo Picasso's *Guernica*, the black-and-white anti-war mural depicting a bull, a screaming horse, a fallen warrior, and figures in anguish.

Collected consciousness

Brandon Harwood opens with Guernica and argues that AI cannot carry meaning or intention—but constrained to three supporting roles (Puller, Pusher, Producer), it functions as a ‘meaning machine’ that amplifies creative judgment instead of replacing it.

doc.cc icondoc.cc

Peter Yang spent the last few months running OpenClaw, Hermes, Claude Code, Codex, and Gemini through ten capabilities he thinks a personal AI agent needs to handle. The headline is in his subtitle: nobody has won yet.

Yang on OpenClaw, an open-source personal-agent platform:

I estimate that 10% of my time with OpenClaw is spent fixing it instead of using it. Examples: It forgot it had access to edit Google Docs. It randomly started using a robot voice instead of the one I like. It breaks half the time after every update.

He switched to Hermes (a newer personal-agent platform from Nous Research) anyway:

If OpenClaw’s maintenance tax is wearing you down, give Hermes a try. A week in, it’s been more reliable for me.

Yang’s full comparison of Claude Code, Codex, and Gemini—plus the stack he ends up running—is in the post. His advice for the rest of us:

Pick one or two agents that work for you based on the pros and cons above and just commit.

His promise to anyone who picks one and stays:

Once you have an agent that’s available 24/7 and can actually get work done for you, you’ll never go back to a regular AI chat interface again.

Promotional hero illustration for an article comparing OpenClaw, Hermes, Claude Code, Codex, and Gemini as personal AI agents.

The Race to Build a Personal AI Agent (And Why Nobody Has Won Yet)

Everyone wants to build an AI chief of staff. Here’s my honest take on the pros and cons of OpenClaw, Hermes, Claude Code, Codex, and Gemini.

creatoreconomy.so iconcreatoreconomy.so

Luke Wroblewski shared his notes from the Design Futures Assembly, a gathering of about a hundred senior designers and leaders from AI labs, big tech, and startups in San Francisco:

When everyone can ship, you get a different kind of problem. One design leader described it perfectly: they let everyone build and push whatever they wanted. And you could feel it in the product, because nothing made sense together.

This is the part of the AI-in-design story that the toolkit numbers obscure. Wroblewski reports roughly half of designers had shipped AI-generated code to production this year, and that the typical designer’s toolkit had doubled in size over twelve months. Those are real numbers. But once production stops being the bottleneck, the bottleneck moves. A single word surfaced repeatedly:

Several people at the assembly used the word “editorial” to describe where design leadership is heading. Less about making the thing, more about deciding what gets made and ensuring it all holds together. The skill of saying no is becoming one of the most important skills in the profession.

The “saying no” line echoes something Chad Johnson wrote a few weeks back: the designers who shape direction “learn to say no with evidence and to disagree without drama.” The Assembly’s framing makes that posture mandatory at a portfolio level, not just on individual features. One tool company founder, Wroblewski notes, preferred “coherence”: the sense that a product came from one shared point of view. I like that word better too. Coherence describes the thing the user actually feels.

Design Futures Assembly event header image from Luke Wroblewski's notes on the San Francisco gathering.

Design Futures Assembly

Half of designers ship AI-generated code to production. Wroblewski’s notes from the Design Futures Assembly land on a new role: editorial leadership.

lukew.com iconlukew.com

Thariq Shehzad, on the Claude Code team at Anthropic, has switched from markdown to HTML as his default agent output format. The reasoning is more honest than a format-war argument would suggest, because it’s about what humans will actually read. He opens by acknowledging what markdown was for:

Markdown has become the dominant file format used by agents to communicate with us. It’s simple, portable, has some rich text capability and is easy for you to edit. Claude has even gotten surprisingly good at using ASCII to make diagrams inside of markdown files. But as agents have become more and more powerful, I have felt that markdown has become a restricting format.

Then the pivot:

As Claude is able to do more complex work, it is also writing larger and larger specs and plans. In practice, I’ve found I tend to not actually read more than a 100-line markdown file, and I certainly am not able to get anyone else in my organization to read it. But HTML documents are much easier to read, Claude can organize the structure visually to be ideal to navigate with tabs, illustrations, links, etc.

When the spec gets long enough that you stop reading it, you’ve quietly moved from review to rubber-stamp. Shehzad’s answer isn’t to ask Claude for shorter specs. It’s to make the artifact something a human will actually open, scroll, and share. A controllable, shareable artifact is most of what made personal computing legible in the first place; HTML is the format that already does it.

He puts the trade-off honestly when the obvious objection comes up:

While markdown often uses fewer tokens, I’ve found that the added expressiveness of HTML and the much higher likelihood of me reading it means I get overall better output. With the 1MM context window in Opus 4.7, the increased token usage is not really noticeable in the context window.

And the close is the real argument:

The real reason I use HTML is that I feel much more in the loop with Claude. I had begun to fear that because I had stopped reading plans in depth I would simply have to leave Claude to make its choices. But I am happy to say instead that I feel more in the loop than ever before when using HTML.

Header image accompanying Thariq Shehzad's post on switching from markdown to HTML for Claude Code agent outputs.

Using Claude Code: The Unreasonable Effectiveness of HTML

Thariq Shehzad on Anthropic’s Claude Code team switched his agent output from markdown to HTML — because what keeps Claude honest is what humans actually read.

x.com iconx.com

Owen Williams, a design manager at Stripe, sat down with Claire Vo on How I AI to walk through Protodash, the internal prototyping tool he has spent the last eighteen months building. What sticks is what Protodash has done to the handoff. Williams, describing the Radar fraud-detection team:

They literally have a pull request of a prototype that I had I see an engineer working on and I’m like this has never happened ever in my career as a design manager. They’re like “I’ll just use the prototype as the source of truth” and they can just take it and do that. There’s a huge change — not having to red line a Photoshop file or all of that stuff.

That’s the part that matters. The prototype is the code, in the same components, ready to be picked up. Protodash gets there by constraining generation: a bundle of Cursor rules, a router and chrome scaffold, and Stripe’s design system (Sail) exposed via an MCP server. The off-the-shelf tools—v0, Cursor by itself, Claude Design—produce what Williams calls “blurple slop” because they hallucinate components. Wire the generator to the actual system and the output stops looking like a Tailwind demo and starts looking like Stripe.

The fidelity jump changes the room, too:

It’s sort of been this very transformative thing because all of a sudden I’m sitting in these design reviews and it’s so convincing that I’m like, is this the real product or am I looking at something fake?

This is what Tara Tan predicted: the moat in AI design tooling is the design-system graph, and whoever makes that graph machine-readable for agents wins the enterprise. Stripe just did it, internally, with a homemade stack, meaning it’s really an uphill battle for anyone trying to make a generic tool for this use case.

The interesting thing is who shows up to use it. Williams says Protodash is now used more by PMs than designers; PMs paste a PRD from Google Docs and get back a working flow before designers are pulled in. That tracks with the Figma Make case studies — PM-led prototyping isn’t theoretical anymore.

Williams is clear-eyed about what the tool can’t do:

How can I make sure that the tool knows enough to be dangerous? It gets to 80%. But like that taste, that craft is like, that’s why designers will always exist, in my opinion. Like they know how to elevate the experience. Like this thing knows how to use the components. The components are well designed, but it’s not going to be perfect. And we are here to steer them.

The internal AI tool that’s transforming how Stripe designs products

How Stripe’s internal AI prototyping tool, Protodash, ties generation to the design system and turns the design-to-engineering handoff into a pull request.

youtube.com iconyoutube.com

Talking to Peter Yang, Ravi Mehta—former CPO of Tinder, now teaching AI prototyping at Reforge—walks through a live demo of building the same Spotify-style genre page three different ways. The first attempt uses a short functional prompt and produces something that, in Mehta’s words, kind of feels like AI slop. The third uses what he calls a full-stack context bundle: a functional spec, a 20-minute Figma wireframe, and a JSON file of real album data pulled together in Claude with an MCP server. The output is night and day.

His definition of the shift:

Context engineering is designing and building systems that provide an AI model with the right information and tools to accomplish the task. And I think a lot of the common mistake I see with prototyping is people don’t think about context within that 360 degree way. And as a result, people just, you know, write a quick prompt or a quick little mini spec and expect the prototype tool to be able to create something as high fidelity as what they used to create before when they had all of these different artifacts that are a critical part of the product lifecycle.

That definition will sound familiar to anyone who saw Philipp Schmid’s framing of context engineering when it first circulated. Same emphasis on “right information and tools.” It’s the working definition the field has settled on. What Mehta adds is the concrete answer to “okay, what are the three things you actually have to assemble?” Functional context (a spec), visual context (a wireframe), and data context (real structured JSON, not lorem ipsum). Skip any of them and the prototype either looks generic, behaves wrong at edge cases, or breaks suspension of disbelief the moment a real customer touches it.

The piece I want to underline is his defense of visual thinking, because the “designers are obsolete” takes haven’t stopped, and Mehta gives them a clean rebuttal:

So if you start to think differently about the different types of context that are available, you can actually get much more specific and have a lot more control over what gets built and build something that’s a lot more robust. This is functional context. The next level that is really important is visual context. […] And so here, I very quickly in Figma, just taking 20 minutes, done a wireframe, and sort of outlined what I want this interface to look like. […] The prototype needs to have a level of fidelity that’s hard to get with sort of traditional prompting techniques.

Twenty minutes in Figma, then a short prompt that says “use the attached wireframe.” A wireframe does what a 17-page PRD and three rounds of trying to describe a layout in English to the model can’t. The wireframe is part of the input to the deliverable now.

The corollary cuts the other way too. If the wireframe is now an AI briefing document, the people who can produce a decent one in twenty minutes have a real edge over the people who can’t. That’s still designers, still us. It’s just that the wireframe now feeds the model directly, not only the engineer reading the spec next sprint.

Everything You Need to Know About Context Engineering in 40 Minutes

Ravi Mehta builds the same Spotify-style page three times to show how functional spec, visual wireframe, and real data each level up an AI prototype.

youtube.com iconyoutube.com

I wrote about this whole family of files in my recent newsletter: DESIGN.md, SKILL.md, SOUL.md, the markdown artifacts you write so an agent can read them. Nick Babich has the practitioner walkthrough for the DESIGN.md flavor of it, specifically the version that Google Stitch reads when it generates a screen. He describes the format directly:

DESIGN.md is a markdown file with two layers: YAML front matter that contains machine-readable design tokens (exact hex values, font properties, spacing scales) and Body that features a human-readable design rationale.

The two-layer split is right. The YAML is the part the agent can’t argue with: primary: "#d97706" is #d97706. The body is where you tell the agent why, and it has to be written like prose, not a config file. Babich’s philosophy section is where I’d point a designer who’s about to write their first one:

Unlike a traditional specification that often has very specific details that designers should follow when crafting a new design, DESIGN.md is less prescriptive in its nature. It creates a solution foundation for AI tools (colors, typography, corner radius) while providing enough freedom to alter the format for domain-specific needs. Another thing is that DESIGN.md is a living artifact, not a static config file. It should evolve as your design evolves.

The “less prescriptive” line is counterintuitive. You’d think the whole point of feeding rules to an agent is to be more prescriptive, not less. But Babich is right about the shape: pin down the tokens, leave the application loose, refine the file as the agent surfaces edge cases you didn’t think about. These files hold what we used to keep in our heads and call taste, and you don’t write taste like a requirements doc. You write it like a brief, and you keep editing it.

Article header illustration for Nick Babich's UX Planet piece on the DESIGN.md format.

What is DESIGN.md and How To Use It

One of the biggest challenges with AI design generators is producing consistent output. Even with detailed instructions, AI can drift away from the spec.

uxplanet.org iconuxplanet.org

PJ Onori built a tool that A/B tests his design system against AI agents, and he’s careful to say it isn’t impressive:

Two groups of agents get spun up, and both are given the same prompt to make an interface. One group’s given the old design system. The other is given our new one. Each agent provides feedback on problems faced after it’s done. Once all agents finish, the builds are evaluated on a bunch of crap and a report is generated.

The list of what the tool measures is long: timing, lines of code, code variance, fix attempts, components used, accessibility, performance, inline styles, visual diff, token usage, agent feedback. Onori, on the test he ran when he wasn’t sure his documentation was actually doing the work:

I was starting to question if documentation was making things better. Maybe component improvements was doing the heavy lifting–who knows? So, I ran a couple tests without documentation… The documentation was clearly the heavy lifter. […] Documentation is essential for systems that agents don’t have a lot of reps with. I’ve started to add a “For agents” section in the docs. That section is the dumpster for “get it in your silicon head” training.

The “For agents” section is a small idea with a real implication. Documentation has historically been written for one audience. Now there are two, and as Onori says elsewhere in the post, the second one needs “the same damned point” repeated five or six times and doesn’t care if the prose is ugly. His instinct is to wall that off so humans don’t have to read it.

Onori is publishing measurements where most people are publishing takes. That’s the missing piece in the design-system-as-moat argument: somebody actually testing whether agents do better with a well-built system than a worse one, and showing the numbers. Onori, on the closing caution:

There’s a lot of noise in the output, feedback, and analysis–otherwise know as everything. That noise compounds fast. Think of the telephone game–then think about what that’d do to a design system. […] Feedback needs to go through a BS filter. […] The feedback part of the analysis is helpful. Make no mistake. But it needs to heavy interpretation.

The telephone game is the right picture. A design system that updates itself based on agent feedback that’s been generated by other agents and analyzed by a third agent is going to drift somewhere strange in a small number of iterations, and nobody on the team will be able to reconstruct why. Onori’s tool stops short of that on purpose: it produces measurements, and a person reads them.

Stippled illustration of a person sitting at a desk, leaning forward and writing or working on something.

Testing agents on design systems

It’s really easy to say agents are able to use a design system. It’s another thing to prove it.

pjonori.blog iconpjonori.blog

Nick Babich on agents in UX Planet. A useful pair to his earlier writeup on Claude skills, since the two words get used interchangeably and they are not the same thing. Babich opens with the plain-language version:

Think of an AI agent as a program you run when you need to solve a particular problem in design. For example, you can create an AI agent that helps you with usability testing, code review, UI/UX audit, etc.

A program you run is the right mental model. A skill, the way Babich described it in his earlier piece, is a recipe: a markdown file Claude reaches for when a task matches. An agent is what runs once Claude has the recipe in hand. It carries state across steps, picks tools, reports back.

Babich’s four attributes of a well-designed agent get at that distinction without saying it out loud:

  1. Good clarity (intent alignment). A strong agent understands what success looks like, not just the task. This understanding helps it translate vague prompts into clear objectives.
  2. Context awareness. Good agents maintain and use context effectively. Not only do they remember previous steps, constraints, and user preferences (which is well-expected behavior nowadays), but they also adapt output based on the environment (tools, data, stage of workflow).
  3. Tool orchestration. Agents can perform the workflow autonomously and they have the ability to use the right tools for a task at hand is what makes an agent so powerful. Well-crafted agents can chain tools together into workflows, and they don’t overuse tools when simple reasoning is enough.
  4. Explainability (transparent reasoning). When you interact with an AI agent, you need to understand why something happened. Thus, an AI agent should provide a rationale behind decisions surface assumptions, and trade-offs.

Context awareness and tool orchestration are what separate an agent from a prompt template. A skill can ship intent alignment and explainability in plain markdown, but state across steps and the ability to chain tools require a runtime. That’s why Babich’s specs include Boundaries sections and “When Not To Use It” blocks: a stateful, tool-using program needs guardrails that a one-shot prompt does not.

If you haven’t built one yet, his five specs—Research Synthesizer, Competitor Intelligence, Problem Definition, Idea Generation, UX Flow Designer—are a clean starter pack. Pick the one closest to a workflow you already do by hand, and notice how much of the spec is about what the agent will not do.

3D illustration of an orange robot head with a maze inside its open skull, glowing circuit lines extending outward to orange cube nodes.

Agentic Product Design

5 design tasks you can automate with AI today

uxplanet.org iconuxplanet.org