175 posts tagged with “process”

March 10, 2026 at 8:00 PM

Notion built a prototype playground for their designers. It’s a single Next.js repo with shared styles and slash commands for deployment. The infrastructure is solid. The adoption question is harder.

Brian Lovin, talking to Claire Vo on How I AI:

It’s still a Next.js app. It’s still React and TypeScript and Git and branches and it’s just a lot of concepts to throw at someone who maybe is used to only prototyping in Figma or they’re intimidated by a terminal or code. And so I’m trying to figure out like how do we make this thing more approachable? How do we make it easier to onboard but also not dumbed down, right? I want people to learn how to use computers. I want people to even subconsciously absorb the ideas of git and branching and pull requests and merging.

“Make it easier but not dumbed down” is the tension every team building AI design tooling is going to hit. Lovin wants designers to actually learn Git, not just have it abstracted away. That’s a bet on long-term capability over short-term adoption. If Notion, with its engineering culture and resources, is still working through this, the rest of the industry has a longer road than the demos suggest.

But Lovin makes a sharp case for why the effort is worth it, especially for AI product design:

I don’t think you can design a good chat experience in Figma. You can design what the chat input looks like. You could design a little chat bubble and a send button and a dropdown for model picker. I think all that’s fine in Figma, but what you can’t design in Figma is what it actually will feel like to use that thing. I probably should have said this at the very beginning, but the reason Prototype Playground existed is because when I started working on Notion AI, I was literally designing conversations in Figma — the user’s going to say this, and then the AI is going to say this, and then it’s going to work perfectly and create a page or a database. You mock these golden paths in Figma and then the engineers go and they build it. And it just doesn’t work that way, right? You send a message, the AI gets stuck, or asks a follow-up question, or does the wrong thing and you need to correct it.

This is the strongest argument I’ve heard for code-first prototyping of AI features. Static mocks enforce golden-path thinking. Real models surface the messy middle: the weird follow-ups, the latency that changes how an interaction feels, the error states you’d never think to mock up.

And yet:

I still use Figma. I probably still spend 60 to 70% of my time in Figma. There’s just certain things that you’re making that don’t need to be in the browser. They don’t need to be coded up. You can just look at it and be like, “Yeah, that’s roughly right. We should just ship that.”

So even the person who built the Prototype Playground still spends most of his time in Figma. Figma isn’t dying just yet, but its scope is narrowing. But for AI features specifically, Lovin’s case is hard to argue with: you need the real model running to know if the design works.

The interview gets most interesting when Lovin describes his operating philosophy for AI agents and how to get them to run longer:

My philosophy on this has been anytime the AI asks you to do something, you should, before responding, try your best to see if you could teach the AI to answer that question for itself. […] So, for example, I’ve taught Claude, “Hey, check your work. One, you can run commands like eslint, right? And like look for actual TypeScript errors.” The second is you can give it access to MCP tools. […] Before installing this, Claude would say to you, “Hey, I’ve implemented your feature. Go take a look at it and let me know what you think.” And remember, our rule is anytime Claude tells you to do something? Ask if you can teach it to do that thing for itself. So, I don’t want to have to look at the browser every time to see if I did it correctly. So, instead, I teach Claude, “Actually, you should be the one to go and open the browser.”

Every interruption from the AI is interrupting your flow state. That’s orchestration in practice: building infrastructure that lets the AI handle its own quality checks so you the designer stays in the flow of deciding what to build and whether it’s right.

Lovin again:

You want your designs to encounter reality as early as possible. And if you imagine this gradient of like I’m scribbling on a napkin on one side to I’m shipping to production and showing customers on the other side, our goal as designers is to move up that gradient towards prod as quickly as possible. […] I just find that when you’re designing something in Figma and then you actually try it in the browser, in the browser you notice a ton of problems. All of a sudden you’re clicking things, you notice loading states, you notice “ah, that didn’t quite work on this screen size.”

Encounter reality as early as possible. That’s the whole argument in six words. There’s a lot more in this conversation, and it’s worth the full watch.

How Notion designers ship live prototypes in minutes | Brian Lovin (Product designer)

Brian Lovin is a designer at Notion AI who has transformed how the design team builds prototypes, by creating a shared code environment powered by Claude Code. Instead of designers working in isolated repositories or limited to static Figma designs, Brian built a collaborative “prototype…

youtube.com

March 10, 2026 at 3:00 PM

AI tools made designers faster. The question nobody’s answering is whether their organizations can keep up.

Cameron Worboys, head of product design at Cash App, talking to Michael Riddering on Dive Club:

I think the biggest blockers across all of the tech industry in the next 2 years will not be the speed of building. It’s going to be the operational side and being able to move something from like we have built this thing. How does it move through the operational cogs of product development in order to like get it live to customers? So my view is like how do we set ourselves up for the new world? You have to make sure that your organization is capable at running at the same speed as the AI tools. And these AI tools move fucking fast.

The bottleneck migrated. Building isn’t the constraint anymore. Getting what you’ve built through approvals, reviews, compliance, and deployment is. Cash App’s response has been radical: they’ve flattened to three management layers (they call it “core plus three”), deleted design crits, and are pushing every designer to ship production code.

Worboys on what quality actually looks like at this speed:

The quality piece, there’s a misconception that it comes from a designer sitting in some cave for 3 months and pontificating about the future of software. It literally doesn’t. It comes from reps and the speed which you can be wrong and the speed that you can go again and experiment and experiment and experiment. And I think that’s what we’ve seen change, is the amount designers can produce has exponentially increased and the amount of like bureaucracy and layers you need to run an organization has changed a lot as well.

Quality through iteration, not pontification. That’s always been true, but when each iteration takes minutes instead of days, the gap between teams that ship and teams that sit in review becomes enormous.

Worboys on where this leads:

I believe one of the primary ways which you will create lock-in in the new world is creating apps that feel completely one of one. […] When you think about the future of software development and where it’s going with generative UI, there is nothing in the future that’s going to prevent us from creating these completely one of one experiences. So that’s what is top of mind for me at the moment. And I do think we will get there relatively quickly, that every Cash App does feel unique and completely designed around the person. And then from a business perspective, it creates this deeper, harder to quantify emotional connection with a product that is the same as like your wardrobe. Clothes are by and large like an expression of personal identity.

This is the most concrete product bet I’ve seen on generative UI. Not widgets inside a chat window. Entire apps shaped around the individual. I still think core app chrome should stay stable. But Worboys is betting that consumer fintech is where that line starts to blur.

Cameron Worboys - Inside an AI-native design org

Today’s episode with Cameron Worboys (https://x.com/camworboys) (Head of Product Design at Cash App) is an inside look at how an AI-native design org operates and the ways designers can thrive in this new world.

youtube.com

Essays

March 4, 2026 at 4:00 PM 6 min read

A red-crowned crane soaring over misty mountain waterfalls in a Japanese ink-wash style illustration with pink-blossomed trees and teal rocky cliffs.

Spec-Driven Development: It Looks Like Waterfall (And I Feel Fine)

We’ve been talking a lot about agentic engineering, how software is now getting built with AI. As I look to see how design can complement this new development paradigm, a newish methodology called spec-driven development caught my eye. The idea is straightforward: you write a detailed specification first, then AI agents generate the code from it. The specification becomes the source of truth, not the code.

My first reaction when I started reading about SDD was: wait, isn’t this just waterfall?

Seriously. You gather requirements. You write them down in a structured document. You hand that document to someone (or something) that builds to spec. That’s the waterfall pattern. We spent two decades running away from it, and now it’s back wearing a blue Patagonia vest and calling itself a methodology.

March 3, 2026 at 10:00 PM

The design process isn’t dead. It’s changing. My belief is that the high-level steps are exactly the same, but where designers spend their time is being redistributed.

Jenny Wen, head of design for Claude at Anthropic (formerly at Figma), on Lenny’s Podcast:

This design process that designers have been taught, we sort of treat it as gospel. That’s basically dead. I think it was sort of dying before the age of AI, but given now that engineers can go off and spin off their seven Claudes, I think as designers, we really have to let go of that process.

It’s a strong headline. But Wen then describes her actual day-to-day, and it sounds familiar:

We are still prototyping stuff. I’m still mocking stuff up. I think it’s just I have a wider set of tools now, and I think the proportion of time I spend doing each thing just has changed.

So the process isn’t dead. The proportions shifted. Wen breaks it down:

A few years ago, 60 to 70% of it was mocking and prototyping, but now I feel the mocking up part of it is 30 to 40%. And then there’s that other 30 to 40% there that is now jamming and pairing directly with engineers. And then there’s a slice of it that is now implementation as well.

What’s missing from that breakdown is user research and discovery. Wen mentions having a researcher on the team, mentions reading studies and feedback, but those activities don’t factor into the breakdown at all. For a team building products where, by Wen’s own admission, “you can’t mock up all the states” and “you actually discover use cases as you see people using them,” you’d think research would be eating a larger share of the pie, not disappearing from the conversation entirely. In my day-to-day, the designers on my team spend 30–40% on discovery and flows. Maybe 40–50% on mockups and prototypes. We’re basically already at her breakdown.

There’s also a context problem. Wen’s “ship fast, iterate publicly, build trust through speed” approach makes sense for Anthropic. They’re building greenfield AI products where nobody knows the right interaction patterns yet. The models are non-deterministic. Labeling something a “research preview” and iterating in public is the right call when the design space is that undefined.

That approach gets harder with a product that has an established install base. When you’re updating features that millions of people depend on, “ship it and iterate” has real costs. Sonos learned this. Or if your product is mission-critical as Figma learned when it shipped its UI3 and designers revolted. Or worse, an essential service like a CRM or operational software. The slow, unglamorous work of discovery and user testing exists because breaking what already works is expensive. Wen has the advantage of building greenfield — there’s no install base to protect. Not every team has that luxury.

The interview gets more interesting when Wen turns to hiring. She describes three archetypes: the “block-shaped” strong generalist who’s 80th percentile across multiple skills, the deep T-shaped specialist who’s in the top 10% of their area, and then a third she says the industry is overlooking:

My last one is probably the one that I think we’re all overlooking, which is what I call the crack new grad. It’s just somebody who’s early career and feels, like, wise and experienced beyond their years, but is also just very humble and very eager to learn. I think this person is really interesting right now because I think most companies are just hiring senior talent, folks that have done things before, are super experienced, but given how much the roles are changing and what we’re expected to do is changing, I think having somebody who almost has a blank slate, and is just a really quick learner and is really eager to learn new tactics and stuff like that, and doesn’t have all these baked in processes and rituals in their mind, that’s super valuable.

Wen’s “crack new grad” maps closely to the strategies I wrote for entry-level designers: build things, get comfortable with AI tools, be what Josh Silverman calls the “dangerous generalist.” Someone without baked-in rituals who learns fast and ships. That a design leader at a frontier lab is actively looking for this profile matters, because most of the industry is still filtering for ten years of experience.

The design process is dead. Here’s what’s replacing it. | Jenny Wen (head of design at Claude)

Jenny Wen leads design for Claude at Anthropic. Prior to this, she was Director of Design at Figma, where she led the teams behind FigJam and Slides. Before that, she was a designer at Dropbox, Square, and Shopify.

youtube.com

February 26, 2026 at 10:00 PM

The instinct when working with AI agents is to write more. More instructions, more constraints. Turns out that’s exactly wrong.

Addy Osmani, writing for O’Reilly, digs into the research:

Research has confirmed what many devs anecdotally saw: as you pile on more instructions or data into the prompt, the model’s performance in adhering to each one drops significantly. One study dubbed this the “curse of instructions”, showing that even GPT-4 and Claude struggle when asked to satisfy many requirements simultaneously. In practical terms, if you present 10 bullet points of detailed rules, the AI might obey the first few and start overlooking others.

So the answer is a smarter spec, not a longer one. Osmani pulls from GitHub’s analysis of over 2,500 agent configuration files and finds that effective specs cover six areas: commands, testing, project structure, code style, git workflow, and boundaries.

The boundaries piece is worth lingering on. Osmani recommends a three-tier system:

Always do: Actions the agent should take without asking. “Always run tests before commits.” “Always follow the naming conventions in the style guide.”
Ask first: Actions that require human approval. “Ask before modifying database schemas.” “Ask before adding new dependencies.”
Never do: Hard stops. “Never commit secrets or API keys.” “Never edit node_modules/ or vendor/.” “Never remove a failing test without explicit approval.”

That framing—always, ask first, never—gives the AI a decision framework instead of a wall of instructions. It maps to how you’d manage a person, too. Osmani quotes Simon Willison on the comparison: getting good results from a coding agent feels “uncomfortably close to managing a human intern.”

Klaassen’s compound engineering is one version of this. Osmani’s spec framework is another. The principle underneath both: teach fewer things well rather than everything at once.

Two humanoid robots inspect a giant iridescent aqua scroll unrolling from a metal roller in a sunlit hall.

How to Write a Good Spec for AI Agents

This post first appeared on Addy Osmani’s Elevate Substack newsletter and is being republished here with the author’s permission.TL;DR: Aim for a clear

oreilly.com

February 26, 2026 at 6:00 PM

Most people using AI to write code are still reviewing every line. Kieran Klaassen stopped doing that months ago.

Kieran Klaassen, CTO of Cora at Every, on Peter Yang’s channel, He calls his approach compound engineering:

AI can learn. If you invest time to have the AI learn what you like and learn what it does wrong, it won’t do it the next time. So that’s the seed for compound engineering. There are four steps: planning first, working—which is just doing the work from the plan—then assessing and reviewing, making sure the work that’s done is correct, and then taking the learnings from that process and codifying them. So the next time you create a plan, it’s there. It learned.

Plan, build, review, codify. Each cycle teaches the AI something it keeps. You hit a problem, you capture the fix, and that fix lives in your repo as documentation the AI reads next time. The learnings compound across sessions.

The result: Klaassen says 100% of his code is now AI-written. He hasn’t opened Cursor in three months. But he’s not winging it. On what that trust actually requires:

It’s a little bit more of like, I trust you. I don’t need to look at all the code. I don’t need to read all the code, but I have systems and ways I work with AI that I trust, and through that I can let AI do things.

That trust is earned through the loop. Mistakes get caught, codified, and they don’t happen twice. Klaassen compares it to onboarding:

It’s similar to onboarding a person on your team. You need to get them on board, get them used to your code. But once that is done, you can let them go and really just run with it.

How to Make Claude Code Better Every Time You Use It (50 Min Tutorial) | Kieran Klaassen

Kieran my favorite Claude Code power user and teacher. In our interview, he walked through his Compound Engineering system that makes Claude Code better every time you use it. This same system has been embraced by the Claude Code team and others. Kieran is like Morpheus introducing me to the matrix, so don’t miss this episode 🙂

youtube.com

February 25, 2026 at 6:00 PM

Why AI isn’t showing up in productivity data? Chetan Dube offers one answer in Fast Company: most companies are bolting AI onto existing roles instead of redesigning the work.

Most managers are using AI the same way they use any productivity tool: to move faster. It summarizes meetings, drafts responses, and clears small tasks off the plate. That helps, but it misses the real shift. The real change begins when AI stops assisting and starts acting. When systems resolve issues, trigger workflows, and make routine decisions without human involvement, the work itself changes. And when the work changes, the job has to change too.

McKinsey data backs this up—78% of organizations now use AI in at least one function, “though some are still applying it on top of existing roles rather than redesigning work around it.” That’s the Solow paradox in one sentence.

Dube’s lost luggage example is a good one:

Generative AI can explain what steps to take to recover a lost bag. Agentic AI aims to actually find the bag, reroute it, and deliver it. The person that was working in lost luggage, doing these easily automated tasks, can now be freed to become more of a concierge for these disgruntled passengers.

The job goes from processing to judgment. And if leaders don’t get ahead of it:

If leaders don’t redesign the job intentionally, it will be redesigned for them, by the technology, by urgent failures, and by the slow erosion of clarity inside their teams.

That slow erosion of clarity is already visible. People less and less sure what they’re supposed to be doing because the tasks they were hired for are quietly handled by a system nobody put in charge.

Four-person open-plan desk with monitors, keyboards, office chairs and potted plants on a white oval amid colorful isometric cubes

If AI is doing the work, leaders need to redesign jobs

AI is taking a lot of work off of employees’ plates, but that doesn’t mean work has vanished. Now, there’s different work, and leaders need to craft jobs to match this new reality.

fastcompany.com

February 24, 2026 at 4:00 PM

The software development process has accumulated decades of ceremony. Boris Tane argues AI agents are collapsing the whole thing.

On engineers who started their careers after Cursor:

They don’t know what the software development lifecycle is. They don’t know what’s DevOps or what’s an SRE. Not because they’re bad engineers. Because they never needed it. They’ve never sat through sprint planning. They’ve never estimated story points. They’ve never waited three days for a PR review.

I read that and thought about design. How much of our process is ceremony too? The Figma-to-developer handoff. The pixel-perfect QA pass. The design review where six people debate border radius. If an AI agent can generate working UI from a design system in three prompts—which I’ve done—a lot of what we treat as process is friction we’ve institutionalized.

Tane’s conclusion:

The quality of what you build with agents is directly proportional to the quality of context you give them. Not the process. Not the ceremony. The context.

For engineering, context means specs, tests, architectural constraints. For design, it means your design system—the component docs and the rules for how things fit together. If that context is thin, the agent produces garbage. If it’s thorough and machine-readable, the output lands close to production-ready.

Tane again:

Requirements aren’t a phase anymore. They’re a byproduct of iteration.

Same for mockups. When you can generate and evaluate working UI faster than you can annotate a Figma frame, the mockup stops being a deliverable and becomes a sketch you might skip entirely. The design system becomes the spec. Context engineering becomes the job.

The Software Development Lifecycle Is Dead

AI agents didn’t make the SDLC faster. They killed it.

boristane.com

February 23, 2026 at 9:00 PM

I’ve been arguing that the designer’s job is shifting from execution to orchestration—directing AI agents rather than pushing pixels. I made that case from the design side. Addy Osmani just made it from engineering based on what he’s seeing.

Osmani draws a hard line between vibe coding and what he calls “agentic engineering.” On vibe coding:

Vibe coding means going with the vibes and not reviewing the code. That’s the defining characteristic. You prompt, you accept, you run it, you see if it works. If it doesn’t, you paste the error back and try again. You keep prompting. The human is a prompt DJ, not an engineer.

“Prompt DJ” is good. But Osmani’s description of the disciplined version is what caught my attention—it’s the same role I’ve been arguing designers need to grow into:

You’re orchestrating AI agents - coding assistants that can execute, test, and refine code - while you act as architect, reviewer, and decision-maker.

Osmani again:

AI didn’t cause the problem; skipping the design thinking did.

An engineer wrote that. The spec-first workflow Osmani describes is design process applied to code. Designers have been saying “define the problem before you jump to solutions” for decades. AI just made that advice load-bearing for engineers too.

The full piece goes deep on skill gaps, testing discipline, and evaluation frameworks—worth a complete read.

Agentic Engineering

Agentic Engineering is a disciplined approach to AI-assisted software development that emphasizes human oversight and engineering rigor, distinguishing it fr...

addyosmani.com

February 19, 2026 at 7:00 PM

Tommaso Nervegna, a Design Director at Accenture Song, gives one of the clearest practitioner accounts I’ve seen of what using Claude Code as a designer looks like day to day.

The guide is detailed—installation steps, terminal commands, deployment. This is essential reading for any designer interested in Claude Code. But for me, the interesting part isn’t the how-to. It’s his argument that raw AI coding tools aren’t enough without structure on top:

Claude Code is powerful, but without proper context engineering, it degrades as the conversation gets longer.

Anyone who’s used these tools seriously has experienced this. You start a session and the output is sharp. Forty minutes in, it’s forgotten your constraints and is hallucinating component names. Nervegna uses a meta-prompting framework called Get Shit Done that breaks work into phases with fresh contexts—research, planning, execution, verification—each getting its own 200K token window. No accumulated garbage.

The framework ends up looking a lot like good design process applied to AI:

Instead of immediately generating code, it asks:
“What happens when there’s no data to display?” “Should this work on mobile?” “What’s the error state look like?” “How do users undo this action?”

Those are the questions a senior designer asks in a review. Nervegna calls it “spec-driven development,” but it’s really the discipline of defining the problem before jumping to solutions—something our profession has always preached and often ignored when deadlines hit.

Nervegna again:

This is spec-driven development, but the spec is generated through conversation, not written in Jira by a project manager.

The specification work that used to live in PRDs and handoff docs is happening conversationally now, between a designer and an AI agent. The designer’s value is in the questions asked before any code gets written.

Claude Code for Designers: A Practical Guide

A Step-by-Step Guide to Designing and Shipping with Claude Code

nervegna.substack.com

February 18, 2026 at 7:00 PM

Steve Yegge has been talking to nearly 40 people at Anthropic over the past four months. What he describes looks nothing like the feature factory world that NN/g catalogs. No 47-page alignment documents. No 14-meeting coordination cycles. Instead, campfires:

Everyone sits around a campfire together, and builds. The center of the campfire is a living prototype. There is no waterfall. There is no spec. There is a prototype that simply evolves, via group sculpting, into the final product: something that finally feels right. You know it when you finally find it.
As evidence of this, Anthropic, from what I’m told, does not produce an operating plan ahead more than 90 days, and that is their outermost planning cycle. They are vibing, on the shortest cycles and fastest feedback loops imaginable for their size.

No roadmap beyond 90 days. They group-sculpt a living prototype. Someone told Yegge that Claude Cowork shipped 10 days after the idea first came up. Ten days. A small team with real ownership, shipping at the speed the tools now allow.

Yegge argues this works partly because of a cultural requirement most companies would struggle with. He describes a three-person startup called SageOx that operates the same way:

A lot of engineers like to work in relative privacy, or even secrecy. They don’t want people to see all the false starts, struggles, etc. They just want people to see the finished product. It’s why we have git squash and send dignified PRs instead of streaming every compile error to our entire team.
But my SageOx friends Ajit and Ryan actually want the entire work stream to be public, because it’s incredibly valuable for forensics: figuring out exactly how and why a teammate, human or agent, got to a particular spot. It’s valuable because merging is a continuous activity and the forensics give the models the tools and context they need to merge intelligently.
So at SageOx they all see each other’s work all the time, and act on that info. It’s like the whole team is pair programming at once. They course-correct each other in real time.

Yegge calls this “the death of the ego.” Everyone sees your mistakes, your wrong turns, how fast you work. Nothing to hide. Most designers and engineers I know would be deeply uncomfortable with that. We like to polish before we share. We present finished comps, not the 13 variations we tried and abandoned.

But if the campfire model is where things are heading—and the speed advantage over the feature factory is hard to argue with—then the culture has to change before the process can. That’s the part nobody wants to talk about.

Five bees in goggles on a wooden stage assembling a glowing steampunk orb, surrounded by tools, blueprints, gears and theater seats

The Anthropic Hive Mind

As you’ve probably noticed, something is happening over at Anthropic. They are a spaceship that is beginning to take off.

steve-yegge.medium.com

February 12, 2026 at 6:00 PM

I recently spent some time to move my entire note-taking system away from Notion to Obsidian because the latter runs on Markdown files, which are text files. Why? Because AI runs on text.

And that is also the argument from Patrick Morgan. Your notes, your documented processes, your collected examples of what “good” looks like—if those live in plain text, AI can actually work with them. If they live in your head, or scattered across tools that don’t export, they’re invisible.

There’s a difference between having a fleeting conversation and collaborating on an asset you both work on. When your thinking lives in plain text — especially Markdown — it becomes legible not just to you, but to an AI that can read across hundreds of files, notice patterns, and act at scale.

I like that he frames this as scaffolding rather than some elaborate knowledge management system. He’s honest about the PKM fatigue most of us share:

Personal knowledge management is far from a new concept. Honestly, it’s a topic I started to ignore because too many people were trying to sell me on yet another “life changing” system. Even when I tried to jump through the hoops, it was all just too much for me for too little return. But now that’s changed. With AI, the value is much greater and the barrier to entry much lower. I don’t need an elaborate system. I just need to get my thinking in text so I can share it with my AI.

This is the part that matters for designers. We externalize visual thinking all the time—moodboards, style tiles, component libraries. But we rarely externalize the reasoning behind those decisions in a format that’s portable and machine-readable. Why did we choose that pattern? What were we reacting against? What does “good” look like for this particular problem?

Morgan’s practical recommendation is dead simple: three markdown files. One for process, one for taste, one for raw thinking. That’s it.

This is how your private thinking becomes shared context.

The designers who start doing this now will have documented judgment that AI can actually use.

AI Runs on Text. So Should You.

Where human thinking and AI capability naturally meet

open.substack.com

February 12, 2026 at 3:00 PM

Many designers I’ve worked with want to get to screens as fast as possible. Open Figma, start laying things out, figure out the structure as they go. It works often enough that nobody questions it. But Daniel Rosenberg makes a case for why it shouldn’t be the default.

Rosenberg, writing for the Interaction Design Foundation, argues that the conceptual model—the objects users manipulate, the actions they perform, and the attributes they change—should be designed before anyone touches a screen:

Even before you sketch your first screen it is beneficial to develop a designer’s conceptual model and use it as the baseline for guiding all future interaction design decisions.

Rosenberg maps this to natural language. Objects are nouns. Actions are verbs. Attributes are adjectives. The way these elements relate to each other is the grammar of your interface. Get the grammar wrong and no amount of visual polish will save you.

His example is painfully simple. A tax e-sign system asked him to “ENTER a PIN” when he’d never used the system before. There was no PIN to enter. The action should have been “CREATE.” One wrong verb and a UX expert with 40 years of experience couldn’t complete the task. His accountant confirmed that dozens of clients had called thinking the system was broken.

Rosenberg on why this cascades:

A suboptimal decision on any lower layer will cascade through all the layers above. This is why designing the conceptual model grammar with the lowest cognitive complexity at the very start… is so powerful.

This is the part I want my team to internalize. When you jump straight to screens, you’re making grammar decisions implicitly—choosing verbs for buttons, deciding which objects to surface, grouping attributes in panels. You’re doing conceptual modeling whether you know it or not. The question is whether you’re doing it deliberately.

The MAGIC of Semantic Interaction Design

Blame the user: me, a UX expert with more than 40 years of experience, who has designed more than 100 successful commercial products and evaluated the inadequate designs of nearly 1, 000 more.

interaction-design.org

February 11, 2026 at 9:00 PM

Everyone wants to talk about the AI use case. Nobody wants to talk about the work that makes the use case possible.

Erika Flowers, who led NASA’s AI readiness initiative, has a great metaphor for this on the Invisible Machines podcast. Her family builds houses, and before they could install a high-tech steel roof, they spent a week building scaffolding, setting up tarps, rigging safety harnesses, positioning dumpsters for debris. The scaffolding wasn’t the job. But without it, the job couldn’t happen.

Flowers on where most organizations are with AI right now:

We are trying to just climb up on these roofs with our most high tech pneumatic nail gun and we got all these tools and stuff and we haven’t clipped off to our belay gear. We don’t have the scaffolding set up. We don’t have the tarps and the dumpsters to catch all the debris. We just want to get up there. That is the state of AI and transformation.

The scaffolding is the boring stuff: data integration, governance, connected workflows, organizational readiness. It’s context engineering at the enterprise level. Before any AI feature can do real work, someone has to make sure it has the right data, the right permissions, and the right place in a process. Nobody wants to fund that part.

But Flowers goes further. She argues we’re not just skipping the scaffolding—we’re automating the wrong things entirely. Her example: accounting software uses AI to help you build a spreadsheet faster, then you email it to someone who extracts the one number they actually needed. Why not just ask the AI for the number? We’re using new technology to speed up old workflows instead of asking whether the workflow should exist at all.

Then she gets to the interesting question—who’s supposed to design all of this?

I don’t think it exists necessarily with the roles that we have. It’s going to be a lot closer to Hollywood… producer, director, screenwriter. And I don’t mean as metaphors, I mean literally those people and how they think and how they do it because we’re in a post software era.

She lists therapists, psychologists, wedding planners, dance choreographers. People who know how to choreograph human interactions without predetermined inputs. That’s a different skill set than designing screens, and I think she’s onto something.

Why AI Scaffolding Matters More than Use Cases ft Erika Flowers

We’re in a moment when organizations are approaching agentic AI backwards, chasing flashy use cases instead of building the scaffolding that makes AI agents actually work at scale. Erika Flowers, who led NASA’s AI Readiness Initiative and has advised Meta, Google, Netflix, and Intuit, joins Robb and Josh for a frank and funny conversation about what’s broken in enterprise AI adoption. She dismantles the myth of the “big sexy AI use case” and explains why most AI projects fail before they start. The trio makes the case that we’re entering a post-software world, whether organizations are ready or not. Chapters - 0:09 - NASA AI Readiness Explained | Erica Flowers on Agentic AI & Runtimes 1:48 - Why the “Big Sexy AI Use Case” Is a Lie 2:42 - AI Didn’t Start with ChatGPT: What NASA Has Been Doing for 30 Years 4:24 - Why AI Runtimes Matter More Than Any Single Use Case 5:21 - The Hidden AI Problem: Legacy Data, Silos & Organizational Reality 7:13 - The Boring AI That Actually Works (And Why Enterprises Ignore It) 8:10 - The AI Arms Race Nobody Understands 9:22 - AI Scaffolding Explained: The Metaphor Every Leader Needs to Hear 12:12 - AI Readiness Is Cultural Change, Not Just Technology 14:38 - From Parking Lots to Companies: How Simple AI Agents Quietly Scale 17:01 - Why Most AI Features Feel Useless in Real Products 19:08 - Stop Automating Spreadsheets: Ask AI the Question Instead 25:06 - The Post-Software Era: Why Designers Aren’t Enough Anymore 28:33 - UI Is a Medium: How AI Will Absorb Interfaces Entirely 46:24 - Infinite Content, Human Creativity, and the Future After AI Listen and Check out Erika’s podcast, “Flower Power Hour”: https://open.spotify.com/show/15BTSl9fWiH3QTmVAYj6Fd Learn more about Erika at www.helloerikaflowers.com/ ---------- Support our show by supporting our sponsors! This episode is supported by OneReach.ai Forged over a decade of R&D and proven in 10,000+ deployments, OneReach.ai’s GSX is the first complete AI agent runtime environment (circa 2019) — a hardened AI agent architecture for enterprise control and scale. Backed by UC Berkeley, recognized by Gartner, and trusted across highly regulated industries, including healthcare, finance, government and telecommunications. A complete system for accelerating AI adoption - design, train, test, deploy, monitor, and orchestrate neurosymbolic applications (agents). Use any AI models - Build and deploy intelligent agents fast - Create guardrails for organizational alignment - Enterprise-grade security and governance Request free prototype: https://onereach.ai/prototype/?utm_source=youtube&utm_medium=social&utm_campaign=podcast_s6e12&utm_content=1 ---------- The revised and significantly updated second edition of our bestselling book about succeeding with AI agents, Age of Invisible Machines, is available everywhere: Amazon — https://bit.ly/4hwX0a5 #InvisibleMachines #Podcast #TechPodcast #AIPodcast #AI #AgenticAI #AIAgents #DigitalTransformation #AIReadiness #AIDeployment #AISoftware #AITransformation #AIAdoption #AIProjects #NASA #AgentRuntime #Innovation #AIUseCase

youtu.be

February 11, 2026 at 3:00 PM

Daniel Miessler pulls an idea from a recent Karpathy interview that’s been rattling around in my head since I read it:

Humans collapse during the course of their lives. Children haven’t overfit yet. They will say stuff that will shock you because they’re not yet collapsed. But we [adults] are collapsed. We end up revisiting the same thoughts, we end up saying more and more of the same stuff, the learning rates go down, the collapse continues to get worse, and then everything deteriorates.

Miessler’s description of what this looks like in practice is uncomfortable:

How many older people do you know who tell the same stories and jokes over and over? Watch the same shows. Listen to the same five bands, and then eventually two. Their aperture slowly shrinks until they die.

I’ve seen this in designers. The ones who peaked early and never pushed past what worked for them. Their work from five years ago looks exactly like their work today. Same layouts, same patterns, same instincts applied to every problem regardless of context. They collapsed and didn’t notice.

Then Miessler, almost in passing:

This was a problem before AI. And now many are delegating even more of their thinking to a system that learns by crunching mediocrity from the internet. I can see things getting significantly worse.

If collapse is what happens when you stop seeking new inputs, then outsourcing your thinking to AI is collapse on fast-forward. You’re not building pattern recognition, you’re borrowing someone else’s average. The outputs look competent. They pass a first glance. But nothing in there surprises anyone, because the model optimizes for the most statistically probable next token.

Use AI to accelerate execution, not to replace the part where you actually have an idea.

Childhood → reading/exposure/tools/comedy → Renewal → Sustained Vitality. Side: Adult Collapse (danger: low entropy, repetition).

Humans Need Entropy

On Karpathy

danielmiessler.com

February 10, 2026 at 7:00 PM

There’s a version of product thinking that lives in frameworks and planning docs. And then there’s the version that shows up when someone looks at a screen and immediately knows something is off. That second version—call it product sense, call it taste or judgement—comes from doing the work, not reading about it.

Peter Yang, writing in his Behind the Craft newsletter, shares 25 product beliefs from a decade at Roblox, Reddit, Amazon, and Meta. The whole list is worth reading, but a few items stood out.

On actually using your own product:

I estimate that less than 10% of PMs actually dogfood their product on a weekly basis. Use your product like a first-time user and write a friction log of how annoying the experience is. Nobody is too senior to test their own shit.

Ten percent. If that number is even close to accurate, it’s damning. You can’t develop good product judgment if you’re not paying attention to the thing you ship. And this applies to designers just as much as PMs.

Yang again, on where that judgment actually shows up:

Default states, edge cases, and good copy — these details are what separates a great product from slop. It doesn’t matter how senior you are, you have to give a damn about the tiniest details to ship something that you can be proud of.

Knowing that default states matter, knowing which edge cases to care about, knowing when copy is doing too much or too little—you can’t learn that from a framework. That’s pattern recognition from years of seeing what good looks like and what falls apart.

And on what qualifies someone to do this work:

Nobody cares about your FAANG pedigree or AI product certificate. Hire high agency people who have built great side projects or demonstrated proof of work. The only credential that matters is what you’ve shipped and your ideas to improve the product.

Reps and shipped work, not reading and credentials. The people who’ve done the reps are the ones who can see the details everyone else misses.

Person with glasses centered, hands clasped; red text reads "10 years of PM lessons in 12 minutes"; logos for Meta, Amazon, Reddit, Roblox.

25 Things I Believe In to Build Great Products

What I believe in is often the opposite of how big companies like to work

creatoreconomy.so

February 6, 2026 at 12:00 AM

Earlier this week I linked to Gale Robins’ argument that AI makes execution cheap but doesn’t help you decide what to build. Christina Wodtke is making the same case from the research side.

Christina Wodtke opens with a designer who spent two weeks vibe-coding a gratitude journaling app. Beautiful interface, confetti animations, gentle notifications. Then she showed it to users. “I don’t really journal,” said the first one. “Gratitude journaling felt performative,” said the second. Two weeks building the wrong thing. Wodtke’s diagnosis:

That satisfaction is a trap. You’re accumulating artifacts that may have nothing to do with what anyone needs.

Wodtke draws a line between need-finding and validation that I think a lot of teams blur. Skipping the first and jumping to the second means you’re testing your guess, not understanding the problem:

Need-finding happens before you have a solution. You’re listening to people describe their lives, their frustrations, their workarounds. You’re hunting for problems that actually exist—problems people care enough about that they’re already trying to solve them with spreadsheets and sticky notes and whatever else they’ve cobbled together.

Wodtke’s version of fast looks different from what you’d expect:

The actual fast path is unsexy: sit down with five to ten people. Ask them about their lives. Shut up and listen. Use those three magic words—“tell me more”—every time something interesting surfaces. Don’t show them anything. Don’t pitch. Just listen.

“You’ll build less. It’ll be the right thing.” When building is cheap, the bottleneck moves upstream to judgment, knowing what to build. That judgment comes from listening, not prompting.

Solid black square with no visible details.

Vibe-Coding Is Not Need-Finding

Last month a product designer showed me her new prototype. She’d spent two weeks vibe-coding a tool for tracking “gratitude journaling streaks.” The interface was beautiful. Confe…

eleganthack.com

February 4, 2026 at 9:00 PM

Every team I’ve ever led has had one of these people. The person who writes the doc that gives the project its shape, who closes context gaps in one-on-ones before they turn into conflicts, who somehow keeps six workstreams from drifting apart. They rarely get the credit they deserve because the work, when it’s done well, looks like it just happened on its own.

Hardik Pandya writes about this on his blog. He shares a quote from a founder friend describing his most valuable employee:

“She’s the reason things actually work around here. She just… makes sure everything happens. She writes the docs. She runs the meetings that matter. She talks to people. Somehow everything she touches stays on track. I don’t know how I’d even describe what she does to a person outside the company. But if she left, we’d fall apart in a month. Maybe less.”

I’ve known people like this at every company I’ve worked at. And I’ve watched them get passed over because the performance system couldn’t see them. Pandya nails why:

When a project succeeds, credit flows to the people whose contributions are easy to describe. The person who presented to the board. The person whose name is on the launch email. The person who shipped the final feature. These contributions are real, I’m not diminishing them. But they’re not more real than the work that made them possible. They’re just easier to point at.

Most organizations try to fix this by telling the invisible workers to “be more visible”—present more, build your personal brand internally. Pandya’s advice goes the other direction, and I think he’s right:

If you’re good at the invisible work, the first move isn’t to get better at visibility. It’s to find the leader who doesn’t need you to be visible.

As a leader, I take this as a challenge. If someone on my team is doing the work that holds everything together, it’s my job to make sure the organization sees it too—especially when it doesn’t announce itself.

The Invisible Work

The coordination work that holds projects together disappears the moment it works. On the unfairness of recognition and finding leaders who see it anyway.

hvpandya.com

January 26, 2026 at 9:00 PM

When I managed over 40 creatives at a digital agency, the hardest part wasn’t the work itself—it was resource allocation. Who’s got bandwidth? Who’s blocked waiting on feedback? Who’s deep in something and shouldn’t be interrupted? You learn to think of your team not as individuals you assign tasks to, but as capacity you orchestrate.

I was reminded of that when I read about Boris Cherny’s approach to Claude Code. Cherny is a Staff Engineer at Anthropic who helped build Claude Code. Karo Zieminski, writing in her Product with Attitude Substack, breaks down how Cherny actually uses his own tool:

He keeps ~10–15 concurrent Claude Code sessions alive: 5 in terminal (tabbed, numbered, with OS notifications). 5–10 in the browser. Plus mobile sessions he starts in the morning and checks in on later. He hands off sessions between environments and sometimes teleports them back and forth.

Zieminski’s analysis is sharp:

Boris doesn’t see AI as a tool you use, but as a capacity you schedule. He’s distributing cognition like compute: allocate it, queue it, keep it hot, switch contexts only when value is ready. The bottleneck isn’t generation; it’s attention allocation.

Most people treat AI assistants like a single very smart coworker. You give it a task, wait for the answer, evaluate, iterate. Cherny treats Claude like a team—multiple parallel workers, each holding different context, each making progress while he’s focused elsewhere.

Zieminski again:

Each session is a separate worker with its own context, not a single assistant that must hold everything. The “fleet” approach is basically: don’t make one brain do all jobs; run many partial brains.

I’ve been using Claude Code for months, but mostly one session at a time. Reading this, I realize I’ve been thinking too small. The parallel session model is about working efficiently. Start a research task in one session, let it run while you code in another, check back when it’s ready.

Looks like the new skill on the block is orchestration.

How Boris Cherny Uses Claude Code

An in-depth analysis of how Boris Cherny, creator of Claude Code, uses it — and what it reveals about AI agents, responsibility, and product thinking.

open.substack.com

January 22, 2026 at 3:00 PM

If design’s value isn’t execution—and AI is making that argument harder to resist—then what is it? Dan Ramsden offers a framework I find useful.

He breaks thinking into three types: deduction (drawing conclusions from data), induction (building predictions from patterns), and abduction—generating something new. Design’s unique contribution is abductive thinking:

When we use deduction, we discover users dropping off during a registration flow. Induction might tell us why. Abduction would help us imagine new flows to fix it.

Product managers excel at sense-making (aka “Why?”). Engineers build the thing. Design makes the difference—moving from “what is” to “what could be.”

On AI and the temptation to retreat to “creativity” or “taste” as design’s moat, Ramsden is skeptical:

Some might argue that it comes down to “taste”. I don’t think that’s quite right — taste without a rationale is just an opinion. I think designers are describers.

I appreciate that distinction. Taste without rationale is just preference. Design’s value is translating ideas through increasing levels of fidelity—from sketch to prototype to tested solution—validating along the way.

His definition of design in a product context:

Design is a set of structured processes to translate intent into experiments.

That’s a working definition I can use. It positions design not as the source of ideas (those can come from anywhere, including AI), but as the discipline that manages ideas through validation. The value isn’t in generating the concept—it’s in making it real while managing risk.

Two overlapping blue circles: left text "Making sense to generate a problem"; right text "Making a difference to generate value

The value of Design in a product organisation

Clickbait opening: There’s no such thing as Product Design

medium.com

January 19, 2026 at 3:00 PM

I’ve spent a lot of my product design career pushing for metrics—proving ROI, showing impact, making the case for design in business terms. But I’ve also seen how metrics become the goal rather than a signal pointing toward the goal. When the number goes up, we celebrate. When it doesn’t, we tweak the collection process. Meanwhile, the user becomes secondary. Last week’s big idea was around metrics, this piece piles on.

Pavel Samsonov calls this out:

Managers can only justify their place in value chains by inventing metrics for those they manage to make it look like they are managing.

I’ve sat in meetings where we debated which numbers to report to leadership—not which work to prioritize for users. The metrics become theater. So-called “vanity metrics” that always go up and to the right.

But here’s where Pavel goes somewhere unexpected. He doesn’t let designers off the hook either:

Defining success by a metric of beauty offers a useful kind of vagueness, one that NDS seems to hide behind despite the slow loading times or unnavigability that seem to define their output; you can argue with slow loading times or difficulty finding a form, but you cannot meaningfully argue with “beautiful.”

“Taste” and “beauty” are just another avoidance strategy. That’s a direct challenge to the design discourse that’s been dominant lately—the return to craft, the elevation of aesthetic judgment. Pavel’s saying it’s the same disease, different symptom. Both metrics obsession and taste obsession are ways to avoid the ambiguity of actually defining user success.

So what’s the alternative? Pavel again:

Fundamentally, the work of design is intentionally improving conditions under uncertainty. The process necessarily involves a lot of arguments over the definition and parameters of “improvement”, but the primary barrier to better is definitely not how long it takes to make artifacts.

The work is the argument. The work is facing the ambiguity rather than hiding behind numbers or aesthetics. Neither Figma velocity nor visual polish is a substitute for the uncomfortable conversation about what “better” actually means for the people using your product.

Bold "Product Picnic" text over a black-and-white rolling hill and cloudy sky, with a large outlined "50" on the right.

Your metrics are an avoidance strategy

Being able to quantify outcomes doesn’t make them meaningful. Moving past artificial metrics requires building shared intention with colleagues.

productpicnic.beehiiv.com

January 15, 2026 at 9:00 PM

It’s January and by now millions of us have made resolutions and probably broken them already. The second Friday of January is known as “Quitter’s Day.”

OKRs—objectives and key results—are a method for businesses to set and align company goals. The objective is your goal and the KRs are the ways to reach your goals. Venture capitalist John Doerr learned about OKRs while working at Intel, brought it to Google, and later became the framework’s leading evangelist.

Christina Wodtke talks about how to use OKRs for your personal life, and maybe as a way to come up with better New Year’s resolutions. She looked at her past three years of personal OKRs:

Looking at the pattern laid out in front of me, I finally saw what I’d been missing. My problem wasn’t work-life balance. My problem was that I didn’t like the kind of work I was doing.
The key results kept failing because the objective was wrong. It wasn’t about balance. It was about joy.
This is the second thing key results do for you: when they consistently fail, they’re telling you something. Not that you lack discipline—that you might be chasing the wrong goal entirely.

And I love Wodtke’s line here: “New Year’s resolutions fail because they’re wishes, not plans.“ She continues:

They fail because “eat better” and “be healthier” and “find balance” are too vague to act on and too fuzzy to measure.
Key results fix this. Not because measurement is magic, but because the act of measuring forces clarity. It makes you confront what you actually want. And sometimes, when the data piles up, it reveals that what you wanted wasn’t the thing you needed at all.

Your Resolution Isn’t the Problem. Your Measurement Is.

It’s January, and millions of people have made the same resolution: “Eat better.” By February, most will have abandoned it. Not because they lack willpower or discipline. Because …

eleganthack.com

January 15, 2026 at 6:00 PM

Building on our earlier link about measuring the impact of features, how can we keep track of the overall health of the product? That’s where a North Star Metric comes in.

Julia Sholtz writes and introduction to North Star Metrics in the analytics provider Amplitude’s blog:

Your North Star Metric should be the key measure of success for your company’s product team. It defines the relationship between the customer problems your product team is trying to solve and the revenue you aim to generate by doing so.

How is it done? The first step is to figure out the “game” your business is playing: how your business engages with customers:

The Attention Game: How much time are your customers willing to spend in your product?
The Transaction Game: How many transactions does your user make on your platform?
The Productivity Game: How efficiently and effectively can someone get their work done in your product?

They have a whole resource section on this topic that’s worth exploring.

Every Product Needs a North Star Metric: Here’s How to Find Yours

Get an introduction to product strategy with examples of North Star Metrics across industries.

amplitude.com

January 15, 2026 at 3:00 PM

How do we know what we designed is working as intended? We measure. Vitaly Friedman shares something called the TARS framework to measure the impact of features.

We need UX metrics to understand and improve user experience. What I love most about TARS is that it’s a neat way to connect customers’ usage and customers’ experience with relevant product metrics.

Here’s TARS in a nutshell:

Target Audience (%): Measures the percentage of all product users who have the specific problem that a feature aims to solve.
Adoption (%): Tracks the percentage of the target audience that successfully and meaningfully engages with the feature.
Retention (%): Assesses how many users who adopted the feature continue to use it repeatedly over time.
Satisfaction Score (CES): Gauges the level of satisfaction, specifically how easy it was for retained users to solve their problem after using the feature.

Friedman has more details in the article, including how to use TARS to measure how well a feature is performing for your intended target audience.

How To Measure The Impact Of Features

Meet TARS — a simple, repeatable, and meaningful UX metric designed specifically to track the performance of product features. Upcoming part of the Measure UX & Design Impact (use the code 🎟 IMPACT to save 20% off today).

smashingmagazine.com

January 14, 2026 at 3:00 PM

Here is a good reminder from B. Prendergast to “stop asking users what they want—and start watching what they do.”

Asking people what they want is one of the most natural instincts in product work. Surveys, interviews, and feature wish lists feel accessible, social, and collaborative. They open channels to understand and empathise with the user base. They help teams feel closer to the people they serve. For teams under pressure, a stack of opinions can feel like solid data.
But this breaks when we compare what users say to what they actually do (say-do gap).
We all want to present ourselves a certain way. We want to seem more competent than confused (social desirability bias). Our memories can be fuzzy, especially about routine tasks (recall bias). Standards for what feels “easy” or “intuitive” can vary wildly between people (reference bias).

And of course, as soon as we start to ask users to imagine what they’d want, they’ll solve based on their personal experiences—which might be the right solution for them, but might not be for other users in the same situation.

Prendergast goes on to suggest “watch what people do, measure what matters, and use what they say to add context.” This approach involves watching user interactions, analyzing real behaviors through analytics, and treating feature requests as signals of underlying problems to uncover genuine needs. Prioritizing decisions based on observed patterns and desired outcomes leads to more effective solutions than relying on user opinions alone.

Stop asking users what they want — and start watching what they do.

People’s opinions about themselves and the things they use rarely match real behaviour.

renderghost.leaflet.pub