Skip to content

162 posts tagged with “process”

Everyone wants to talk about the AI use case. Nobody wants to talk about the work that makes the use case possible.

Erika Flowers, who led NASA’s AI readiness initiative, has a great metaphor for this on the Invisible Machines podcast. Her family builds houses, and before they could install a high-tech steel roof, they spent a week building scaffolding, setting up tarps, rigging safety harnesses, positioning dumpsters for debris. The scaffolding wasn’t the job. But without it, the job couldn’t happen.

Flowers on where most organizations are with AI right now:

We are trying to just climb up on these roofs with our most high tech pneumatic nail gun and we got all these tools and stuff and we haven’t clipped off to our belay gear. We don’t have the scaffolding set up. We don’t have the tarps and the dumpsters to catch all the debris. We just want to get up there. That is the state of AI and transformation.

The scaffolding is the boring stuff: data integration, governance, connected workflows, organizational readiness. It’s context engineering at the enterprise level. Before any AI feature can do real work, someone has to make sure it has the right data, the right permissions, and the right place in a process. Nobody wants to fund that part.

But Flowers goes further. She argues we’re not just skipping the scaffolding—we’re automating the wrong things entirely. Her example: accounting software uses AI to help you build a spreadsheet faster, then you email it to someone who extracts the one number they actually needed. Why not just ask the AI for the number? We’re using new technology to speed up old workflows instead of asking whether the workflow should exist at all.

Then she gets to the interesting question—who’s supposed to design all of this?

I don’t think it exists necessarily with the roles that we have. It’s going to be a lot closer to Hollywood… producer, director, screenwriter. And I don’t mean as metaphors, I mean literally those people and how they think and how they do it because we’re in a post software era.

She lists therapists, psychologists, wedding planners, dance choreographers. People who know how to choreograph human interactions without predetermined inputs. That’s a different skill set than designing screens, and I think she’s onto something.

Why AI Scaffolding Matters More than Use Cases ft Erika Flowers

We’re in a moment when organizations are approaching agentic AI backwards, chasing flashy use cases instead of building the scaffolding that makes AI agents actually work at scale. Erika Flowers, who led NASA’s AI Readiness Initiative and has advised Meta, Google, Netflix, and Intuit, joins Robb and Josh for a frank and funny conversation about what’s broken in enterprise AI adoption. She dismantles the myth of the “big sexy AI use case” and explains why most AI projects fail before they start. The trio makes the case that we’re entering a post-software world, whether organizations are ready or not. Chapters - 0:09 - NASA AI Readiness Explained | Erica Flowers on Agentic AI & Runtimes 1:48 - Why the “Big Sexy AI Use Case” Is a Lie 2:42 - AI Didn’t Start with ChatGPT: What NASA Has Been Doing for 30 Years 4:24 - Why AI Runtimes Matter More Than Any Single Use Case 5:21 - The Hidden AI Problem: Legacy Data, Silos & Organizational Reality 7:13 - The Boring AI That Actually Works (And Why Enterprises Ignore It) 8:10 - The AI Arms Race Nobody Understands 9:22 - AI Scaffolding Explained: The Metaphor Every Leader Needs to Hear 12:12 - AI Readiness Is Cultural Change, Not Just Technology 14:38 - From Parking Lots to Companies: How Simple AI Agents Quietly Scale 17:01 - Why Most AI Features Feel Useless in Real Products 19:08 - Stop Automating Spreadsheets: Ask AI the Question Instead 25:06 - The Post-Software Era: Why Designers Aren’t Enough Anymore 28:33 - UI Is a Medium: How AI Will Absorb Interfaces Entirely 46:24 - Infinite Content, Human Creativity, and the Future After AI Listen and Check out Erika’s podcast, “Flower Power Hour”: https://open.spotify.com/show/15BTSl9fWiH3QTmVAYj6Fd Learn more about Erika at www.helloerikaflowers.com/ ---------- Support our show by supporting our sponsors! This episode is supported by OneReach.ai Forged over a decade of R&D and proven in 10,000+ deployments, OneReach.ai’s GSX is the first complete AI agent runtime environment (circa 2019) — a hardened AI agent architecture for enterprise control and scale. Backed by UC Berkeley, recognized by Gartner, and trusted across highly regulated industries, including healthcare, finance, government and telecommunications. A complete system for accelerating AI adoption - design, train, test, deploy, monitor, and orchestrate neurosymbolic applications (agents). Use any AI models - Build and deploy intelligent agents fast - Create guardrails for organizational alignment - Enterprise-grade security and governance Request free prototype: https://onereach.ai/prototype/?utm_source=youtube&utm_medium=social&utm_campaign=podcast_s6e12&utm_content=1 ---------- The revised and significantly updated second edition of our bestselling book about succeeding with AI agents, Age of Invisible Machines, is available everywhere: Amazon — https://bit.ly/4hwX0a5 #InvisibleMachines #Podcast #TechPodcast #AIPodcast #AI #AgenticAI #AIAgents #DigitalTransformation #AIReadiness #AIDeployment #AISoftware #AITransformation #AIAdoption #AIProjects #NASA #AgentRuntime #Innovation #AIUseCase

youtu.be iconyoutu.be

Daniel Miessler pulls an idea from a recent Karpathy interview that’s been rattling around in my head since I read it:

Humans collapse during the course of their lives. Children haven’t overfit yet. They will say stuff that will shock you because they’re not yet collapsed. But we [adults] are collapsed. We end up revisiting the same thoughts, we end up saying more and more of the same stuff, the learning rates go down, the collapse continues to get worse, and then everything deteriorates.

Miessler’s description of what this looks like in practice is uncomfortable:

How many older people do you know who tell the same stories and jokes over and over? Watch the same shows. Listen to the same five bands, and then eventually two. Their aperture slowly shrinks until they die.

I’ve seen this in designers. The ones who peaked early and never pushed past what worked for them. Their work from five years ago looks exactly like their work today. Same layouts, same patterns, same instincts applied to every problem regardless of context. They collapsed and didn’t notice.

Then Miessler, almost in passing:

This was a problem before AI. And now many are delegating even more of their thinking to a system that learns by crunching mediocrity from the internet. I can see things getting significantly worse.

If collapse is what happens when you stop seeking new inputs, then outsourcing your thinking to AI is collapse on fast-forward. You’re not building pattern recognition, you’re borrowing someone else’s average. The outputs look competent. They pass a first glance. But nothing in there surprises anyone, because the model optimizes for the most statistically probable next token.

Use AI to accelerate execution, not to replace the part where you actually have an idea.

Childhood → reading/exposure/tools/comedy → Renewal → Sustained Vitality. Side: Adult Collapse (danger: low entropy, repetition).

Humans Need Entropy

On Karpathy

danielmiessler.com icondanielmiessler.com

There’s a version of product thinking that lives in frameworks and planning docs. And then there’s the version that shows up when someone looks at a screen and immediately knows something is off. That second version—call it product sense, call it taste or judgement—comes from doing the work, not reading about it.

Peter Yang, writing in his Behind the Craft newsletter, shares 25 product beliefs from a decade at Roblox, Reddit, Amazon, and Meta. The whole list is worth reading, but a few items stood out.

On actually using your own product:

I estimate that less than 10% of PMs actually dogfood their product on a weekly basis. Use your product like a first-time user and write a friction log of how annoying the experience is. Nobody is too senior to test their own shit.

Ten percent. If that number is even close to accurate, it’s damning. You can’t develop good product judgment if you’re not paying attention to the thing you ship. And this applies to designers just as much as PMs.

Yang again, on where that judgment actually shows up:

Default states, edge cases, and good copy — these details are what separates a great product from slop. It doesn’t matter how senior you are, you have to give a damn about the tiniest details to ship something that you can be proud of.

Knowing that default states matter, knowing which edge cases to care about, knowing when copy is doing too much or too little—you can’t learn that from a framework. That’s pattern recognition from years of seeing what good looks like and what falls apart.

And on what qualifies someone to do this work:

Nobody cares about your FAANG pedigree or AI product certificate. Hire high agency people who have built great side projects or demonstrated proof of work. The only credential that matters is what you’ve shipped and your ideas to improve the product.

Reps and shipped work, not reading and credentials. The people who’ve done the reps are the ones who can see the details everyone else misses.

Person with glasses centered, hands clasped; red text reads "10 years of PM lessons in 12 minutes"; logos for Meta, Amazon, Reddit, Roblox.

25 Things I Believe In to Build Great Products

What I believe in is often the opposite of how big companies like to work

creatoreconomy.so iconcreatoreconomy.so

Earlier this week I linked to Gale Robins’ argument that AI makes execution cheap but doesn’t help you decide what to build. Christina Wodtke is making the same case from the research side.

Christina Wodtke opens with a designer who spent two weeks vibe-coding a gratitude journaling app. Beautiful interface, confetti animations, gentle notifications. Then she showed it to users. “I don’t really journal,” said the first one. “Gratitude journaling felt performative,” said the second. Two weeks building the wrong thing. Wodtke’s diagnosis:

That satisfaction is a trap. You’re accumulating artifacts that may have nothing to do with what anyone needs.

Wodtke draws a line between need-finding and validation that I think a lot of teams blur. Skipping the first and jumping to the second means you’re testing your guess, not understanding the problem:

Need-finding happens before you have a solution. You’re listening to people describe their lives, their frustrations, their workarounds. You’re hunting for problems that actually exist—problems people care enough about that they’re already trying to solve them with spreadsheets and sticky notes and whatever else they’ve cobbled together.

Wodtke’s version of fast looks different from what you’d expect:

The actual fast path is unsexy: sit down with five to ten people. Ask them about their lives. Shut up and listen. Use those three magic words—“tell me more”—every time something interesting surfaces. Don’t show them anything. Don’t pitch. Just listen.

“You’ll build less. It’ll be the right thing.” When building is cheap, the bottleneck moves upstream to judgment, knowing what to build. That judgment comes from listening, not prompting.

Solid black square with no visible details.

Vibe-Coding Is Not Need-Finding

Last month a product designer showed me her new prototype. She’d spent two weeks vibe-coding a tool for tracking “gratitude journaling streaks.” The interface was beautiful. Confe…

eleganthack.com iconeleganthack.com

Every team I’ve ever led has had one of these people. The person who writes the doc that gives the project its shape, who closes context gaps in one-on-ones before they turn into conflicts, who somehow keeps six workstreams from drifting apart. They rarely get the credit they deserve because the work, when it’s done well, looks like it just happened on its own.

Hardik Pandya writes about this on his blog. He shares a quote from a founder friend describing his most valuable employee:

“She’s the reason things actually work around here. She just… makes sure everything happens. She writes the docs. She runs the meetings that matter. She talks to people. Somehow everything she touches stays on track. I don’t know how I’d even describe what she does to a person outside the company. But if she left, we’d fall apart in a month. Maybe less.”

I’ve known people like this at every company I’ve worked at. And I’ve watched them get passed over because the performance system couldn’t see them. Pandya nails why:

When a project succeeds, credit flows to the people whose contributions are easy to describe. The person who presented to the board. The person whose name is on the launch email. The person who shipped the final feature. These contributions are real, I’m not diminishing them. But they’re not more real than the work that made them possible. They’re just easier to point at.

Most organizations try to fix this by telling the invisible workers to “be more visible”—present more, build your personal brand internally. Pandya’s advice goes the other direction, and I think he’s right:

If you’re good at the invisible work, the first move isn’t to get better at visibility. It’s to find the leader who doesn’t need you to be visible.

As a leader, I take this as a challenge. If someone on my team is doing the work that holds everything together, it’s my job to make sure the organization sees it too—especially when it doesn’t announce itself.

Sketch portrait, title "THE INVISIBLE WORK" and hvpandya.com/notes on pale blue left; stippled open book and stars on black right.

The Invisible Work

The coordination work that holds projects together disappears the moment it works. On the unfairness of recognition and finding leaders who see it anyway.

hvpandya.com iconhvpandya.com

When I managed over 40 creatives at a digital agency, the hardest part wasn’t the work itself—it was resource allocation. Who’s got bandwidth? Who’s blocked waiting on feedback? Who’s deep in something and shouldn’t be interrupted? You learn to think of your team not as individuals you assign tasks to, but as capacity you orchestrate.

I was reminded of that when I read about Boris Cherny’s approach to Claude Code. Cherny is a Staff Engineer at Anthropic who helped build Claude Code. Karo Zieminski, writing in her Product with Attitude Substack, breaks down how Cherny actually uses his own tool:

He keeps ~10–15 concurrent Claude Code sessions alive: 5 in terminal (tabbed, numbered, with OS notifications). 5–10 in the browser. Plus mobile sessions he starts in the morning and checks in on later. He hands off sessions between environments and sometimes teleports them back and forth.

Zieminski’s analysis is sharp:

Boris doesn’t see AI as a tool you use, but as a capacity you schedule. He’s distributing cognition like compute: allocate it, queue it, keep it hot, switch contexts only when value is ready. The bottleneck isn’t generation; it’s attention allocation.

Most people treat AI assistants like a single very smart coworker. You give it a task, wait for the answer, evaluate, iterate. Cherny treats Claude like a team—multiple parallel workers, each holding different context, each making progress while he’s focused elsewhere.

Zieminski again:

Each session is a separate worker with its own context, not a single assistant that must hold everything. The “fleet” approach is basically: don’t make one brain do all jobs; run many partial brains.

I’ve been using Claude Code for months, but mostly one session at a time. Reading this, I realize I’ve been thinking too small. The parallel session model is about working efficiently. Start a research task in one session, let it run while you code in another, check back when it’s ready.

Looks like the new skill on the block is orchestration.

Cartoon avatar in an orange cap beside text "I'm Boris and I created Claude Code." with "6.4M Views" in a sketched box.

How Boris Cherny Uses Claude Code

An in-depth analysis of how Boris Cherny, creator of Claude Code, uses it — and what it reveals about AI agents, responsibility, and product thinking.

open.substack.com iconopen.substack.com

If design’s value isn’t execution—and AI is making that argument harder to resist—then what is it? Dan Ramsden offers a framework I find useful.

He breaks thinking into three types: deduction (drawing conclusions from data), induction (building predictions from patterns), and abduction—generating something new. Design’s unique contribution is abductive thinking:

When we use deduction, we discover users dropping off during a registration flow. Induction might tell us why. Abduction would help us imagine new flows to fix it.

Product managers excel at sense-making (aka “Why?”). Engineers build the thing. Design makes the difference—moving from “what is” to “what could be.”

On AI and the temptation to retreat to “creativity” or “taste” as design’s moat, Ramsden is skeptical:

Some might argue that it comes down to “taste”. I don’t think that’s quite right — taste without a rationale is just an opinion. I think designers are describers.

I appreciate that distinction. Taste without rationale is just preference. Design’s value is translating ideas through increasing levels of fidelity—from sketch to prototype to tested solution—validating along the way.

His definition of design in a product context:

Design is a set of structured processes to translate intent into experiments.

That’s a working definition I can use. It positions design not as the source of ideas (those can come from anywhere, including AI), but as the discipline that manages ideas through validation. The value isn’t in generating the concept—it’s in making it real while managing risk.

Two overlapping blue circles: left text "Making sense to generate a problem"; right text "Making a difference to generate value

The value of Design in a product organisation

Clickbait opening: There’s no such thing as Product Design

medium.com iconmedium.com

I’ve spent a lot of my product design career pushing for metrics—proving ROI, showing impact, making the case for design in business terms. But I’ve also seen how metrics become the goal rather than a signal pointing toward the goal. When the number goes up, we celebrate. When it doesn’t, we tweak the collection process. Meanwhile, the user becomes secondary. Last week’s big idea was around metrics, this piece piles on.

Pavel Samsonov calls this out:

Managers can only justify their place in value chains by inventing metrics for those they manage to make it look like they are managing.

I’ve sat in meetings where we debated which numbers to report to leadership—not which work to prioritize for users. The metrics become theater. So-called “vanity metrics” that always go up and to the right.

But here’s where Pavel goes somewhere unexpected. He doesn’t let designers off the hook either:

Defining success by a metric of beauty offers a useful kind of vagueness, one that NDS seems to hide behind despite the slow loading times or unnavigability that seem to define their output; you can argue with slow loading times or difficulty finding a form, but you cannot meaningfully argue with “beautiful.”

“Taste” and “beauty” are just another avoidance strategy. That’s a direct challenge to the design discourse that’s been dominant lately—the return to craft, the elevation of aesthetic judgment. Pavel’s saying it’s the same disease, different symptom. Both metrics obsession and taste obsession are ways to avoid the ambiguity of actually defining user success.

So what’s the alternative? Pavel again:

Fundamentally, the work of design is intentionally improving conditions under uncertainty. The process necessarily involves a lot of arguments over the definition and parameters of “improvement”, but the primary barrier to better is definitely not how long it takes to make artifacts.

The work is the argument. The work is facing the ambiguity rather than hiding behind numbers or aesthetics. Neither Figma velocity nor visual polish is a substitute for the uncomfortable conversation about what “better” actually means for the people using your product.

Bold "Product Picnic" text over a black-and-white rolling hill and cloudy sky, with a large outlined "50" on the right.

Your metrics are an avoidance strategy

Being able to quantify outcomes doesn’t make them meaningful. Moving past artificial metrics requires building shared intention with colleagues.

productpicnic.beehiiv.com iconproductpicnic.beehiiv.com

It’s January and by now millions of us have made resolutions and probably broken them already. The second Friday of January is known as “Quitter’s Day.”

OKRs—objectives and key results—are a method for businesses to set and align company goals. The objective is your goal and the KRs are the ways to reach your goals. Venture capitalist John Doerr learned about OKRs while working at Intel, brought it to Google, and later became the framework’s leading evangelist.

Christina Wodtke talks about how to use OKRs for your personal life, and maybe as a way to come up with better New Year’s resolutions. She looked at her past three years of personal OKRs:

Looking at the pattern laid out in front of me, I finally saw what I’d been missing. My problem wasn’t work-life balance. My problem was that I didn’t like the kind of work I was doing.

The key results kept failing because the objective was wrong. It wasn’t about balance. It was about joy.

This is the second thing key results do for you: when they consistently fail, they’re telling you something. Not that you lack discipline—that you might be chasing the wrong goal entirely.

And I love Wodtke’s line here: “New Year’s resolutions fail because they’re wishes, not plans.“ She continues:

They fail because “eat better” and “be healthier” and “find balance” are too vague to act on and too fuzzy to measure.

Key results fix this. Not because measurement is magic, but because the act of measuring forces clarity. It makes you confront what you actually want. And sometimes, when the data piles up, it reveals that what you wanted wasn’t the thing you needed at all.

Your Resolution Isn’t the Problem. Your Measurement Is.

Your Resolution Isn’t the Problem. Your Measurement Is.

It’s January, and millions of people have made the same resolution: “Eat better.” By February, most will have abandoned it. Not because they lack willpower or discipline. Because …

eleganthack.com iconeleganthack.com

Building on our earlier link about measuring the impact of features, how can we keep track of the overall health of the product? That’s where a North Star Metric comes in.

Julia Sholtz writes and introduction to North Star Metrics in the analytics provider Amplitude’s blog:

Your North Star Metric should be the key measure of success for your company’s product team. It defines the relationship between the customer problems your product team is trying to solve and the revenue you aim to generate by doing so.

How is it done? The first step is to figure out the “game” your business is playing: how your business engages with customers:

  1. The Attention Game: How much time are your customers willing to spend in your product?
  2. The Transaction Game: How many transactions does your user make on your platform?
  3. The Productivity Game: How efficiently and effectively can someone get their work done in your product?

They have a whole resource section on this topic that’s worth exploring.

Every Product Needs a North Star Metric: Here’s How to Find Yours

Every Product Needs a North Star Metric: Here’s How to Find Yours

Get an introduction to product strategy with examples of North Star Metrics across industries.

amplitude.com iconamplitude.com

How do we know what we designed is working as intended? We measure. Vitaly Friedman shares something called the TARS framework to measure the impact of features.

We need UX metrics to understand and improve user experience. What I love most about TARS is that it’s a neat way to connect customers’ usage and customers’ experience with relevant product metrics.

Here’s TARS in a nutshell:

  • Target Audience (%): Measures the percentage of all product users who have the specific problem that a feature aims to solve.
  • Adoption (%): Tracks the percentage of the target audience that successfully and meaningfully engages with the feature.
  • Retention (%): Assesses how many users who adopted the feature continue to use it repeatedly over time.
  • Satisfaction Score (CES): Gauges the level of satisfaction, specifically how easy it was for retained users to solve their problem after using the feature.

Friedman has more details in the article, including how to use TARS to measure how well a feature is performing for your intended target audience.

How To Measure The Impact Of Features — Smashing Magazine

How To Measure The Impact Of Features

Meet TARS — a simple, repeatable, and meaningful UX metric designed specifically to track the performance of product features. Upcoming part of the Measure UX & Design Impact (use the code 🎟 IMPACT to save 20% off today).

smashingmagazine.com iconsmashingmagazine.com

Here is a good reminder from B. Prendergast to “stop asking users what they want—and start watching what they do.”

Asking people what they want is one of the most natural instincts in product work. Surveys, interviews, and feature wish lists feel accessible, social, and collaborative. They open channels to understand and empathise with the user base. They help teams feel closer to the people they serve. For teams under pressure, a stack of opinions can feel like solid data.

But this breaks when we compare what users say to what they actually do (say-do gap).

We all want to present ourselves a certain way. We want to seem more competent than confused (social desirability bias). Our memories can be fuzzy, especially about routine tasks (recall bias). Standards for what feels “easy” or “intuitive” can vary wildly between people (reference bias).

And of course, as soon as we start to ask users to imagine what they’d want, they’ll solve based on their personal experiences—which might be the right solution for them, but might not be for other users in the same situation.

Prendergast goes on to suggest “watch what people do, measure what matters, and use what they say to add context.” This approach involves watching user interactions, analyzing real behaviors through analytics, and treating feature requests as signals of underlying problems to uncover genuine needs. Prioritizing decisions based on observed patterns and desired outcomes leads to more effective solutions than relying on user opinions alone.

Stop asking users what they want — and start watching what they do. - Annotated

Stop asking users what they want — and start watching what they do.

People’s opinions about themselves and the things they use rarely match real behaviour.

renderghost.leaflet.pub iconrenderghost.leaflet.pub

Echoing my series on the design talent crisis and other articles warning against the practice of cutting back on junior talent, Vaughn Tan offers yet another dimension: subjective decision-making skills are only honed through practice. But the opportunities given to junior staff for this type of decision-making are slim.

But to back up, here’s Tan explaining what subjective decision-making is:

These are decisions where there’s no one “correct” answer and the answers that work can’t be known in advance. Subjective decisionmaking requires critical thinking skills to make strongly reasoned arguments, identify appropriate evidence, understand the tradeoffs of different arguments, make decisions that may (or may not) be correct, and develop compelling narratives for those decisions.

While his article isn’t about AI nor is it about companies not hiring juniors, it is about companies not developing juniors and allowing them to practice this type of decision-making in low-stakes situations.

Critical thinking and judgment require practice. Practice needs to be frequent, and needs to begin at a low level with very few consequences that are important. This small-scale training in subjective decisionmaking and critical thinking is the best way to learn how to do it properly in more consequential situations.

If you wait until someone is senior to teach judgment, their first practice attempts have serious consequences. High-stakes decisionmaking pressure cannot be simulated realistically; learning how to deal with it requires actual practice with real consequences that progressively increases in scope and consequentiality.

And why is this all important? Not developing junior staff means there will be a bottleneck issue—only seniors can make these judgement calls—and one day, there will be a succession problem, i.e., who takes over when the seniors leave or retire.

Judgment from the ground up

tl;dr: Critical thinking is foundational for making decisions that require subjective judgment. People learn how to do subjective decisionmaking

vaughntan.org iconvaughntan.org

This short article could easily fall under the motivational category, but I couldn’t help but draw parallels to what we do as designers when working as part of a product team.

Hazel Weakly says that people who see systems also tend to become in charge of them, sooner or later. And to be a leader is to “understand that you’ll find yourself stranded in the middle of the ocean one day.”

Not just you, but everyone you lead. And you’ll need to chart a course. In the ever-changing winds, the ever-shifting tides, the unknown weather, and with an inability to see up or down or basically anywhere except a few minutes away. You won’t have the time to find your bearings even if you could. Yet, somehow, in this sea of swirling and infinite complexities and probabilities, in the midst of incalculable odds, you will find yourself needing to have simultaneously several different things…

The first thing is that you will fail. I equate this to knowing that design is about trial and error, testing and measuring, and then adjusting.

The second thing is unshakable conviction that “you will succeed.” I see success as solving the problem, coming up with a solution that helps users do what they need. And you know what? Designers will succeed when they follow the design process.

Finally, the third thing is to “prepare and make ready everyone around you.” Which means influencing your product and engineering counterparts and other stakeholders that the solution your advocating for is the right one.

To Be a Leader of Systems | Hazel Weakly

To Be a Leader of Systems

Picture with me, if you will, the absurdity of finding yourself swimming in the middle of the ocean. First think about the ocean and how deep and infinitely…

hazelweakly.me iconhazelweakly.me
Storyboard grid showing a young man and family: kitchen, driving, airplane, supermarket, night house, grill, dinner.

Directing AI: How I Made an Animated Holiday Short

My first taste of generating art with AI was back in 2021 with Wombo Dream. I even used it to create very trippy illustrations for a series I wrote on getting a job as a product designer. To be sure, the generations were weird, if not even ugly. But it was my first test of getting an image by typing in some words. Both Stable Diffusion and Midjourney gained traction the following year and I tried both as well. The results were never great or satisfactory. Years upon years of being an art director had made me very, very picky—or put another way, I had developed taste.

I didn’t touch generative AI art again until I saw a series of photos by Lars Bastholm playing with Midjourney.

Child in yellow jacket smiling while holding a leash to a horned dragon by a park pond in autumn.

Lars Bastholm created this in Midjourney, prompting “What if, in the 1970s, they had a ‘Bring Your Monster’ festival in Central Park?”

That’s when I went back to Midjourney and started to illustrate my original essays with images generated by it, but usually augmented by me in Photoshop.

In the intervening years, generative AI art tools had developed a common set of functionality that was all very new to me: inpainting, style, chaos, seed, and more. Beyond closed systems like Midjourney and OpenAI’s DALL-E, open source models from Stable Diffusion, Flux, and now a plethora of Chinese models offer even better prompt adherence and controllability via even more opaque-sounding functionality like control nets, LoRAs, CFG, and other parameters. It’s funny to me that for a very artistic field, the associated products to enable these creations are very technical.

My Site Stats for 2025

In 2025, I published 328 posts with a total of 118,445 words on this blog. Of course, in most of the posts, I’m quoting others, so excluding block quotes—those quoted passages greater than a sentence—I’m down to 76,226 words. Still pretty impressive, I’d say.

Post analysis 2025 - 328 posts. Top months: Oct 45, Jul 42, Mar 4. Link posts 283 (86%). Total words 118,445, avg 361.

I used Claude Code to write a little script that analyzed my posts from last year.

In reviewing data from my analytics package Umami, it is also interesting which posts received the most views. By far it was “Beyond the Prompt,” my AI prompt-to-code shootout article. The others in the top five were:

That last one has always surprised me. I must’ve hit the Google lottery on it for some reason.

Speaking of links, since April—no data before—visitors clicked on links mentioned on this blog 2,949 times. I also wanted to see which linked items were most popular, by outbound clicks:

  1. AI 2027, naturally
  2. Smith & Diction’s catalog of “Usable Google Fonts
  3. Matt Webb’s post on Do What I Mean
  4. A visualization called “The Authoritarian Stack” that shows how power, money, and companies connect
  5. The New York Times list of the “25 Most Influential Magazine Covers of All Time” (sadly, the gift link has since expired)

And finally, the totals of the year for views were 58,187, with 42,075 visitors. That works out to be an average of about 3,500 visitors per month. Tiny compared with other blogs out there. But my readers mean the world to me.

Anyway, some interesting stats, at least to me. Here’s to more in 2026.

I spend a lot of time not talking about design nor hanging out with other designers. I suppose I do a lot of reading about design to write this blog, and I am talking with the designers on my team, but I see Design as the output of a lot of input that comes from the rest of life.

Hardik Pandya agrees and puts it much more elegantly:

Design is synthesizing the world of your users into your solutions. Solutions need to work within the user’s context. But most designers rarely take time to expose themselves to the realities of that context.

You are creative when you see things others don’t. Not necessarily new visuals, but new correlations. Connections between concepts. Problems that aren’t obvious until someone points them out. And you can’t see what you’re not exposed to.

Improving as a designer is really about increasing your exposure. Getting different experiences and widening your input of information from different sources. That exposure can take many forms. Conversations with fellow builders like PMs, engineers, customer support, sales. Or doing your own digging through research reports, industry blogs, GPTs, checking out other products, YouTube.

Male avatar and text "EXPOSURE AS A DESIGNER" with hvpandya.com/notes on left; stippled doorway and rock illustration on right.

Exposure

For equal amount of design skills, your exposure to the world determines how effective of a designer you can be.

hvpandya.com iconhvpandya.com

Scott Berkun enumerates five habits of the worst designers in a Substack post. The most obvious is “pretentious attitude.” It’s the stereotype, right? But in my opinion, the most damaging and potentially fatal habit is a designer’s “lack of curiosity.” Berkun explains:

Design dogma is dangerous and if the only books and resources you read are made by and for designers, you will tend to repeat the same career mistakes past designers have made. We are a historically frustrated bunch of people but have largely blamed everyone else for this for decades. The worst designers are ignorant, and refuse to ask new questions about their profession. They repeat the same flawed complaints and excuses, fueling their own burnout and depression. They resist admitting to their own blindspots and refuse to change and grow.

I’ve worked with designers who have exhibited one or more of these habits at one time or another. Heck, I probably have as well.

Good reminders all around.

Bold, rough brush-lettered text "WHY DESIGN IS HARD" surrounded by red handwritten arrows, circles, Xs and critique notes.

The 5 habits of the worst designers

Avoid these mistakes and your career will improve

whydesignishard.substack.com iconwhydesignishard.substack.com

Critiques are the lifeblood of design. Anyone who went to design school has participated in and has been the focus of a crit. It’s “the intentional application of adversarial thought to something that isn’t finished yet,” as Fabricio Teixeira and Caio Braga, the editors of DOC put it.

A lot of solo designers—whether they’re a design team of one or if they’re a freelancer—don’t have the luxury of critiques. In my view, they’re handicapped. There are workarounds, of course. Such as critiques with cross-functional peers, but it’s not the same. I had one designer on my team—who used to be a design team of one in her previous company—come up to me and say she’s learned more in a month than a year at her former job.

Further down, Teixeira and Braga say:

In the age of AI, the human critique session becomes even more important. LLMs can generate ideas in 5 seconds, but stress-testing them with contextual knowledge, taste, and vision, is something that you should be better at. As AI accelerates the production of “technically correct” and “aesthetically optimized” work, relying on just AI creates the risks of mediocrity. AI is trained to be predictable; crits are all about friction: political, organizational, or strategic.

Critique

Critique

On elevating craft through critical thinking.

doc.cc icondoc.cc

He told me his CEO - who’s never written a line of code - was running their company from an AI code editor.

I almost fell out of my chair.

OF COURSE. WHY HAD I NOT THOUGHT OF THAT.

I’ve since gotten rid of almost all of my productivity tools.

ChatGPT, Notion, Todoist, Airtable, Google Keep, Perplexity, my CRM. All gone.

That’s the lede for a piece by Derek Larson on running everything from Claude Code. I’ve covered how Claude Code is pretty brilliant and there are dozens more use cases than just coding.

But getting rid of everything and using just text files and the terminal window? Seems extreme.

Larson uses a skill in Claude Code called “/weekly” to do a weekly review.

  1. Claude looks at every file change since last week
  2. Claude evaluates the state of projects, tasks, and the roadmap
  3. We have a conversation to dig deeper, and make decisions
  4. Claude generates a document summarizing the week and plan we agreed on

Then Claude finds items he’s missed or procrastinating on, and “creates a space to dump everything” on his mind.

Blue furry Cookie Monster holding two baking sheets filled with chocolate chip cookies.

Feed the Beast

AI Eats Software

dtlarson.com icondtlarson.com

Design Thinking has gotten a bad rap in recent years. It was supposed to change everything in the corporate world but ended up changing very little. While Design Thinking may not be the darling anymore, designers still need time to think, which is, for the sake of argument, time away from Figma and pushing pixels.

Chris Becker argues in UX Collective:

However, the canary in the coalmine is that Designers are not being used for their “thinking” but rather their “repetition”. Much of the consternation we feel in the UX industry is catapulted on us from this point of friction.

He says that agile software development and time for designers to think aren’t incompatible:

But allowing Designers to implement their thinking into the process is about trust. When good software teams collaborate effectively, there are high levels of trust and autonomy (a key requirement of agile teams). Designers must earn that trust, of course, and when we demonstrate that we have “done the thinking,” it builds confidence and garners more thinking time. Thinking begets thinking. So, Designers, let’s continue to work to maximise our “thinking” faculties.

Hand-drawn diagram titled THINKING: sensory icons and eyeballs feed a brain, plus a phone labeled "Illusory Truth Effect," leading to outputs labeled "Habits.

Let designers think

How “Thinking” + “Designing” need to be practiced outside AI.

uxdesign.cc iconuxdesign.cc

Game design is fascinating to me. As designers, “gamification” was all the rage a few years back, inspired by apps like Duolingo that made it fun to progress in a product. Raph Koster outlines a twelve-step, systems-first framework for game design, complete with illustrations. Notice how he’ll use UX terms like “affordance” because ultimately, game design is UX.

In step five, “Feedback,” Koster provides an example:

[The player] can’t learn and get better unless [they] get a whole host of information.

  • You need to know what actions – we usually call them verbs — are even available to you. There’s a gas pedal.
  • You need to be able to tell you used a verb. You hear the engine growl as you press the pedal.
  • You need to see that the use of the verb affected the state of the problem, and how it changed. The spedometer moved!
  • You need to be told if the state of the problem is better for your goal, or worse. Did you mean to go this fast?

Sound familiar? It’s Jakob Nielsen’s “Visibility of System Status.”

White-bordered hex grid with red, blue, yellow and black hex tiles marked by dot patterns, clustered on a dark tabletop

Game design is simple, actually

So, let’s just walk through the whole thing, end to end. Here’s a twelve-step program for understanding game design. One: Fun There are a lot of things people call “fun.” But most of them are not u…

raphkoster.com iconraphkoster.com

I think the headline is a hard stance, but I appreciate the sentiment. All the best designers and creatives—including developers—I’ve ever worked with do things on the side. Or in Rohit Prakash’s words, they tinker. They’re always making something, learning along the way.

Prakash, writing in his blog:

Acquiring good taste comes through using various things, discarding the ones you don’t like and keeping the ones you do. if you never try various things, you will not acquire good taste.

It’s important for designers to see other designs and use other products—if you’re a software designer. It’s equally important to look up from Dribbble, Behance, Instagram, and even this blog and go experience something unrelated to design. Art, concerts, cooking. All of it gets synthesized through your POV and becomes your taste.

Large white text "@seatedro on x dot com" centered on a black background.

If you don’t tinker, you don’t have taste

programmer by day, programmer by night.

seated.ro iconseated.ro

Ethan Mollick, a professor of entrepreneurship at the Wharton School says that AI has gotten so good that our relationship with them is changing. “We’re moving from partners to audience, from collaboration to conjuring,” he says.

He fed NotebookLM his book and 140 Substack posts and asked for a video overview. AI famously hallucinates. But Mollick found no factual errors in the six-minute video.

We’re shifting from being collaborators who shape the process to being supplicants who receive the output. It is a transition from working with a co-intelligence to working with a wizard. Magic gets done, but we don’t always know what to do with the results. This pattern — impressive output, opaque process — becomes even more pronounced with research tasks.

Mollick believes that the most wizard-like model today is GPT-5 Pro. He uploaded an academic paper that took him a year to write, which was peer-reviewed, and was then published in a major journal…

Nine minutes and forty seconds later, I had a very detailed critique. This wasn’t just editorial criticism, GPT-5 Pro apparently ran its own experiments using code to verify my results, including doing Monte Carlo analysis and re-interpreting the fixed effects in my statistical models. It had many suggestions as a result (though it fortunately concluded that “the headline claim [of my paper] survives scrutiny”), but one stood out. It found a small error, previously unnoticed. The error involved two different sets of numbers in two tables that were linked in ways I did not explicitly spell out in my paper. The AI found the minor error, no one ever had before.

Later in his post, Mollick says that there’s a problem with this wizardry—it’s too opaque. So what can we do?

First, learn when to summon the wizard versus when to work with AI as a co-intelligence or to not use AI at all. AI is far from perfect, and in areas where it still falls short, humans often succeed. But for the increasing number of tasks where AI is useful, co-intelligence, and the back-and-forth it requires, is often superior to a machine alone. Yet, there are, increasingly, times when summoning a wizard is best, and just trusting what it conjures.

Second, we need to become connoisseurs of output rather than process. We need to curate and select among the outputs the AI provides, but more than that, we need to work with AI enough to develop instincts for when it succeeds and when it fails.

And lastly, trust it. Trust the technology, he suggests. “The question isn’t ‘Is this completely correct?’ but ‘Is this useful enough for this purpose?’”

I think we’re in that transition period. AI is indeed dastardly great at some things and constantly getting better at the tasks it’s not. But we all know where this is headed.

Witch hat hovering over a desktop monitor with circuit-like lines flowing into the screen, small coffee mug on the desk.

On Working with Wizards

Verifying magic on the jagged frontier

oneusefulthing.org icononeusefulthing.org

Remote work really exploded when the Covid-19 pandemic hit. Everyone had to adjust to working from home, relying on Zoom and Slack and other collaborative tools much more. But beyond tooling, there’s also process. Matt Mullenweg, CEO of Automattic, has famously been a proponent of distributed work for a while.

Paolo Belcastro peels back the curtain to share how the 1,500 or so global employees of Automattic stay connected via two core principles:

There are two ideas that define our communication culture:

Radical Transparency: we default to openness, with every conversation accessible to everyone in the company. Asynchronous by Design: we don’t expect everyone to be “on” at the same time.

Everything is written down:

Our internal platform, P2, started life as a WordPress theme (it was called Prologue, later updated to version 2 and eventually shortened to P2) that lets people post directly on the front end of a site—fast, simple, and visible to everyone. Over time it evolved into a network of thousands of P2s for teams, projects, and watercooler chats (couch surfing, classified ads, house renovations, babies, pets, music, or games, we kind of have it all).

Every post, every comment, every decision ever made in the history of Automattic is preserved there.

As you can imagine, it soon becomes a volume problem. There’s too much stuff.

No one can read everything.

That’s why onboarding is designed to help people adapt:

  • Each newcomer is paired with a mentor from a different team, to give them a cross-company perspective.
  • They receive a curated list of “milestone posts” that map the history of Automattic, along with role-specific threads relevant to their work.
  • The Field Guide offers principles, templates, and advice about how to handle communication.

Somehow, they make it work.

Using chaos to communicate order

Using chaos to communicate order

How we communicate at Automattic

ttl.blog iconttl.blog