When I managed over 40 creatives at a digital agency, the hardest part wasn’t the work itself—it was resource allocation. Who’s got bandwidth? Who’s blocked waiting on feedback? Who’s deep in something and shouldn’t be interrupted? You learn to think of your team not as individuals you assign tasks to, but as capacity you orchestrate.
I was reminded of that when I read about Boris Cherny’s approach to Claude Code. Cherny is a Staff Engineer at Anthropic who helped build Claude Code. Karo Zieminski, writing in her Product with Attitude Substack, breaks down how Cherny actually uses his own tool:
He keeps ~10–15 concurrent Claude Code sessions alive: 5 in terminal (tabbed, numbered, with OS notifications). 5–10 in the browser. Plus mobile sessions he starts in the morning and checks in on later. He hands off sessions between environments and sometimes teleports them back and forth.
Zieminski’s analysis is sharp:
Boris doesn’t see AI as a tool you use, but as a capacity you schedule. He’s distributing cognition like compute: allocate it, queue it, keep it hot, switch contexts only when value is ready. The bottleneck isn’t generation; it’s attention allocation.
Most people treat AI assistants like a single very smart coworker. You give it a task, wait for the answer, evaluate, iterate. Cherny treats Claude like a team—multiple parallel workers, each holding different context, each making progress while he’s focused elsewhere.
Zieminski again:
Each session is a separate worker with its own context, not a single assistant that must hold everything. The “fleet” approach is basically: don’t make one brain do all jobs; run many partial brains.
I’ve been using Claude Code for months, but mostly one session at a time. Reading this, I realize I’ve been thinking too small. The parallel session model is about working efficiently. Start a research task in one session, let it run while you code in another, check back when it’s ready.
Looks like the new skill on the block is orchestration.


