Jason Spielman put up a case study on his site for his work on Google’s NotebookLM:
The mental model of NotebookLM was built around the creation journey: starting with inputs, moving through conversation, and ending with outputs. Users bring in their sources (documents, notes, references), then interact with them through chat by asking questions, clarifying, and synthesizing before transforming those insights into structured outputs like notes, study guides, and Audio Overviews.
And yes, he includes a sketch he did on the back of a napkin.
I’ve always wondered about the UX of NotebookLM. It’s not typical and, if I’m being honest, not exactly super intuitive. But after a while, it does make sense. Maybe I’m the outlier though, because Spielman’s grandmother found it easy. In an interview last year on Sequoia Capital’s Training Data, he recalls:
I actually do think part of the explosion of audio overviews was the fact it was a simple one click experience. I was on the phone with my grandma trying to explain her how to use it and it actually didn't take any explanation. I'm like, “Drop in a source.” And she’s like, “Oh! I see. I click this button to generate it.” And I think that the ease of creation is really actually what catalyzed so much explosion. So I think when we think about adding these knobs [for customization] I think we want to do it in a way that’s very intentional.