Live AI is the highest-risk recurring moment in any major company’s annual calendar today. And almost no one outside a handful of production teams has a clear framework for how to think about it.
Most of the conversation about AI demos focuses on what the AI does. The harder question is how the production around the demo reinforces its value. The old playbook was designed for deterministic software – scripted, rehearsed, locked down. And, simply put, agentic AI doesn’t work that way.
The product is unpredictable. The demo runs once – with the share price moving in real time. The production team can’t fully constrain the output. And the audience is watching with their finger on the share button. Get it right and you reset the equity narrative. Get it wrong and you become the case study in someone else’s analysis.


Google has been the case study before. In February 2023, a single live Bard demo answered one question incorrectly about the James Webb Space Telescope. By the next day’s close, Alphabet had lost roughly $100 billion in market capitalization.
In a few days, Sundar Pichai will walk back on stage at Shoreline Amphitheatre for Google I/O 2026. (The keynote begins at 10:00 AM PT on Tuesday, May 19.) Reports point to a major Gemini model update, a new agentic AI assistant called Gemini Spark, Android 17, and the unveiling of Aluminium OS. Some of these will demo well in a controlled environment. Others will require live, multi-step, agentic AI performance in front of a global audience. Fortunately, the production discipline that distinguishes a credibility-building demo from a market-moving disaster has been developed largely in public, by Google itself, over the past eight years.
And to interpret Google I/O 2026, you have to understand the framework Google has been building since 2018.
For decades, corporate live demos followed a simple production logic. You wrote the script, ran the rehearsals, and planned for what could go wrong. A Salesforce dashboard demo. An iPhone software walkthrough. An Adobe Creative Cloud feature reveal. The product behaved predictably because the demo was, in essence, a high-fidelity rehearsal performed live.
Live AI demos break that model in four specific ways.
Agentic AI works differently. You give the system a goal, and it decides how to get there. The production team can guide the prompt but not the path. How and what the model produces between input and output isn’t fully predictable. That’s both the promise of agentic AI – and the production risk.
Even with the same prompt, the model can produce a different response. Sometimes the difference is small. Sometimes it’s significant. Sometimes the answer is just wrong. Until the demo actually runs in front of the audience, the production team can’t know exactly what the model will say.
A multi-step AI agent might take 45 to 90 seconds to finish a task. On a live stage, 45 seconds of an AI “thinking” is a production crisis. The audience disengages. The camera has nothing to cut to. The speaker has to fill the silence. A traditional software demo can choreograph every second. An AI demo can’t.
A traditional software demo failure is embarrassing. A live AI demo failure is a stock-moving event. The Bard incident wasn’t a fluke – it was the first clear signal of a new category of risk. Since 2023, more than one AI company has watched its share price move on the strength of a single live demo.
For us, these aren’t quirks to manage. They’re a structurally new production category. And they require a fresh approach.

No company has confronted the live AI demo problem longer or more publicly than Google. The framework isn’t written down anywhere, but it is present in the production decisions Google has made over the past eight years – including the ones it learned the hard way.
Five moments define that evolution.
At I/O 2018, Google played pre-recorded phone calls in which its Duplex AI assistant booked a hair appointment and called a restaurant. Sundar Pichai introduced the recordings as “the Google Assistant actually calling a real salon.” The AI sounded so human, complete with “um” and “uh” verbal tics, that the demo went viral within hours.
Then came the questions, first raised by Axios: why didn’t the businesses identify themselves when they picked up? Why was there no ambient noise? Were these genuine real-world interactions or carefully staged recordings? Google declined to provide the names of the businesses or confirm whether the calls had been edited. The credibility question presented a separate ethics issue: should an AI identify itself as AI when calling a human?
The lesson: When an audience can’t verify whether what they’re seeing is real, the demo creates more doubt than it resolves. Specificity, transparency, and verifiable detail are essential to earning trust and demo credibility.
Google held a small launch event for Bard featuring a promotional clip in which the AI gave a factually incorrect answer about the James Webb Space Telescope. Astronomers noticed within hours. Alphabet shares dropped about 7.7% the following day, erasing roughly $100 billion in market value.
The lesson: Every public AI output is a public statement. Fact-checking demo content isn’t a marketing task – it’s a risk management one.
Google released a six-minute video billed as “Hands-on with Gemini: Interacting with multimodal AI.” It appeared to show the model engaging with images, drawings, and a continuous spoken conversation in real time.
Within 24 hours, Bloomberg’s Parmy Olson reported that the demo had been constructed differently than the video implied: Gemini wasn’t responding to spoken voice or live video at all. Google’s team had fed the model still image frames and text prompts, then added the voice narration afterward in post.
Google’s disclaimer in the video description noted only that “latency has been reduced and outputs have been shortened.” The narrative shifted from “Gemini is remarkable” to “Google misrepresented Gemini.”
The lesson: The line between live demo and marketing video has to be transparent. The credibility hit when an edit gets exposed is worse than the lift from a polished demo.
At I/O 2024, Google unveiled Project Astra – its real-time multimodal assistant – with a deliberate production move that broke from prior patterns. The demo aired as two continuous takes, one on a Pixel phone and one on a prototype pair of smart glasses.
The signal to the audience was clear: Google hadn’t cherry-picked the responses. The model was handling a stream of inputs in real time, rough edges and all. Around that demo, Google also began explicitly labeling other AI segments as recorded or aspirational rather than implying everything was live. The change was subtle in execution but marked a significant milestone. Google had stopped trying to make everything feel live and had started telling the audience exactly what they were watching.
The lesson: Labeling the kind of demo you’re showing is the first rule of demo credibility. The audience will forgive almost any production choice if they know what choice you made.
At I/O 2025, Project Astra moved from research demo to shipping product, powering new experiences in Search Live, the Gemini app, and third-party developer tools. The production decision here was as significant as the engineering one.
Having spent 2024 carefully framing Astra as a live, unscripted experience, Google could now invite the audience to use the same capability themselves. The demo and the product had become indistinguishable, which is the highest form of credibility a live AI demo can earn.
The lesson: When the AI demo eventually becomes a product launch, the production discipline that surrounds it becomes the foundation for long-term trust.
That’s eight years of drafting the playbook, paid for in public embarrassment, market cap, and corrective communication. It distills into a single principle: the audience doesn’t need the demo to be perfect – but they need to know exactly what kind of demo they’re watching.
Which brings us to the framework.
Most companies treat a live AI demo as a binary – either it’s live or it isn’t. The actual production reality is a spectrum, and failure in almost every public AI demo controversy comes down to misalignment between which category the audience thought they were watching and which one was actually being staged.
There are five distinct ways to stage a live AI demo. We call them the Demo Modes – a five-category framework for live AI production
The AI runs in real time during the event. No pre-staging. No predetermined output. The speaker delivers an input, and the audience watches the response unfold in real time. Highest credibility, highest production risk. The two continuous-take Project Astra demos at I/O 2024 were the clearest recent example of Mode 1 done well.
The AI is running in real time, but inside a controlled environment. The prompts are curated, the use cases are scoped. The model is genuinely working, but the production team has narrowed what it might be asked to do. Mid-high credibility when the framing is transparent, mid risk. Most enterprise software AI demos today are Mode 2 whether the company says so or not.
The AI completed the task minutes or hours before the event. The audience watches the playback of an actual real run – including any imperfections – with the speaker explicitly framing it as such: “We ran this just before walking on stage. Here’s what it produced.” Mid credibility when disclosed, low risk. This mode is dramatically under-used. Done well, it captures most of the trust of a live demo while significantly lowering the risk of failing on stage.
A polished video of the AI performing a task, clearly labeled as recorded. Low credibility for capability claims but high credibility for visual production quality. The Gemini Hands-On video would have qualified as Mode 4 if Google had labeled it that way. The controversy emerged because it wasn’t.
Explicitly framed as “what’s possible,” “what we’re building toward,” or “where this is headed.” It’s a preview of where the product is going – not proof of what it can do today. Lowest credibility for capability claims, but useful for setting vision. Google should have framed the Duplex demo at I/O 2018 this way. Instead it was framed ambiguously enough to read as Mode 1.
The Demo Modes aren’t a ranking. They’re a set of choices. A keynote can deliberately mix categories – Verified Live for the headline demonstration, Constrained Live for the enterprise capability, Pre-flight Live for the agentic workflow, Pre-Recorded for the partner integration, Aspirational for the long-term roadmap.
The discipline isn’t picking the “best” mode. The discipline is making sure the audience knows which one they’re watching.

The framework becomes most useful as a real-time reading tool. Here’s how it applies on Tuesday, May 19.
Reports point to a major Gemini model update at the keynote – whether labeled Gemini 4 or a 3.x successor. The production question is which mode Google chooses for the headline demonstration. Verified Live (Mode 1) would be the most confident move – signaling that Google trusts the new model to perform outside of a controlled environment. Constrained Live (Mode 2) would be the more cautious choice. If Google frames the demo as anything other than Mode 1 or 2, that’s a signal worth noting.
Leaks point to a new agentic AI assistant called Gemini Spark – designed to work autonomously across apps, emails, calendars, and websites. Booking flights. Managing email. Filling out forms. This is the most production-risk kind of demo a company can stage today, because every action links to the next, and a single failure cascades across all points. The production decision is whether to demo Spark’s full workflow live (Mode 1 – high risk, high reward), to scope it tightly (Mode 2 – safer, less impressive), or to compress the experience via Pre-flight Live (Mode 3 – the team runs it just before the event and acknowledges it openly). Watch for the speaker’s framing language at the moment of the reveal. If they say “we ran this just before walking out,” that’s Mode 3 done well. If the demo cuts cleanly between steps without acknowledgment, the production team has chosen polish over transparency.
Every live AI demo at I/O 2026 will have some friction. Latency. A response that lands awkwardly. A model output that’s correct but visually unimpressive. The production decision is whether to absorb that friction visibly (the Astra 2024 approach) or to edit it out (pre-2024 approach). The Astra approach is the more mature move. Watch for it.
I/O 2024 introduced explicit labeling. I/O 2025 refined it. If I/O 2026 makes the live vs. recorded distinction even cleaner – graphics, lower-thirds, verbal framing – that’s Google institutionalizing their framework. If the line blurs again, that’s a regression worth flagging.
The most consequential audience at I/O 2026 isn’t the developers in Shoreline. It’s the institutional investors evaluating Google Cloud’s AI revenue trajectory. Pichai disclosed at Cloud Next 2026 that just over half of 2026 ML compute investment will go to the Cloud business. The Cloud demos at I/O have to translate that capex into a credible product story. Watch how those demos are categorized. Constrained Live with enterprise customer logos as visible validation carries weight. Aspirational framing doesn’t.
Anyone who watches I/O 2026 with the 5 Demo Modes framework in hand will likely walk away from the keynote with a deeper understanding of these products and features than the reader who watches for product news alone.

The 5-Mode framework names the demo categories, but the execution lives in the production decisions that distinguish one category from another. Four of them carry disproportionate weight.
Mode 1 (Verified Live) requires real-time agility. The speaker has to be ready to narrate whatever the model produces – including responses they’ve never seen before. That’s a different kind of prep than walking through a rehearsed click sequence. Pichai’s comfort with live AI moments is a production advantage Google has built over years. Most CEOs aren’t there yet.
Every live AI demo needs a written set of fallbacks: if the model produces a problematic response, what does the speaker say next? If latency drags on, where does the camera cut? If the demo fails entirely, how does the show move on without acknowledging it? The audience never sees the contingency. They only see the recovery. The Bard launch failure wasn’t a demo failure – it was a contingency-planning failure. The factual error was visible in promotional materials before the event. Better fact-checking should have caught it.
When a live AI demo is processing, the camera has to go somewhere. A cut to the speaker carries one signal. A cut to crowd reaction carries another. A cut to a product graphic carries a third. Each choice tells the home audience something different about whether to trust what’s happening. At I/O scale, this requires a director, multiple operators, and pre-planned camera blocking for every demo segment.
Mode 3 demos don’t happen by accident. They require the production team to actually run the demo backstage, capture the output, and have it ready to play back within minutes of the live moment. That’s a second production happening at the same time as the live event. Most companies don’t budget for it. The ones that do have a tool the others don’t.
These decisions aren’t decorative. They’re the difference between a demo that builds credibility and one that costs market cap.
Google I/O 2026 isn’t an isolated event. It’s the first in a three-week window that includes Microsoft Build (June 2–3) and Apple WWDC (June 8–12). All three companies will stage live AI demos. All three will face the same production decisions. And by the end of June, the industry will have its first complete data set for how the leading public AI companies are navigating the new production risk.
And others are watching them closely. Salesforce Dreamforce in September will stage Agentforce demos. Workday, ServiceNow, Adobe, and every major enterprise software company will demo agentic capabilities at investor moments over the next year. The companies that have a clear live AI demo framework will appear more credible than those with technically superior AI but worse production discipline.
That’s the broader implication. When every public company is staging live AI, the production discipline around the demo becomes part of the equity story itself. Not just for the AI labs – for any company whose narrative depends on showing product capability and evolution.
The 5 Demo Modes aren’t a prescription. Different companies, different audiences, different products will call for different combinations. What every company needs is the vocabulary to make those choices on purpose – not by accident.
Google has been learning that vocabulary in public for eight years. The lesson the rest of the industry has yet to fully absorb is that the question isn’t whether to demo live. It’s whether the production team is ready to handle what happens when you do.
The high-wire act is permanent. The model for walking it is still taking shape. Google I/O 2026 is the next big stage.
That’s the work worth investing in. It’s also the work Cardboard Spaceship builds for clients navigating the moments that matter.
The most consequential moments in modern corporate communications now run on live AI. Whether you’re preparing for an Investor Day, a product launch, a developer event, or an investor moment that includes an agentic demonstration, the production decisions you make now will define how the market reads your capability when the moment arrives.
Let’s start a conversation →
No slides. No teleprompter. No walk-on music. No rehearsed opening. No cocktail reception. No video package. No panel of executives. No celebrity moderator.
Just one man, Warren Buffett, in a chair for five hours, answering whatever he’s asked.
By every modern standard of event production, the Berkshire Hathaway annual shareholder meeting should be a snoozer. It violates every rule of contemporary audience engagement, and it’s the exact opposite of what IR consultants typically recommend.
Yet, it draws 40,000 people to Omaha every May. It commands global media coverage. It moves markets. And it’s routinely cited by institutional investors as the single most valuable investor communication event in the world.
So what’s going on?
The short answer: the Berkshire meeting isn’t boring. It’s disciplined. And it works for the exact reasons most shareholder meetings don’t.
Here’s what 60 years of the “Woodstock for Capitalists” teaches about effective investor communication – and what any IR team can apply to their next event.
Most corporate events are built around a fundamental assumption: investors need to be entertained, impressed, or both.
So the production gets heavier every year. More video. More stagecraft. More choreographed executive walkons. More polished messaging. The implicit message: we don’t trust you to stay interested unless we keep the energy up.
Berkshire does something different, and it takes more production skill, not less, to pull off. The format assumes the audience is smart enough to stay engaged without decorative flourishes. Every production decision – staging, lighting, camera work, broadcast design – is built to disappear so the content can carry the room.
That’s the hardest thing to do in event production. Making the work invisible requires a team that understands exactly what to amplify and when to pull back .
The takeaway for your next event: production isn’t about adding spectacle. It’s about designing every element – visible or invisible – to serve the core moment. The most confident companies communicate through production that feels effortless and intentional, which is almost always the product of the most thoughtful production design.
Every production decision at a Berkshire meeting communicates something:
None of those are obvious production choices. Each one was deliberate. Each one required significant production planning and infrastructure to execute consistently year after year.
Together, they communicate Berkshire’s values more effectively than any mission statement could.
The takeaway for your next event: every production choice – length, format, staging, Q&A structure, broadcast design – is saying something about your company. The best production partners don’t just execute the format. They help you understand what each decision communicates and design every element to reinforce the narrative you’re trying to build.
Most shareholder meetings are a mosaic of competing elements: opening remarks, CEO presentation, CFO deep-dive, divisional updates, video segments, panel discussions, Q&A, networking reception. Each element is designed to serve a different audience need. The result is often a meeting that does many things adequately and nothing exceptionally.
Berkshire picked one element – the open-question Q&A – and made it the entire event.
That single format decision is what built the brand. It’s also what makes the event irreplaceable. There’s nowhere else an institutional investor can ask a man like Warren Buffett – the CEO of a $1.1 trillion company – any question they want and get a real answer in real time for five hours straight.
The takeaway for your next event: identify the one element of your investor communication that genuinely differentiates you. Then ask whether the rest of your format is supporting that element or diluting it. If you’re doing ten things to satisfy different stakeholders, none of them will be memorable.
There’s a reason Berkshire’s meeting generates so much media coverage. Reporters and analysts know Buffett will answer hard questions directly – about the economy, about politics, about specific holdings, about mistakes. The format makes it impossible to dodge.
Compare that to a typical investor meeting where executives deliver scripted remarks, take three pre-screened questions, and exit stage right. The contrast isn’t subtle. Shareholders notice. So do journalists. So do short sellers.
Transparency at Berkshire isn’t a corporate value statement. It’s a production structure. The format doesn’t allow evasion. And because the format doesn’t allow evasion, the company has spent 60 years demonstrating that it has nothing to evade.
The takeaway for your next event: if your IR format includes a lot of guardrails; pre-screened questions, heavily rehearsed executives, short Q&A windows, vague forward-looking statements – those guardrails are communicating something. Ask whether what they’re communicating is actually what you want.
The “Woodstock for Capitalists” works because its core identity is unmistakable.
Same weekend. Same city. Same open Q&A format.
But here’s what’s easy to miss. Berkshire has evolved the event substantially over 60 years. The webcast expanded from audio-only to video to global streaming in English and Mandarin. The question-submission system evolved. The microphone lottery was introduced. The broadcast production grew from bare-bones to one of the most sophisticated corporate livestreams in the world.
What Buffett and his team understood is the difference between identity and execution. The identity stayed fixed. The execution kept getting better.
That’s the formula most IR teams get wrong in one direction or the other. Some reinvent everything every year and lose the continuity that builds audience loyalty. Others freeze their format and miss opportunities to elevate the experience as technology, audience expectations, and stakes evolve.
The best IR events have clear identity and constant elevation. What stays the same is the core: the narrative, the intentionality, the relationship. What gets better every year is the craft that delivers it.
The takeaway for your next event: separate what should never change from what should always improve. Your core identity, your voice, your relationship with investors – protect those fiercely. Your production value, your storytelling sophistication, your technical execution – those should get better every single year. Great production partners help you identify which is which and elevate the execution without disturbing the foundation.
Every IR team is making the same tradeoff, whether they realize it or not.
Most shareholder meetings are designed around this year’s messaging. The best ones invest in a relationship with investors that lasts decades.
There’s a reason Berkshire’s annual meeting has outlasted recessions, leadership questions, and Warren Buffett’s own transition off the stage. The event was built to carry weight, not to win a single news cycle. The format, the venue, the open Q&A, the trust between the company and its shareholders – none of it happened by accident. It was built, refined, and protected, year after year, by a team that understood what the event needed to carry. The result is the most respected, most attended, most covered corporate event in the world.
That’s not boring. That’s the highest form of the craft.
Great IR communication isn’t built one event at a time. It’s built as a long-term asset – a production foundation strong enough to hold the company’s relationship with investors through every cycle, every leadership change, every market condition.
That’s the work worth investing in. And that’s the quality Cardboard Spaceship delivers.
Rethinking your next investor event?
The best IR communication isn’t about stripping production down or piling it on. It’s about making sure that every production decision, visible or invisible, earns its place. That’s the discipline we bring to every high-stakes investor event we build.