AI Archives

Microsoft Build 2026 and the Arrival of the Agentic Audience

The afternoon before Satya Nadella walked onto a stage at Fort Mason, Microsoft Copilot went dark. More than 2,600 enterprise users reported outages on Downdetector before noon on June 1, 2026. The failure locked people out of workflows their companies had rebuilt around the assistant. Twenty-four hours later, Nadella opened Microsoft Build 2026 by promising autonomous agents that would run those workflows on their own.

That gap – between the failure and the promise – is where this story starts. But it is not what the story is about. For three years, the question facing every public company with an AI story was simple: do you have it? Build 2026 is where that question quietly died. Nobody in the room doubted Microsoft has AI.
The sharper shift wasn’t the question. It was who is now asking it.

The first reader of your earnings call is increasingly not a person but a machine. It is an AI agent that summarizes and scores you for the analyst before a human looks. At Build, Microsoft spent two days building exactly that kind of system. This piece is about the audience it creates: what we call the agentic audience. And what every team working in investor relations in the agentic era must do to earn an accurate read.

Here is the uncomfortable part. An investor moment built only for the people attending (live or virtually) is now half-built. The other half of your audience never shows up, never applauds, never forgets. And it is already grading you. The investor moment used to be a pitch. It is becoming an audit, one you volunteer for the moment you publish.

Microsoft Build 2026 main stage during the agentic AI keynote

What Is the Agentic Audience?

The agentic audience is the growing layer of AI systems that read, summarize, and score a company’s communications before any human does. It is the model an analyst runs across your earnings transcript to extract guidance. The tool that sentiment-scores management’s tone on the call. The agent that reconciles your slide deck against your filings and flags what does not match. By the time a human analyst forms a view, a machine has often already shaped it.

Microsoft is building the reader

Build 2026 turned that abstraction into something you can point at, because Microsoft spent the keynote building the very class of system that does this work. It unveiled seven in-house MAI models, among them the reasoning model MAI-Thinking-1 and MAI-Transcribe-1.5, which handles 43 languages. They are summarizers, sentiment-scorers, and reasoning engines. In other words, they are the exact tools that now sit between a company and its investors. The company that demonstrated the agentic audience most vividly is also one of the companies building it. Microsoft did not invent this reader, and it is not building it alone. Google, OpenAI, and Anthropic are racing on the same class of system. What Build offered was the clearest view of it: the whole stack on one stage, in public – models, agents, web-grounding, governance.

Web IQ is the Bing-built service Microsoft unveiled to pull live web information into AI answers. And it already feeds both Microsoft Copilot and ChatGPT. Read that twice. The machine forming a view of your company is, increasingly, a handful of systems owned by the same companies whose clouds you publish on, whose models you depend on, and whose platforms you compete with. The agentic audience is not a neutral crowd assembling on its own. Someone is building it, and the supplier list is short.

Nothing announced at Build governs how an outside model reads or summarizes you. Microsoft’s governance story was real but pointed elsewhere: Agent 365 controls what your agents can touch and spend. The party being summarized has no equivalent protection. The reader is consolidating, and no one is accountable for the read.

Key Considerations:

You can’t opt out. A short supplier list means there’s no avoiding it. The reader is infrastructure now, not a vendor you choose.
You don’t control the lens. The same companies own the systems that summarize you – your cloud providers, your model suppliers, your competitors. Whatever their intent, you get no say in how they read you and no recourse when they get it wrong.
It’s already decided, not coming. Web IQ feeding both Copilot and ChatGPT is the proof that the consolidation has happened, at the infrastructure layer, mostly out of view.

Satya Nadella on stage at Microsoft Build 2026, Fort Mason.

Why the Agentic Audience Changes Investor Relations

A machine does not read the way a room does, and that single fact rewrites the job.
It cannot be charmed. Not by a confident founder, not by a clean slide, not by the conviction that carries a room. It does not attend a presentation; it parses one. It extracts the claims, checks them against the rest of your disclosures, and scores what reconciles. Charisma was always part of the IR toolkit. The agentic audience is immune to it. What it rewards instead is verifiability – a claim it can reconstruct, not one it has to trust.

Evidence is the new challenge

This is why “do you have AI?” gave way to a harder set of questions. Can you govern it, can you reproduce the results you claim, and can you earn back what you spend. Each one is a question a machine can actually check.

Microsoft built its keynote to answer them in evidence rather than adjectives. It owned its reasoning model outright, trained without distillation. It unveiled an agent platform whose first Autopilot, Scout, runs on the open-source OpenClaw wrapped in enterprise governance. And Microsoft Foundry, a consumption-billing model, turned agent work into metered, countable usage.

Capability was the last decade’s contest. The next one is quieter and harder to fake: it rewards the company that AI can verify. A version of it has already played out in publishing. There, AI summaries answer for the source before readers ever click through. The businesses on the other end feel it, in their traffic and their revenue.

You cannot charm this reader. You can only hand it something to confirm. Numbers that reconcile across every document. Claims it can check on its own. Proof it can rebuild without taking your word for any of it. That is what reading you well now requires. The reader is here, and it is forming a view. The only question left is whether you have given it a story true enough to support.

Satya Nadella presenting Microsoft Agent 365 governance and security at Build 2026

Market Proof

You could watch this happen in real time at Build. Microsoft shares had run well off their late-March low going into the week, but the stock slipped during the event itself, sliding roughly 3 to 4% across June 2 and 3 to around $430.80. The pattern of that sell-off is the agentic audience caught in the act.

The market discounted the spectacle – a next-generation Majorana 2 quantum chip it could not yet price – and bore down on the one figure it could verify and model: performance per watt, where Microsoft claimed a 1.4x gain from running its own MAI models on its own Maia 200 chips. It rewarded what reconciled and marked down what it could not check. That is not how a crowd responds to a great show. It is how a system reads. The agentic audience was no forecast at Build; it was already in the tape, grading the most consequential AI company on earth in real time.

The bull case ran on the same currency. Morgan Stanley’s Keith Weiss argued Wall Street is underestimating the revenue Microsoft’s AI data-center spending will generate – by as much as 91% – and set a $650 price target against a roughly $557 “Strong Buy” consensus. Wedbush’s Daniel Ives, calling the recent OpenAI restructure a net positive, raised his to $575 and noted Microsoft will now collect roughly $6 billion from OpenAI in 2026. Notice what those targets rest on: not vision, but modeled, reconcilable numbers – revenue per megawatt, dollars per token, performance per watt. Even the human analysts are reading like the machines now.

Developers and analysts in the audience at Microsoft Build 2026

How to Build an Investor Narrative for Humans and Machines

None of this retires production. An investor moment is still a human performance. You are still building conviction, sequencing a story, earning a room’s trust in the span of a morning. The agentic audience does not replace that work. It adds a second, colder reading on top of it. A machine that was never in the room now parses everything you build for the people in it. The discipline is producing one event that lands both ways.

What each audience rewards

The two audiences reward different things, which is harder than it sounds. A few of the considerations that change how we build:

The artifact has to carry what the room felt.

The agentic audience reads the transcript, the deck, and the filing. It does not read the lighting, the pause before the big number, or the certainty in a CEO’s voice. A point that landed only on its delivery will not survive the compression into text. The meaning has to live in the record, not only in the performance.

Consistency becomes a headline feature.

The agentic audience scores reconciliation a human eye forgives. Every figure has to tie across every document a system cross-references – deck, press release, 10-Q, transcript, call. The mismatch a person skims past is the exact thing the model flags. Consistency also runs across time. The reader holds your last several quarters and compares them in seconds. A quietly reclassified segment or redefined metric reads as drift. Flag the change before the model catches it.

The reader scores your tone.

It can’t hear conviction, but it reads the words that carry it. Hedging, heavy qualifiers, and a defensive Q&A answer register as uncertainty. One shaky line can color an otherwise strong quarter. Confident language is not only better for the room. It also scores better on the page.

Claims have to survive out of context.

A machine cannot infer the setup. The claims you most want understood have to be the easiest ones to extract and still be true. Say a number needs three slides of context to mean what you intend. The summary will keep the number and drop the context.

Proof beats assertion.

A machine can rebuild proof, so proof is what it rewards. Microsoft modeled this on stage. Its strongest demonstrations were the ones the audience could go run that same afternoon. The new coding model reached developers the day it launched, and the models rolled out live rather than “coming soon.” Our five Demo Modes for reading live AI demonstrations need a sixth: Verified Reproducible. Its credibility comes not from how it looked, but from how easily the audience can rebuild it. The IR translation is exact. An analyst – or the analyst’s AI – can rebuild a number from your filings. That number beats the one you ask anyone to take on faith.

Production craft is not exempt from this. If anything, it matters more. This year, Microsoft shrank Build into the workshop-scaled Fort Mason Center. It was a smaller, more exposed room than the stadium productions of years past. The choice traded spectacle for intimacy to rebuild developer trust. A smaller room is harder to produce, not easier – there is nowhere for thin material to hide. But the same instinct serves the AI analyst. A room built on substance rather than spectacle produces a clean record: proof-dense, claim-forward, easy to reconstruct. The production decisions that earn a skeptical room’s trust are, increasingly, the same ones that survive an agentic read.

The essential materials and considerations

A few overlooked materials become the load-bearing part of an investor moment. Each decides whether the story you told the room is the one that travels beyond it:

A clean, accurate transcript – timestamped, speaker-attributed, and corrected against the actual remarks. It is the first document an analyst’s agent opens, and everything downstream inherits its errors.
Data that reconciles to the filings – every figure in the deck, the release, and the script tied to a single source of truth. A reader that cross-checks will trust the numbers that agree and discount the one that drifts.
Consistent naming – one label for each segment, metric, and product, used the same way everywhere. The entity drift a person skims past is glaring to a model tracking you across documents.
Reproducible proof points – for the claims that matter most, show the math, the method, or the source. A figure an analyst can rebuild outperforms one they have to take on faith.
A plain-language version of the story – the equity narrative stated cleanly enough to quote correctly. Supply the summary, or the summarizer will write its own.
Captions and on-screen text that match the words – charts and key slides rendered as text, not pixels. What it cannot read, it cannot quote.

Read yourself back

The last move is the one most teams skip.
After the event, query the major models and agents the way a covering analyst would. Read how they summarize you. Say they miss the thesis, mangle the guidance, or quote a number out of context. That is not a model problem. It maps exactly where your materials fell short, and you now know what to fix before the next one. The agentic read is no longer something to anticipate. It is something you can test, the same way you would rehearse a run-of-show. That is the difference between a team that talks about the agentic audience and one that produces for it.

None of these are particularly glamorous. They are what decides whether the story you told the room is the story that sticks.

This is what we consider when we approach any investor moment, live event, or public production. The room still gets the cinematic, human experience it came for: the narrative arc, the presence, the conviction that moves people. But we make sure that every asset that leaves the building will survive the second, thorough machine reading.

What the Agentic Audience Means for Your Next Investor Moment

It echoes something we explored in Coinbase’s recent work: proving real human craft in a synthetic age. Build 2026 is the other side of the same coin. Coinbase had to prove provenance: that real people made the work. The agentic era asks for accountability: that the claim holds up to scrutiny. Both answer the one question that now defines high-stakes communication. When everyone assumes capability, and software can fake or automate almost anything, how do you generate trust?

The companies that own the agentic era will take the new reader seriously now. They will decide how the world understands them before a model decides for them. The ones that wait will not get a warning. They will simply find their story reaching the analyst pre-summarized. A model they never met will have framed it, in words they did not choose.

Your next investor isn’t only human. The audience now includes a machine that reads first, forgets nothing, and cannot be won over. At Build 2026, Microsoft showed everyone what it looks like by building it. You do not win that room with a better show. You win it with a story built from planning to playback. Both the people deciding and the machine briefing them have to read it the same way. That is the work, and it is worth starting before the reader starts deciding for you.

Frequently Asked Questions

What did Microsoft announce at Build 2026?

Microsoft Build 2026 centered on agentic AI. The headline announcements included seven in-house MAI models (led by the reasoning model MAI-Thinking-1), a new category of autonomous agents called Autopilots – the first, Scout, runs on the open-source framework OpenClaw – the Maia 200 AI chip, the Project Solara agent platform, and the Majorana 2 quantum chip.

How is Microsoft Build 2026 relevant to investor relations?

Build 2026 marked the moment the investor question shifted from “do you have AI?” to whether a company can govern, reproduce, and monetize it. For IR teams, it signals that capability is now assumed and proof – of control and return – is the new mandate.

What is the "Agentic Audience"?

The Agentic Audience is the idea that the first reader of your earnings transcript, investor day deck, or keynote is increasingly an AI system that summarizes and sentiment-scores you for the analyst before a human reads a word. Companies now produce investor communication for both humans and the models that brief them.

How did the market react to Microsoft Build 2026?

Microsoft stock fell roughly 3–4% across June 2–3, 2026, to around $430.80. Investors treated the flashiest reveal, the Majorana 2 quantum chip, as a long-term catalyst, and kept pressing on AI capital spending, margins, an expanded FTC probe, and broader macro headwinds.

What else did Microsoft announce at Build 2026?

Beyond its core AI platform, Microsoft Build 2026 introduced Azure HorizonDB, a managed PostgreSQL service for agentic apps; a GPU-accelerated Fabric Data Warehouse for faster analytics; Project Rayfin, a managed backend-as-a-service on Microsoft Fabric that closes the prototype-to-production gap; the MAI-Voice-2 and MAI-Transcribe-1.5 models, the latter covering 43 languages; and Web IQ, a model-agnostic, MCP-native grounding layer that already powers both Microsoft Copilot and ChatGPT.

Thinking about your next investor day?

The agentic era is rewriting what proof looks like in front of investors, and the companies that adapt first will set the standard. If you are planning a high-stakes moment and want to build it to convince both the room and the models reading it, let’s start a conversation.

More Resources

This article has been updated to include insight following the Google I/O 2026 event.

The Production Problem No One Is Talking About

Live AI is the highest-risk recurring moment in any major company’s annual calendar today. And almost no one outside a handful of production teams has a clear framework for how to think about it.

Most of the conversation about AI demos focuses on what the AI does. The harder question is how the production around the demo reinforces its value. The old playbook was designed for deterministic software – scripted, rehearsed, locked down. And, simply put, agentic AI doesn’t work that way.

The product is unpredictable. The demo runs once – with the share price moving in real time. The production team can’t fully constrain the output. And the audience is watching with their finger on the share button. Get it right and you reset the equity narrative. Get it wrong and you become the case study in someone else’s analysis.

Google has been the case study before. In February 2023, a single live Bard demo answered one question incorrectly about the James Webb Space Telescope. By the next day’s close, Alphabet had lost roughly $100 billion in market capitalization.

In a few days, Sundar Pichai will walk back on stage at Shoreline Amphitheatre for Google I/O 2026. (The keynote begins at 10:00 AM PT on Tuesday, May 19.) Reports point to a major Gemini model update, a new agentic AI assistant called Gemini Spark, Android 17, and the unveiling of Aluminium OS. Some of these will demo well in a controlled environment. Others will require live, multi-step, agentic AI performance in front of a global audience. Fortunately, the production discipline that distinguishes a credibility-building demo from a market-moving disaster has been developed largely in public, by Google itself, over the past nine years.

And to interpret Google I/O 2026, you have to understand the framework Google has been building since 2018.

Why Live AI Demos Are Structurally Different – and Why That Changes Everything

For decades, corporate live demos followed a simple production logic. You wrote the script, ran the rehearsals, and planned for what could go wrong. A Salesforce dashboard demo. An iPhone software walkthrough. An Adobe Creative Cloud feature reveal. The product behaved predictably because the demo was, in essence, a high-fidelity rehearsal performed live.

Live AI demos break that model in four specific ways.

The model decides its own steps.

Agentic AI works differently. You give the system a goal, and it decides how to get there. The production team can guide the prompt but not the path. How and what the model produces between input and output isn’t fully predictable. That’s both the promise of agentic AI – and the production risk.

The output isn't the same every time.

Even with the same prompt, the model can produce a different response. Sometimes the difference is small. Sometimes it’s significant. Sometimes the answer is just wrong. Until the demo actually runs in front of the audience, the production team can’t know exactly what the model will say.

Agentic demos take time.

aA multi-step AI agent might take 45 to 90 seconds to finish a task. On a live stage, 45 seconds of an AI “thinking” is a production crisis. The audience disengages. The camera has nothing to cut to. The speaker has to fill the silence. A traditional software demo can choreograph every second. An AI demo can’t.

Failures are public, fast, and expensive.

A traditional software demo failure is embarrassing. A live AI demo failure is a stock-moving event. The Bard incident wasn’t a fluke – it was the first clear signal of a new category of risk. Since 2023, more than one AI company has watched its share price move on the strength of a single live demo.

For us, these aren’t quirks to manage. They’re a structurally new production category. And they require a fresh approach.

Google chief executive Sundar Pichai speaks during the tech titan’s annual I/O developers conference on May 14, 2024, in Mountain View, California. Google on Tuesday said it would introduce AI-generated answers to online queries made by users in the United States, in one of the biggest updates to its search engine in 25 years. (Photo by Glenn CHAPMAN / AFP)

How Google Built the Framework: A Nine-Year Evolution

No company has confronted the live AI demo problem longer or more publicly than Google. The framework isn’t written down anywhere, but it is present in the production decisions Google has made over the past nine years – including the ones it learned the hard way.

Six moments define that evolution.

2018: The Duplex Phone Call.

At I/O 2018, Google played pre-recorded phone calls in which its Duplex AI assistant booked a hair appointment and called a restaurant. Sundar Pichai introduced the recordings as “the Google Assistant actually calling a real salon.” The AI sounded so human, complete with “um” and “uh” verbal tics, that the demo went viral within hours.

Then came the questions, first raised by Axios: why didn’t the businesses identify themselves when they picked up? Why was there no ambient noise? Were these genuine real-world interactions or carefully staged recordings? Google declined to provide the names of the businesses or confirm whether the calls had been edited. The credibility question presented a separate ethics issue: should an AI identify itself as AI when calling a human?

The lesson: When an audience can’t verify whether what they’re seeing is real, the demo creates more doubt than it resolves. Specificity, transparency, and verifiable detail are essential to earning trust and demo credibility.

February 2023: The Bard $100 Billion Day.

Google held a small launch event for Bard featuring a promotional clip in which the AI gave a factually incorrect answer about the James Webb Space Telescope. Astronomers noticed within hours. Alphabet shares dropped about 7.7% the following day, erasing roughly $100 billion in market value.

The lesson: Every public AI output is a public statement. Fact-checking demo content isn’t a marketing task – it’s a risk management one.

December 2023: The Gemini Hands-On Video.

Google released a six-minute video billed as “Hands-on with Gemini: Interacting with multimodal AI.” It appeared to show the model engaging with images, drawings, and a continuous spoken conversation in real time.

Within 24 hours, Bloomberg’s Parmy Olson reported that the demo had been constructed differently than the video implied: Gemini wasn’t responding to spoken voice or live video at all. Google’s team had fed the model still image frames and text prompts, then added the voice narration afterward in post.

Google’s disclaimer in the video description noted only that “latency has been reduced and outputs have been shortened.” The narrative shifted from “Gemini is remarkable” to “Google misrepresented Gemini.”

The lesson: The line between live demo and marketing video has to be transparent. The credibility hit when an edit gets exposed is worse than the lift from a polished demo.

May 2024: Project Astra and the Framework Reset at I/O.

At I/O 2024, Google unveiled Project Astra – its real-time multimodal assistant – with a deliberate production move that broke from prior patterns. The demo aired as two continuous takes, one on a Pixel phone and one on a prototype pair of smart glasses.

The signal to the audience was clear: Google hadn’t cherry-picked the responses. The model was handling a stream of inputs in real time, rough edges and all. Around that demo, Google also began explicitly labeling other AI segments as recorded or aspirational rather than implying everything was live. The change was subtle in execution but marked a significant milestone. Google had stopped trying to make everything feel live and had started telling the audience exactly what they were watching.

The lesson: Labeling the kind of demo you’re showing is the first rule of demo credibility. The audience will forgive almost any production choice if they know what choice you made.

May 2025: Project Astra Ships.

At I/O 2025, Project Astra moved from research demo to shipping product, powering new experiences in Search Live, the Gemini app, and third-party developer tools. The production decision here was as significant as the engineering one.

Having spent 2024 carefully framing Astra as a live, unscripted experience, Google could now invite the audience to use the same capability themselves. The demo and the product had become indistinguishable, which is the highest form of credibility a live AI demo can earn.

The lesson: When the AI demo eventually becomes a product launch, the production discipline that surrounds it becomes the foundation for long-term trust.

May 2026: The Agentic Era Goes Live

At I/O 2026, Google staged the most agentic-AI-heavy keynote in the event’s history. Gemini Spark – a 24/7 personal AI agent designed to act autonomously across apps, emails, and calendars – was the headline product. Antigravity 2.0 was demoed by showing an operating system that the AI had built from scratch over 12 hours, then demonstrating it running Doom live on stage.

The Samsung XR glasses demos had real-time presenters using the eyewear to ask Gemini where to meet a friend, order coffee with a tip, and capture photos – rough edges visible, no edits. The doctrine held. No demo failures on the scale of Bard 2023. The framing across the keynote was unusually clean: live demos clearly labeled as live, aspirational segments clearly labeled as future-state.

The lesson: When a company builds the framework for nearly a decade, the production framework starts to feel native rather than imposed. But clean execution alone isn’t enough anymore – the audience now expects production to also resolve the strategic questions they walked in with.

That’s nine years of drafting the playbook, paid for in public embarrassment, market cap, and corrective communication. It distills into a single principle: the audience doesn’t need the demo to be perfect – but they need to know exactly what kind of demo they’re watching.

Which brings us to the framework.

The 5 Demo Modes of Live AI

Most companies treat a live AI demo as a binary – either it’s live or it isn’t. The actual production reality is a spectrum, and failure in almost every public AI demo controversy comes down to misalignment between which category the audience thought they were watching and which one was actually being staged.

There are five distinct ways to stage a live AI demo. We call them the Demo Modes – a five-category framework for live AI production

Mode 1: Verified Live

rThe AI runs in real time during the event. No pre-staging. No predetermined output. The speaker delivers an input, and the audience watches the response unfold in real time. Highest credibility, highest production risk. The two continuous-take Project Astra demos at I/O 2024 were the clearest recent example of Mode 1 done well.

Mode 2: Constrained Live

The AI is running in real time, but inside a controlled environment. The prompts are curated, the use cases are scoped. The model is genuinely working, but the production team has narrowed what it might be asked to do. Mid-high credibility when the framing is transparent, mid risk. Most enterprise software AI demos today are Mode 2 whether the company says so or not.

Mode 3: Pre-flight Live

The AI completed the task minutes or hours before the event. The audience watches the playback of an actual real run – including any imperfections – with the speaker explicitly framing it as such: “We ran this just before walking on stage. Here’s what it produced.” Mid credibility when disclosed, low risk. This mode is dramatically under-used. Done well, it captures most of the trust of a live demo while significantly lowering the risk of failing on stage.

Mode 4: Pre-Recorded

A polished video of the AI performing a task, clearly labeled as recorded. Low credibility for capability claims but high credibility for visual production quality. The Gemini Hands-On video would have qualified as Mode 4 if Google had labeled it that way. The controversy emerged because it wasn’t.

Mode 5: Aspirational

Explicitly framed as “what’s possible,” “what we’re building toward,” or “where this is headed.” It’s a preview of where the product is going – not proof of what it can do today. Lowest credibility for capability claims, but useful for setting vision. Google should have framed the Duplex demo at I/O 2018 this way. Instead it was framed ambiguously enough to read as Mode 1.

The Demo Modes aren’t a ranking. They’re a set of choices. A keynote can deliberately mix categories – Verified Live for the headline demonstration, Constrained Live for the enterprise capability, Pre-flight Live for the agentic workflow, Pre-Recorded for the partner integration, Aspirational for the long-term roadmap.

The discipline isn’t picking the “best” mode. The discipline is making sure the audience knows which one they’re watching.

What to Watch For at Google I/O 2026

The framework becomes most useful as a real-time reading tool. Here’s how it applies on Tuesday, May 19.

Watch the next-generation Gemini reveal.

Reports point to a major Gemini model update at the keynote – whether labeled Gemini 4 or a 3.x successor. The production question is which mode Google chooses for the headline demonstration. Verified Live (Mode 1) would be the most confident move – signaling that Google trusts the new model to perform outside of a controlled environment. Constrained Live (Mode 2) would be the more cautious choice. If Google frames the demo as anything other than Mode 1 or 2, that’s a signal worth noting.

Watch for Gemini Spark.

Leaks point to a new agentic AI assistant called Gemini Spark – designed to work autonomously across apps, emails, calendars, and websites. Booking flights. Managing email. Filling out forms. This is the most production-risk kind of demo a company can stage today, because every action links to the next, and a single failure cascades across all points. The production decision is whether to demo Spark’s full workflow live (Mode 1 – high risk, high reward), to scope it tightly (Mode 2 – safer, less impressive), or to compress the experience via Pre-flight Live (Mode 3 – the team runs it just before the event and acknowledges it openly). Watch for the speaker’s framing language at the moment of the reveal. If they say “we ran this just before walking out,” that’s Mode 3 done well. If the demo cuts cleanly between steps without acknowledgment, the production team has chosen polish over transparency.

Watch how Google handles failure moments.

Every live AI demo at I/O 2026 will have some friction. Latency. A response that lands awkwardly. A model output that’s correct but visually unimpressive. The production decision is whether to absorb that friction visibly (the Astra 2024 approach) or to edit it out (pre-2024 approach). The Astra approach is the more mature move. Watch for it.

Watch the segmentation between live and recorded.

I/O 2024 introduced explicit labeling. I/O 2025 refined it. If I/O 2026 makes the live vs. recorded distinction even cleaner – graphics, lower-thirds, verbal framing – that’s Google institutionalizing their framework. If the line blurs again, that’s a regression worth flagging.

Watch the Cloud and enterprise demos especially.

The most consequential audience at I/O 2026 isn’t the developers in Shoreline. It’s the institutional investors evaluating Google Cloud’s AI revenue trajectory. Pichai disclosed at Cloud Next 2026 that just over half of 2026 ML compute investment will go to the Cloud business. The Cloud demos at I/O have to translate that capex into a credible product story. Watch how those demos are categorized. Constrained Live with enterprise customer logos as visible validation carries weight. Aspirational framing doesn’t.

Anyone who watches I/O 2026 with the 5 Demo Modes framework in hand will likely walk away from the keynote with a deeper understanding of these products and features than the reader who watches for product news alone.

The Production Decisions That Make or Break a Live AI Demo

The 5-Mode framework names the demo categories, but the execution lives in the production decisions that distinguish one category from another. Four of them carry disproportionate weight.

Speaker preparation has to match the demo mode.

Mode 1 (Verified Live) requires real-time agility. The speaker has to be ready to narrate whatever the model produces – including responses they’ve never seen before. That’s a different kind of prep than walking through a rehearsed click sequence. Pichai’s comfort with live AI moments is a production advantage Google has built over years. Most CEOs aren’t there yet.

The contingency plan is the production.

Every live AI demo needs a written set of fallbacks: if the model produces a problematic response, what does the speaker say next? If latency drags on, where does the camera cut? If the demo fails entirely, how does the show move on without acknowledging it? The audience never sees the contingency. They only see the recovery. The Bard launch failure wasn’t a demo failure – it was a contingency-planning failure. The factual error was visible in promotional materials before the event. Better fact-checking should have caught it.

The camera direction shapes credibility.

When a live AI demo is processing, the camera has to go somewhere. A cut to the speaker carries one signal. A cut to crowd reaction carries another. A cut to a product graphic carries a third. Each choice tells the home audience something different about whether to trust what’s happening. At I/O scale, this requires a director, multiple operators, and pre-planned camera blocking for every demo segment.

Pre-flight runs require real preparation.

Mode 3 demos don’t happen by accident. They require the production team to actually run the demo backstage, capture the output, and have it ready to play back within minutes of the live moment. That’s a second production happening at the same time as the live event. Most companies don’t budget for it. The ones that do have a tool the others don’t.

These decisions aren’t decorative. They’re the difference between a demo that builds credibility and one that costs market cap.

Where AI Demos Are Heading

Google I/O 2026 isn’t an isolated event. It’s the first in a three-week window that includes Microsoft Build (June 2–3) and Apple WWDC (June 8–12). All three companies will stage live AI demos. All three will face the same production decisions. And by the end of June, the industry will have its first complete data set for how the leading public AI companies are navigating the new production risk.

And others are watching them closely. Salesforce Dreamforce in September will stage Agentforce demos. Workday, ServiceNow, Adobe, and every major enterprise software company will demo agentic capabilities at investor moments over the next year. The companies that have a clear live AI demo framework will appear more credible than those with technically superior AI but worse production discipline.

That’s the broader implication. When every public company is staging live AI, the production discipline around the demo becomes part of the equity story itself. Not just for the AI labs – for any company whose narrative depends on showing product capability and evolution.

The 5 Demo Modes aren’t a prescription. Different companies, different audiences, different products will call for different combinations. What every company needs is the vocabulary to make those choices on purpose – not by accident.

Google has been learning that vocabulary in public for eight years. The lesson the rest of the industry has yet to fully absorb is that the question isn’t whether to demo live. It’s whether the production team is ready to handle what happens when you do.

The high-wire act is permanent. The model for walking it is still taking shape. Google I/O 2026 is the next big stage.

That’s the work worth investing in. It’s also the work Cardboard Spaceship builds for clients navigating the moments that matter.

What Google I/O 2026 Actually Staged

Update: This section was added after Google I/O 2026 wrapped to validate our framework against the actual two-day event.

Google I/O 2026 ran from May 19 to May 20. Sundar Pichai walked off the Shoreline Amphitheatre stage having staged the most agentic-AI-heavy keynote in the event’s history – followed by a Developer Keynote that quietly proposed an architectural overhaul of how the web itself works. Here’s how it tracked against the framework, and what the broader industry should take from it.

Antigravity 2.0 was the boldest production move of the keynote – and the clearest Mode 3 in Google’s history.

Varun Mohan, head of Google’s Antigravity platform, demoed agentic coding by showing how Antigravity and Gemini 3.5 Flash together built a functioning operating system from scratch in 12 hours, using less than $1,000 of tokens. The OS was then demonstrated running Doom live on stage.

This was a textbook Mode 3 (Pre-flight Live): the AI did the actual work autonomously in the hours before the event, and the audience saw the genuine output. The catch: Google didn’t visually communicate the Pre-flight Live nature of the demo as clearly as the framework would prescribe. The 12-hour reality was disclosed verbally but compressed into a moment that read closer to Mode 1 in the audience’s mind. The most impressive demo of the keynote and the most under-framed production move – at the same time.

Gemini Spark was demoed in Mode 2 (Constrained Live).

Josh Woodward took the stage to show Spark planning a block party – coordinating schedules, permits, and calendar integrations through tightly scoped prompts on an iPhone. The model worked in real time, the prompts were curated, the use case was defined. This was the right production decision for a brand-new product with broad cross-app permissions.

Spark is genuinely high-risk to demo because every action chains to the next. Constrained Live limits that chain to a deliberate set of steps without sacrificing the live energy.

The Samsung XR glasses demos went Mode 1 (Verified Live).

Real presenters on stage using the glasses to ask Gemini where to meet a friend, order coffee, and capture photos – with rough edges left in. This was the production choice closest to the Astra 2024 approach. The friction wasn’t hidden. The audience saw the model working in real time, sometimes imperfectly, and trusted what they saw more for it.

Hassabis closed the keynote in Mode 5 (Aspirational).

Demis Hassabis’s “AGI is now on the horizon” framing was explicitly labeled as future-state – not current product. This is exactly how Mode 5 should work. The audience knows they’re being shown a vision, not a capability. No credibility cost. No expectation mismatch.

The bigger story arrived in the Developer Keynote.

Day 1 afternoon brought the announcements with the longest-tail production implications: WebMCP, an open web standard for AI agents; Chrome DevTools for agents as a stable 1.0 release; HTML-in-Canvas; Modern Web Guidance; Android CLI; Android Bench. The framing in Google’s own keynote recap: “We’ve transitioned from AI that simply assists you, to agents that can independently navigate complex tasks across your entire workflow.” This is the bet that recasts every live AI demo from this point forward. Every demo is now also a demo of the agentic web thesis – and the production stakes have just compounded.

The doctrine held. The market read it anyway.

No demo failures on the scale of Bard 2023, no edited-video controversies, no credibility leaks. The framing across the keynote was unusually clean: live demos clearly labeled as live, aspirational segments clearly labeled as future-state. And yet Alphabet’s stock slid during the keynote. The next morning, BofA reaffirmed Alphabet at a $430 price target, Wells Fargo raised its target to $435, and Morgan Stanley called out the “agentic offerings across commerce, travel and daily life.”

So the picture is nuanced: the demos themselves didn’t fail, but the production didn’t sufficiently answer the question Wall Street walked in with – how AI Mode in Search will be monetized when 93% of those searches already end without an external click. The lesson is sharper than “live demos move markets.” It’s that production decisions are now responsible for resolving the audience’s open questions, not just demonstrating the product. The Bard-era risk was that a live demo could break the equity story. The new risk is that even a clean live demo isn’t enough.

For the broader industry, the next test cases arrive in two weeks.

Microsoft Build (June 2–3) and Apple WWDC (June 8–12) will stage their own live AI demos – and their own answers to the agentic web thesis Google just planted. Both companies have learned from Google’s nine-year public arc. By the end of June, the industry will have its first complete data set for how the leading public AI companies are handling not just the production risk of live demos, but the production responsibility of resolving institutional questions in real time.

Watch which Modes they choose. The framework still applies. The stakes just got higher.

That’s the work worth investing in. It’s also the work Cardboard Spaceship builds for clients navigating the moments that matter.

Planning a live AI demo at your next high-stakes event?

The most consequential moments in modern corporate communications now run on live AI. Whether you’re preparing for an Investor Day, a product launch, a developer event, or an investor moment that includes an agentic demonstration, the production decisions you make now will define how the market reads your capability when the moment arrives.
Let’s start a conversation →

Resources

CNN Business: Google shares lose $100 billion after AI chatbot makes error during demo — February 2023
NPR: Google’s AI chatbot, Bard, sparks a $100 billion loss in Alphabet shares — February 2023
TechCrunch: Google’s best Gemini demo was faked — December 2023
Engadget: Google admits that a Gemini AI demo video was staged — December 2023
TechCrunch: Duplex shows Google failing at ethical and creative AI design — May 2018
Axios: What Google isn’t telling us about its AI demo — May 2018
TechCrunch: Google’s Gemini updates: How Project Astra is powering some of I/O’s big reveals — May 2024
TechCrunch: Project Astra comes to Google Search, Gemini, and developers — May 2025
Google Blog: Alphabet Q1 2026 earnings remarks — April 2026
Google Blog: Sundar Pichai shares news from Google Cloud Next 2026 — April 2026
Google I/O 2026 Official Site