The Deepdive
Join Allen and Ida as they dive deep into the world of tech, unpacking the latest trends, innovations, and disruptions in an engaging, thought-provoking conversation. Whether you’re a tech enthusiast or just curious about how technology shapes our world, The Deepdive is your go-to podcast for insightful analysis and passionate discussion.
Tune in for fresh perspectives, dynamic debates, and the tech talk you didn’t know you needed!
Read the companion article on https://medium.com/@allanandida
The Deepdive
Google I/O 26: Welcome to the Age of Glorified AI Interns
Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.
Google just spent an hour telling us AI agents will run our lives, and somehow it still sounds like a very overqualified intern with push notifications.
Read the companion article on https://medium.com/@allanandida
At Google I/O 26, Gemini 3.5, Omni and Spark were pitched as the start of an “agentic era” where background bots quietly schedule your day, rewrite your inbox and shop on your behalf while you do more “important” things. In this episode, we unpack what Google actually shipped – from Daily Brief, Universal Cart and voice‑driven Gmail to Android XR glasses – and ask whether these agents are true coworkers or just glorified task rabbits wrapped in trillion‑token branding. Along the way, we look at what this means for users, developers and the open AI ecosystem when one company decides its interns now live inside your calendar, browser and credit card.
Leave your thoughts in the comments and subscribe for more tech updates and reviews.
A Number That Breaks Your Brain
IdaI want you to just um picture something for a second. You're standing there, maybe you've got your morning coffee in hand, and someone casually drops this number on you. 3.2 quadrillion.
AllanI mean, it genuinely sounds like a number a little kid invents, right? Like when they're losing a playground argument. Well, I have 3.2 quadrillion invisible force fields.
IdaAaron Powell Yes. I thought the exact same thing, but this isn't a playground argument. This is, you know, Google CEO Sundar Pachai standing under the bright lights at the 2026 i slash O keynote. Oh yeah. And that number. 3.2 quadrillion is the number of data tokens that Google's AI models are processing every single month. And a token is essentially the fundamental building block of a problem being solved.
AllanAaron Powell We really have to put that in historical context to understand the sheer violence of that scale. Just 12 months prior, that number was 480 trillion.
IdaWhich we thought was huge.
AllanExactly. We thought 480 trillion was this incomprehensible deluge of computation. But we are looking at a seven-fold jump in a single year. It represents a physical infrastructure shift that is, frankly, difficult to even map in your head. The server farms, the cooling, the silicon required to churn through quadrillions of tokens asynchronously. It's a completely new paradigm of compute.
IdaSo we are drowning in computation. And the natural question you have to ask is, well, what are we actually doing with it?
AllanRight. Are we solving the mysteries of the universe?
IdaAaron Powell I mean, during the keynote, they briefly touched on predicting category five hurricanes and, you know, modeling complex proteins. But when you look at the actual consumer demos for this new agentic Gemini era, it's completely different.
AllanIt really is.
IdaWe are primarily using this god-like frontier intelligence to email dog kennels and organize neighborhood block parties.
AllanAaron Powell The justiposition is wild. And that's exactly our mission for you on this deep dive today. We are exploring the architecture and the implications of this newly inaugurated agentic era.
IdaRight.
Why Agentic AI Changes Everything
AllanThe core theme that emerged from every single demo is that Google has essentially decided human decision making, even the tiny micro decisions, is just too exhausting.
IdaYeah, we're just done making choices.
AllanExactly. They are building an infrastructure designed to let you completely outsource your cognitive load to the cloud. We are shifting from asking AI to generate text to letting AI execute complex multi-step actions on our behalf.
IdaOkay, let's unpack this because if we are burning through quadrillions of tokens, the architecture here has to be fundamentally
Gemini Spark Lives In The Cloud
Idadifferent than just like a chatbot on my phone.
AllanCompletely different, yeah.
IdaAnd that brings us to the star of their show, Gemini Spark. They are billing it as a 24-7 personal AI agent. But what actually makes it agentic? I mean, why couldn't my phone's local processor just handle a to-do list?
AllanWell, your phone's processor is brilliant for immediate localized tasks. But Spark doesn't live on your phone. It runs on dedicated virtual machines in Google Cloud.
IdaOkay, so it's entirely off-device.
AllanRight. And think of a virtual machine not as physical hardware, but as a simulated, self-contained computer running within Google's massive server farms. By decoupling the agent from your local hardware, it gains persistent statefulness.
IdaMeaning it doesn't sleep just because my phone battery died?
AllanPrecisely. You can assign Spark a complex multi-day task, completely close your laptop, and just walk away. The virtual machine keeps the agent active in the background. That's wild. It's constantly monitoring data streams, waiting for conditions to be met, and executing actions asynchronously.
IdaWhich they demonstrated with perhaps the most painfully relatable modern chore in existence, planning a
The Block Party That Runs Itself
Idaneighborhood block party.
AllanOh, that demo, yes.
IdaIn the live demo, the presenter essentially brainded a vague mandate onto Spark. They didn't code a workflow, they just told it to handle the party. And the AI autonomously spun up a live Google Sheet to track incoming RSVPs.
AllanBut it didn't just passively read the sheet. It used a mechanism called retrieval augmented generation to monitor incoming Gmail replies.
IdaRight. It actually read the emails.
AllanYes. And it extracted the unstructured intent of the neighbor's email, whether they were a yes, a no, or a maybe if I could find a babysitter, and then it updated the structured rows and columns of the spreadsheet dynamically.
IdaIt then drafted polite reminder emails to the neighbors who hadn't responded yet. And it even generated a slide deck to hype up a bounce house for the kids.
AllanThe bounce house?
IdaYes. But the detail that absolutely broke me, the one that really highlights the reasoning capability, was the HOA rule.
AllanThe agent's situational awareness there was incredible.
IdaUnbelievable. The AI autonomously dug into the user's Google Drive, searched through a massive folder of boring documents, found the homeowners association bylaws, and read them.
AllanAnd then stopped them from breaking the rule.
IdaExactly. It actively alerted the user that they were legally prohibited from setting up the bounce house in the cul-de-sac before Friday afternoon.
AllanWhat does this say about us as a society? We used to worry AI would steal our jobs, now we're begging it to navigate the passive aggressive politics of our homeowners association.
IdaIt's the ultimate fantasy of avoiding social friction. You are literally outsourcing the anxiety of neighborhood politics to a virtual machine.
AllanIt really is.
IdaAnd Google is weaving this lack of friction deep into the operating system level, which we saw with the new
Voice Control Meets Messy Desktops
IdaGemini app for Mac. The voice command demo for the Mac integration completely changes how we interact with local files. The presenter is just looking at their messy desktop.
AllanLike most of our desktops.
IdaRight, exactly. They use their mouse to highlight a bunch of random unstructured files, a couple of PDFs, some images of vet invoices. They hold down a functional lee and just start talking naturally.
AllanThey asked Gemini to draft an introductory email to a dog kennel for their two dogs. One was named Hank.
IdaAnd the other was named Lou Cinnamon, which is objectively a top-tier name for a dog.
AllanTruly the best name.
IdaBut beyond the names, think about the mechanics of what the AI just did. It took static image files and messy PDFs, ran a localized vision model to extract the text, and used semantic reasoning to understand what was a vaccine date versus what was a random phone number.
AllanAnd then it just built that table.
IdaYeah, it structured all of Hank and Louis Cinnamon's allergies into a perfectly formatted HTML table inside an outgoing email draft.
AllanAll in seconds. That process is completely eradicated.
IdaIt's just gone.
AllanBut to accomplish these seamless behind-the-scenes tasks on a larger scale, the AI has to do more than just read files. It is fundamentally changing the fabric of the internet itself.
IdaOh, for sure.
AllanWe are moving away from finding information to demanding the web recompile itself
Search Stops Linking And Starts Building
Allanto serve our immediate needs.
IdaThis was the moment in the keynote where I realized the internet as we know it, you know, the classic 10 blue links on a search page is dead.
AllanCompletely dead.
IdaThe traditional search box is being replaced by an intelligent search box. You aren't just typing text anymore, you are dropping in files, videos, or even open Chrome tabs as direct inputs.
AllanGoogle is calling the underlying engine for this anti-gravity 2.0. It's a developer tool powered by their Gemini 3.5 flash model, and it introduces agenc coding directly into the consumer search experience.
IdaSo it's coding for you.
AllanExactly. When you ask a complex question now, the search engine doesn't go look for a webpage that holds the answer. It acts like a senior software engineer and literally builds a custom application, complete with a user interface right there in the results.
IdaAaron Powell The Astrophysics example they showed perfectly illustrates this. A user asks how binary black holes create gravitational waves.
AllanAaron Powell A pretty heavy question.
IdaYeah. In the old internet, you get a Wikipedia link. In the anti-gravity 2.0 internet, search dynamically codes and deploys a fully interactive, custom 3D simulation of black holes spiraling into each other. You have actual sliders to adjust the mass and the orbital separation.
AllanAaron Powell Let's just break down the technical marvel of what is happening under the hood there. The AI isn't pulling a pre-made widget from a library.
IdaRight. It's building it from scratch.
AllanThe Gemini model is interpreting your prompt, planning a software architecture, writing the underlying physics logic, coding the front-end graphical interface, compiling it, and deploying it inside a secure sandbox container on your browser.
IdaIn milliseconds.
AllanIt is writing the spoke software for your highly specific question in milliseconds. It's unbelievable.
IdaAnd it does this for deeply personal queries, too. Another demo showed someone planning a weekend trip for their family. Search utilized something called personal intelligence to securely parse data from the user's Gmail and calendar.
AllanAnd then it coded a persistent mini app for the weekend.
IdaRight. It wasn't a static itinerary, it was a custom dashboard tracking live restaurant reservations. It even integrated chess tutorials and activities because it deduced from earlier emails that the oldest kid was learning to play chess.
AllanBuilding a stateful personalized application on the fly is impressive enough. But to truly prove the raw, unadulterated power of Anti-Gravity 2.0 and the Gemini 3.5 flash model, the Google engineering team did something completely unhinged during the developer portion of the keynote.
IdaOh, the absolute most absurd tech flex I have ever witnessed.
AllanJust so over the top.
IdaThey wanted to prove how good these
Ninety-Three Agents Code An OS
Idamulti-agent systems are at coding. So they unleashed 93 autonomous AI subagents and gave them a single prompt. Build a fully functional computer operating system entirely from scratch.
AllanWe need to be clear about the difficulty of that prompt. Building an operating system is an incredibly complex orchestration of memory management, file systems, kernel architecture, and hardware abstraction.
IdaIt's not a weekend project.
AllanNo, it usually takes a team of human engineers months, if not years.
IdaAnd these 93 agents worked in parallel for 12 hours. The architecture of this is mind-blowing. Think of it like a highly specialized automated construction crew.
AllanOh, that's a good way to put it.
IdaYou have an architect agent designing the kernel, a plumber agent handling the memory allocation, and a supervisor agent checking for merge conflicts in the code. They are constantly talking to each other, writing code, testing it, failing, and iterating.
AllanAnd the numbers are staggering.
IdaThey processed 2.6 billion tokens in that 12-hour window. They wrote every single line of code, and the total compute cost was less than $1,000 in API credits.
AllanAnd the ultimate payoff for this monumental feat of multi-agent software engineering.
IdaThey booted up this bespoke AI-generated operating system live on stage just to play the classic 1993 video game Doom. This is simultaneously impressive and completely ridiculous. It's like asking a librarian to help you find a book, and instead, they build a custom printing press in front of you.
AllanIt is the ultimate flex of raw capability.
IdaIt proves that you essentially drop items into this persistent cart from wherever you happen to be browsing.
A Universal Cart With Real Reasoning
IdaOnce an item is in there, the cart becomes this tireless background bargain hunter. It never sleeps. Exactly. It continuously queries merchant APIs to track price histories, monitor for restocks, and hunt for flash deals.
AllanAaron Powell But the critical evolution here is that the cart acts as a highly analytical financial chaperone. It possesses semantic reasoning about the physical properties of the products themselves.
IdaThe PC building example they used to demonstrate this was brilliant. Let's say you are building a custom gaming rig. You add a high-end processor to your universal cart from a tech blog. Okay. But three days ago, you had added a specific motherboard from a YouTube review. The AI cart proactively flags a warning. It recognizes that the processor requires a different physical socket type than the motherboard you already selected.
AllanThat is so helpful.
IdaIt stops you from buying incompatible parts and dynamically suggests alternatives that fit your exact bill.
AllanIt understands the spatial and physical compatibility constraints of consumer goods. And to cross the final hurdle, making the actual purchasing completely seamless, Google introduced the agent payments protocol, or AP2.
IdaThe AP2 protocol is where we cross firmly into sci-fi territory. This protocol allows your AI agent to securely execute financial transactions on your behalf without you ever clicking a checkout button.
AllanWhich sounds terrifying.
IdaMy immediate thought was how do I know it won't just empty my bank account on random gadgets?
AllanBecause AP2 isn't just handing your credit card number to a chatbot. It utilizes cryptographic tokenized parameters. You set strict, immutable boundaries. For instance, you tell the agent you only want a specific brand of monitor and your hard spend limit is $400.
IdaSo it's locked in.
AllanRight. The agent generates a single-use token bound by those exact smart contract-like constraints. When it finds a deal that matches, it initiates a handshake with the merchant processor. And if they try to overcharge, if the merchant tries to charge $401 or swap the brand, the transaction mathematically fails. It creates a tamper-proof digital paper trail that links your constraints, the merchant, and the payment processor so there are no disputes.
IdaOkay, but here's the thing. The AI is smart enough to know you're buying the wrong PC parts, but presumably polite enough not to mention you don't actually need another gaming rig. How long until my AI is just negotiating with your AI to buy things neither of us wants?
AllanIt's a highly valid concern regarding induced demand. We are handing these systems an unprecedented level of agency over our wallets.
IdaWe really are.
AllanBut Google's core bet is that the sheer convenience, the complete removal of the physical exhaustion of cross-referencing, motherboard compatibility sheets, and hunting for discount codes will easily override our skepticism.
IdaSo far, this automated, frictionless existence has been trapped behind glass. We are looking at laptops,
Smart Glasses That Translate The World
Idaphones, and desktop monitors. But Google wants to take this environmental intelligence and integrate it directly onto our faces, which brings us to the hardware: the new Android XR smart glasses.
AllanHistorically, smart glasses have faced a massive uphill battle regarding public perception.
IdaBecause nobody wants to walk down the street looking like a rogueborg drone with a glowing camera strapped to their temple. The cyborg aesthetic is deeply off-putting.
AllanGoogle is hyper-aware of that stigma, which is why their hardware strategy this time is entirely dependent on partnerships with top-tier fashion eyewear brands. They handed the external chassis design over to Warby Parker and Gentlemonster.
IdaSo they actually look like high-end stylish fashion accessories. You wouldn't know there were tech devices at a glance. Not at all. And to further reduce the friction of adoption, the initial rollout this fall only features audio glasses. There is no visual display, no augmented reality holograms projecting into your retinas.
AllanThey feature high-resolution onboard cameras so the multimodal AI can process your visual environment, but the output is delivered via directional spatial audio.
IdaIt's just sound.
AllanIt is literally Gemini whispering directly into your ear, providing constant hands-free context about the world around you without ever demanding you look down at a screen.
IdaThe live, hands-on test of the spatial audio was staggering. A journalist used the XR glasses to facilitate a highly complex three-way conversation in a crowded room. You had a Spanish speaker, a Serbian speaker, and an English speaker.
AllanAll talking at once.
IdaYeah. The glasses seamlessly captured the audio, processed the translation via the cloud, and whispered the real-time English translation into the journalist's ear.
AllanThe technical hurdle there isn't just translation, it's acoustic isolation. The AI demonstrated intense situational awareness. By utilizing vocal footprint isolation and directional mics, the agent adeptly ignored the background chatter of other English speakers in the room.
IdaThat's the crazy part to me.
AllanIt knew exactly which voices were part of the targeted conversation and dynamically filtered out the ambient noise.
IdaIt's essentially the babblefish from Hitchhiker's Guide to the Galaxy, but styled by a high-end Korean fashion house. And the glasses aren't just for passive observation. In another onstage demo, they showed a woman using the glasses to navigate the physical world and trigger digital actions simultaneously. She asks Gemini for walking directions to a local coffee shop.
AllanAnd here's where the API integrations really shine. While she's walking, the glasses, which are wirelessly tethered to her phone, autonomously utilize deep links to open the DoorDash app sitting dormant in her pocket.
IdaThe AI navigates the hidden architecture of the phone app she isn't even looking at and places her usual order for a nitro cold brew so that the transaction is completed and the coffee is waiting on the counter the second she arrives.
AllanIt is the ultimate manifestation of their goal: removing the friction of interacting with both digital interfaces and physical environments. But of course, because this is a Silicon Valley tech demo, they couldn't resist throwing in something profoundly bizarre.
IdaOh, the blimp. I almost forgot about the blimp. So she gets her nitro cold brew, looks out at the live audience in the auditorium, and asks Gemini to use the nano banana image model.
AllanNaturally.
IdaThe onboard cameras on the glasses take a photo of the crowd. The AI processes the image, hallucinates a massive cartoon blimp floating in the sky above the audience, and instantly sends that newly generated image to her paired smartwatch. Wait, it gets better. We finally have high-end Korean fashion glasses that can instantly translate Serbian, and we use them to hallucinate cartoon blimps. I love that this exists, but also why?
AllanBecause they possess the compute power to do it. And they want developers to know the image models can run with near zero latency based on live camera feeds.
IdaI guess that makes sense.
AllanBut your question of why actually perfectly highlights the core tension of this entire deep dive. If we zoom
When Friction Disappears, What’s Left
Allanout and synthesize the sheer scale of everything we've unpacked today, we have to circle back to that breathtaking number from the intro. 3.2 quadrillion tokens.
Ida3.2 quadrillion units of thought, of logical deduction, of multi-agent coordination, churning away in the cloud every single month.
AllanWe are harnessing an unprecedented scale of computational physics to completely remove the friction from the most mundane, deeply human tasks. We are stepping into a paradigm where software engineers don't write code, the agents do.
IdaRight.
AllanThe shopping cart catches your physical compatibility mistakes. Your glasses translate the world and buy your coffee before your brain even fully registers the desire.
IdaIt's an invisible, hyper competent infrastructure. We have unlocked the power to simulate the gravitational waves of binary black holes, and we are aggressively deploying it to figure out if our neighbor's kids allergy allows for peanut butter at the weekend lock party. It is gloriously, wonderfully absurd.
AllanIt is absurd, but it is undeniably beautiful in its utility. We are stepping away from the keyboard entirely and letting the machines negotiate with the machines.
IdaWhich leaves us with a lingering provocative thought for you to mull over as you go about your day. If the AI handles our scheduling, our shopping, our HOA bylaws, and our coffee runs, what happens to the social friction that actually forces us to connect?
AllanThat is the real question.
IdaThink about it. The messy friction of navigating someone else's schedule, deciphering a confusing menu in a foreign language, or accidentally buying the wrong PC part and having to ask a friend for help? Those tiny points of friction are often where serendipity and human empathy happen.
AllanYeah, that's where life happens.
IdaIf machines are doing all the talking, all the organizing, and all the apologizing for us, do we risk letting our own capacity for grace and human-to-human relationship building atrophy? When the friction is gone, do we lose the spark that makes the interactions meaningful in the first place?
AllanIt is a profound question. If the agents are perfect, we lose the beautiful vulnerability of making mistakes together.
IdaIt's definitely a lot to think about. But for now, we'll let you get back to your frictionless reality. Hopefully, without any cartoon blimps blocking your view. Thanks for joining us on this deep dive tailored just for you. Until next time.