The Deepdive

Apple's AI Fumble: The Siri Crisis

Allen & Ida Season 2 Episode 5

Send us a text

The battle for AI supremacy has a fascinating outlier - Apple Intelligence. While competitors race to deploy increasingly powerful models, Apple's deliberate approach prioritizes privacy, security, and polish over raw speed. This strategic tension sits at the heart of today's AI landscape.

Apple's sophisticated machine learning architecture includes on-device Foundation Models running directly on your iPhone and iPad alongside server-based counterparts handling more complex tasks through innovative "Private Cloud Compute" infrastructure. What makes this approach revolutionary is Apple's unwavering commitment to privacy: "We do not use our users' private personal data or user interactions when training our foundation models." This philosophy creates both challenges and opportunities in AI development.

The technical implementation reveals impressive engineering solutions like LoRa adapters - tiny specialized modules that allow the main AI to instantly adapt to hundreds of different tasks without consuming significant storage space. Their quantization techniques compress models to less than four bits per weight on average while maintaining performance through specialized recovery mechanisms. This balancing act between computational constraints and AI capabilities represents Apple's distinctive approach to bringing intelligence to personal devices.

Yet a fascinating disconnect emerges between Apple's internal benchmarks (which show their models outperforming competitors in certain tasks) and public perception, particularly regarding Siri. Online discussions reveal both defenders of Apple's methodical approach and critics frustrated by perceived delays and unfulfilled promises. This tension raises profound questions about AI development philosophy: Is perfection worth waiting for, or does being first to market with evolving capabilities matter more? As this AI revolution continues unfolding, Apple's cautious strategy will either prove visionary or costly in a landscape where user expectations evolve as rapidly as the technology itself.

Subscribe to dive deeper into how technology shapes our world and discover whether Apple's distinctive AI approach will ultimately triumph in this rapidly evolving landscape.

Leave your thoughts in the comments and subscribe for more tech updates and reviews.

Speaker 1:

Hey there, curious minds, welcome back to the Deep Dive. We take that big stack of info and hopefully give you those aha moments. Today we're diving into something sparking a lot of chat, maybe even some strong opinions Apple intelligence. We've got stuff from dense tech papers all the way to. You know, pretty lively debates online, like on Reddit it's all about Apple's big new personal intelligence system. Reddit it's all about Apple's big new personal intelligence system. So our mission cut through the buzz, the hype, maybe even some of the frustration, to get you the real story on what Apple's building, how they're doing it and well, is their careful approach the right play in this crazy fast world of AI?

Speaker 2:

That's right. We're going to unpack the models Apple's built, look at their really rigorous methods for training and safety, and then we'll connect that techie stuff back to you know what it actually means for users. And, yeah, that intense debate around Siri getting its big upgrade or maybe a long over to upgrade. What's really interesting is seeing Apple, famous for polished launches, navigate this super fast, sometimes messy AI frontier speed versus perfection, right. Okay, let's get into Apple intelligence. Ok, let's get into Apple intelligence Huge announcement, talking AI tools fine-tuned for everyday stuff, making your writing better, handling notifications smartly, summarizing long emails or articles, even creating fun images or, and this is key, actually doing things within your apps to simplify tasks. The sources mention a few key models. There's AFM on device. That's a smaller, efficient one running right on your iPhone or iPad for privacy and speed. Then there's AFM server, a bigger model for more complex stuff handled securely in the cloud, and they've got specialized ones too, like for coding and Xcode, and a fusion model for, you know, images and messages and stuff. It's meant to be everywhere.

Speaker 1:

So for us users, what's the deal? Apple always bangs the drum on values like privacy. It's kind of their thing. How does that square with AI? Because AI usually means tons of data, right? Is that privacy focus almost a disadvantage for them in this AI race.

Speaker 2:

That's a really important point, a critical one, actually. The documents are super clear. Responsible AI principles guide everything from design to training to testing. And, yeah, privacy is central. They state very clearly we do not use our users' private personal data or user interactions when training our foundation models. That's a huge differentiator, a deliberate choice. They pull it off with strong on-device processing so that AI thinking happens on your phone mostly, and for bigger tasks they use this thing called private cloud compute. It's a groundbreaking infrastructure, they say. Basically, it lets them use their servers for heavy lifting, but in a way that keeps your data cryptographically private. So they're trying to make privacy a strength, not a weakness.

Speaker 1:

Okay, that sounds impressive, like the dream, right Privacy and smart AI. But the big question is still how? How do they build powerful AI, which usually needs mountains of data, without using our data? Give us a peek under the hood, but keep it, you know, understandable.

Speaker 2:

Absolutely so. The foundation is this careful three-stage pre-training process, then post-training. Apple's research interestingly found that data quality is way more important than just raw quantity for getting good performance. So they're not just scraping everything, they use their own web crawler, Applebot, for publicly available info. But they're really careful. They filter out profanity, personal info, PII. They even deduplicate data and, crucially, they decontaminate against benchmarks.

Speaker 1:

Decontaminate, like making sure it hasn't seen the test answers beforehand.

Speaker 2:

Exactly Like ensuring a student hasn't just memorized old exam questions. They want the models to genuinely learn and reason, not just recognize patterns in the test data. It's a very curated, almost meticulous approach to data Very Apple perhaps compared to the move fast approach elsewhere.

Speaker 1:

OK, so that's the data. Then comes the actual teaching right Getting the AI to understand instructions, talk like a person. That seems like the really hard part.

Speaker 2:

That's right. That's post-training. It involves things like supervised fine-tuning, sft and reinforcement learning from human feedback, rlhf Heard those acronyms a lot lately. This is where the models learn to follow instructions properly, understand context, have a decent conversation and what's really interesting here a key innovation they talk about is their hybrid data strategy. They use human-labeled data, yes, where experts guide the AI, but they also use a lot of synthetic data.

Speaker 1:

Synthetic data like AI-generated data.

Speaker 2:

Exactly For math problems, for instance. They don't just find problems online. They take some seed problems and use their own AI models to evolve them into a huge, diverse set. Then they use other AI models as judges to check if the solutions are correct.

Speaker 1:

Wow, okay, so the AI is kind of helping teach itself in a way.

Speaker 2:

In a very structured way. Yes, it's clever. They do similar things for learning how to use software tools or write code. It helps them scale up high-quality training data without needing endless human hours for everything, while still controlling the quality.

Speaker 1:

So it's not just about giant models. It's about making them smart and efficient, especially for our phones, right, which don't have unlimited power or battery. That sounds tricky.

Speaker 2:

Definitely a balancing act. And that brings us to their optimization tricks, adapters and quantization. Instead of having one massive model for every single little task, which would kill your phone's storage, they use LoRa adapters.

Speaker 1:

The LoRa adapter.

Speaker 2:

Think of them like small specialized plugin modules for the main AI brain. The main model stays the same, but these little adapters let it specialize instantly for dozens, maybe hundreds of different tasks. And they're tiny. An adapter for the on-device model might only be tens of megabytes Super efficient.

Speaker 1:

Okay, that's clever, and chronization that sounds like shrinking things.

Speaker 2:

It is. It's compressing the models drastically down to less than four bits per weight on average, which is incredibly small. This lets these powerful models actually fit and run smoothly on your iPhone or iPad's limited memory. But here's the really smart bit they use special accuracy recovery adapters to make sure that even after shrinking them down so much, the models still perform well. They don't lose their smarts. It's some serious engineering to get that power onto the device without making your phone grind to a halt A big win for users, potentially.

Speaker 1:

All right, We've geeked out on the tech, the privacy focus, the efficiency tricks. All sounds quite impressive, but let's get down to it. The real question how good is Apple intelligence in the wild? Because this is where things get well, really interesting and maybe a bit messy. There's a clear split in how people see it. Apple's own tests sound great, but the public chat, especially about Siri it, tells a very different story. Lots of frustration out there.

Speaker 2:

You've nailed the central conflict. Apple's internal reports, their technical papers yeah, they show strong results. They use standard benchmarks like MMLU for language understanding yeah, they show strong results. They use standard benchmarks like MMLU for language understanding, gsm8k for math things academics use. But importantly, they say they rely heavily on human evaluations, trying to capture that real-world user experience. You know how helpful does it actually feel? For example, their AFM on-device model, with the right adapter, apparently beats competitors like Phi 3, mini, lama 3, 8b, gemma 7B on summarizing emails or messages, according to their human testers. And the bigger AFM server model also shows top results in following instructions, writing quality math and especially using tools within apps. They claim it outperforms even GPT-4 and Gemini 1.5 Pro in some areas. So on paper, based on their tests, it looks very, very capable.

Speaker 1:

Okay, very capable on paper. Benchmarks look great. Human testers internally are happy. But then you go online, you look at Reddit. You talk to people, especially about Siri, and the vibe is just completely different. You see things like Siri is useless, flat out, or Apple completely failed on delivering what they initially wanted. What is happening here? Why is there such a huge gap between Apple's shiny internal reports and this wave of user disappointment?

Speaker 2:

It's a fascinating disconnect, isn't it, perception versus metrics, and maybe the sheer speed of AI progress setting expectations. That Reddit source really highlights the tension. You have users like 5pilla, the original poster there, who actually defends Apple. They argue Apple is right to wait on AI and Siri. Their thinking is it's better to be late and solid than early and messy.

Speaker 1:

The classic Apple approach, maybe Polish overspeed.

Speaker 2:

Kind of. They appreciate Apple's focus on a smooth experience, reliability and definitely the privacy angle. These users acknowledge that, yeah, ai on phones is still early. Lots of cool demos out there, but maybe not polished or useful day to day yet. So for them, apple's patience is a virtue. It's strategic.

Speaker 1:

But, man, you can feel the impatience bubbling up from others strongly. Some folks thrown around bait and switch saying Apple overpromised, underdelivered, especially marketing new iPhones on AI features that now seem pushed back to what? Spring 2026 for some things. This isn't just about fancy benchmarks anymore. It's about whether the assistant on your phone actually works well today. What does this tell us about where AI assistants are right now? Is Apple's perfectionism becoming a problem?

Speaker 2:

Exactly. You have other users like Cephas Sierra in that thread who are just fed up. They say Apple completely failed on the initial vision and that Siri still sucks Harsh words. They look at Google Assistant, they look at Gemini and they argue those are light years ahead in just basic daily usefulness. And the thread shows this isn't new frustration. Siri's been around since 2011. But many feel it's barely changed or maybe even gotten worse at simple things.

Speaker 1:

Yeah, trying to get Siri to do something basic can sometimes feel like pulling teeth.

Speaker 2:

Right. Meanwhile people point out Alexa is great for smart home stuff. Connects to everything. Google Assistant is praised for being smooth on Android, giving useful info proactively. So the fight isn't just about whose AI model is technically best on a benchmark. It's about the whole ecosystem how responsive it is right now. What features are actually available today? There's this real tension Apple's careful, private approach versus the public wanting cutting edge AI. Now the market isn't exactly waiting patiently.

Speaker 1:

Okay, this leads us straight to another critical point Trust and safety. With AI getting deeper into our lives, this stuff matters A lot. Apple says responsible AI principles, inform all steps. Sounds good, but what does that actually mean in practice, especially with a risk like AI making things up hallucinations or bias, or even just being misused? How do they stop it going wrong?

Speaker 2:

They seem to take it very seriously. Beyond the no private user data for training rule, which is a big one ethically, they talk about multiple layers of guardrails. This includes deliberately training the models on adversarial data, basically feeding it tricky inputs designed to fool it or make it misbehave so it learns to resist. They also do a ton of red teaming. That's where teams of humans and even other AIs actively try to provoke harmful or biased responses.

Speaker 1:

Like professional hackers, but for AI safety.

Speaker 2:

Kind of, yeah, trying to find the weaknesses before bad actors do. And for things like code generation, say, in Xcode, any code that AI writes is always run in a totally locked down, isolated sandbox. They use tech like Firecracker.

Speaker 1:

They treat AI generated code as inherently untrustworthy by default, which is probably wise, and we've seen why that caution is needed right. That Air Canada chatbot case was a real wake-up call. The chatbot gave wrong info about bereavement fairs, a customer relied on, it sued and Air Canada was held liable for what its AI said. That shows these aren't just abstract risks. How does Apple plan to handle that kind of real world liability when AI is baked into everything?

Speaker 2:

That Air Canada situation is the perfect example of the stakes. It's huge. Apple's strategy seems to be about carefully balancing being helpful with being harmless and, importantly, tailoring safety rules for each specific feature. It's not one size fits all. They do incredibly detailed human reviews focused on harmful content, sensitive topics like health or finance, and they aim for much lower rates of problematic responses compared to other models. Their internal tests showed their responses were seen as safer and more helpful side by side.

Speaker 2:

Ultimately, it's about building user confidence. Especially if the AI is helping with important matters, you have to trust it. We're also seeing this trend elsewhere, right Specialized AIs like Hippocratic AI for healthcare using doctor input, or LAQ for helping seniors. These examples show the industry moving towards tailored, safer AI for critical uses. Apple's careful approach fits that mold. So, wrapping this up, we've got Apple's vision deeply integrated AI, huge focus on privacy, meticulously engineered for efficiency and safety. They're clearly proud of the tech, the benchmarks, the responsible path they're taking, but we've also got that loud chorus of users feeling they're behind that Siri just isn't there yet. That waiting for features is getting really old, especially when competitors have something usable right now.

Speaker 1:

Indeed, it boils down to a big trade-off, doesn't it for Apple and for you as a user? Do you value that potentially perfect super private experience enough to wait for it, even if it means feeling behind the curve today? Or do the immediate capabilities of competitors, even if maybe less polished or private, win out because they're useful now? This deep dive really shows Apple trying to walk that line between innovation and their core values, but it's definitely a tricky walk in today's AI race. It leaves us with a pretty provocative thought, doesn't it? In this mad dash for AI supremacy, where speed and having something often seem to count most, is Apple's traditional late and solid strategy aiming for perfection and privacy still the winning hand in the long run? Or is the market's hunger for the latest thing right now turning that famous Apple perfectionism into, maybe, a liability? It's a really fascinating question and one will definitely keep watching closely as all this unfolds.

People on this episode