The difference between the real world and the AI world is a flight of a dozen stairs.
In my basement office, I listened to what Google presented at Google I/O. I heard executives talk about search, Android, and Project Astra: a future multimodal “universal agent” that can understand audio and video, built around versions of Gemini both large and small. One million tokens is a big deal, right?
I guess. But to whom?
What I heard at Google I/O was a future that…is still in the future, for a price that most people can’t afford, and with features that were developed by Google employees for Google employees. It’s a Google that, more and more, seems to have lost the plot.
Everything’s not here yet
For the most part, what Google presented at Google I/O was science fiction. Instead of showing us what the future is, now, Google showed us what it might bring, tomorrow. It’s a future hidden behind jargon, inside developer previews and experiments within Google.Labs — and even then, those experiments aren’t scheduled to actually begin until sometime in the future. Remember, this is a company with a well-established penchant for development ADD, where products go to die when their developers lose interest. Do I think that Google doesn’t believe in AI? Of course not. But I’m not convinced that anything Google showed Tuesday will make its way into the real world, either.
(Remember Google’s ultra-realistic videoconferencing tool, Project Starline? It debuted at Google I/O 2021. Google and HP said this week that it will be commercialized next year.)
Downstairs in my office, everything is amazing. Take Project Astra, Google’s new vision for AI assistants. Someone wanders around the office using AI to ask questions about what the AI is seeing. What bandname would the AI give for a golden retriever and his stuffed chew toy? Explain this code to me, and how this database diagram could be improved.
That’s cool, no doubt. But to whom? Google employees, that’s who. Does Google expect that I’m going to point my smartphone camera at a head of broccoli and ask what to do with it? I hope not.
I might see myself wandering about a woodshop, asking Google what a miter saw is good for, for example. But I certainly wouldn’t trust YouTube to teach me how to use it responsibly. My colleague Michael Crider found some middle ground: using video as a search input, then asking Google for context. That’s a bit smarter.
I understand what Google is going for with Astra — improved visual search — and it is going to debut in the Gemini app for Android this fall, via a feature called Live. I’m just not sure how many people will want to use it. Or when it will actually arrive.
Why do the useful things cost so much?
But it doesn’t feel real. It doesn’t feel accessible. Is it useful? I’m not sure. That’s why the most meaningful announcement from Google I/O 2024 feels so refreshing: the integration of Gmail and Google’s Gemini AI, to allow you to question (for example) exactly what went on in an email thread. This is what Google was built upon: making search (and later email) accessible, easy, and simple to use. It makes sense!
Even then, though, there was very little that signaled to me that Google understood that people use its products — people that can’t sit down to a $200 sushi lunch at the drop of a hat.
The Gmail integration, as cool as it sounds, is locked behind a $20/month AI Premium subscription for Google Workspace. A chunk of the Gemini app presentation was devoted to a trip planner. Google Search showed how restaurants could be organized by patio seating and live music. For that matter, Search is now largely organized around “summaries” that tend to ignore the source of the actual information, i.e. writers like me. This all benefits techies with six-figure salaries and stock options.
Most of AI feels…similar. There are LLM chatbots, like Copilot, Gemini, and ChatGPT. There’s AI art, like Veo, Google’s most advanced video generation model, and Imagen 3, which the company calls its “highest quality text-to-image model yet.” (Those are coming in the future, too, after Google completes its collaborations with people like Donald Glover.)
But you know what the best thing I saw this week was? Something that felt fresh? Even welcoming? The synthesized AI voices that OpenAI showed off in ChatGPT. I know they’re fake. I know they’re designed to play upon your emotions, to make you think that you’re talking to a person. But it works! It feels human. And it’s a return to the early days of AI, when chatbots from Microsoft and others at least seemed real.
At lunch, I walked upstairs to say hello to my wife, who was working. I asked if she would mind paying $20 per month just to make sense of her email. She just snorted. Last night, I showed my 11-year-old son the OpenAI demos, and he wanted to watch every one.
I remember when Google made my life easier. Now I think it’s forgotten how.