ChatGPT's voice operation shows that voice assistants will never … – XDA Developers
ChatGPT has introduced voice dictation, but there’s a reason why it’ll never truly take off
Amidst turmoil at OpenAI, the company announced that ChatGPT would soon be able to interact with users via their voices on Android and iOS. Not only can a user speak to ChatGPT, but they'll now receive an audible response, too. While that's cool on the surface, there's a reason voice dictation, a technology that's been mostly mature for many years now, hasn't really taken off. Sure, almost every major ecosystem has its own version, from Amazon Alexa to Siri, but the tech has so many flaws that not even ChatGPT can make it interesting.
One of my biggest annoyances with voice assistants comes from dealing with the preamble of both initiating the conversation and getting the answer. I can often look it up quicker myself, and in times when my hands are full, the best use I find for these kinds of assistants is for setting timers, not responding to messages or googling questions. OpenAI recently shared an example of a conversation you could have with ChatGPT.
While technically impressive, the demonstration is a bit ridiculous. First off, the question — about how many 16-inch pizzas to order — is absurd. I understand that it's there to demonstrate ChatGPT's ability to deal with complex conversations, but not only is the answer needlessly complex, but the delivery is, too. If I'm asking a mathematical question of an AI, I just want the answer. Tell me the number first, and then explain it. If I don't care about the explanation, I can just cancel the playback.
Switching that up isn't enough, though, because that's something AI can already do. Maybe the contextual nature of the number of slices of pizza and the number of people requires the AI to "research," but at some point, I'm sure features like that will come to all other AI voice assistants, too. Once it does, we're back to square one when even the best Amazon Echo devices can do what OpenAI has been moving towards at a breakneck pace.
If I'm using my smartphone, it's easy for me to quickly type and search for something. I can do that anywhere, without being heard, and I can then read through the answers at my leisure. If I ask a voice assistant to find something for me, chances are I search for it myself after the fact to see what other options there are. Voice assistants are too wordy, and they always will be.
What is the end goal of a voice assistant? They're never going to replace smartphones (as much as companies like Humane want them to) for several key reasons, the most important being privacy. Logging into services, sending private messages, or even googling those silly, dumb questions you use incognito mode for isn't really possible to do privately with a voice-based device.
As a result, outside very niche, private-use contexts, voice assistants can never replace a smartphone or privately-used device, and I don't see that ever changing. Without a fundamental shift in how people view their own privacy and what they're willing to say out loud, it's hard to convince people that they want to use their voice to operate their devices all the time.
We don't need the same news report being read out in 15 different places or one person repeatedly asking about how many 16-inch pizzas they need for 778 people.
Imagine a world where, instead of everybody using their phones on a packed subway, they use a voice-powered device. Imagine how hectic that would get, not to mention loud. Your own devices would have trouble discerning voices, and a packed subway would theoretically be a cacophony of noise. The subway is bad enough. It doesn't need the same news report being read out in 15 different places or one person repeatedly asking about how many 16-inch pizzas they need for 778 people.
It's also hard enough to convince people as it is that your devices aren't listening to you 24/7, but people are already antsy about having always-listening microphones near them. With devices that can only be voice-operated, it will be hard not to feel listened to at all times.
I'm a technology enthusiast, but I think it's for the best that devices aren't going to be exclusively voice-operated for a long time. It's nigh-on impossible for that to be the case for the reasons outlined here. While companies like Humane are pushing the envelope, they'll ultimately fail to capture any reasonable market with a device that relies on voice as the main way to operate it.
Voice assistants will forever be a helpful addition to devices that we use daily, but the technology to understand us has been good enough for a long time now.
By subscribing, you agree to our Privacy Policy and may receive occasional deal communications; you can unsubscribe anytime.
I’m Adam Conway, an Irish technology fanatic with a BSc in Computer Science and I’m XDA’s Lead Technical Editor. My Bachelor’s thesis was conducted on the viability of benchmarking the non-functional elements of Android apps and smartphones such as performance, and I’ve been working in the tech industry in some way or another since 2017.
In my spare time, you’ll probably find me playing Counter-Strike or VALORANT, and you can reach out to me at adam@xda-developers.com, on Twitter as @AdamConwayIE, on Instagram as adamc.99, or u/AdamConwayIE on Reddit.