Digital voice assistants vs. privacy in 2026

I think it’s fair to say that digital voice assistants got off to a shaky start. Those of us who are inclined to be suspicious of the tech giants worried that voice-activated products listened more than they ought. Vendors were never particularly transparent about what they did with the voice data they collected.

I can’t deny the sophistication and utility of contemporary voice assistants: in my youth, devices like this would have been science-fiction. Impressive as they are, though, mass-market voice assistants relied, and continue to rely, on a gargantuan IT infrastructure, hosted by companies that don’t necessarily have their customers’ best interests at heart.

To my recollection, Microsoft’s Kinect product was the earliest target of a concerted, headlining objection to voice recording. Microsoft originally planned for Kinect to be an integral part of their XBox One gaming console, released in 2014. They claimed that Kinect – essentially a collection of sensors and microphones – would revolutionise computer gaming: rather than fiddling with a joystick or keyboard, the gamer would stand in front of Kinect’s sensors, and control the game using movement and voice cues.

The problem with Kinect was that nobody knew when it was listening, or what Microsoft was doing with the information it collected. Concerns escalated when researchers demonstrated that Kinect was sensitive enough to measure the user’s heart and respiration rate, although there’s no evidence that Microsoft ever collected this kind of data. In any case, Microsoft responded to their customers’ objections for once, and demoted Kinect from an essential component to a plug-in accessory. Three years later, Kinect was discontinued. Gamers just didn’t benefit enough from it, to be willing to accept the privacy risks. Frankly, game developers didn’t warm to it either, further decreasing its usefulness.

Since then, we’ve become more tolerant of voice assistants: not because the privacy concerns have gone away, but because their utility has increased to the point where it outweighs our qualms. There’s been a proliferation of “smart speaker” products, like Amazon’s Echo range and audio appliances from Bose and Sonos. Even my wireless headphones can call out to a voice assistant. At least they could, had I not disabled this feature. In addition to these stand-alone, consumer appliances, a wealth of voice assistant software runs on a computer or, these days, a smartphone. There’s Amazon’s Alexa, Apple’s Siri, Google Assistant, Microsoft Copilot, and Samsung Bixby, among others.

Another reason we’re prepared to tolerate these new voice assistants, when we didn’t welcome Microsoft Kinect, is that contemporary products need specific activation: we think or, at least, hope they aren’t listening all the time, but only when we want them to be.

Modern voice assistants are usually activated by some specific phrase, like “Hey Google” or “Hey Siri”. Their suppliers tell us that they don’t record or collect audio, other than in response to this “wake phrase”. Once the software is awake, you’d have to consult the privacy policy to see what the vendor will do with your voice recordings. At least, you’ll see what the vendor wants you to think they do with them. It’s no secret that the voice analysis gets done on the vendor’s servers, not on your device. Consequently, the handset has to send the voice data to the vendor. Clearly privacy is at issue here.

Here’s what Google’s privacy policy says they do with your voice recordings:

“Assistant uses your queries and info from your linked devices and services to understand and respond to you, including to personalise your experience”.

We know that when companies like Google talk about “personalizing your experience”, what they actually mean is targeting ads. The process of “real-time bidding”, which underlies all contemporary advertising frameworks, makes it very difficult for ad brokers – not limited to Google – to constrain your data to well-defined recipients.

Google’s privacy policy goes on to say:

“Your data is also used to develop and improve Google products and services […] To help assess quality and improve Assistant, human reviewers (which include third parties) read, annotate and process the text of your Assistant queries and related info. We take steps to protect your privacy as part of this process. This includes disassociating your queries from your Google Account before reviewers see or annotate them.”

So Google is supposed to anonymize the voice data it sends to third parties, and maybe it does. That doesn’t stop Google itself using and storing your data.

I think we all accept that, if we’re using a voice assistant, we’re sending some personal data to the vendor’s servers. This is no different, in principle, from sending an email using Google Mail, or using Google Docs to write a letter. We know that Google sees the text, and we either trust Google with this data or we don’t. But at least we know what we’re sharing with the service provider.

The real problem with voice assistants isn’t what happens when we deliberately activate them: it’s what happens when we don’t. As with Kinect, privacy-minded individuals worry they they might be listening when we don’t expect them to.

If you know you’re talking to Google, or Apple, or whomever, you’re likely to limit your speech to a particular question or command. You’re unlikely to say “Hey, Google, is Amanda still f#@king her tennis coach?” when you’re talking to a person in the same room.

For many years, though, users of voice assistants have reported seeing advertising that can only be based on information scraped from their personal conversations. I’ve often been told stories like: “Yesterday at dinner, my husband and I were discussing our holiday in Greece. Today my web browser is full of advertisements from swimwear”.

The problem with tales like this is that they’re very hard to evaluate. Many people see a lot of online advertising. If your browsing history – and Google knows your browsing history – shows you’ve been shopping for holidays, the chances of seeing an ad for swimwear are pretty high, whether you discussed swimwear out loud or not. There’s probably enough information in the profiles that advertisers keep of you to direct advertising that way, without their needing to go to the hassle of processing your speech.

And yet these stories persist. In my experience, they tend to be 3rd-hand accounts, like: “Bill heard from Alice that Marek was talking to Sergei about wine (or football, or cars), and…” You can guess how the story goes. Because I go to some lengths to block advertising, and to avoid being tracked, I have no personal experience to support, or refute, stories like this. On the whole, though, I’m somewhat sceptical. Verifiable first-hand accounts seem pretty scarce.

On the other hand, vendors of digital voice assistants do admit that sometimes they wake up accidentally. Google’s support site says:

“After Google Assistant detects an activation, it exits standby mode and sends your request to Google servers. This can also happen if there is a noise that sounds like ‘Hey, Google’ or an unintended manual activation.”

I guess by “unintended manual activation” they mean you butt-dialled the voice assistant; some smartphones have a specific button to wake a voice assistant. The wakeup phrase is processed locally, not on a remote server, so the voice analysis probably isn’t as thorough as if it were done on a server.

Perhaps “I gargle” sounds enough like “Hey, Google” to wake the assistant up. But even if the voice match were very picky, it’s unlikely to interpret context. So if I were to say “Horses eat hay; Google said so”, very likely that’s going to wake Google assistant up.

In short, vendors of voice assistants don’t deny that inadvertent waking happens, or that they process the voice data that’s captured when it does. In fact, Google says that one of the reasons it collects this data is to improve the accuracy of the wake-phrase detection. This seems a laudable goal, if Google – and its partners – work on robustly anonymized data. But do they?

While stories of voice data being abused have been circulating for over a decade, I think the first high-profile scare, involving authenticated reports, involved Apple in 2019. Apple’s behaviour came to public attention when somebody called Thomas Le Bonniec revealed he was in possession of large amounts of voice recordings from Siri, containing a good deal of personal information. The voice recordings apparently included conversations between doctors and patients, and discussions of sexual encounters.

Le Bonniec was not an Apple employee: he worked for an independent company that Apple had contracted to review voice recordings, for quality control purposes. What the quality control program should have been doing, according to Apple, was examining voice recordings that Siri had failed to interpret, with a view to improving the voice recognition. However, it seems very unlikely that the kinds of conversation that worried Le Bonniec would have been aimed deliberately at Siri. This raises two crucial questions.

First, were these recordings just collected unintentionally, after an accidental waking of the voice assistant, or was Siri listening more than Apple admitted? Perhaps even all the time?

Second, if Apple was sending voice data to contractors for quality control, could it have been doing the same other purposes? Or, instead, could the contractors have been misusing the data, for their own ends?

We don’t know the answers to either of these questions because, at the beginning of January 2025, Apple settled a class-action lawsuit out of court. The plaintiffs had alleged that Apple had been selling their voice recordings, or transcripts of them, to advertisers.

Those of us who tend towards cynicism might see this out-of-court settlement as an admission of guilt. Apple said it settled to “avoid further litigation.” It also said in its statement:

“Siri data has never been used to build marketing profiles and it has never been sold to anyone for any purpose.”

This wasn’t the end of the matter, though. In October of 2025, the French human rights organization, the Ligue des Droits de l’Homme (LDH), encouraged the French government to open an investigation into Apple’s behaviour. The LDH claim wasn’t based on any specific consumer complaints; rather, the organization argued that Apple’s behaviour was broadly in violation of EU data protection legislation. Apple, they claimed, hadn’t properly disclosed what happened to the voice data it collected, nor sought explicit consent from Siri users.

It’s important to understand that EU legislation doesn’t specifically forbid Apple, or anybody else, from selling voice data to advertisers, if the customer explicitly gives consent; and it may be impossible to use the product without consent. But to store and process personal data without consent is unlawful.

Apple now claims it improved Siri’s security, and it no longer shares voice data with advertisers – which does give the impression that it did so at one time. However, we probably aren’t going to get more information than this. Even though Apple remains adamant that it isn’t selling voice transcripts to advertisers, it’s hard to be sure that the same applies to any subcontractors Apple engages. If these subcontractors misbehaved, that wouldn’t necessarily absolve Apple from liability under EU law. Right now, Apple hasn’t responded in detail to the EU enquiry, and most likely never will.

Apple isn’t the only business to be in legal hot water over voice recordings, of course. As you might expect whenever there are concerns about personal privacy, Google is in the frame, too. At the end of January 2026 Google, like Apple, offered to settle a class action lawsuit out of court in California. This action also alleged that Google had been processing speech data and selling the results to advertisers. So far as I know, the complainants never alleged that Google Assistant was listening all the time – only that Google Assistant was waking up in error, and that Google processed and monetized speech data opportunistically.

Like Apple, Google has a privacy policy that states that it doesn’t sell voice data outside the organization. Google’s reach, though, is so all-encompassing that this term doesn’t restrict its behaviour all that much. We don’t know whether Google treats voice data from unexpected wake-ups differently from speech directed specifically at its voice assistant. Probably it does, for the same reason that Apple does: to research ways to reduce the likelihood of unexpected wake-ups.

Once Google Assistant has woken up, deliberately or by accident, everything you say is fair game. We know this is the case, because Google documents a way to erase from your records whatever you last said, if you noticed Google Assistant listening unexpectedly. Since there’s a way to do this, Google Assistant must be storing your utterances if it thinks you were speaking to it, even if it turns out that you weren’t.

This is all somewhat speculative and, since Apple and Google have both settled out of court, we’re denied the opportunity to hear them explain their actions. While this gives an impression that they have something to hide, large companies have learned from the “McLibel” case that it’s better not to go to court against private individuals if possible, even if you’re sure of your ground.

The McLibel case began in 1990, and lasted an astonishing ten years. Environmental campaigners Helen Steel and Dave Morris had been distributing leaflets warning people about the alleged dangers of McDonald’s fast food and its shady business practices. So McDonald’s started legal action against the campaigners for defamation. In the end, McDonalds convinced the court on some counts, but not on all. The court awarded derisory damages – derisory to McDonald’s, at least – which McDonald’s has so far not recovered.

Over the ten years of the McLibel hearing, many of McDonald’s dubious business practices were exposed in open court, and are now part of the public record. The legal action was, in short, a public relations disaster for the company, even though they (partially) won their case.

We’ve learned from McLibel that, if you’re a huge corporation, you’ve little to gain by going up in court against a bunch of malcontents who have, frankly, little to lose. At best, it’s bad PR. At worst, you lose the case.

So that fact that neither Google nor Apple wanted to fight a privacy case in open court doesn’t surprise me. The settlements they’ve offered to the class action claimants are miniscule in proportion to their earnings – Certainly less than they’d have to pay their lawyers if they went to a full hearing. Their actions make perfect sense, from a business perspective.

At the same time, settling without a hearing leaves a bad smell, and the lingering impression of a cover-up.

So where does this leave us?

In the EU, all vendors of voice assistant services now have full privacy policies, as legislation requires. It seems highly unlikely to me that these corporations are recording our voices secretly, all the time. It’s certain that voice assistants activate unexpectedly – their vendors make no secret of this. When this happens, your voice is processed and the transcribed speech becomes part of the profile that the vendor already holds on you. That’s unless you’re fortunate enough to notice the accidental waking, and take steps to have the information deleted.

Once your voice has been processed – whether you spoke intentionally to the voice assistant or not – I think it’s plausible that what you say could end up being used for targeted advertising.

It therefore seems possible to me that you could have a conversation about shoes, and end up seeing advertisements for shoes – not because your voice assistant is listening all the time, but because it’s wake-up detection is imperfect.

While I admit this is all possible, it doesn’t seem particularly likely. Not because Google, et al., have scruples, but because they have better ways to target advertising. The whole world-wide web has become a vast advertising platform; there’s little need for companies to use convoluted and unreliable ways to get information from your speech, when they already have robust, straightforward ways like browser fingerprinting.

Nevertheless, I don’t use voice assistants. The reality is that they don’t do enough for me, that I’d take even the small risk with my privacy. If they were more useful, perhaps I’d be willing to take that small risk. But perhaps not: if I really wanted a voice assistant, I’d investigate the many open-source, non-distributed voice assistant frameworks that are starting to appear. None is as immediately usable as Google Assistant, but being able to do all the speech processing locally is a huge step up in privacy, and worth some extra set-up. For now, I just don’t need this kind of technology enough to go to the trouble.

However, if you’re using an Apple smartphone, or any device that runs a stock, vendor Android, voice assistant software is built in, and it’s probably enabled by default. On Android devices, you can turn off Google Assistant completely, but it’s a convoluted process, and some people have reported that it turns itself back on. I believe that other platforms provide ways to turn off the voice assistant but, again, with some difficulty and an irritating lack of permanence.

I’m running a custom firmware on my Android handsets, that doesn’t have any voice assistant services built in. I can therefore be reasonably sure that my voice will never be recorded and sent somewhere without my knowing. Installing custom firmware is too a big step for most people and, on some cellphones, simply impossible. My view, for what that’s worth, is that voice assistants don’t represent a significant additional threat to privacy, when considered alongside all the known privacy hazards of the Internet. I wouldn’t be hugely worried if I couldn’t completely disable the voice assistant.

But I’d disable it if I could.

Have you posted something in response to this page?
Feel free to send a webmention to notify me, giving the URL of the blog or page that refers to this one.