Apple's Siri Voice Clip Storage Policy Raises Privacy Concerns

By Wayne Rash  |  Posted 2013-04-21

Apple's Siri Voice Clip Storage Policy Raises Privacy Concerns

Apple’s revelation to Wired that it stores for two years the voice clips for the questions that Siri users ask the automated assistant at least clarifies some details about Siri that Apple hasn’t revealed previously.

While Apple has made it clear that Siri actually works by digitizing your voice when you ask Siri a question and then sending that voice clip to Apple’s servers for analysis, it’s never really said what it does with the clips after you have completed your inquiry.

Siri works like this. When you press the home button on your Siri-enabled device, all the Siri app does is digitize your question, and then send it to Apple. There the Apple servers use voice recognition software to parse out what you said, and then sees if it has the answer available in its databases. If you ask Siri, “What’s the best restaurant within five miles,” Siri will send your location along with the question.

Once the question gets to the Apple servers, the servers will look for all of the restaurants within five miles of your location and the ratings using an online database, usually Yelp. Once it looks up that information it will send you a list sorted by star rating of the restaurants within that distance.

While Siri knows about restaurants, bars and sports data, there’s a lot it doesn’t know about. So when I ask for the current score in the Washington Nationals game, Siri knows the answer. When I ask Siri how long Apple keeps my voice clips, it asks if I want it to search the Web. If I say I do, then Siri goes off to wherever questions go and never comes back with an answer.

What happens then is that Apple keeps the voice clip and analyzes it to see how to make Siri more likely to find the answer. In testing I’ve found that Siri rarely knows any answer outside of the specific areas pre-programmed by Apple. This means that except for knowing the score for the Nationals on a given night, most of what I’ve accomplished is to give Apple fodder for improving the search process. I guess that’s nice, but not particularly helpful.

For example, I asked Siri how long Apple stores my voice clips. Siri didn’t know, and asked if I wanted to search the Web. I told Siri to search, but the assistant never revealed the answer.

Then I tried one of the example scenarios that have appeared in the technology press—questions that you shouldn’t ask Siri because someone might find out. “Siri,” I asked, “Where can I go to get drunk?” This, it seems, is one question that has a pre-programmed response. “I hope you’re not driving, Wayne,” it said. Then Siri presented me with a button labeled “Call a taxi.”

Apple's Siri Voice Clip Storage Policy Raises Privacy Concerns

I have no idea whether that query was saved to Apple’s servers, or if it was, whether anyone will be able to tell if Siri called a taxi. But if it was saved and if someone were to find that voice clip, all they would really find out is that I asked the question.

But suppose someone were to ask a question of a more personal or private nature, such as “Where can I find a good divorce lawyer?” Is that the sort of thing you want stored on Apple’s servers, even if the chance of anyone finding out it’s you are remote?

Other scenarios, such as asking questions that could lead to revealing business secrets I find unlikely. For one thing, Siri isn’t equipped to answer complex questions. The second is that Siri is best used as a recreational service since its grasp of useful information is frequently very tenuous. If you’re looking for an answer to an important question, Siri is not your gal.

In fact, after the Nats lost in the playoffs in 2012, I stopped bothering to use Siri for anything. Worse, considering the way the season has started, I don’t ask Siri for scores because I’m afraid I’ll find out the answer.

But still, the privacy advocates who worry about those voice clips have a point. Those voice clips are still in Apple’s servers and while Apple promises to delete them after two years, how will we know when that happens? Meanwhile, those clips are identified by a random number that’s tied to a specific device. Although Apple says that it’s not connected to your Apple ID, is it still possible to identify the device and from that the owner of the device? Apple isn’t saying.

And while voiceprint technology still isn’t really ready for primetime, it’s getting better, and it’s possible to identify someone’s voice from a recording if you have a reference sample. This means it may be possible for a law enforcement agency to compare Siri voice clips against a clip from an individual. If you say the wrong thing, might the police come after you someday based on what you said in a Siri query? Again, we don’t really know.

Worse, Apple isn’t sharing these details to the iOS user community even if you look up the privacy policy on your device (it’s not revealed on the Website). While storing your queries may make sense, Apple needs to be upfront with its use of the voice files and give people a way to opt out, even if it means their Siri queries will be even more useless than usual.

Rocket Fuel