Google is reviewing its privacy policies after a report showed that contractors were given sensitive audio information, including private conversations and personal information that could identify users, according to a report by TechCrunch.
The company said in a blog post that it had a deal with language experts around the globe to review audio to transcribe a “small set of queries” to aid in the understanding of different languages.
“As part of our work to develop speech technology for more languages, we partner with language experts around the world who understand the nuances and accents of a specific language,” the company said. “These language experts review and transcribe a small set of queries to help us better understand those languages. This is a critical part of the process of building speech technology, and is necessary to creating products like the Google Assistant.”
Google said that only about 0.2 percent of audio snippets were reviewed, and that the snippets aren’t associated with Google accounts. Background noises or other conversations, it said, aren’t meant to be transcribed.
The person who leaked the information to a Belgian broadcaster said that they listened to upwards of 1,000 recordings and found that 153 were recorded accidentally and not meant to be heard.
Google has chosen to go after the person that leaked the information.
“We just learned that one of these language reviewers has violated our data security policies by leaking confidential Dutch audio data. Our Security and Privacy Response teams have been activated on this issue, are investigating, and we will take action,” it said. “We are conducting a full review of our safeguards in this space to prevent misconduct like this from happening again.”
The report comes as the Department of Justice has the company under a microscope, and is preparing a potential antitrust probe into the company.
Google said that people do have a way to opt out of having their data stored.
“You can turn off storing audio data to your Google account completely, or choose to auto-delete data after every 3 months or 18 months. We’re always working to improve how we explain our settings and privacy practices to people, and will be reviewing opportunities to further clarify how data is used to improve speech technology,” the company said.