On Dec 17, 2013, TiE Silicon Valley (SV) hosted a panel entitled: When Does Personalization Overreach into Privacy in the Era of Big Data? The thin line between targeted content (obtained from on-line data collection/synthesis) and the intrusion of user privacy was explored by moderator Karthik Kannan, Co-founder Cetas and Product/GTM Leader-Pivotal along with three panelists:
- Dhruba Borthaku, Senior Architect/Engineer-Database Engineering at Facebook
- Sanjay Sawhney, Senior Director and Head of Research at Symantec Research Labs
- Mahesh Kumar, Founder and CEO of Tiger Analytics
The panel attempted to address how leading technology companies are working to set the precedent for the right amount of personalization without compromising users’ privacy and security. While some excellent points were made, they were not tied together or summarized into main themes. Lots of great questions were raised, but few answers given. Nonetheless, we’ve tried to capture the key issues and main discussion points in this article.
Here are some of the questions posed in the abstract (provided by TiE-SV):
- With NSA & large web companies (e.g. Google, Facebook, Yahoo, etc) snooping, where does one draw the line between personalization and privacy?
- Is it location awareness that’s scaring people?
- Or is it the entity gathering this information that’s scary?
- What are their guiding principles?
- What are the types of technologies helping adhere to these principles?
- Most importantly, where is the opportunity for new startups/technology?
Highlights of the Panel Discussion:
1. Moderator Kannan’s Opening Remarks
Privacy and personalization are two sides of the same coin, implying that there’s a tradeoff between them.
Use cases of personalization/tracking include:
- End user web experiences
Several questions arise from personalization/tracking:
- What does this mean for users and technology providers?
- What are our responsibilities and opportunities?
- Are our privacies now being (significantly) compromised?
“The panel will address these issues,” Mr. Kannan said
2. There are many legal questions about data encryption. Emails stored in a server might be encrypted, but stored cookies are not. Homomorphic encryption (~2009) has the ability to query some data without decrypting it. (No follow-up as to the use or role of encryption for security or to protect user data from being stolen).
3. There are lots of techniques “invisible burglars” use to steal user data- subjective, individual, and generational. (No elaboration of who those burglars are or the techniques they use to steal user data)
4. There’s a potential flow of user personal information that might be used for identity theft, targeted advertisements, or other purposes. As an example of the latter, retail stores might obtain data from WiFi enabled smart phones while the owners are in the store shopping and then use that information to advertise to that customer. Again, many questions arise:
- How is the data obtained from users (with or without their permission)?
- Are users aware of their privacy choices and end user agreements/opt-out possibilities?
- How is customer data used, stored, and encrypted (or not)?
- Are vendors transparent?
- What is the gap between expected behavior and actual behavior of apps/services?
- Do users exercise good judgement and/or take steps to protect their privacy?
(No answers were provided)
5. Questions and issues posed by Manesh Kumar:
- How can “data science (analytics?)” be used for personalization?
- What can be done for privacy in an evolving area for future study.
- Personalization is a reality today for the existence of a “free on-line world” that includes email, streaming video & music, social media, etc.
- Technologies and policies need to evolve so that personalization and privacy can co-exist.
- Personalization is helping the advertising industry and social media companies (e.g. Facebook).
- Key question is when does personalization over reach so as to compromise user privacy?
4. Moderator: Today, anyone can correlate and “dig-up” information about users. Is user privacy protected in that situtation and is it a technology problem or not?
- Privacy is a mix of technology and policies.
- In most cases, groups of customers are targeted rather than individual customers (Amazon and Netflix are certainly exceptions to that statement).
- Privacy protection has never been successful from a commercial standpoint (How could it be successful, in a world of “free” Internet/web services sponsored by targeted advertisers?)
- Technology is an enabler of privacy and personalization.
- What level of privacy should be guaranteed? Examples include TSA screening, health care, on-line banking, etc.
- There are several options users have to safeguard their privacy: block cookies, private browsing on Safari, do not track, Ghostery.com, Google Dashboard, etc.
- The privacy enhancing technologies are generally not known by individuals. Most people lack basic knowledge of how to protect their information.
5. Key use cases of personalization are driven by technology. For example:
- Turn by turn driving directions
- Health care management & on-line health records
- On-line ad targeting for potential customers
- Recommendation engines (based on previous user preferences) from Amazon and Netflix
- Personalized medicine including customized medications based on an individual’s health profile
6. What are some of the Big Data Snooping technologies?
- Infrastructure to store petabytes of data
- Open source software that does statistical analysis/data mining; many are cloud based
- Software built for computational advertising is now moving to other domains (those “domains” were not identified)
- Use of natural language processing to synthesize and analyze data
7. “Industry should use (privacy/security) tools for what they are; not as a replacement for domain knowledge.” (No further explanation or elaboration given).
8. What has Europe and UK done differently to protect privacy and user records?
- Cameras everywhere in London
- Lot more stringent privacy controls and regulations in Europe
- Concerns for “who owns your data” are being addressed
9. From the session abstract: “Most importantly, where is the opportunity for new startups/technology?” (No discussion or mention by moderator or panelists)
10. Q & A Session:
This author asked the panel to comment on threats beyond advertisers that had collected user data for personalization of ads/commericials. Specifically, identity theft, NSA and other government over-reach/snooping, banking and credit card theft, and stealing of individual’s medical records or health insurance data. The moderator acknowledged those were all legitimate concerns, but said the focus of this panel was on personalization of ads versus individual privacy, NSA/government snooping or identity theft.
The author also noted a California State Assembly hearing on Privacy and Technology in the Internet Age (Dec 12th at SCU) which addressed many of the same issues the panel discussed. In many cases, web companies have no clue who they are tracking or how they use the information obtained. In other instances, they sell the information to advertisers (see Addendum below).
Here are a few references for that outstanding SCU event:
Addendum/Postscript: Tech firms rip NSA but use same data-mining tactics
Google, Facebook, Yahoo and other Internet companies that have expressed outrage over the National Security Agency (NSA) intercepting their users’ data, pioneered mining information about customers, sometimes without their knowledge. Yet some privacy experts argue that these same companies have not been transparent with their customers about how user data are tracked and used.
Google’s Android smartphones/tablets and search engine, Apple‘s iPhones/iPads, Yahoo Internet search engine or Facebook’s social media website, all make money by capturing and analyzing information about users’ Internet habits, locations and social media postings. That’s some of the very information the NSA gathers to snoop on individuals (most of which have no connection to terrorists). The companies profit by selling the information to advertisers, who then send people information about products they’re most likely to buy. This practice is otherwise known as “targeted advertising” or “personalization.”
Each piece of data a person shares online has measurable value to the companies that collect it and marketing firms that use it, helping to build an industry that generated $156 billion in revenue in 2012, according to the Direct Marketing Association. That’s more than twice the size of the budget for U.S. intelligence agencies.
The NSA has tapped fiber-optic cables abroad to siphon data from Google and Yahoo, circumvented or cracked encryption, and covertly introduced weaknesses and back doors into coding, according to reports in the Washington Post, the New York Times and the Guardian (UK) newspapers.
The tech companies’ response to those reports revolves around a business calculation: Information they receive for free from users has a tangible value, because targeted advertising sells more products. They risk receiving less information if users move elsewhere. A presidential review panel agreed this month with the companies’ assertions that the secrecy may cost them business if users think their communications aren’t secure.
“Data is one of the most important assets that these companies have, and companies protect their assets zealously,” said Jim Brock, a former Yahoo executive and co-creator of PrivacyFix, a program that monitors Internet tracking. “It’s very healthy for people to understand that their data has value and this value exchange is a two-way street,” Brock added.