Big Data

Framing Big Data Debates

If finding the proper balance between privacy risks and Big Data rewards is the big public policy challenge of the day, we can start by having a serious discussion about what that policy debate should look like. In advance of my organization’s workshop on “Big Data and Privacy,” we received a number of paper submissions that attempted to frame the debate between Big Data and privacy. Is Big Data “new”?  What threats exist?  And what conceptual tools exist to address any concerns?

As part of my attempt to digest the material, I wanted to look at how several scholars attempted to think about this debate.

This question is especially timely in light of FTC Chairwoman Edith Ramirez’s recent remarks on the privacy challenge of Big Data at the Aspen Forum this week. Chairwoman Ramirez argued that “the fact that ‘big data’ may be transformative does not mean that the challenges it poses are, as some claim, novel or beyond the ability of our legal institutions to respond.” Indeed, a number of privacy scholars have suggested that Big Data does not so much present new challenges but rather has made old concerns ever more pressing.

Read More…

From Cyberspace to Big Data Podcast

In the run-up to the Future of Privacy Forum’s “Big Data and Privacy” workshop with the Stanford Center for Internet & Society, I’ve taken to podcasting again, speaking with scholars who couldn’t attend the conference.  First up was Professor Bill McGeveran, who prepared an essay looking over lessons from the 2000 Stanford symposium on “Cyberspace and Privacy: A New Legal Paradigm?”

Of course, now the buzzword has moved from cyberspace to big data.  McGeveran suggests big data is really seeing a replay of the same debates cyberspace saw a decade ago.  Among the parallels he highlights are (1) the propertization of data, (2) technological solutions like P3P, (4) First Amendment questions, and (4) the challenges posed by the privacy myopia.

The Toobin Principle as a Corollary to the Snowden Effect

Jay Rosen has a fantastic piece today on PressThink on what he calls the “Toobin principle“.  In effect, Jeffrey Toobin and a number of media figures have criticized Edward Snowden as a criminal or, worse, a traitor even as they admit that his revelations have led to a worthwhile and more importantly, a newsworthy debate. For his part, Rosen asks whether there can “even be an informed public and consent-of-the-governed for decisions about electronic surveillance”?

I would add only the following observations. First, an informed public may well be the only real mechanism for preserving individual privacy over the long term. As we’ve seen, the NSA has gone to great lengths to explain that it was acting under appropriate legal authority, and the President himself stressed that all three branches of government approved of these programs. But that hasn’t stopped abuses — as identified in currently classified FISC opinions — or and I think this is key, stopped government entities from expanding these programs.

This also begs the bigger, looming concern of what all of this “Big Data” means. One of the big challenges surrounding Big Data today is that companies aren’t doing a very good job communicating with consumers about what they’re doing with all this data.  Innovation becomes a buzzword to disguise a better way to market us things. Like “innovation,” national security has long been used as a way to legitimize many projects. However, with headlines like “The NSA is giving your phone records to the DEA. And the DEA is covering it up,” I believe it is safe to say that the government now faces the same communications dilemma as private industry.

In a recent speech at Fordham Law School, FTC Commissioner Julie Brill cautioned that Big Data will require industry to “engage in an honest discussion about its collection and use practices in order to instill consumer trust in the online and mobile marketplace.”  That’s good advice — and the government ought to take it.

Buying and Selling Privacy Paper

Judge Alex Kozinski has offered to pay $2,400 a year to protect his privacy. Meanwhile, Federico Zannier started a Kickstarter to “data mine” himself and ended up making $2,700. One’s rich and can pay to protect his privacy; the other’s not and is selling every bit of his info. I’ve posted my paper on this subject to SSRN.

Parsing the Purpose Limitation Principle

Last month, the European Union’s Article 29 Working Party (WP29) released an opinion analyzing the data protection principle of purpose limitation. That principle, which aims to protect data subjects by setting limits on how data controllers use their data, conflicts with potential Big Data applications. In the wake of efforts by a number of parties to tweak the “legitimate interests” ground for processing data, this opinion demonstrates how Big Data fundamentally challenges European privacy law.  The opinion itself seems geared toward addressing Big Data; the WP29 specifically notes that current business trends are of particular relevance to its opinion, which it put forward as a way to balance the risks and rewards of data processing in our increasingly data-intensive society.

Under Article 6(1)(b) of Directive 95/46/EC, the purpose limitation principle consists of two fundamental building blocks:

  1. that personal data must be collected for “specified, explicit and legitimate” purposes (purpose specification);

  1. that personal data not be “further processed in a way incompatible” with those purposes (compatible use).

The challenge posed by Big Data is that much of the new value of information comes not from any original, identified purpose but rather from secondary or derivative uses. As a result, both building blocks of the purpose limitation principle are in tension with the how Big Data works, presenting a challenge for pursuing innovative data uses in Europe.

First, WP29’s understanding of purpose specification requires that before data collection that purposes “must be precisely and fully identified.”  Many of the secondary ways in which data can provide value, whether to security, safety, or health, may not be easily identifiable. This problem cannot be cured by providing a broader purpose specification because the Working Party is critical of “vague or general” purposes such as “improving users’ experience,” “marketing purposes,” “IT-security purposes,” and “future research” as being generally inadequate to meet this test.

Limited in this way, the benefits of Big Data are effectively cabined by whether or not they satisfy the compatible use test.  The onus falls on data processors in Europe to determine whether or not a secondary use is compatible with how the data was originally collected. The WP29 opinion recognizes that actually applying the compatibility test is problematic within the context of Big Data, and suggests developing a “rigorous but balanced and flexible application of the compatibility test” to new data applications.

The compatibility test does provide some flexibility to data processors.  For one, because the test itself prohibits incompatibility rather than requires compatibility, the lack of any affirmative requirement that a data processor show further processing is compatibility appears to provide some wiggle-room.  Compatibility still must be assessed on a case-by-case basis; the following criteria are put forward as particularly relevant to any compatibility assessment:

    • the relationship between the purposes for which data has been collected and the purposes of further processing;
    • the context in which data has been collected and the reasonable expectations of the data subjects as to its further use;
    • the nature of the personal data and the impact of the further processing on the data subjects;
    • the administrative and technical safeguards adopted to ensure fair processing and prevent any undue impact on individuals.

These are important criteria to consider, but the WP29 specifically discusses the implementation of safeguards as being important to Big Data processing.  It distinguishes between two different “Big Data” scenarios: one where organizations seek to uncover “trends and correlations in the information” and another where they “specifically want to analyze or predict the personal preferences, behavior and attitudes of individual customers” in order to “inform ‘measures or decisions’ that are taken with regard to those customers.”

As described, this second scenario has the larger personal privacy implications for individuals. The WP29 explains that “free, specific, informed and unambiguous ‘opt-in’ consent” would be required, which may be easier said than done.  The division of the Big Data world into projects that seek out merely correlations in disparate information and those that directly target individuals is simple and easy to grasp, but it does not necessarily reflect how Big Data is reshaping how data is now processed. In a paper released in February, the Centre for Information Policy Leadership (CIPL) discussed some of the concerns surrounding Big Data, and one of the paper’s key takeaways is that Big Data is largely an iterative process. If many of the benefits we’re deriving from data come from secondary uses of that information, these insights appear across applications that cannot be as easily divided as the WP29 suggests.

More problematic, insisting on consent for any potential Big Data application that could impact individuals may not be tenable.  As CIPL noted, data analytics relies on “increasingly large data sets obtained from such diverse sources that obtaining consent may not be practicable.”  The WP29 seems to recognize both the limitations on consent and that insisting on consent could eliminate some legitimate benefits.  In a footnote, it admits that some exceptions may exist “in some cases, based on an informed debate of the societal benefits of some uses of big data.” (While the WP29 remains wedded to our current notice-and-consent framework, some of their proposed safeguards are exactly what is needed to alleviate Big Data fears.  The opinion encourages the disclosure of decisional criteria and providing consumers with insight into how their data impact decision-making algorithms.  In many ways, the opinion comes close to encouraging some of the same mechanisms to get users’ engaged with their data that I recently discussed.)

Fortunately, consent is one of six lawful bases for processing data in Europe. Article 7 of the Directive permits personal data to be processed where it is necessary for the “legitimate interests” pursued by the data controller, except “where such interests are overridden by the interests or fundamental rights and freedoms of the data subject.”  Arguably, as notice-and-consent requirements have become ever more legalistic and procedural, the legitimate interest ground increasingly becomes the default posture of data processors.

Indeed, as Europe debates its new data protection law, the legitimate interest ground has seen considerable discussion.  The  Civil Liberties, Justice and Home Affairs (LIBE) Committee Report issued in December proposes that the legitimate interest provision could only be relied upon in “exceptional circumstances.”  The more industry-friendly Commerce Committee suggests that the European Data Protection Board should “set out comprehensive guidelines on what can be defined as ‘legitimate interest.’”  All of this activity suggests once again how challenging Big Data applications may be for European privacy law, and tweaking how we understanding principles such as purpose limitation do not resolve the benefits and business realities of Big Data.

What’s Scary About Big Data, and How to Confront It

Today, the data deluge that Big Data presents encourages a passivity and misguided efforts to get off the grid.  With an “Internet of Things” ranging from our cars to our appliances, even to our carpets, retreating to our homes and turning off our phones will do little to stem the datafication tide. Transparency for transparency’s sake is meaningless.  We need mechanisms to achieve transparency’s benefits. // More on the Future of Privacy Forum Blog.

1 2 Scroll to top