Framing Big Data Debates

If finding the proper balance between privacy risks and Big Data rewards is the big public policy challenge of the day, we can start by having a serious discussion about what that policy debate should look like. In advance of my organization’s workshop on “Big Data and Privacy,” we received a number of paper submissions that attempted to frame the debate between Big Data and privacy. Is Big Data “new”?  What threats exist?  And what conceptual tools exist to address any concerns?

As part of my attempt to digest the material, I wanted to look at how several scholars attempted to think about this debate.

This question is especially timely in light of FTC Chairwoman Edith Ramirez’s recent remarks on the privacy challenge of Big Data at the Aspen Forum this week. Chairwoman Ramirez argued that “the fact that ‘big data’ may be transformative does not mean that the challenges it poses are, as some claim, novel or beyond the ability of our legal institutions to respond.” Indeed, a number of privacy scholars have suggested that Big Data does not so much present new challenges but rather has made old concerns ever more pressing.

For example, Professor Bill McGeveran’s essay looks back thirteen years, an eternity in the world of technology, to highlight the 2000 Stanford symposium on “Cyberspace and Privacy” as an excellent source for much thinking that can inform today’s Big Data debates. Issues like “propertization, technological measures, speech protection, and privacy myopia” take on “new salience because of Big Data,” McGeveran explains. “But they are not fundamentally different from the brilliant deliberations at the 2000 Symposium. To see how they apply today one must substitute the names of some companies and update some technological assumptions. But these cosmetic changes don’t compromise their theoretical core.”

Professor Neil Richards and Jonathan King also discuss how today’s Big Data is similar to yesterday’s cyberspace.  “The utopian ideal of cyberspace needed to yield to human reality,” the pair write. “[A] less wild-eyed and more pragmatic discussion about Big Data would be more helpful. It isn’t too much to ask for data-based decisions about data-based decision making.”

Part of the challenge is that the very term “Big Data” lends itself to hyperbole—it is “big” after all.  Thus, it becomes easy to portray Big Data both as the cure to everything that ails society and as a privacy-destroying Leviathan. When it comes to Big Data rhetoric, what exactly are we concerned about?

Richards and King start by identifying three rhetorical “paradoxes” that exist in Big Data discussions. Two of their paradoxes get at the bigger, more hypothetical challenges surrounding data analytics: first, its capacity to identify and classify everything, and second, the resulting power this will grant privileged government and corporate actors. These are big concerns, but they are not as grounded as the third “transparency paradox.”

The transparency paradox gets at one of the most overlooked, but obvious concerns surrounding Big Data. Basically, data analytics promises to use data to make the world a more transparent place. Data will tell us when a New York City manhole cover may explode or reveal how the human mind learns how to speak. However, all of this data collection “is invisible, and [Big Data’s] tools and technology are opaque, shrouded by layers of physical, legal, and technical privacy by design,” they explain.

Regulators will likely be quick to agree with this point. Returning to Chairwoman Ramirez’s recent speech, she deemed transparency to be an “essential part” in reconciling Big Data and privacy.  “The time has come for businesses to move their data collection and use practices out of the shadows and into the sunlight,” she stated.

But transparency alone may fail to address the other paradoxes posed by Big Data. The problem, according to Cynthia Dwork and Professor Deirdre Mulligan, is that Big Data brings with it ubiquitous classification—at a level far beyond human attempts to categorize information.  Privacy protections and transparency alone “offer a mixed bag with respect to discrimination,” the two explain, and do nothing to address the ills that rampant classification could have for society at large.

They suggest that we approach Big Data as one big “socio-technical system.”  “[B]ig data debates are ultimately about values first, and only second about math and machines,” they write.  In other words, we are the proverbial man behind the curtain, and we ought to recognize this fact moving forward.  This echoes the position taken by Jules Polonetsky and Omer Tene in their recent article that “the focus on the machine is a distraction from the debate surrounding data-driven ethical dilemmas.” “[I]t is the humans who must be held accountable,” they write.

By all accounts, we ought to be held accountable for not addressing the legitimate worries around Big Data—and for not working more diligently to achieve its benefits. The FTC has made clear it views its role with regard to Big Data as akin to a lifeguard at the beach, surveying the incoming tide. If I might extend the analogy, companies have been eager to surf the Big Data wave, but as far as society and the body politic are concerned, the jury is still out as to whether Big Data is a shark at sea or a life-preserver.  It’s time for all stakeholders to have a serious public policy discussion.

Leave a Reply