Elettra Bietti, "Locked-In Data Production: User-Dignity and Capture in the Platform Economy"

For the Record’ is a feature where our Fellows-in-Residence and Graduate Fellows have a chance to present their research ideas informally, reflect on their experience at the Center, or report on Center events. The views, thoughts, and opinions expressed in the text belong solely to the author, and not necessarily to the Edmond J. Safra Center for Ethics or other group or individual.

This month's For the Record comes from Eugene P. Beard Graduate Fellow in Ethics, Elettra Bietti, SJD student at Harvard Law School and an affiliate at the Berkman Klein Center for Internet and Society. Her thesis focuses on information gatekeepers such as Facebook and Google. She is currently exploring these companies' moral and legal obligations towards individuals through a methodology drawn from political theory and public law, considering them as sites of contestation, in which new interests and forms of social organization demand for a reconfiguration of individual rights, entitlements, and obligations.


Locked-in Data Production: User-Dignity and Capture in the Platform Economy

Elettra Bietti

“Data” can be many different things: information about a person (e.g., date of birth, name of pet, current or past location), content (e.g., this blog post), metadata or information about information (e.g., where data is stored, who stored it and how, or the number of amendments that this blog post underwent), information that might indirectly relate to a person (e.g., the number of people who visited the gym yesterday, the most on-demand body product in a neighborhood), or information that is not about people at all (e.g., the chemical composition of the ozone layer). With the advent of the Internet, of large technology platforms, and of the data economy, questions emerge as to how flows of data, or subsets of it, should be governed and constrained, particularly as they relate to persons, and how data might affect their lives and well-being.

Consider three scenarios:

Ali has a passion for playing classical music and a YouTube channel. In his free time he records and posts videos of his piano performances. He has a few videos with more than 100K views. Should he be compensated by YouTube for the traffic that his videos generate, and if so how?

Eli likes good food. At Whole Foods, the cashier always asks her if she has an Amazon Prime account. When she links her Prime account to her purchase she normally obtains a five-dollar discount. Has she been fairly compensated for her data through that discount? Does she have additional rights to understand whether that compensation was fair?

Oli is a PhD student and spends a lot of time on Facebook. Should Oli’s time on Facebook be compensated on the ground that it generates useful data for Facebook? Should Oli have other rights against Facebook on the basis of his participatory activities?

These three cases strike us as very distinct, yet a common thread underlies them: What rights do individuals have to be compensated by or to share profits with platforms for the data and content they contribute? Is there a simple and general way to clarify, correct, and potentially address the existing power asymmetries between individuals and platforms in these contexts? My intuition is that the answer to these questions must depend on the specific facts of each case, because different kinds of information are produced in different ways, raise separate problems, and demand separate solutions. Looking for a one-size-fits-all solution that applies to all personal information including content and data, on all platforms and for all users, would appear myopic and wrong. My aim in what follows is to show that any normative view through which a market mechanism capable of achieving adequate compensation and capable of addressing all claims that might exist between Ali, Eli and Oli and platforms such as YouTube, Amazon and Facebook, respectively, is wrong and leads to confusion. To do this I will first briefly provide some background on the data economy and the harms it presents, and then will articulate my view by showing why three means of achieving adequate compensation fall short of addressing those harms.


Data Optimists and Data Pessimists

Scholars who have acknowledged and shed light on the important role of data in the digital economy can be divided in two groups: optimists, who focus on the economic promises of the use of data as a raw material or form of capital, and pessimists, who have highlighted the exploitative effects of the data economy on social life. Amongst optimists are many computer scientists and economists, who benefit from large datasets and celebrate the benefits of data on markets and innovation. Hal Varian, Chief Economist at Google, shows that amongst other benefits, data reduces information and transaction costs for firms, enabling the production of quality or highly personalized products at cheaper cost, and their distribution at larger scale.  His paradigmatic example is that of car rental companies. Accident rates are a function of speed limit and the monitoring of speed limit can reduce accident costs for car rental companies and consequently also rental costs for consumers. Varian’s assumption is that everyone would prefer to pay less for car rental by driving within the speed limit, and thus would have reason to accept a computer transmitter in the trunk of the car that subsidizes compliant behavior. Pessimists, on the other hand, do not think there is necessarily good reason to accept being surveilled in such a way simply for the sake of a cheaper service. Julie Cohen and Shoshana Zuboff, for instance, articulate critiques of the data economy, denouncing the process of data appropriation and commoditization of every aspect of human life by private businesses, an economic logic based on reckless and indifferent privatized surveillance so pervasive that it is difficult to diagnose. They each suggest reasons why car drivers, in Varian’s example, ought to object to a computer transmitter in the car trunk by default.

There are two sets of primary reasons why we ought to resist the surveillance economy. First, surveillance entails a level of unreasonable intrusion into our lives. It is unreasonable not only because it is in most cases unwanted but because it is opaque and does not provide us with opportunities to understand how it may affect us and to conduct ourselves accordingly. Something is not surveillance unless it is being done to us (or our data) either without our knowing or in ways that we can only partially apprehend or understand. There is a sense in which this intrusion without an ability or opportunity to know and defend oneself from it affects our interest in being treated with the respect that is owed to a person. The second reason for resisting the surveillance economy is that it leads to objectionable commodification of information about us. Like the sale of a body part, it could be argued that selling one’s data, especially if it becomes compulsory, is an act that goes against our sense of self-worth as humans, against the attitude of respect that we are owed as persons. Further, commodification creates a disconnect between an individual and information about them, subjecting information to the self-serving laws of a market, and leading to consequences that are unpredictable, often difficult to understand, and possibly harmful. A third reason to resist a surveillance economy is that the one reinforces the other: the more one subjects data to market processes, the higher the risks of unreasonable surveillance.

While many scholars acknowledge these preoccupations, the more optimistic among them believe that all of the surveillance economy’s harms can be compensated through wages or price. For example, in response to Varian’s view, Eric Posner and Glen Weyl have suggested that a new market should be created over data, a market for the provision by individual data ‘producers’ such as Ali, Eli, and Oli to platforms or data acquirers such as YouTube, Amazon, and Facebook in exchange for a wage or other form of compensation. Underlying this view, often referred to as Data as Labor (or DaL), is a faith that market forces alone, if properly put in motion, are capable of fairly redistributing resources. In parallel to DaL, startups are creating new ways of monetizing data through blockchain technologies, individuals are collectivizing around data cooperatives, and initiatives such as Tim Berners-Lee’s Solid or the Microsoft Project Bali are seeking to give individuals greater control over their information. What each of these proposals normatively entails is at present far from clear and univocal.

I now explore three paradigmatic conceptions of market-based data governance, showing that none can be exhaustive.


Data as Ownership

The first conception of market-based data governance is based on the belief that a private property rights regime over data would be capable of achieving the fairest possible distribution of rewards in the data economy. Robert Nozick famously argued that a morally sound allocation of property and resources can only be achieved through a minimalist state that relies on what he calls principles of justice in holdings, justice in acquisition and justice in transfer being the main principles. An entitlements view of the data economy is normatively weak.

Ownership claims over data are problematic in at least five ways. First, most online data that an individual provides, voluntarily or involuntarily, is entered through a browser into someone else’s proprietary domain system, is processed by the domain owners through human or machine interventions, and both the entered data and any output that results from processing is stored on someone else’s servers located who knows where. Even if an individual has proprietary claims, it is also clear that the data could simultaneously be the subject of proprietary claims by the website owners and the server owners amongst others, each of which could arguably pre-empt the individual’s claims. When OKCupid employs user-data to generate a match, the resulting match can’t be said to belong to either of the users who are being matched, in spite of the fact that the information relates to them in a very personal way and they may be entitled to claims over how it is used. Second, personal data about different people is frequently mixed up, so that giving ownership rights to one person can curtail others’ rights. Third, if one of ownership’s core characteristics is the ability to exclude, it is unclear that proprietary claims over data can have this feature at all. If a painter draws a portrait of me, I do not in fact automatically acquire property rights over the portrait because it contains information about me, nor can a compelling moral case be made for my having property rights in this case. Unless the artist decides to give me the portrait as a gift, or to sell it to me, I have no right to exclude others from it, alienate, or sell it. Fourth, ownership creates a market over data as a commodity, and as such entails a specific kind of harm: that of severing the self from personal data as an object, allowing monetization and tradability of such object, and obscuring the losses in human integrity that result from such alienation. Fifth, an ownership regime makes two unwarranted assumptions: that it is acceptable for individuals to decide through exercises of rational volition how their data should be priced, and also that it is possible for individuals to possess sufficient information about what happens to their data to make an informed choice as to price. The reality and relative opacity of digital life renders such assumptions implausible, and it is likely that an ownership regime would benefit the most informed and educated of data producers to the detriment of the helpless and misinformed, who could easily be tricked into selling their data at lower than market value. The moral case for allocating the benefits of the digital economy through a private ownership regime is thus very weak.


Data as Compensation

The second conception of market-based data governance holds that the most compelling form of distribution of resources in the data economy demands that individuals be offered some form of compensation for the data they produce and time they spend online. Compensation could consist in the payment of wages as fixed term or short-term employees with associated labor protections, or as contractors without the associated labor protections, or even in simple discounts or vouchers.

This view possesses at least three flaws. First, it has been shown that if work is perceived as leisure, people are more likely to do it for free, while if something is perceived as labor, people will be more reluctant to do it even if it is compensated. This is what Glen Weyl and Eric Posner call the ‘Tom Sawyer’ problem that incentivizes platforms not to compensate their users and to keep generating participation and addiction through gamified incentives. Second, as Tommie Shelby has pointed out, a sense of self-worth or dignity is not conditional upon our having employment or receiving compensation. Employment or compensation can serve individuals’ sense of self-respect and dignity only if it surpasses some threshold of decency, and it is not clear that digital labor as conceived by Posner and Weyl would meet that threshold. Third, even if a more than minimal decency threshold were met, and if labor protections were present, the key problem remains that considering data as a commodity to be exchanged on a labor market licenses and further entrenches misuse of data, commodifying real persons by embedding them into a production system that they have neither chosen to be part of, nor can decide to remove themselves from. We don’t just need regulated market-mechanisms for trading over data but further restraints on what and how much data can be generated and collected. Even the best moral understanding of Data as Compensation does not give individuals the power to determine how the data economy is managed and how much data will be produced and commodified.


Data as Share of Profits

The third conception of market-based data governance focuses on tracing outputs rather than inputs and compensates individuals proportionately to the profits made by technology platforms. There are two versions of this conception. The first is the view that individuals ought to have a right to get a share of the profits a company made through data about them which is equivalent to and based on the value or amount of data that they contributed. The second, and more ambitious, view demands a radical form of co-ownership over the business as a whole: a form of communal shareholding or some other sui generis rights of users to use and enjoy the fruits of the business as a joint venture and to participate in its governance.

As regards the first option, English law provides remedies such as the ability to trace and follow assets that were misappropriated by someone who acted wrongfully and in breach of trust. Such remedies have historically required the courts’ involvement, and it would seem burdensome and highly inefficient to demand that courts get involved every time an individual needs to claim compensation from technology platforms for unlawful misappropriation. This view is also parasitic on an understanding of data as an asset that can be owned and traced. Nonetheless, efforts in line with this view are currently developing in two directions. The first is to reconfigure data governance as a question of management of data flows. Accordingly, the relevant asset is not the particular bit of data but rather the flow that all these bits jointly form. The second is to imagine the implications of imposing fiduciary or trustee obligations on data handlers for the benefit of individuals, as suggested by Jack Balkin.

The other option, joint or common ownership, departs from the other understandings of data governance that I presented by significantly expanding its embrace. Such a system would allow individuals to negotiate organically with platforms the terms of their engagement on an ongoing basis. While market-based mechanisms are a good way of handling data valuation questions and transfers of title, they provide no answer on how individuals and platforms are to negotiate any post-transfer uses or trades of data and data flows. A system of common governance of algorithmic processes, data capture, and use not founded on pure private property-based transactions instead enables individual empowerment at this later and long-lasting stage in the relationship between a technology infrastructure provider and individuals. This framework is in tension, however, not only with traditional understandings of private property or market-based mechanism design, but also with many of our currently entrenched assumptions about the generation, collection, and use of data.


It seems that market-based conceptions of data governance cannot alone remedy our discontent with the platform economy, let alone act as panacea data governance solutions. Compensating Ali, Eli, and Oli on the basis of their participatory actions does not address our concerns with the increasing commodification of data and the surveillance risks that result from it.

Take the three cases in turn. In Ali’s case there is a commodity, the videos, that he can arguably legitimately monetize, but compensation does not trump all of Ali’s claims against YouTube. In Eli’s case there is no commodity to be bargained over. Eli might have a right (e.g., under the GDPR) to know more about the discount and how it was computed, but she does not have a legal right to know how Amazon values her data. One could argue she ought to have such a right, but under current law she does not. Oli probably has no rights to compensation whatsoever, but it can be argued that he ought to have rights to ensure that Facebook not use his data to target objectionable content or political propaganda at him or others. Legally speaking, Ali, Eli, and Oli’s situations are as diverse as their available forms of legal redress are piecemeal. Morally, it seems that their claims to benefit from their own involvement in the data economy ought to go much further than mere compensation: in addition to any claims to economically benefit from their activities, they have moral rights to determine in concert with others the kind of digital future they want, how much data it is acceptable to collect and use and for what purposes, and how the benefits of such collection should be allocated. The problem, in a nutshell, is that opting for market- or property-based mechanisms leaves private platform companies with too much objectionable power over their users.

Even non-market solutions such as co-ownership are not as simple or straightforward as we might have hoped, however. But if not co-ownership, then what? Are participatory governance structures for deciding what should happen to our data and to platform infrastructure the only way forward, or are they unacceptable as some privacy scholars seem to believe? Why do market-based solutions seem so appealing and why is the myth of the rational decision-maker deciding his faith in solitude so difficult to debunk in light of a preponderance of evidence that shows that it does not serve us well in the digital ecosystem?