2 Who Knows What: Data Privacy and Ownership 2 Who Knows What: Data Privacy and Ownership
2.1 Privacy, Surveillance, and the Going Dark Problem 2.1 Privacy, Surveillance, and the Going Dark Problem
How much access to our data should the government have?
2.1.1. “Battle of the Clipper Chip” by Stephen Levy, New York Times (1994).
Tensions between the government and the technology-using public over cryptography and digital censorship are nothing new. The 1990s saw what many have now termed “Crypto Wars I,” a clash between government officials fearful of encryption-driven “warrant-free zones” – datastores inaccessible to law enforcement – and a broad coalition of technologists and civil libertarians. This article explores the “cypherpunk” movement, a wing of this coalition which vigorously resisted controls on cryptographic technology. How do the cypherpunk movement’s ideals and motivations reflect the internet’s founding principles (decentralization, layering, generativity, open-sourcing, etc.)? Does the digital surveillance environment look different today?
2.1.2. “Going Dark: Are Technology, Privacy, and Public Safety on a Collision Course?” by James Comey, Speech to Brookings Institution (2014).
This speech by FBI director James Comey – this time soon after the outbreak of Crypto Wars II – reflects the law enforcement and intelligence community’s position in the Going Dark debate. Do you find Comey’s argument convincing?
2.1.3. “What the Government Does with Americans’ Data” by Rachel Levinson-Waldman, Brennan Center (2013).
Pgs 1-18, remainder optional but useful
This report – published shortly after the Snowden disclosures – offers an overview of the United States intelligence apparatus and its data collection capacities. Does anything about it surprise you?
2.1.4. “Don’t Panic: Making Progress on the Going Dark Debate” by Jonathan Zittrain et al., Berkman Klein Center (2016).
This report, published as part of the Berklett Cybersecurity initiative, adds nuance to the Going Dark debate by highlighting characteristics of the digital environment that will likely facilitate evidence and intelligence gathering well into the future. How do you feel these nuances bear on the urgency of the Going Dark problem? How might the Going Dark problem be affected by the rise of platforms (and the breakdown of layerization/decentralization)?
2.2 Advertising and Data Monetization 2.2 Advertising and Data Monetization
When and how should technology companies be able to make money off of our data?
2.2.1. “Getting the Message: How the Internet is Changing Advertising” by Susan Young, Harvard Business Review (2000).
The promise of internet advertising was recognized from early in the internet’s history, but new technologies have driven massive acceleration in advertiser sophistication. What does this article – written in 2000 – get right about the future of advertising? If the authors had anticipated the rise of platforms, how might that knowledge have impacted their predictions?
2.2.2. “Re-Thinking the Network Economy: The True Forces That Drive the Digital Marketplace” by Stan Liebowitz, AMACOM (2002).
Chapter 6: “Can Advertising Revenues Support the Net?” Read 6 Intro and 6 part D.
Liebowitz’s comments on the online advertising business, written shortly after the dot com crash, are notably bearish. He argues that the most feasible path towards profitability for internet services is subscription-based models. But advertising is the primary driver of revenues from internet content creation today – what was Liebowitz missing? Does access to enormous volumes of user data change the story Liebowitz is telling?
2.2.3. “A Brief History of Online Advertising” by Karla Cook, Hubspot (Updated 2018).
This post offers a brief overview of major developments in the history of online advertising, each of which contributes to an ever-growing hunger for data. How do these developments support or disrupt the arguments made in the two previous pieces?
2.2.4. “How Companies Learn Your Secrets” by Charles Duhigg, New York Times (2012).
Though this article is from 2012, it still demonstrates advertisers’ incredible degree of sophistication in data analysis. How should we think about the line between helpful and creepy in data-driven advertising? What guidelines or controls should be placed on experiments conducted by advertisers?
2.2.5. “Data Brokers: A Call for Transparency and Accountability” FTC Report (2014).
Read Executive Summary
Data brokers – entities which accumulate, package, and resell user data to advertisers, platforms, and other entities – are often as shadowy as they are unregulated. This FTC report recognizes some potential problems arising from this lack of oversight. How is user data aggregation by data brokers different from aggregation by the platforms we use? How might data brokers erode decentralization?
2.2.6. “'It might work too well': the dark art of political advertising online” by Julia Carrie Wong, The Guardian (2018).
There has been much recent outrage over the use of data-driven analytics technologies to target voter outreach and influence elections. How do the stakes of advertising change when it enters into the political realm?
2.2.7. “Facebook Lets Advertisers Exclude Users by Race” by Julia Angwin, ProPublica (2018).
Facebook – and other platforms like it – give advertisers an enormous slate of tools for precisely selecting an audience. It’s easy to see how this enormous breadth of data points could give rise to explicit or implicit discrimination. Who should get to decide what attributes advertisers can use for targeting? How should they make such decisions?
2.2.8. “Mark Zuckerberg Can Still Fix This Mess” by Jonathan Zittrain, New York Times (2018).
A wave of data and advertising related scandals have sapped the trust of Facebook’s users – and of the public at large. This op-ed calls for the company’s leadership to take ownership of these mistakes and move towards a responsibility-driven model of data management. What might Facebook reasonably say against such a proposal?
2.2.9. “How to Exercise the Power You Didn’t Ask For” by Jonathan Zittrain, Harvard Business Review (2018).
This piece outlines the “information fiduciaries” model mentioned in the previous piece. Would the existence of a fiduciary duty between users and the technology companies which rely on their data make you more comfortable with targeted internet advertising? What might make information fiduciaries different from other existing types of fiduciaries?
2.3 Artificial Intelligence 2.3 Artificial Intelligence
How and to what extent should technology companies use machine learning and other AI technologies to extract value and insights from user data?
2.3.1. “Thinking Machines: The Search for Artificial Intelligence” by Jacob Roberts, Science History Institute (2016).
This retrospective gives an overview of the course of AI development. How might current excitement and expectations surrounding AI influence development and deployment decisions made by advertisers and platforms?
2.3.2. “How Artificial Intelligence Can – And Can’t – Fix Facebook” by Tom Simonite, Wired (2018).
Platforms have often touted AI as a means of scaling difficult moderation and oversight functions – this piece analyzes some of the limitations of such an approach. Does AI deployment influence how we think about platform responsibility?
2.3.3. “Our Machines Now Have Knowledge We’ll Never Understand” by David Weinberger, Wired (2017).
The format in which machine learning algorithms represent the “knowledge” that drives their predictions is often not human-readable. What are some of the dangers of this lack of interpretability? Are there deployment situations in which we might be okay with uninterpretable algorithms which provide good results?
2.3.4. “Facebook Figured Out My Family Secrets, And It Won't Tell Me How” by Kashmir Hill, Gizmodo (2017).
This article offers an example of how a lack of consumer-facing interpretability – and a broader inability to demand explanations of model behavior – might make platform users uncomfortable. What do you think about Facebook’s claim that it couldn’t share information about its algorithm for purposes of privacy and competitiveness?
2.3.5. “Facebook tests machine learning to detect ads that discriminate by race” by Ken Yeung, Venture Beat (2017).
Artificial intelligence has the potential to detect and proactively stop discriminatory practices on the internet. What might be some unintended side effects of the deployment of such algorithmically mediated restrictions?
2.3.6. “Automated Experiments on Ad Privacy Settings” by Amit Datta et al., Proceedings on Privacy Enhancing Technologies (2015).
Read Introduction, 6, 7
This paper describes a number of experiments revealing – according to the authors – measurable discrimination in the advertisements that Google shows to end users. What difficulties might researchers (and platforms, and advertisers) face in defining what fairness and discrimination actually are?
2.3.7 Additional AI Learning Resources [OPTIONAL] 2.3.7 Additional AI Learning Resources [OPTIONAL]
2.3.7.1. “Machine Learning for Humans” by Vishal Maini, Medium (2017)
2.3.7.2. Andrew Ng’s lectures on Machine Learning, Youtube (2016)
2.3.7.3. “Lecture 11: Introduction to Machine Learning, MIT 6.0002 Introduction to Computational Thinking and Data Science,” Eric Grimson, MIT OpenCourseWare (2016)
2.4 Jurisdictional Issues, the GDPR, and the Right to be Forgotten 2.4 Jurisdictional Issues, the GDPR, and the Right to be Forgotten
Who’s in charge of the rules surrounding the collection and use data?
2.4.1. “Be Careful What You Ask For: Reconciling a Global Internet and Local Law” by Jonathan Zittrain (2003).
The internet has given rise to a range of jurisdictional challenges. This piece, written in 2003, considers some of the inherent governance tradeoffs implicated by a global internet. How might the existence of platforms play into or complicate this analysis?
2.4.2. PLACEHOLDER “Who Controls the Internet?” by Jack Goldsmith and Tim Wu (2006) (Chapter 4).
Read Chapter 4 – “Why Geography Matters"
It’s easy to ignore the fact that global internet connectivity is made possible by physical infrastructure, some of which runs across borders and oceans. Goldsmith and Wu consider the implications of a global internet infrastructure. What challenges might global-scale platforms face related to these problems of geography?
2.4.3. “Fact Sheet on the Right to be Forgotten Hearing” European Commission (2014).
The right to be forgotten is a hallmark of European internet governance, and, in relation to the United States, a prime example of differing jurisdictional standards. It has since been included under the GDPR. Why might a tech company providing services across multiple jurisdictions – some of which observe the right to be forgotten and some of which do not – choose to enforce the policy across all geographic versions of its platform? Why might it choose not to do so?
2.4.4. “Google to France: ‘Forget You’ - An Update on the Right to Be Forgotten” by Zoe Bedell, Lawfare (2016).
In 2016, Google announced a lawsuit against the French government over a fine French regulators had levied following Google’s refusal to “forget” links on a global basis. In other words, a link removed after a right to be forgotten request by a French citizen would still be viewable in non-EU jurisdictions like the United States. How do you feel about Google’s position in this lawsuit? What are some potential risks imposed by a decision for each side?
2.4.5. “WTF is GDPR?” by Natasha Lomas, TechCrunch (2018).
The GDPR is a complicated and wide-ranging piece of legislation – this somewhat crassly titled backgrounder explains its most important facets. To what extent do the data protection, data sharing, and consent standards described in the GDPR offer compelling solutions to the problems we’ve discussed in previous weeks? Is the medicine too weak (or too strong)?
2.4.6. “Business booms for privacy experts as landmark data law looms” by Salvador Rodriguez, Reuters (2018).
A significant cottage industry has sprung up around GDPR compliance. How might compliance demands make it difficult for upstart firms to compete with established behemoths?
2.4.7. “Yes, The GDPR Will Affect Your U.S.-Based Business” by Yaki Faitelson, Forbes (2018).
As in the case of the right to be forgotten, the global nature of the internet and its platforms means that regionally specific regulations often propagate across borders. How might this propagation shift global norms regarding data privacy and control (and the means by which platforms implement those norms)?
2.4.8. “An American Alternative to Europe’s Privacy Law” by Tim Wu, New York Times (2018).
In this piece, Tim Wu argues for the information fiduciaries concept as a better fit for the US legal system than a GDPR-like regulation. What enforcement challenges might emerge from, as Wu suggests, relying “on judges and state law to establish that the legal concept of “fiduciary duty” can apply to technology companies”?