Daniel Solove Contends Privacy Law Regulations Should Be Based on Use, Harm, and Risk

By TAP Staff Blogger

Posted on February 17, 2023


“Sensitive data is not more harmful than non-sensitive data. It is the use of the data that matters.”
from “Data Is What Data Does…” by Daniel Solove


In a new article, George Washington Law professor and privacy expert Daniel Solove contends that privacy law requires rethinking. “Data Is What Data Does: Regulating Use, Harm, and Risk Instead of Sensitive Data” explains that privacy law “protections should be based on the use of personal data and proportionate to the harm and risk involved with those uses.”


Below are a few takeaways from “Data Is What Data Does: Regulating Use, Harm, and Risk Instead of Sensitive Data” by Daniel Solove (January 2023).


The Problem with the Sensitive Data Approach


Heightened protection for sensitive data is becoming quite trendy in privacy laws around the world. These provisions in privacy laws are based on a recognition that a uniform level of privacy protection would be too simplistic.


To avoid treating serious and minor situations uniformly, many privacy laws designate a set of special categories of personal data called “sensitive data” that receive heightened protections. With sensitive data, privacy laws offer two levels of protection, a baseline level for regular personal data and a heightened level for sensitive data. Although two levels might not be granular enough, two is certainly better than one. Commonly recognized special categories of sensitive data include racial or ethnic origin, political opinions, religious or philosophical beliefs, trade union membership, health, sexual orientation and sex life, biometric data, and genetic data.


This Article argues that the problems with the sensitive data approach make it unworkable and counterproductive — as well as expose a deeper flaw at the root of many privacy laws. These laws make a fundamental conceptual mistake — they embrace the idea that the nature of personal data is a sufficiently useful focal point for the law. But nothing meaningful for regulation can be determined solely by looking at the data itself. Data is what data does. Personal data is harmful when its use causes harm or creates a risk of harm. It is not harmful if it is not used in a way to cause harm or risk of harm.


Types of Data Recognized as Sensitive


  Under the GDPR, sensitive data includes the following special categories of personal data:

  • racial or ethnic origin
  • political opinions religious or philosophical beliefs
  • trade-union memberships
  • health, sex life
  • genetic data
  • biometric data

In a 2019 analysis of the definitions of sensitive data from 112 countries, the most commonly recognized categories of sensitive data include the types defined by the GDPR. A major divergence is that the laws of many other countries include criminal records as sensitive data whereas the GDPR does not (although the GDPR provides special protection for criminal records). Other commonly recognized types of sensitive data include credit information and identification numbers or documents. Some countries define sensitive data as including data related to private life, personal habits, or private relationships.


Overall, privacy laws have significant overlap in the categories of data they recognize as sensitive, but they also have many differences. The result is a rather complicated landscape from jurisdiction to jurisdiction, making compliance with the laws quite challenging. Organizations must classify their personal data (a practice known as “data mapping”), identifying which data is sensitive because it must be treated differently. With more than 70% of the 194 countries around the world having comprehensive privacy laws (most of which include sensitive data), 51 plus laws in different U.S. states, mapping which data is sensitive is a complicated task because the same type of data may be sensitive in some jurisdictions but not others. Sometimes, categories are similar yet slightly different, such as “health” data under the GDPR versus “mental or physical health diagnosis” under the Virginia CDPA.


Inferences about Sensitive Data Can Readily Be Made from Non-Sensitive Data


Under several major privacy laws, it is clearly established that inferences count for sensitive data — any personal data from which sensitive data can be inferred will also be deemed to be sensitive data. The problem, though, is that the implications are far greater than currently recognized. In today’s age of Big Data, personal data is readily aggregated with other pieces of personal data and fed into hungry algorithms that generate inferences about people.


In an age of inference, nearly all regular personal data can, either in isolation or combination, give rise to inferences about sensitive data.


A few relatively obvious examples of ways to infer sensitive data from non-sensitive data include:

  • Data about patterns of electricity use can be used infer that a person or household is an Orthodox Jew. Orthodox Jews do not use electricity on Saturdays.
  • Data about food consumption can be used to infer religion, such as Muslims, Jews, Hindus, or other faiths that do not eat particular foods or eat particular foods on particular holidays.
  • Data about food consumption can be used to infer health conditions, such as particular diets for particular conditions (such as gluten-free for celiac or sugar-free for diabetes).
  • Location data can be used to determine the religious or political institutions a person visits.

Focusing On Use, Harm, and Risk


Privacy law should stop focusing on the nature of personal data. The particular type of personal data does not indicate anything important when it comes to determining how to protect it. What matters is use.


The sensitive data approach falters because it is centered on a conceptual mistake — it views the nature of the data as significant for determining the appropriate level of protection. As I discussed above, the nature of the data tells us little of value.


What matters is use, harm, and risk.


Use involves what is done with personal data, the activities that organizations undertake with it, and the purposes and effects of those activities. Harm involves negative consequence from the use of personal data that affect individuals or society. Risk involves the likelihood and gravity of certain harms that have not yet occurred.


An Example Considering Data About a Person’s Religion


The way data is protected depends upon use, harm, and risk. For example, consider personal data about a person’s religion — that a person identifies as being of a particular faith. Many privacy laws would deem this to be sensitive data. But without knowing how the data will be used, it is not clear what protections are appropriate.


If the data about a person’s religion is confidential, then the law should protect its confidentiality by restricting disclosure, imposing strong duties of confidentiality, and protecting the confidential relationships where this data is created and shared. But in many cases, data about religion is not confidential. Suppose the person is a religious leader. Protection of this data as confidential would be meaningless — and even contrary to the person’s desires, which might be to have this information widely known.


If the data were used to discriminate against the person because of their faith, then this use would be harmful. Confidentiality protection would not be helpful since the data is already widely known. Meaningful protection would need to focus on stopping the data from being used to discriminate.


The law should address harms no matter what type of personal data is used— whether it be data directly about the person’s religion, data that is a proxy for the person’s religion, or data completely independent of the person’s religion but used for these problematic purposes.


As this example demonstrates, the law’s protections cannot be one-size-fits all, as the particular uses, harms, and risks are quite different. Not every problem is the same. Looking at the data itself fails to tell us how it should be protected.


The Challenge of Complexity with Privacy Harms


In a recent article I wrote with Danielle Citron, we set forth a wide array of privacy harms that have a basis in existing law and cases. [See “Privacy Harms,” 101 B.U. L. Rev. 793, 2022.] We note that courts and policymakers are inconsistent in their recognition of privacy harms and that they can often falter and adopt narrow simplistic notions of harms rather than the broader and more pluralistic harms that we identify. But the harms we identify have a basis in precedent and are not far-fetched. In an earlier article about data breach harms, we noted how some courts quickly stated that the law did not recognize emotional distress alone as cognizable harm, ignoring more than a century of indisputable precedent from hundreds (if not thousands) of privacy tort cases that did recognize emotional distress alone as sufficient to establish harm. [See “Risk and Anxiety: A Theory of Data Breach Harms,” 96 Tex. L. Rev. 737, 2018.]


Focusing on use, harm, and risk is the only viable path, however difficult it may be. The sooner privacy law realizes this truth — as inconvenient as it may be — the better it will be for starting to journey down the right path. Of course, not all use cases are clear. The harmfulness of many uses is in dispute. But the fact that there are areas of contention and blurriness should not be a deterrent, as the boundaries of sensitive data are even less clear. Ultimately, there is no escape from the hard work of figuring out how to regulate certain uses. Privacy is immensely complicated, and it is highly contextual.


Read the full article: “Data Is What Data Does: Regulating Use, Harm, and Risk Instead of Sensitive Data” by Daniel Solove (January 2023).


Read More from Professor Solove’s Work with Professor Danielle Citron Examining Privacy Harms:


Daniel Solove is the Eugene L. and Barbara A. Bernard Professor of Intellectual Property and Technology Law at the George Washington University Law School. One of the world’s leading experts in privacy law, Professor Solove has lectured at universities, companies, and government agencies around the world and been interviewed and quoted by the media in several hundred articles and broadcasts, including The New York Times, Washington Post, Wall Street Journal, USA Today, Chicago Tribune, the Associated Press, ABC, CBS, NBC, CNN, and NPR.


Professor Solove is the author of numerous books, including Breached! Why Data Security Law Fails and How to Improve It (Oxford 2022) (with Woodrow Hartzog), Nothing to Hide: The False Tradeoff Between Privacy and Security (Yale 2011), Understanding Privacy (Harvard 2008), and The Future of Reputation: Gossip and Rumor in the Information Age (Yale 2007). The Future of Reputation won the 2007 McGannon Award. Additionally, he has written a children’s fiction book about privacy called The Eyemonger (2020). Professor Solove's books have been translated into Chinese, Italian, Korean, Japanese, and Bulgarian, among other languages.