Ten Simple Rules for Responsible Big Data Research

Privacy and Security and Artificial Intelligence

Article Snapshot


Solon Barocas, danah boyd, Kate Crawford, Alyssa Goodman, Rachelle Hollander, Emily Keller, Barbara Koenig, Jacob Metcalf, Arvind Narayanan, Alondra Nelson, Frank Pasquale, Seeta Peña Gangadharan and Matthew Zook


PLOS Computational Biology, Vol. 13, No. 3, e1005399, 2017


Use of big data in academic and industry research is growing. Studies of human psychology, biology, and behavior must be ethical. Researchers should start by recognizing that careless use of data can be harmful.

Policy Relevance

Researchers’ use of big data should be sound, accurate, and maximize good while minimizing harm.

Main Points

  • Over the past five years, the use of big data in research by academia and industry has grown; tools of big data research include:
    • Mining medical records for scientific and economic information.
    • Mapping relationships using social media.
    • Using sensors to record speech and actions.
    • Tracking individuals' movements.
  • Complex ethical issues arise in conducting big data research; all big data research on social, medical, psychological, and economic phenomena involves human subjects, and researchers ought to minimize harm.
  • Researchers should acknowledge that data points represent people and that data can harm people; even seemingly neutral datasets used to determine credit risk or shape criminal justice decisions can produce unfair outcomes.
  • Researchers should recognize that privacy is contextual, not simple; just because something has been shared publicly does not mean that use of it in research is unproblematic.
  • Researchers should guard against reidentification of anonymized data; many seemingly nonspecific factors such as battery usage, spatial location, birthdate, gender, zip code, and facial images can be used to identify individuals, especially when combined.
  • Researchers should debate tough ethical choices with colleagues and those in other disciplines on an ongoing basis.
  • Researchers should develop a code of conduct for their organization, research community, or industry.
  • Researchers should design datasets and systems to be audited, developing automated testing procedures and clearly documenting when decisions are made.
  • Researchers should know when to break the rules; in times of natural disaster or public health emergency, one might put aside questions of individual privacy for the greater good.

Get The Article

Find the full article online

Search for Full Article