CAREER: Rigorous Foundations for Data Privacy

Award #: 0747294
Amount Awarded: $400,000
Sponsoring Organization:NSF
Grant Period: 2008-2013
Primary Investigator(s): Adam Smith

Abstract

The ubiquity of collections of personal and sensitive data (census surveys, online social networks, and public health data, to name a few) has created a host of new problems stemming from conflicts between data access and privacy. An important challenge for these collections is to discover and release global characteristics of the database without compromising the privacy of the individuals whose data they contain. The problem has been studied extensively in such diverse fields as statistics, databases and data mining. However, the approaches proposed in the literature, until very recently, had either no formal privacy guarantees or ensured security only against limited types of attacks. This project seeks to lay a firm conceptual foundation for the field of privacy in statistical databases, taking into account realistic, sophisticated adversarial attacks and bringing together ideas from several different sub-disciplines of statistics and computer science. The research is centered around three themes: (1) formulating realistic models and definitions of privacy that provide resistance against strong, even active, attacks; (2) understanding the types of information that can, and cannot, be revealed while retaining privacy according to the definitions discussed above; (3) investigating techniques which "break" anonymization protocols, in order to inform protocol design in the same way that cryptanalysis informs modern cryptography. The research is closely tied to questions of resilience and robustness in machine learning and statistics. To ensure the broader impact of the research, this project includes a program of educational and outreach activities including new course development and workshop organization.

Related Publications