Princeton University Library Catalog

Differential privacy : from theory to practice / Ninghui Li, Min Lyu, Dong Su, Weining Yang.

Author:
Li, Ninghui (Computer scientist) [Browse]
Format:
Book
Language:
English
Published/​Created:
[San Rafael, California] : Morgan & Claypool, 2017.
Description:
1 online resource (xiii, 124 pages)
Series:
  • Synthesis digital library of engineering and computer science. [More in this series]
  • Synthesis lectures on information security, privacy, and trust ; # 18. [More in this series]
  • Synthesis lectures on information security, privacy, and trust, 1945-9750 ; # 18
Summary note:
Over the last decade, differential privacy (DP) has emerged as the de facto standard privacy notion for research in privacy-preserving data analysis and publishing. The DP notion offers strong privacy guarantee and has been applied to many data analysis tasks. This Synthesis Lecture is the first of two volumes on differential privacy. This lecture differs from the existing books and surveys on differential privacy in that we take an approach balancing theory and practice. We focus on empirical accuracy performances of algorithms rather than asymptotic accuracy guarantees. At the same time, we try to explain why these algorithms have those empirical accuracy performances. We also take a balanced approach regarding the semantic meanings of differential privacy, explaining both its strong guarantees and its limitations. We start by inspecting the definition and basic properties of DP, and the main primitives for achieving DP. Then, we give a detailed discussion on the semantic privacy guarantee provided by DP and the caveats when applying DP. Next, we review the state of the art mechanisms for publishing histograms for low-dimensional datasets, mechanisms for conducting machine learning tasks such as classification, regression, and clustering, and mechanisms for publishing information to answer marginal queries for high-dimensional datasets. Finally, we explain the sparse vector technique, including the many errors that have been made in the literature using it. The planned Volume 2 will cover usage of DP in other settings, including high-dimensional datasets, graph datasets, local setting, location privacy, and so on. We will also discuss various relaxations of DP.
Notes:
Part of: Synthesis digital library of engineering and computer science.
Bibliographic references:
Includes bibliographical references (pages 113-121).
Source of description:
Title from PDF title page (viewed on November 8, 2016).
Contents:
  • 1. Introduction -- 1.1 Privacy violation incidents -- 1.1.1 Privacy incidents -- 1.1.2 Lessons from privacy incidents -- 1.2 On balancing theory and practice -- 1.3 Organization of this book -- 1.4 Topics for volume 2 --
  • 2. A primer on [epsilon]-differential privacy -- 2.1 The definition of [epsilon]-DP -- 2.1.1 Bounded DP or unbounded DP -- 2.2 Properties of [epsilon]-DP -- 2.2.1 Post-processing and sequential composition -- 2.2.2 Parallel composition and convexity -- 2.3 The Laplace mechanism -- 2.3.1 The scalar case -- 2.3.2 The vector case -- 2.4 The exponential mechanism -- 2.4.1 The general case of the exponential mechanism -- 2.4.2 The monotonic case of the exponential mechanism -- 2.4.3 Case study: computing mode and median -- 2.4.4 Discussion on the exponential mechanism -- 2.5 Case study: computing average -- 2.5.1 Applying the Laplace and the exponential mechanism -- 2.5.2 Applying the Laplace mechanism and composition -- 2.5.3 A non-private average algorithm using accurate count -- 2.5.4 NoisyAverage with accurate count -- 2.5.5 NoisyAverage with normalization -- 2.5.6 Which is best -- 2.6 Settings to apply DP -- 2.7 Bibliographical notes --
  • 3. What does DP mean? -- 3.1 Limitations of syntactic notions -- 3.2 Semantic guarantees of differential privacy -- 3.2.1 Infeasibility of achieving "privacy as secrecy" -- 3.2.2 Toward a "real-world-ideal-world" approach -- 3.2.3 DP as approximating the ideal world of "privacy as control" -- 3.2.4 A formulation of DP's semantic guarantee -- 3.2.5 The personal data principle -- 3.2.6 A case study in applying PDP -- 3.3 Examining DP and PDP -- 3.3.1 When the notion of neighboring datasets is defined incorrectly -- 3.3.2 When using DP in the local setting -- 3.3.3 What constitutes one individual's data -- 3.3.4 An individual's personal data or personal data under one individual's control -- 3.3.5 Group privacy as a potential legal Achilles' heel for DP -- 3.3.6 A moral challenge to private party benefiting from DP -- 3.4 Additional caveats when using DP -- 3.4.1 Using an [epsilon] that is too large -- 3.4.2 Applying a model to personal data -- 3.4.3 Privacy and discrimination -- 3.5 Bibliographical notes --
  • 4. Publishing histograms for low-dimensional datasets -- 4.1 Problem definition -- 4.1.1 Three settings -- 4.1.2 Measuring utility -- 4.2 Dense pre-defined partitioning -- 4.2.1 The baseline: a simple histogram -- 4.2.2 The hierarchical method -- 4.2.3 Constrained inference -- 4.2.4 Effect of privacy budget allocation in hierarchical histograms -- 4.2.5 Wavelet transforms and other optimizations -- 4.2.6 Beyond one-dimensional datasets -- 4.3 Lacking suitable partitioning -- 4.3.1 The uniform grid method--UG -- 4.3.2 The adaptive grids approach--AG, 2D case -- 4.3.3 Bottom-up grouping -- 4.3.4 Recursive partitioning -- 4.4 Bibliographical notes --
  • 5. Differentially private optimization -- 5.1 Example optimization problems -- 5.1.1 k-means clustering -- 5.1.2 Linear regression -- 5.1.3 Logistic regression -- 5.1.4 SVM -- 5.2 Objective perturbation -- 5.2.1 Adding a noisy linear term to the optimization objective function -- 5.2.2 The functional mechanism -- 5.3 Make an existing algorithm private -- 5.3.1 DPLloyd: differentially private Lloyd algorithm for k-means clustering -- 5.3.2 DiffPID3: differential private ID3 algorithm for decision tree classification -- 5.4 Iterative local search via EM -- 5.4.1 PrivGene: differentially private model fitting using genetic algorithms -- 5.4.2 Iterative local search -- 5.4.3 Enhanced exponential mechanism -- 5.5 Histograms optimized for optimization -- 5.5.1 Uniform grid and its extensions -- 5.5.2 Histogram publishing for estimating M-estimators -- 5.5.3 DiffGen: differentially private anonymization based on generalization -- 5.5.4 PrivPfC: differentially private data publication for classification -- 5.6 Bibliographical notes --
  • 6. Publishing marginals -- 6.1 Problem definition -- 6.2 Methods that don't fit the problem -- 6.2.1 The flat method -- 6.2.2 The direct method -- 6.2.3 Adding noise in the Fourier domain -- 6.2.4 Data cubes -- 6.2.5 Multiplicative weights mechanism -- 6.2.6 Learning based approaches -- 6.3 The PriView approach -- 6.3.1 Summary of the PriView approach -- 6.3.2 Computing k-way marginals -- 6.3.3 Consistency between noisy views -- 6.3.4 Choosing a set of views -- 6.3.5 Space and time complexity -- 6.4 Bibliographical notes --
  • 7. The sparse vector technique -- 7.1 Introduction -- 7.2 Variants of SVT -- 7.2.1 Privacy proof for proposed SVT -- 7.2.2 Privacy properties of other variants -- 7.2.3 Error in privacy analysis of GPTT -- 7.2.4 Other variants -- 7.3 Optimizing SVT -- 7.3.1 A generalized SVT algorithm -- 7.3.2 Optimizing privacy budget allocation -- 7.3.3 SVT for monotonic queries -- 7.4 SVT vs. EM -- 7.4.1 Evaluation -- 7.5 Bibliographical notes -- Bibliography -- Authors' biographies.
Indexed in:
  • Compendex
  • INSPEC
  • Google scholar
  • Google book search
Subject(s):
PrivacyMathematical models [Browse]
ISBN:
  • 9781627052979 (ebook)
  • (print)
Doi:
  • 10.2200/S00735ED1V01Y201609SPT018
Author:
Other views:
Staff view