Big data: |
- Data protection and big data:
- G. D'Acquisto, J. Domingo-Ferrer, P. Kikiras, V. Torra, Y.-A. de Montjoye, A. Bourka (2015) Privacy by design in big data: An overview of privacy enhancing technologies in the era of big data analytics, European Union Agency for Network and Information Security (ENISA), 2015. ISBN: 978-92-9204-160-1, DOI: 10.2824/641480. (Open access)
- V. Torra, G. Navarro-Arribas (2016) Big Data Privacy and Anonymization, In Privacy and Identity Management 15-26. (Open access)
Privacy by design in big data: An overview of privacy enhancing technologies in the era of big data analytics, European Union Agency for Network and Information Security (ENISA), 2015. ISBN: 978-92-9204-160-1, DOI: 10.2824/641480. downloadable from here
|
Protection Methods: |
- Review articles on data protecction procedures for numerical and categorical data. First extensive comparison of masking methods with respect to risk and utility/information loss:
- Domingo-Ferrer, J., Torra, V., (2001) Disclosure control methods and information loss for microdata, Confidentiality, disclosure, and data access : Theory and practical applications for statistical agencies, Doyle, P., Lane, J.I., Theeuwes, J.J.M., Zayatz, L.V. eds., Elsevier, pp. 91-110.PDF@URV
- Domingo-Ferrer, J., Torra, V., (2001) A quantitative comparison of disclosure control methods for microdata, Confidentiality, disclosure, and data access : Theory and practical applications for statistical agencies. Doyle, P.; Lane, J.I.; Theeuwes, J.J.M.; Zayatz, L.V. eds., Elsevier, pp. 111-133. PDF@URV
- Microaggregation: More information here
- Data protection for (numerical) temporal data (longitudinal data):
- Nin, J., Torra, V. (2006) Extending microaggregation procedures for time series protection, Lecture Notes in Artificial Intelligence, 4259 899-908. (5th Int. Conf. on Rough Sets and Current Trends in Computing, RSCTC RSCTC 2006). http://dx.doi.org/10.1007/11908029_93
- Nin, J., Torra, V. (2009) Towards The Evaluation of Time Series Protection Methods. Information Sciences, Elsevier, 179:11 1663-1677. http://dx.doi.org/10.1016/j.ins.2009.01.024
|
IL Measures: |
- Information Loss and Data Utility (generic measures):
- Domingo-Ferrer, J., Torra, V., (2001) Disclosure control methods and information loss for microdata, Confidentiality, disclosure, and data access : Theory and practical applications for statistical agencies, Doyle, P., Lane, J.I., Theeuwes, J.J.M., Zayatz, L.V. eds., Elsevier, pp. 91-110.PDF@URV
- Domingo-Ferrer, J., Torra, V., (2001) A quantitative comparison of disclosure control methods for microdata, Confidentiality, disclosure, and data access : Theory and practical applications for statistical agencies. Doyle, P.; Lane, J.I.; Theeuwes, J.J.M.; Zayatz, L.V. eds., Elsevier, pp. 111-133. PDF@URV
|
DR (generic) Measures: |
- Disclosure risk measures (generic measures using re-identification algorithms suitable for any data protection method):
- Nin, J., Herranz, J., Torra, V. (2008) Towards a More Realistic Disclosure Risk Assessment.In Privacy in Statistical Databases (PSD), volume 5262 of Lecture Notes in Computer Science, pages 152-165. Springer. PDF@Springer
- Torra, V., Abowd, J.M., Domingo-Ferrer, J. (2006) Using Mahalanobis Distance-Based Record Linkage for Disclosure Risk Assessment, Lecture Notes in Computer Science, 4302, 233-242 (PSD 2006).
PDF@Cornell
(In this paper we prove that record linkage and re-identification algorithms can also be used for evaluating the risk of synthetic data. We use use both probabilistic and distance-based record linkage. Several different distance were used, including Mahalanobis and Kernel-based distances.)
- Nin, J., Torra, V. (2006) Distance based re-identification for time series, Analysis of distances, Lecture Notes in Computer Science, 4302 205-216. (PSD 2006). PDF@Springer (On the evaluation of disclosure risk of data protection methods on numerical time series.)
- Torra, V., Domingo-Ferrer, J. (2003) Record linkage methods for multidatabase data mining, in V. Torra (Ed), Information fusion in data mining, Springer, ISBN 3-540-00676-1, 101-132. (This paper reviews in detail record linkage methods: distance-based and probabilistic.)
|
DR (specific - adhoc) Measures: |
- Disclosure risk measures (specific measures -- adhoc measures -- developed to attack particular data protection methods). These measures and studies are needed when we expect data releases follow the transparency principle. Information about transparency here.
- Nin, J., Herranz, J., Torra V. (2008) Rethinking Rank Swapping to Decrease Disclosure Risk, Data and Knowledge Engineering, 64:1 346-364. http://dx.doi.org/10.1016/j.datak.2007.07.006 (This paper describes an effective attack for the data protection method Rank Swapping.)
- Nin, J., Torra V. (2009) Analysis of the Univariate Microaggregation Disclosure Risk, New Generation Computing, 27 177-194.
PDF@Springer
(This paper describes an effective attack for univariate microaggregation, a data protection method.)
- Nin, J., Herranz, J., Torra V. (2008) On the Disclosure Risk of Multivariate Microaggregation, Data and Knowledge Engineering, 67 399-412.
http://dx.doi.org/10.1016/j.datak.2008.06.014
(This paper describes an effective attack for multivariate microaggregation, another data protection method.)
|
General: |
- Transactions on Data Privacy
- G. Navarro-Arribas, G., Torra, V. (eds.) (2015) Advanced Research in Data Privacy, Springer. Book @ Springer
- This book gives an overview of the main topics on data privacy. The book is a result of the ARES CONSOLIDER (CSD2007-00004) project
-
Edited volumes:
- Privacy in Data Mining, Special issue of, Data Mining and Knowledge Discovery 11:2 (2005), Springer, J. Domingo-Ferrer, V. Torra (Eds). Includes:
- A Framework for Evaluating Privacy Preserving Data Mining Algorithms, by Elisa Bertino, Igor Nai Fovino and Loredana Parasiliti Provenza
- Preserving the Confidentiality of Categorical Statistical Data Bases When Releasing Information for Association Rules, by S. E. Fienberg and A. B. Slavkovic
- Probabilistic Information Loss Measures in Confidentiality Protection of Continuous Microdata, by J. M. Mateo-Sanz, J. Domingo-Ferrer and F. Sebé
- Ordinal, Continuous and Heterogeneous k -Anonymity Through Microaggregation, by J. Domingo-Ferrer and V. Torra
- Privacy in Statistical Databases 2004, Lecture Notes in Computer Science, 3050 (2004), Springer. J. Domingo-Ferrer, V. Torra (Eds.)
Includes:
- V. Torra, Microaggregation for Categorical Variables: A Median Based Approach (LNCS 3050 (2004) 162-174). PDF @ Springer
- V. Torra, S. Miyamoto, Evaluating Fuzzy Clustering Algorithms for Microdata Protection (LNCS 3050 (2004) 175-186). PDF @ Springer
- Special issue on Aggregation and re-identification in Statistical Disclosure Control, Int. J. of Uncertain Fuzziness and Knowledge-Based Systems, 10:5 (2002), Springer. V. Torra, J. Domingo-Ferrer (Eds.) Includes:
- V. Torra, J. Domingo-Ferrer, Editorial: Trends in aggregation and security assessment for inference control in statistical databases (pp. 453 - 457).
- J. Domingo-Ferrer, V. Torra, A critique of the sensitivity rules usually employed for statistical table protection (pp. 545-556).
- L. Sweeney, k-Anonymity: a model for protecting privacy (pp. 557 - 570).
- L. Sweeney, Achieving k-anonymity privacy protection using generalization and suppression (pp. 571 - 588).
|