Balanced k-Anonymization

The technique of k-anonymization has been proposed to obfuscate private data through associating it with at least k identities. This paper investigates the basic tabular structures that underline the notion of k-anonymization using cell suppression. These structures are studied under idealized conditions to identify the essential features of the k-anonymization notion. We optimize data kanonymization through requiring a minimum number of anonymized values that are balanced over all columns and rows. We study the relationship between the sizes of the anonymized tables, the value k, and the number of attributes. This study has a theoretical value through contributing to develop a mathematical foundation of the kanonymization concept. Its practical significance is still to be investigated.




References:
[1] S. S. Al-Fedaghi,, "A systematic approach to anonymity," Proceedings
of 3rd International Workshop on Security in Information Systems WOSIS-2005, Miami, May, 2005.
[2] S. S. Al-Fedaghi., G. Fiedler, and B. Thalheim "Privacy enhanced
information systems," Proceedings of The 15th European-Japanese
Conference on Information Modelling And Knowledge Bases, Tallinn, Estonia, 2005.
[3] G. T. Aggarwal, G., K. Feder, R. Kenthapadi, R. Motwani, D. Panigrahy,
D. Thomas, A. Zhu, "k-anonymity: algorithms and hardness," 2004,
http://dbpubs.stanford.edu:8090/pub/2004-24.
[4] R. J. Bayardo and R. Agrawal, "Data privacy through optimal kanonymization"
Proc. of ICDE-2005, 2005.
[5] G. Duncan, and D. Lambert, "The risk of disclosure for microdata,"
Journal of Business & Economic Statistics," 7, 1989, pp. 207-217.
[6] J. S. González., "Improving cell suppression in statistical disclosure
control," Conference of European Statisticians, Skopje, 14-16 March
2001
http://www.unece.org/stats/documents/2001/03/confidentiality/16.e.pdf
[7] A. Meyerson, and R. Williams, "On the complexity of optimal kanonymity,"
PODS 2004 June 1416, 2004, Paris, France.
[8] L. Sweeney, "K-anonymity: a model for protecting privacy,"
International Journal on Uncertainty, Fuzziness and Knowledge-based
Systems, 10 (5), 2002; 557-570.
[9] P. Samarati, "Protecting respondents' identities in microdata release.,"
IEEE Transactions on Knowledge and Data Engineering, 13(6),
November/December 2001.
[10] S. Zhong, Z. Yang, and R. N. Wright, "Privacy enhancing kanonymization
of customer data," PODS 2005 June 1315, 2005,
Baltimore, Maryland.
http://www.almaden.ibm.com/cs/people/bayardo/ps/icde05.pdf
[11] A. Hundpool and L. Willenborg, "Mu-argus and tau argus: software for
statistical disclosure control," Third Int-l Seminar on Statistical
Confidentiality, 1996.
[12] A. Meyerson and R. Williams, "On the complexity of optimal kanonymity,"
In Proc. of the 23rd ACM SIGMOD-SIGACT-SIGART
Symposium on the Principles of Database Systems, 223-228, 2004.
[13] E. Bertino, C. O. Beng, Y. Yanjiang, and R. H. Deng, "Privacy and
ownership preserving of outsourced medical data," 2005 International
Conference on Data Engineering (ICDE), Tokyo, Japan, http://wwwscf.
usc.edu/~csci586/paper/icde05.pdf.
[14] L. Sweeney, "Achieving k-anonymity privacy protection using
generalization and suppression," Int-l Journal on Uncertainty,
Fuzziness, and Knowledge-Base Systems 10(5): 571-588, 2002.
[15] L. Sweeney, "Datafly: a system for providing anonymity in medical
data. In Database Security XI: Status and Prospects," IFIP TC11
WG11.3 11th Int-l Conf. on Database Security, 356-381, 1998.