Abstract: The technique of k-anonymization has been proposed to obfuscate private data through associating it with at least k identities. This paper investigates the basic tabular structures that
underline the notion of k-anonymization using cell suppression.
These structures are studied under idealized conditions to identify the
essential features of the k-anonymization notion. We optimize data kanonymization
through requiring a minimum number of anonymized
values that are balanced over all columns and rows. We study the
relationship between the sizes of the anonymized tables, the value k, and the number of attributes. This study has a theoretical value through contributing to develop a mathematical foundation of the kanonymization
concept. Its practical significance is still to be
investigated.
Abstract: Probability-based identity disclosure risk
measurement may give the same overall risk for different
anonymization strategy of the same dataset. Some entities in the
anonymous dataset may have higher identification risks than the
others. Individuals are more concerned about higher risks than the
average and are more interested to know if they have a possibility of
being under higher risk. A notation of overall risk in the above
measurement method doesn-t indicate whether some of the involved
entities have higher identity disclosure risk than the others. In this
paper, we have introduced an identity disclosure risk measurement
method that not only implies overall risk, but also indicates whether
some of the members have higher risk than the others. The proposed
method quantifies the overall risk based on the individual risk values,
the percentage of the records that have a risk value higher than the
average and how larger the higher risk values are compared to the
average. We have analyzed the disclosure risks for different
disclosure control techniques applied to original microdata and
present the results.