Re-identification of familial database records.

Malin B
AMIA Annu Symp Proc. 2006: 524-8

PMID: 17238396 · PMCID: PMC1839550

Many genome-based research projects include familial relationships, such as pedigrees, with genomic data records. To protect anonymity when sharing family information, data holders remove, or encode, explicit identifiers (e.g. personal name). In this paper, however, we introduce IdentiFamily, a software program that can link de-identified family relations to named people. The program extracts genealogical knowledge from publicly available records and ascertains the re-identification risk for specific family relations. We find robust genealogies on current populations can be extracted from online sources, such as newspaper obituaries and death records. We evaluate IdentiFamily on real world data for a state's capital city and demonstrate unique identifiability for approximately 70% of the population. IdentiFamily provides organizations with a tool to evaluate the anonymity of pedigrees prior to disclosure and design formal privacy protection techniques.

MeSH Terms (11)

Computer Security Databases, Genetic Female Genealogy and Heraldry Genetic Privacy Humans Male Medical Records Systems, Computerized Pedigree Records Software

Connections (2)

This publication is referenced by other Labnodes entities: