Other search tools

About this data

The publication data currently available has been vetted by Vanderbilt faculty, staff, administrators and trainees. The data itself is retrieved directly from NCBI's PubMed and is automatically updated on a weekly basis to ensure accuracy and completeness.

If you have any questions or comments, please contact us.

Results: 1 to 10 of 33

Publication Record

Connections

Controlling the signal: Practical privacy protection of genomic data sharing through Beacon services.
Wan Z, Vorobeychik Y, Kantarcioglu M, Malin B
(2017) BMC Med Genomics 10: 39
MeSH Terms: Computer Security, Gene Frequency, Genomics, Humans, Information Dissemination
Show Abstract · Added April 10, 2018
BACKGROUND - Genomic data is increasingly collected by a wide array of organizations. As such, there is a growing demand to make summary information about such collections available more widely. However, over the past decade, a series of investigations have shown that attacks, rooted in statistical inference methods, can be applied to discern the presence of a known individual's DNA sequence in the pool of subjects. Recently, it was shown that the Beacon Project of the Global Alliance for Genomics and Health, a web service for querying about the presence (or absence) of a specific allele, was vulnerable. The Integrating Data for Analysis, Anonymization, and Sharing (iDASH) Center modeled a track in their third Privacy Protection Challenge on how to mitigate the Beacon vulnerability. We developed the winning solution for this track.
METHODS - This paper describes our computational method to optimize the tradeoff between the utility and the privacy of the Beacon service. We generalize the genomic data sharing problem beyond that which was introduced in the iDASH Challenge to be more representative of real world scenarios to allow for a more comprehensive evaluation. We then conduct a sensitivity analysis of our method with respect to several state-of-the-art methods using a dataset of 400,000 positions in Chromosome 10 for 500 individuals from Phase 3 of the 1000 Genomes Project. All methods are evaluated for utility, privacy and efficiency.
RESULTS - Our method achieves better performance than all state-of-the-art methods, irrespective of how key factors (e.g., the allele frequency in the population, the size of the pool and utility weights) change from the original parameters of the problem. We further illustrate that it is possible for our method to exhibit subpar performance under special cases of allele query sequences. However, we show our method can be extended to address this issue when the query sequence is fixed and known a priori to the data custodian, so that they may plan stage their responses accordingly.
CONCLUSIONS - This research shows that it is possible to thwart the attack on Beacon services, without substantially altering the utility of the system, using computational methods. The method we initially developed is limited by the design of the scenario and evaluation protocol for the iDASH Challenge; however, it can be improved by allowing the data custodian to act in a staged manner.
0 Communities
1 Members
0 Resources
MeSH Terms
Towards a privacy preserving cohort discovery framework for clinical research networks.
Yuan J, Malin B, Modave F, Guo Y, Hogan WR, Shenkman E, Bian J
(2017) J Biomed Inform 66: 42-51
MeSH Terms: Computer Security, Confidentiality, Electronic Health Records, Female, Health Insurance Portability and Accountability Act, Humans, United States
Show Abstract · Added April 10, 2018
BACKGROUND - The last few years have witnessed an increasing number of clinical research networks (CRNs) focused on building large collections of data from electronic health records (EHRs), claims, and patient-reported outcomes (PROs). Many of these CRNs provide a service for the discovery of research cohorts with various health conditions, which is especially useful for rare diseases. Supporting patient privacy can enhance the scalability and efficiency of such processes; however, current practice mainly relies on policy, such as guidelines defined in the Health Insurance Portability and Accountability Act (HIPAA), which are insufficient for CRNs (e.g., HIPAA does not require encryption of data - which can mitigate insider threats). By combining policy with privacy enhancing technologies we can enhance the trustworthiness of CRNs. The goal of this research is to determine if searchable encryption can instill privacy in CRNs without sacrificing their usability.
METHODS - We developed a technique, implemented in working software to enable privacy-preserving cohort discovery (PPCD) services in large distributed CRNs based on elliptic curve cryptography (ECC). This technique also incorporates a block indexing strategy to improve the performance (in terms of computational running time) of PPCD. We evaluated the PPCD service with three real cohort definitions: (1) elderly cervical cancer patients who underwent radical hysterectomy, (2) oropharyngeal and tongue cancer patients who underwent robotic transoral surgery, and (3) female breast cancer patients who underwent mastectomy) with varied query complexity. These definitions were tested in an encrypted database of 7.1 million records derived from the publically available Healthcare Cost and Utilization Project (HCUP) Nationwide Inpatient Sample (NIS). We assessed the performance of the PPCD service in terms of (1) accuracy in cohort discovery, (2) computational running time, and (3) privacy afforded to the underlying records during PPCD.
RESULTS - The empirical results indicate that the proposed PPCD can execute cohort discovery queries in a reasonable amount of time, with query runtime in the range of 165-262s for the 3 use cases, with zero compromise in accuracy. We further show that the search performance is practical because it supports a highly parallelized design for secure evaluation over encrypted records. Additionally, our security analysis shows that the proposed construction is resilient to standard adversaries.
CONCLUSIONS - PPCD services can be designed for clinical research networks. The security construction presented in this work specifically achieves high privacy guarantees by preventing both threats originating from within and beyond the network.
Copyright © 2016 Elsevier Inc. All rights reserved.
0 Communities
1 Members
0 Resources
MeSH Terms
Protecting genomic data analytics in the cloud: state of the art and opportunities.
Tang H, Jiang X, Wang X, Wang S, Sofia H, Fox D, Lauter K, Malin B, Telenti A, Xiong L, Ohno-Machado L
(2016) BMC Med Genomics 9: 63
MeSH Terms: Cloud Computing, Computer Security, Genome-Wide Association Study, Genomics
Show Abstract · Added April 10, 2018
The outsourcing of genomic data into public cloud computing settings raises concerns over privacy and security. Significant advancements in secure computation methods have emerged over the past several years, but such techniques need to be rigorously evaluated for their ability to support the analysis of human genomic data in an efficient and cost-effective manner. With respect to public cloud environments, there are concerns about the inadvertent exposure of human genomic data to unauthorized users. In analyses involving multiple institutions, there is additional concern about data being used beyond agreed research scope and being prcoessed in untrused computational environments, which may not satisfy institutional policies. To systematically investigate these issues, the NIH-funded National Center for Biomedical Computing iDASH (integrating Data for Analysis, 'anonymization' and SHaring) hosted the second Critical Assessment of Data Privacy and Protection competition to assess the capacity of cryptographic technologies for protecting computation over human genomes in the cloud and promoting cross-institutional collaboration. Data scientists were challenged to design and engineer practical algorithms for secure outsourcing of genome computation tasks in working software, whereby analyses are performed only on encrypted data. They were also challenged to develop approaches to enable secure collaboration on data from genomic studies generated by multiple organizations (e.g., medical centers) to jointly compute aggregate statistics without sharing individual-level records. The results of the competition indicated that secure computation techniques can enable comparative analysis of human genomes, but greater efficiency (in terms of compute time and memory utilization) are needed before they are sufficiently practical for real world environments.
0 Communities
1 Members
0 Resources
MeSH Terms
Growth of Secure Messaging Through a Patient Portal as a Form of Outpatient Interaction across Clinical Specialties.
Cronin RM, Davis SE, Shenson JA, Chen Q, Rosenbloom ST, Jackson GP
(2015) Appl Clin Inform 6: 288-304
MeSH Terms: Adult, Child, Cohort Studies, Computer Security, Delivery of Health Care, Electronic Health Records, Electronic Mail, Female, Humans, Male, Medicine, Middle Aged, Outpatients, Retrospective Studies
Show Abstract · Added August 25, 2015
OBJECTIVE - Patient portals are online applications that allow patients to interact with healthcare organizations. Portal adoption is increasing, and secure messaging between patients and healthcare providers is an emerging form of outpatient interaction. Research about portals and messaging has focused on medical specialties. We characterized adoption of secure messaging and the contribution of messaging to outpatient interactions across diverse clinical specialties after broad portal deployment.
METHODS - This retrospective cohort study at Vanderbilt University Medical Center examined use of patient-initiated secure messages and clinic visits in the three years following full deployment of a patient portal across adult and pediatric specialties. We measured the proportion of outpatient interactions (i.e., messages plus clinic visits) conducted through secure messaging by specialty over time. Generalized estimating equations measured the likelihood of message-based versus clinic outpatient interaction across clinical specialties.
RESULTS - Over the study period, 2,422,114 clinic visits occurred, and 82,159 unique portal users initiated 948,428 messages to 1,924 recipients. Medicine participated in the most message exchanges (742,454 messages; 78.3% of all messages sent), followed by surgery (84,001; 8.9%) and obstetrics/gynecology (53,424; 5.6%). The proportion of outpatient interaction through messaging increased from 12.9% in 2008 to 33.0% in 2009 and 39.8% in 2010 (p<0.001). Medicine had the highest proportion of outpatient interaction conducted through messaging in 2008 (23.3% of outpatient interactions in medicine). By 2010, this proportion was highest for obstetrics/gynecology (83.4%), dermatology (71.6%), and medicine (56.7%). Growth in likelihood of message-based interaction was greater for anesthesiology, dermatology, obstetrics/gynecology, pediatrics, and psychiatry than for medicine (p<0.001).
CONCLUSIONS - This study demonstrates rapid adoption of secure messaging across diverse clinical specialties, with messaging interactions exceeding face-to-face clinic visits for some specialties. As patient portal and secure messaging adoption increase beyond medicine and primary care, research is needed to understand the implications for provider workload and patient care.
0 Communities
2 Members
0 Resources
14 MeSH Terms
Design and implementation of a privacy preserving electronic health record linkage tool in Chicago.
Kho AN, Cashy JP, Jackson KL, Pah AR, Goel S, Boehnke J, Humphries JE, Kominers SD, Hota BN, Sims SA, Malin BA, French DD, Walunas TL, Meltzer DO, Kaleba EO, Jones RC, Galanter WL
(2015) J Am Med Inform Assoc 22: 1072-80
MeSH Terms: Chicago, Computer Security, Confidentiality, Electronic Health Records, Health Information Exchange, Health Insurance Portability and Accountability Act, Humans, Medical Record Linkage, Software, United States
Show Abstract · Added April 10, 2018
OBJECTIVE - To design and implement a tool that creates a secure, privacy preserving linkage of electronic health record (EHR) data across multiple sites in a large metropolitan area in the United States (Chicago, IL), for use in clinical research.
METHODS - The authors developed and distributed a software application that performs standardized data cleaning, preprocessing, and hashing of patient identifiers to remove all protected health information. The application creates seeded hash code combinations of patient identifiers using a Health Insurance Portability and Accountability Act compliant SHA-512 algorithm that minimizes re-identification risk. The authors subsequently linked individual records using a central honest broker with an algorithm that assigns weights to hash combinations in order to generate high specificity matches.
RESULTS - The software application successfully linked and de-duplicated 7 million records across 6 institutions, resulting in a cohort of 5 million unique records. Using a manually reconciled set of 11 292 patients as a gold standard, the software achieved a sensitivity of 96% and a specificity of 100%, with a majority of the missed matches accounted for by patients with both a missing social security number and last name change. Using 3 disease examples, it is demonstrated that the software can reduce duplication of patient records across sites by as much as 28%.
CONCLUSIONS - Software that standardizes the assignment of a unique seeded hash identifier merged through an agreed upon third-party honest broker can enable large-scale secure linkage of EHR data for epidemiologic and public health research. The software algorithm can improve future epidemiologic research by providing more comprehensive data given that patients may make use of multiple healthcare systems.
© The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
0 Communities
1 Members
0 Resources
MeSH Terms
SOEMPI: A Secure Open Enterprise Master Patient Index Software Toolkit for Private Record Linkage.
Toth C, Durham E, Kantarcioglu M, Xue Y, Malin B
(2014) AMIA Annu Symp Proc 2014: 1105-14
MeSH Terms: Algorithms, Computer Security, Medical Record Linkage, Ownership, Software
Show Abstract · Added April 10, 2018
To mitigate bias in multi-institutional research studies, healthcare organizations need to integrate patient records. However, this process must be accomplished without disclosing the identities of the corresponding patients. Various private record linkage (PRL) techniques have been proposed, but there is a lack of translation into practice because no software suite supports the entire PRL lifecycle. This paper addresses this issue with the introduction of the Secure Open Enterprise Master Patient Index (SOEMPI). We show how SOEMPI covers the PRL lifecycle, illustrate the implementation of several PRL protocols, and provide a runtime analysis for the integration of two datasets consisting of 10,000 records. While the PRL process is slower than a non-secure setting, our analysis shows the majority of processes in a PRL protocol require several seconds or less and that SOEMPI completes the process in approximately two minutes, which is a practical amount of time for integration.
0 Communities
1 Members
0 Resources
MeSH Terms
R-U policy frontiers for health data de-identification.
Xia W, Heatherly R, Ding X, Li J, Malin BA
(2015) J Am Med Inform Assoc 22: 1029-41
MeSH Terms: Algorithms, Computer Security, Confidentiality, Datasets as Topic, Demography, Health Insurance Portability and Accountability Act, Humans, United States
Show Abstract · Added April 10, 2018
OBJECTIVE - The Health Insurance Portability and Accountability Act Privacy Rule enables healthcare organizations to share de-identified data via two routes. They can either 1) show re-identification risk is small (e.g., via a formal model, such as k-anonymity) with respect to an anticipated recipient or 2) apply a rule-based policy (i.e., Safe Harbor) that enumerates attributes to be altered (e.g., dates to years). The latter is often invoked because it is interpretable, but it fails to tailor protections to the capabilities of the recipient. The paper shows rule-based policies can be mapped to a utility (U) and re-identification risk (R) space, which can be searched for a collection, or frontier, of policies that systematically trade off between these goals.
METHODS - We extend an algorithm to efficiently compose an R-U frontier using a lattice of policy options. Risk is proportional to the number of patients to which a record corresponds, while utility is proportional to similarity of the original and de-identified distribution. We allow our method to search 20 000 rule-based policies (out of 2(700)) and compare the resulting frontier with k-anonymous solutions and Safe Harbor using the demographics of 10 U.S. states.
RESULTS - The results demonstrate the rule-based frontier 1) consists, on average, of 5000 policies, 2% of which enable better utility with less risk than Safe Harbor and 2) the policies cover a broader spectrum of utility and risk than k-anonymity frontiers.
CONCLUSIONS - R-U frontiers of de-identification policies can be discovered efficiently, allowing healthcare organizations to tailor protections to anticipated needs and trustworthiness of recipients.
© The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
0 Communities
1 Members
0 Resources
MeSH Terms
Location bias of identifiers in clinical narratives.
Hanauer DA, Mei Q, Malin B, Zheng K
(2013) AMIA Annu Symp Proc 2013: 560-9
MeSH Terms: Computer Security, Confidentiality, Electronic Health Records, Health Insurance Portability and Accountability Act, Humans, Medical Records Systems, Computerized, Narration, United States
Show Abstract · Added April 10, 2018
Scrubbing identifying information from narrative clinical documents is a critical first step to preparing the data for secondary use purposes, such as translational research. Evidence suggests that the differential distribution of protected health information (PHI) in clinical documents could be used as additional features to improve the performance of automated de-identification algorithms or toolkits. However, there has been little investigation into the extent to which such phenomena transpires in practice. To empirically assess this issue, we identified the location of PHI in 140,000 clinical notes from an electronic health record system and characterized the distribution as a function of location in a document. In addition, we calculated the 'word proximity' of nearby PHI elements to determine their co-occurrence rates. The PHI elements were found to have non-random distribution patterns. Location within a document and proximity between PHI elements might therefore be used to help de-identification systems better label PHI.
0 Communities
1 Members
0 Resources
MeSH Terms
Ethical, legal, and social implications of incorporating genomic information into electronic health records.
Hazin R, Brothers KB, Malin BA, Koenig BA, Sanderson SC, Rothstein MA, Williams MS, Clayton EW, Kullo IJ
(2013) Genet Med 15: 810-6
MeSH Terms: Computer Security, Confidentiality, Decision Support Systems, Clinical, Electronic Health Records, Genetic Privacy, Genomics, Health Literacy, Health Records, Personal, Humans, Incidental Findings, Patient Access to Records, Precision Medicine
Show Abstract · Added April 10, 2018
The inclusion of genomic data in the electronic health record raises important ethical, legal, and social issues. In this article, we highlight these challenges and discuss potential solutions. We provide a brief background on the current state of electronic health records in the context of genomic medicine, discuss the importance of equitable access to genome-enabled electronic health records, and consider the potential use of electronic health records for improving genomic literacy in patients and providers. We highlight the importance of privacy, access, and security, and of determining which genomic information is included in the electronic health record. Finally, we discuss the challenges of reporting incidental findings, storing and reinterpreting genomic data, and nondocumentation and duty to warn family members at potential genetic risk.
0 Communities
1 Members
0 Resources
MeSH Terms
Auditing medical records accesses via healthcare interaction networks.
Chen Y, Nyemba S, Malin B
(2012) AMIA Annu Symp Proc 2012: 93-102
MeSH Terms: Computer Security, Confidentiality, Health Facility Administration, Humans, Interprofessional Relations, Medical Audit, Medical Records Systems, Computerized, Models, Organizational
Show Abstract · Added March 29, 2013
Healthcare organizations are deploying increasingly complex clinical information systems to support patient care. Traditional information security practices (e.g., role-based access control) are embedded in enterprise-level systems, but are insufficient to ensure patient privacy. This is due, in part, to the dynamic nature of healthcare, which makes it difficult to predict which care providers need access to what and when. In this paper, we show that modeling operations at a higher level of granularity (e.g., the departmental level) are stable in the context of a relational network, which may enable more effective auditing strategies. We study three months of access logs from a large academic medical center to illustrate that departmental interaction networks exhibit certain invariants, such as the number, strength, and reciprocity of relationships. We further show that the relations extracted from the network can be leveraged to assess the extent to which a patient's care satisfies expected organizational behavior.
1 Communities
1 Members
0 Resources
8 MeSH Terms