Other search tools

About this data

The publication data currently available has been vetted by Vanderbilt faculty, staff, administrators and trainees. The data itself is retrieved directly from NCBI's PubMed and is automatically updated on a weekly basis to ensure accuracy and completeness.

If you have any questions or comments, please contact us.

Results: 1 to 10 of 25

Publication Record

Connections

Rule-based and machine learning algorithms identify patients with systemic sclerosis accurately in the electronic health record.
Jamian L, Wheless L, Crofford LJ, Barnado A
(2019) Arthritis Res Ther 21: 305
MeSH Terms: Adult, Aged, Aged, 80 and over, Algorithms, Databases, Factual, Electronic Health Records, Female, Humans, International Classification of Diseases, Machine Learning, Male, Middle Aged, Reproducibility of Results, Scleroderma, Systemic, Sensitivity and Specificity
Show Abstract · Added March 25, 2020
BACKGROUND - Systemic sclerosis (SSc) is a rare disease with studies limited by small sample sizes. Electronic health records (EHRs) represent a powerful tool to study patients with rare diseases such as SSc, but validated methods are needed. We developed and validated EHR-based algorithms that incorporate billing codes and clinical data to identify SSc patients in the EHR.
METHODS - We used a de-identified EHR with over 3 million subjects and identified 1899 potential SSc subjects with at least 1 count of the SSc ICD-9 (710.1) or ICD-10-CM (M34*) codes. We randomly selected 200 as a training set for chart review. A subject was a case if diagnosed with SSc by a rheumatologist, dermatologist, or pulmonologist. We selected the following algorithm components based on clinical knowledge and available data: SSc ICD-9 and ICD-10-CM codes, positive antinuclear antibody (ANA) (titer ≥ 1:80), and a keyword of Raynaud's phenomenon (RP). We performed both rule-based and machine learning techniques for algorithm development. Positive predictive values (PPVs), sensitivities, and F-scores (which account for PPVs and sensitivities) were calculated for the algorithms.
RESULTS - PPVs were low for algorithms using only 1 count of the SSc ICD-9 code. As code counts increased, the PPVs increased. PPVs were higher for algorithms using ICD-10-CM codes versus the ICD-9 code. Adding a positive ANA and RP keyword increased the PPVs of algorithms only using ICD billing codes. Algorithms using ≥ 3 or ≥ 4 counts of the SSc ICD-9 or ICD-10-CM codes and ANA positivity had the highest PPV at 100% but a low sensitivity at 50%. The algorithm with the highest F-score of 91% was ≥ 4 counts of the ICD-9 or ICD-10-CM codes with an internally validated PPV of 90%. A machine learning method using random forests yielded an algorithm with a PPV of 84%, sensitivity of 92%, and F-score of 88%. The most important feature was RP keyword.
CONCLUSIONS - Algorithms using only ICD-9 codes did not perform well to identify SSc patients. The highest performing algorithms incorporated clinical data with billing codes. EHR-based algorithms can identify SSc patients across a healthcare system, enabling researchers to examine important outcomes.
0 Communities
1 Members
0 Resources
15 MeSH Terms
Validation of discharge diagnosis codes to identify serious infections among middle age and older adults.
Wiese AD, Griffin MR, Stein CM, Schaffner W, Greevy RA, Mitchel EF, Grijalva CG
(2018) BMJ Open 8: e020857
MeSH Terms: Aged, Aged, 80 and over, Algorithms, Clinical Coding, Female, Humans, Infections, International Classification of Diseases, Male, Medicaid, Medical Records, Middle Aged, Patient Discharge, Predictive Value of Tests, Reproducibility of Results, Retrospective Studies, Tennessee, United States
Show Abstract · Added July 27, 2018
OBJECTIVES - Hospitalisations for serious infections are common among middle age and older adults and frequently used as study outcomes. Yet, few studies have evaluated the performance of diagnosis codes to identify serious infections in this population. We sought to determine the positive predictive value (PPV) of diagnosis codes for identifying hospitalisations due to serious infections among middle age and older adults.
SETTING AND PARTICIPANTS - We identified hospitalisations for possible infection among adults >=50 years enrolled in the Tennessee Medicaid healthcare programme (2008-2012) using International Classifications of Diseases, Ninth Revision diagnosis codes for pneumonia, meningitis/encephalitis, bacteraemia/sepsis, cellulitis/soft-tissue infections, endocarditis, pyelonephritis and septic arthritis/osteomyelitis.
DESIGN - Medical records were systematically obtained from hospitals randomly selected from a stratified sampling framework based on geographical region and hospital discharge volume.
MEASURES - Two trained clinical reviewers used a standardised extraction form to abstract information from medical records. Predefined algorithms served as reference to adjudicate confirmed infection-specific hospitalisations. We calculated the PPV of diagnosis codes using confirmed hospitalisations as reference. Sensitivity analyses determined the robustness of the PPV to definitions that required radiological or microbiological confirmation. We also determined inter-rater reliability between reviewers.
RESULTS - The PPV of diagnosis codes for hospitalisations for infection (n=716) was 90.2% (95% CI 87.8% to 92.2%). The PPV was highest for pneumonia (96.5% (95% CI 93.9% to 98.0%)) and cellulitis (91.1% (95% CI 84.7% to 94.9%)), and lowest for meningitis/encephalitis (50.0% (95% CI 23.7% to 76.3%)). The adjudication reliability was excellent (92.7% agreement; first agreement coefficient: 0.91). The overall PPV was lower when requiring microbiological confirmation (45%) and when requiring radiological confirmation for pneumonia (79%).
CONCLUSIONS - Discharge diagnosis codes have a high PPV for identifying hospitalisations for common, serious infections among middle age and older adults. PPV estimates for rare infections were imprecise.
© Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
0 Communities
1 Members
0 Resources
18 MeSH Terms
Validation of an algorithm to identify heart failure hospitalisations in patients with diabetes within the veterans health administration.
Presley CA, Min JY, Chipman J, Greevy RA, Grijalva CG, Griffin MR, Roumie CL
(2018) BMJ Open 8: e020455
MeSH Terms: Adult, Aged, Algorithms, Diabetes Complications, Diabetes Mellitus, Diagnosis-Related Groups, Female, Heart Failure, Hospitalization, Humans, International Classification of Diseases, Male, Middle Aged, Predictive Value of Tests, Sensitivity and Specificity, Veterans
Show Abstract · Added July 27, 2018
OBJECTIVES - We aimed to validate an algorithm using both primary discharge diagnosis (International Classification of Diseases Ninth Revision (ICD-9)) and diagnosis-related group (DRG) codes to identify hospitalisations due to decompensated heart failure (HF) in a population of patients with diabetes within the Veterans Health Administration (VHA) system.
DESIGN - Validation study.
SETTING - Veterans Health Administration-Tennessee Valley Healthcare System PARTICIPANTS: We identified and reviewed a stratified, random sample of hospitalisations between 2001 and 2012 within a single VHA healthcare system of adults who received regular VHA care and were initiated on an antidiabetic medication between 2001 and 2008. We sampled 500 hospitalisations; 400 hospitalisations that fulfilled algorithm criteria, 100 that did not. Of these, 497 had adequate information for inclusion. The mean patient age was 66.1 years (SD 11.4). Majority of patients were male (98.8%); 75% were white and 20% were black.
PRIMARY AND SECONDARY OUTCOME MEASURES - To determine if a hospitalisation was due to HF, we performed chart abstraction using Framingham criteria as the referent standard. We calculated the positive predictive value (PPV), negative predictive value (NPV), sensitivity and specificity for the overall algorithm and each component (primary diagnosis code (ICD-9), DRG code or both).
RESULTS - The algorithm had a PPV of 89.7% (95% CI 86.8 to 92.7), NPV of 93.9% (89.1 to 98.6), sensitivity of 45.1% (25.1 to 65.1) and specificity of 99.4% (99.2 to 99.6). The PPV was highest for hospitalisations that fulfilled both the ICD-9 and DRG algorithm criteria (92.1% (89.1 to 95.1)) and lowest for hospitalisations that fulfilled only DRG algorithm criteria (62.5% (28.4 to 96.6)).
CONCLUSIONS - Our algorithm, which included primary discharge diagnosis and DRG codes, demonstrated excellent PPV for identification of hospitalisations due to decompensated HF among patients with diabetes in the VHA system.
© Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
0 Communities
1 Members
0 Resources
16 MeSH Terms
The time has come for dimensional personality disorder diagnosis.
Hopwood CJ, Kotov R, Krueger RF, Watson D, Widiger TA, Althoff RR, Ansell EB, Bach B, Michael Bagby R, Blais MA, Bornovalova MA, Chmielewski M, Cicero DC, Conway C, De Clercq B, De Fruyt F, Docherty AR, Eaton NR, Edens JF, Forbes MK, Forbush KT, Hengartner MP, Ivanova MY, Leising D, John Livesley W, Lukowitsky MR, Lynam DR, Markon KE, Miller JD, Morey LC, Mullins-Sweatt SN, Hans Ormel J, Patrick CJ, Pincus AL, Ruggero C, Samuel DB, Sellbom M, Slade T, Tackett JL, Thomas KM, Trull TJ, Vachon DD, Waldman ID, Waszczuk MA, Waugh MH, Wright AGC, Yalch MM, Zald DH, Zimmermann J
(2018) Personal Ment Health 12: 82-86
MeSH Terms: Humans, International Classification of Diseases, Personality Disorders
Added March 21, 2018
0 Communities
1 Members
0 Resources
3 MeSH Terms
Developing Electronic Health Record Algorithms That Accurately Identify Patients With Systemic Lupus Erythematosus.
Barnado A, Casey C, Carroll RJ, Wheless L, Denny JC, Crofford LJ
(2017) Arthritis Care Res (Hoboken) 69: 687-693
MeSH Terms: Adult, Aged, Algorithms, Antibodies, Antinuclear, Antirheumatic Agents, Electronic Health Records, Female, Humans, International Classification of Diseases, Lupus Erythematosus, Systemic, Male, Middle Aged, Predictive Value of Tests, Reproducibility of Results, Sensitivity and Specificity
Show Abstract · Added March 14, 2018
OBJECTIVE - To study systemic lupus erythematosus (SLE) in the electronic health record (EHR), we must accurately identify patients with SLE. Our objective was to develop and validate novel EHR algorithms that use International Classification of Diseases, Ninth Revision (ICD-9), Clinical Modification codes, laboratory testing, and medications to identify SLE patients.
METHODS - We used Vanderbilt's Synthetic Derivative, a de-identified version of the EHR, with 2.5 million subjects. We selected all individuals with at least 1 SLE ICD-9 code (710.0), yielding 5,959 individuals. To create a training set, 200 subjects were randomly selected for chart review. A subject was defined as a case if diagnosed with SLE by a rheumatologist, nephrologist, or dermatologist. Positive predictive values (PPVs) and sensitivity were calculated for combinations of code counts of the SLE ICD-9 code, a positive antinuclear antibody (ANA), ever use of medications, and a keyword of "lupus" in the problem list. The algorithms with the highest PPV were each internally validated using a random set of 100 individuals from the remaining 5,759 subjects.
RESULTS - The algorithm with the highest PPV at 95% in the training set and 91% in the validation set was 3 or more counts of the SLE ICD-9 code, ANA positive (≥1:40), and ever use of both disease-modifying antirheumatic drugs and steroids, while excluding individuals with systemic sclerosis and dermatomyositis ICD-9 codes.
CONCLUSION - We developed and validated the first EHR algorithm that incorporates laboratory values and medications with the SLE ICD-9 code to identify patients with SLE accurately.
© 2016, American College of Rheumatology.
0 Communities
2 Members
0 Resources
15 MeSH Terms
PheKB: a catalog and workflow for creating electronic phenotype algorithms for transportability.
Kirby JC, Speltz P, Rasmussen LV, Basford M, Gottesman O, Peissig PL, Pacheco JA, Tromp G, Pathak J, Carrell DS, Ellis SB, Lingren T, Thompson WK, Savova G, Haines J, Roden DM, Harris PA, Denny JC
(2016) J Am Med Inform Assoc 23: 1046-1052
MeSH Terms: Algorithms, Data Mining, Electronic Health Records, Genomics, Humans, International Classification of Diseases, Knowledge Bases, Natural Language Processing, Phenotype
Show Abstract · Added March 14, 2018
OBJECTIVE - Health care generated data have become an important source for clinical and genomic research. Often, investigators create and iteratively refine phenotype algorithms to achieve high positive predictive values (PPVs) or sensitivity, thereby identifying valid cases and controls. These algorithms achieve the greatest utility when validated and shared by multiple health care systems.Materials and Methods We report the current status and impact of the Phenotype KnowledgeBase (PheKB, http://phekb.org), an online environment supporting the workflow of building, sharing, and validating electronic phenotype algorithms. We analyze the most frequent components used in algorithms and their performance at authoring institutions and secondary implementation sites.
RESULTS - As of June 2015, PheKB contained 30 finalized phenotype algorithms and 62 algorithms in development spanning a range of traits and diseases. Phenotypes have had over 3500 unique views in a 6-month period and have been reused by other institutions. International Classification of Disease codes were the most frequently used component, followed by medications and natural language processing. Among algorithms with published performance data, the median PPV was nearly identical when evaluated at the authoring institutions (n = 44; case 96.0%, control 100%) compared to implementation sites (n = 40; case 97.5%, control 100%).
DISCUSSION - These results demonstrate that a broad range of algorithms to mine electronic health record data from different health systems can be developed with high PPV, and algorithms developed at one site are generally transportable to others.
CONCLUSION - By providing a central repository, PheKB enables improved development, transportability, and validity of algorithms for research-grade phenotypes using health care generated data.
© The Author 2016. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
0 Communities
2 Members
0 Resources
9 MeSH Terms
A multi-institution evaluation of clinical profile anonymization.
Heatherly R, Rasmussen LV, Peissig PL, Pacheco JA, Harris P, Denny JC, Malin BA
(2016) J Am Med Inform Assoc 23: e131-7
MeSH Terms: Confidentiality, Data Anonymization, Electronic Health Records, Humans, Hypothyroidism, Information Dissemination, International Classification of Diseases, Organizational Case Studies
Show Abstract · Added March 14, 2018
BACKGROUND AND OBJECTIVE - There is an increasing desire to share de-identified electronic health records (EHRs) for secondary uses, but there are concerns that clinical terms can be exploited to compromise patient identities. Anonymization algorithms mitigate such threats while enabling novel discoveries, but their evaluation has been limited to single institutions. Here, we study how an existing clinical profile anonymization fares at multiple medical centers.
METHODS - We apply a state-of-the-artk-anonymization algorithm, withkset to the standard value 5, to the International Classification of Disease, ninth edition codes for patients in a hypothyroidism association study at three medical centers: Marshfield Clinic, Northwestern University, and Vanderbilt University. We assess utility when anonymizing at three population levels: all patients in 1) the EHR system; 2) the biorepository; and 3) a hypothyroidism study. We evaluate utility using 1) changes to the number included in the dataset, 2) number of codes included, and 3) regions generalization and suppression were required.
RESULTS - Our findings yield several notable results. First, we show that anonymizing in the context of the entire EHR yields a significantly greater quantity of data by reducing the amount of generalized regions from ∼15% to ∼0.5%. Second, ∼70% of codes that needed generalization only generalized two or three codes in the largest anonymization.
CONCLUSIONS - Sharing large volumes of clinical data in support of phenome-wide association studies is possible while safeguarding privacy to the underlying individuals.
© The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
0 Communities
2 Members
0 Resources
8 MeSH Terms
Combining billing codes, clinical notes, and medications from electronic health records provides superior phenotyping performance.
Wei WQ, Teixeira PL, Mo H, Cronin RM, Warner JL, Denny JC
(2016) J Am Med Inform Assoc 23: e20-7
MeSH Terms: Algorithms, Diagnosis, Electronic Health Records, Humans, International Classification of Diseases, Medical Records, Problem-Oriented, Phenotype, Predictive Value of Tests
Show Abstract · Added March 14, 2018
OBJECTIVE - To evaluate the phenotyping performance of three major electronic health record (EHR) components: International Classification of Disease (ICD) diagnosis codes, primary notes, and specific medications.
MATERIALS AND METHODS - We conducted the evaluation using de-identified Vanderbilt EHR data. We preselected ten diseases: atrial fibrillation, Alzheimer's disease, breast cancer, gout, human immunodeficiency virus infection, multiple sclerosis, Parkinson's disease, rheumatoid arthritis, and types 1 and 2 diabetes mellitus. For each disease, patients were classified into seven categories based on the presence of evidence in diagnosis codes, primary notes, and specific medications. Twenty-five patients per disease category (a total number of 175 patients for each disease, 1750 patients for all ten diseases) were randomly selected for manual chart review. Review results were used to estimate the positive predictive value (PPV), sensitivity, andF-score for each EHR component alone and in combination.
RESULTS - The PPVs of single components were inconsistent and inadequate for accurately phenotyping (0.06-0.71). Using two or more ICD codes improved the average PPV to 0.84. We observed a more stable and higher accuracy when using at least two components (mean ± standard deviation: 0.91 ± 0.08). Primary notes offered the best sensitivity (0.77). The sensitivity of ICD codes was 0.67. Again, two or more components provided a reasonably high and stable sensitivity (0.59 ± 0.16). Overall, the best performance (Fscore: 0.70 ± 0.12) was achieved by using two or more components. Although the overall performance of using ICD codes (0.67 ± 0.14) was only slightly lower than using two or more components, its PPV (0.71 ± 0.13) is substantially worse (0.91 ± 0.08).
CONCLUSION - Multiple EHR components provide a more consistent and higher performance than a single one for the selected phenotypes. We suggest considering multiple EHR components for future phenotyping design in order to obtain an ideal result.
© The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
0 Communities
1 Members
0 Resources
8 MeSH Terms
ICD-10-CM Crosswalks in the primary care setting: assessing reliability of the GEMs and reimbursement mappings.
Turer RW, Zuckowsky TD, Causey HJ, Rosenbloom ST
(2015) J Am Med Inform Assoc 22: 417-25
MeSH Terms: Clinical Coding, Humans, Insurance, Health, Reimbursement, International Classification of Diseases, Primary Health Care, Reproducibility of Results
Show Abstract · Added January 26, 2016
OBJECTIVE - The general equivalence mappings (GEMs) and reimbursement mappings (RMs) facilitate translation between ICD-9-CM and ICD-10-CM. This study compared prospectively dual-encoded diagnoses assigned by professional coders with the GEMs/RMs in a clinical setting.
MATERIALS AND METHODS - Professional coders manually encoded diagnoses from 100 primary care notes into both ICD-9-CM and ICD-10-CM. The investigators evaluated whether manual mappings were reproducible using the GEMs/RMs. Reproducible mappings with one ICD-9-CM and one ICD-10-CM code ("one-to-one") were classified as exact or approximate using GEMs flags. Mismatches were characterized manually.
RESULTS - Manual encodings were reproducible from the forward GEMs, backward GEMs, and RMs in 85.2%, 90.4%, and 88.1% of diagnoses, respectively. For one-to-one, reproducible mappings, 61% (forward) and 63% (backward) were approximate mappings compared to 85% and 95% in the GEMs as a whole. Mismatches between manual and GEMs encodings were due to differences in coder interpretation (11%-13%), subtle hierarchical differences (52%-55%), or unknown reasons (32%-35%).
DISCUSSION - This study highlights inconsistencies between manual encoding and using the GEMs/RMs. The number of approximate mappings in our population compared to all one-to-one GEMs entries supports the notion that statistics describing the GEMs as a whole might not represent the most important mappings for each organization. The mismatch characteristics highlight the subtle differences between manual encoding and using the GEMs/RMs.
CONCLUSION - These results support the need for organizations to assess the GEMs and RMs in their own environment to avoid changes in reimbursement and longitudinal statistics.
© The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
0 Communities
1 Members
0 Resources
6 MeSH Terms
Factors Associated With Increased In-Hospital Mortality Among Children With Intracerebral Hemorrhage.
Adil MM, Qureshi AI, Beslow LA, Malik AA, Jordan LC
(2015) J Child Neurol 30: 1024-8
MeSH Terms: Adolescent, Analysis of Variance, Cerebral Hemorrhage, Child, Child, Preschool, Databases, Factual, Female, Health Care Costs, Hospital Mortality, Humans, Infant, International Classification of Diseases, Logistic Models, Male, Retrospective Studies, Risk Factors, Young Adult
Show Abstract · Added March 24, 2020
We assessed factors associated with mortality and potential targets for intervention in a large national sample of children with nontraumatic intracerebral hemorrhage. Using Healthcare Cost and Utilization Project Kids' Inpatient Database ICD-9-CM code 431 identified children aged 1 to 18 years with nontraumatic intracerebral hemorrhage in 2003, 2006 and 2009. Intracerebral hemorrhage was the primary diagnosis for 1172 children (ages 1-18 years) over the 3-year sample. Factors associated with mortality based on multivariable logistic regression included Hispanic ethnicity (odds ratio 1.9, 95% confidence interval 1.1-3.3), older age (11-18 vs 1-10 years, odds ratio 2.5, 95% confidence interval 1.3-5.0), coagulopathy (odds ratio 3.0, 95% confidence interval 1.6-6.0), and coma (odds ratio 9.0, 95% confidence interval 3.2-24.6). From 2003 to 2009, there was a non-significant decrease in mortality with a significant increase in length of stay from 9 to 11 days (P < .003). In children with intracerebral hemorrhage, coma and coagulopathy had the strongest association with mortality; coagulopathy is a potentially modifiable risk factor.
© The Author(s) 2014.
0 Communities
1 Members
0 Resources
MeSH Terms