The publication data currently available has been vetted by Vanderbilt faculty, staff, administrators and trainees. The data itself is retrieved directly from NCBI's PubMed and is automatically updated on a weekly basis to ensure accuracy and completeness.
If you have any questions or comments, please contact us.
PURPOSE - As deep neural networks achieve more success in the wide field of computer vision, greater emphasis is being placed on the generalizations of these models for production deployment. With sufficiently large training datasets, models can typically avoid overfitting their data; however, for medical imaging it is often difficult to obtain enough data from a single site. Sharing data between institutions is also frequently nonviable or prohibited due to security measures and research compliance constraints, enforced to guard protected health information (PHI) and patient anonymity.
METHODS - In this paper, we implement cyclic weight transfer with independent datasets from multiple geographically disparate sites without compromising PHI. We compare results between single-site learning (SSL) and multisite learning (MSL) models on testing data drawn from each of the training sites as well as two other institutions.
RESULTS - The MSL model attains an average dice similarity coefficient (DSC) of 0.690 on the holdout institution datasets with a volume correlation of 0.914, respectively corresponding to a 7% and 5% statistically significant improvement over the average of both SSL models, which attained an average DSC of 0.646 and average correlation of 0.871.
CONCLUSIONS - We show that a neural network can be efficiently trained on data from two physically remote sites without consolidating patient data to a single location. The resulting network improves model generalization and achieves higher average DSCs on external datasets than neural networks trained on data from a single source.
© 2019 American Association of Physicists in Medicine.
PURPOSE - Manually tracing regions of interest (ROIs) within the liver is the de facto standard method for measuring liver attenuation on computed tomography (CT) in diagnosing nonalcoholic fatty liver disease (NAFLD). However, manual tracing is resource intensive. To address these limitations and to expand the availability of a quantitative CT measure of hepatic steatosis, we propose the automatic liver attenuation ROI-based measurement (ALARM) method for automated liver attenuation estimation.
METHODS - The ALARM method consists of two major stages: (a) deep convolutional neural network (DCNN)-based liver segmentation and (b) automated ROI extraction. First, liver segmentation was achieved using our previously developed SS-Net. Then, a single central ROI (center-ROI) and three circles ROI (periphery-ROI) were computed based on liver segmentation and morphological operations. The ALARM method is available as an open source Docker container (https://github.com/MASILab/ALARM).
RESULTS - Two hundred and forty-six subjects with 738 abdomen CT scans from the African American-Diabetes Heart Study (AA-DHS) were used for external validation (testing), independent from the training and validation cohort (100 clinically acquired CT abdominal scans). From the correlation analyses, the proposed ALARM method achieved Pearson correlations = 0.94 with manual estimation on liver attenuation estimations. When evaluating the ALARM method for detection of nonalcoholic fatty liver disease (NAFLD) using the traditional cut point of < 40 HU, the center-ROI achieved substantial agreements (Kappa = 0.79) with manual estimation, while the periphery-ROI method achieved "excellent" agreement (Kappa = 0.88) with manual estimation. The automated ALARM method had reduced variability compared to manual measurements as indicated by a smaller standard deviation.
CONCLUSIONS - We propose a fully automated liver attenuation estimation method termed ALARM by combining DCNN and morphological operations, which achieved "excellent" agreement with manual estimation for fatty liver detection. The entire pipeline is implemented as a Docker container which enables users to achieve liver attenuation estimation in five minutes per CT exam.
© 2019 American Association of Physicists in Medicine.
Purpose The purpose of this study was to evaluate if higher quantity, diversity, and grammatical informativeness of verb phrases in parent follow-in utterances (i.e., utterances that mapped onto child attentional leads) were significantly related to later expressive verb vocabulary in children with autism spectrum disorder (ASD). Method We examined these associations in a sample of 31 toddlers with ASD and their parents in a longitudinal correlational study. Key aspects of parents' verb input were measured in 2 video-recorded 15-min parent-child free-play sessions. Child expressive verb vocabulary was measured using parent report. Results An aggregate variable composed of the quantity, diversity, and grammatical informativeness of parent verb input in follow-in utterances across the 2 parent-child sessions strongly and positively predicted later child expressive verb vocabulary, total R = .25, even when early child expressive verb vocabulary was controlled, R change = .17. Parent follow-in utterances without verbs were not significantly related to later child expressive verb vocabulary, R = .001. Conclusions These correlational findings are initial steps toward developing a knowledge base for how strong verb vocabulary skills might be facilitated in children with ASD.
The relation between caregiver follow-in utterances with verbs presented in different states of dyadic engagement and later child expressive verb vocabulary in children with autism spectrum disorder (ASD) was examined in 29 toddlers with ASD and their caregivers. Caregiver verb input in follow-in utterances presented during higher order supported joint engagement (HSJE) accounted for a significant, large amount of variance in later child verb vocabulary; R= .26. This relation remained significant when controlling for early verb vocabulary or verb input in lower support engagement states. Other types of talk in follow-in utterances in HSJE did not correlate with later verb vocabulary. These findings are an important step towards identifying interactional contexts that facilitate verb learning in children with ASD.
We describe functional and structural data acquired using a 3T scanner in a sample of 132 typically developing children, who were scanned when they were approximately 11 years old (i.e. Time 1). Sixty-three of them were scanned again approximately 2 years later (i.e. Time 2). Children performed four tasks inside the scanner: two arithmetic tasks and two localizer tasks. The arithmetic tasks were a single-digit multiplication and a single-digit subtraction task. The localizer tasks, a written rhyming judgment task and a numerosity judgment task, were used to independently identify verbal and quantity brain areas, respectively. Additionally, we provide data on behavioral performance on the tasks inside the scanner, participants' scores on standardized tests, including reading and math skill, and a developmental history questionnaire completed by parents. This dataset could be useful to answer questions regarding the neural bases of the development of math in children and its relation to individual differences in skill. The data, entitled "Brain Correlates of Math Development", are freely available from OpenNeuro (https://openneuro.org).
OBJECTIVE - To utilize electronic health records (EHRs) to study SLE, algorithms are needed to accurately identify these patients. We used machine learning to generate data-driven SLE EHR algorithms and assessed performance of existing rule-based algorithms.
METHODS - We randomly selected subjects with ≥ 1 SLE ICD-9/10 codes from our EHR and identified gold standard definite and probable SLE cases by chart review, based on 1997 ACR or 2012 SLICC Classification Criteria. From a training set, we extracted coded and narrative concepts using natural language processing and generated algorithms using penalized logistic regression to classify definite or definite/probable SLE. We assessed predictive characteristics in internal and external cohort validations. We also tested performance characteristics of published rule-based algorithms with pre-specified permutations of ICD-9 codes, laboratory tests and medications in our EHR.
RESULTS - At a specificity of 97%, our machine learning coded algorithm for definite SLE had 90% positive predictive value (PPV) and 64% sensitivity and for definite/probable SLE, 92% PPV and 47% sensitivity. In the external validation, at 97% specificity, the definite/probable algorithm had 94% PPV and 60% sensitivity. Adding NLP concepts did not improve performance metrics. The PPVs of published rule-based algorithms ranged from 45-79% in our EHR.
CONCLUSION - Our machine learning SLE algorithms performed well in internal and external validation. Rule-based SLE algorithms did not transport as well to our EHR. Unique EHR characteristics, clinical practices and research goals regarding the desired sensitivity and specificity of the case definition must be considered when applying algorithms to identify SLE patients.
Copyright © 2019 Elsevier Inc. All rights reserved.
Placental dysfunction is implicated in many pregnancy complications, including preeclampsia and preterm birth (PTB). While both these syndromes are influenced by environmental risk factors, they also have a substantial genetic component that is not well understood. Precisely controlled gene expression during development is crucial to proper placental function and often mediated through gene regulatory enhancers. However, we lack accurate maps of placental enhancer activity due to the challenges of assaying the placenta and the difficulty of comprehensively identifying enhancers. To address the gap in our knowledge of gene regulatory elements in the placenta, we used a two-step machine learning pipeline to synthesize existing functional genomics studies, transcription factor (TF) binding patterns, and evolutionary information to predict placental enhancers. The trained classifiers accurately distinguish enhancers from the genomic background and placental enhancers from enhancers active in other tissues. Genomic features collected from tissues and cell lines involved in pregnancy are the most predictive of placental regulatory activity. Applying the classifiers genome-wide enabled us to create a map of 33,010 predicted placental enhancers, including 4,562 high-confidence enhancer predictions. The genome-wide placental enhancers are significantly enriched nearby genes associated with placental development and birth disorders and for SNPs associated with gestational age. These genome-wide predicted placental enhancers provide candidate regions for further testing in vitro, will assist in guiding future studies of genetic associations with pregnancy phenotypes, and aid interpretation of potential mechanisms of action for variants found through genetic studies.
BACKGROUND - The effects of tobacco smoking on epigenome-wide methylation signatures in white blood cells (WBCs) collected from persons living with HIV may have important implications for their immune-related outcomes, including frailty and mortality. The application of a machine learning approach to the analysis of CpG methylation in the epigenome enables the selection of phenotypically relevant features from high-dimensional data. Using this approach, we now report that a set of smoking-associated DNA-methylated CpGs predicts HIV prognosis and mortality in an HIV-positive veteran population.
RESULTS - We first identified 137 epigenome-wide significant CpGs for smoking in WBCs from 1137 HIV-positive individuals (p < 1.70E-07). To examine whether smoking-associated CpGs were predictive of HIV frailty and mortality, we applied ensemble-based machine learning to build a model in a training sample employing 408,583 CpGs. A set of 698 CpGs was selected and predictive of high HIV frailty in a testing sample [(area under curve (AUC) = 0.73, 95%CI 0.63~0.83)] and was replicated in an independent sample [(AUC = 0.78, 95%CI 0.73~0.83)]. We further found an association of a DNA methylation index constructed from the 698 CpGs that were associated with a 5-year survival rate [HR = 1.46; 95%CI 1.06~2.02, p = 0.02]. Interestingly, the 698 CpGs located on 445 genes were enriched on the integrin signaling pathway (p = 9.55E-05, false discovery rate = 0.036), which is responsible for the regulation of the cell cycle, differentiation, and adhesion.
CONCLUSION - We demonstrated that smoking-associated DNA methylation features in white blood cells predict HIV infection-related clinical outcomes in a population living with HIV.
BACKGROUND - Olfactory dysfunction is a common symptom of chronic rhinosinusitis (CRS). We previously identified several cytokines potentially linked to smell loss, potentially supporting an inflammatory etiology for CRS-associated olfactory dysfunction. In the current study we sought to validate patterns of olfactory dysfunction in CRS using hierarchical cluster analysis, machine learning algorithms, and multivariate regression.
METHODS - CRS patients undergoing functional endoscopic sinus surgery were administered the Smell Identification Test (SIT) preoperatively. Mucus was collected from the middle meatus using an absorbent polyurethane sponge and 17 inflammatory mediators were assessed using a multiplexed flow-cytometric bead assay. Hierarchical cluster analysis was performed to characterize inflammatory patterns and their association with SIT scores. The random forest approach was used to identify cytokines predictive of olfactory function.
RESULTS - One hundred ten patients were enrolled in the study. Hierarchical cluster analysis identified 5 distinct CRS clusters with statistically significant differences in SIT scores observed between individual clusters (p < 0.001). A majority of anosmic patients were found in a single cluster, which was additionally characterized by nasal polyposis (100%) and a high incidence of allergic fungal rhinosinusitis (50%) and aspirin-exacerbated respiratory disease (AERD) (33%). A random forest approach identified a strong association between olfaction and the cytokines interleukin (IL)-5 and IL-13. Multivariate modeling identified AERD, computed tomography (CT) score, and IL-2 as the variables most predictive of olfactory function.
CONCLUSION - Olfactory dysfunction is associated with specific CRS endotypes characterized by severe nasal polyposis, tissue eosinophilia, and AERD. Mucus IL-2 levels, CT score, and AERD were independently associated with smell loss.
© 2018 ARS-AAOA, LLC.
Genomic regions with gene regulatory enhancer activity turnover rapidly across mammals. In contrast, gene expression patterns and transcription factor binding preferences are largely conserved between mammalian species. Based on this conservation, we hypothesized that enhancers active in different mammals would exhibit conserved sequence patterns in spite of their different genomic locations. To investigate this hypothesis, we evaluated the extent to which sequence patterns that are predictive of enhancers in one species are predictive of enhancers in other mammalian species by training and testing two types of machine learning models. We trained support vector machine (SVM) and convolutional neural network (CNN) classifiers to distinguish enhancers defined by histone marks from the genomic background based on DNA sequence patterns in human, macaque, mouse, dog, cow, and opossum. The classifiers accurately identified many adult liver, developing limb, and developing brain enhancers, and the CNNs outperformed the SVMs. Furthermore, classifiers trained in one species and tested in another performed nearly as well as classifiers trained and tested on the same species. We observed similar cross-species conservation when applying the models to human and mouse enhancers validated in transgenic assays. This indicates that many short sequence patterns predictive of enhancers are largely conserved. The sequence patterns most predictive of enhancers in each species matched the binding motifs for a common set of TFs enriched for expression in relevant tissues, supporting the biological relevance of the learned features. Thus, despite the rapid change of active enhancer locations between mammals, cross-species enhancer prediction is often possible. Our results suggest that short sequence patterns encoding enhancer activity have been maintained across more than 180 million years of mammalian evolution.