The publication data currently available has been vetted by Vanderbilt faculty, staff, administrators and trainees. The data itself is retrieved directly from NCBI's PubMed and is automatically updated on a weekly basis to ensure accuracy and completeness.
If you have any questions or comments, please contact us.
Summary - After the introduction of high-throughput sequencing, genotyping arrays continue to be a viable source for conducting large-scale genetic studies. Currently, Illumina is one of the largest genotyping array manufacturers. One technical issue that has always plagued the post-processing of Illumina genotyping array data is the strand definition. Against convention, Illumina uses their own definition of strand, which is inconsistent with the standard reference forward and reverse definition. This issue has been a major obstacle in the consistency of reporting, meta-analysis and correct interpretation of phenotype association results. To date, the strand issue has not been adequately addressed, prompting us to develop StrandScript, a tool that can convert all genotyping data generated from Illumina genotyping arrays to the reference forward strand. StrandScript works independently of the Illumina array version and is future proof for newer Illumina array designs. Furthermore, StrandScript can examine an Illumina genotyping array manifest file and can detect all problematic SNPs, including SNPs with wrong RS ID and SNPs with mismatched probe sequences. Here, we introduce StrandScript's design and development, and demonstrate its effectiveness using real genotyping data.
Availability and Implementation - https://github.com/seasky002002/Strandscript.
Contact - firstname.lastname@example.org.
Supplementary information - Supplementary data are available at Bioinformatics online.
© The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: email@example.com
Illumina genotyping arrays have powered thousands of large-scale genome-wide association studies over the past decade. Yet, because of the tremendous volume and complicated genetic assumptions of Illumina genotyping data, processing and quality control (QC) of these data remain a challenge. Thorough QC ensures the accurate identification of single-nucleotide polymorphisms and is required for the correct interpretation of genetic association results. By processing genotyping data on > 100 000 subjects from >10 major Illumina genotyping arrays, we have accumulated extensive experience in handling some of the most peculiar scenarios related to the processing and QC of Illumina genotyping data. Here, we describe strategies for processing Illumina genotyping data from the raw data to an analysis ready format, and we elaborate on the necessary QC procedures required at each processing step. High-quality Illumina genotyping data sets can be obtained by following our detailed QC strategies.
Gastric cancer (GC) is a leading cause of cancer-related deaths worldwide. The Tff1 knockout (KO) mouse model develops gastric lesions that include low-grade dysplasia (LGD), high-grade dysplasia (HGD), and adenocarcinomas. In this study, we used Affymetrix microarrays gene expression platforms for analysis of molecular signatures in the mouse stomach [Tff1-KO (LGD) and Tff1 wild-type (normal)] and human gastric cancer tissues and their adjacent normal tissue samples. Combined integrated bioinformatics analysis of mouse and human datasets indicated that 172 genes were consistently deregulated in both human gastric cancer samples and Tff1-KO LGD lesions (P < .05). Using Ingenuity pathway analysis, these genes mapped to important transcription networks that include MYC, STAT3, β-catenin, RELA, NFATC2, HIF1A, and ETS1 in both human and mouse. Further analysis demonstrated activation of FOXM1 and inhibition of TP53 transcription networks in human gastric cancers but not in Tff1-KO LGD lesions. Using real-time RT-PCR, we validated the deregulated expression of several genes (VCAM1, BGN, CLDN2, COL1A1, COL1A2, COL3A1, EpCAM, IFITM1, MMP9, MMP12, MMP14, PDGFRB, PLAU, and TIMP1) that map to altered transcription networks in both mouse and human gastric neoplasia. Our study demonstrates significant similarities in deregulated transcription networks in human gastric cancer and gastric tumorigenesis in the Tff1-KO mouse model. The data also suggest that activation of MYC, STAT3, RELA, and β-catenin transcription networks could be an early molecular step in gastric carcinogenesis.
© 2017 Wiley Periodicals, Inc.
Ethnic groups can display differential genetic susceptibility to infectious diseases. The arthropod-born viral dengue disease is one such disease, with empirical and limited genetic evidence showing that African ancestry may be protective against the haemorrhagic phenotype. Global ancestry analysis based on high-throughput genotyping in admixed populations can be used to test this hypothesis, while admixture mapping can map candidate protective genes. A Cuban dengue fever cohort was genotyped using a 2.5 million SNP chip. Global ancestry was ascertained through ADMIXTURE and used in a fine-matched corrected association study, while local ancestry was inferred by the RFMix algorithm. The expression of candidate genes was evaluated by RT-PCR in a Cuban dengue patient cohort and gene set enrichment analysis was performed in a Thai dengue transcriptome. OSBPL10 and RXRA candidate genes were identified, with most significant SNPs placed in inferred weak enhancers, promoters and lncRNAs. OSBPL10 had significantly lower expression in Africans than Europeans, while for RXRA several SNPs may differentially regulate its transcription between Africans and Europeans. Their expression was confirmed to change through dengue disease progression in Cuban patients and to vary with disease severity in a Thai transcriptome dataset. These genes interact in the LXR/RXR activation pathway that integrates lipid metabolism and immune functions, being a key player in dengue virus entrance into cells, its replication therein and in cytokine production. Knockdown of OSBPL10 expression in THP-1 cells by two shRNAs followed by DENV2 infection tests led to a significant reduction in DENV replication, being a direct functional proof that the lower OSBPL10 expression profile in Africans protects this ancestry against dengue disease.
BACKGROUND - The burden of subclinical atherosclerosis in asymptomatic individuals is heritable and associated with elevated risk of developing clinical coronary heart disease. We sought to identify genetic variants in protein-coding regions associated with subclinical atherosclerosis and the risk of subsequent coronary heart disease.
METHODS AND RESULTS - We studied a total of 25 109 European ancestry and African ancestry participants with coronary artery calcification (CAC) measured by cardiac computed tomography and 52 869 participants with common carotid intima-media thickness measured by ultrasonography within the CHARGE Consortium (Cohorts for Heart and Aging Research in Genomic Epidemiology). Participants were genotyped for 247 870 DNA sequence variants (231 539 in exons) across the genome. A meta-analysis of exome-wide association studies was performed across cohorts for CAC and carotid intima-media thickness. APOB p.Arg3527Gln was associated with 4-fold excess CAC (P=3×10). The APOE ε2 allele (p.Arg176Cys) was associated with both 22.3% reduced CAC (P=1×10) and 1.4% reduced carotid intima-media thickness (P=4×10) in carriers compared with noncarriers. In secondary analyses conditioning on low-density lipoprotein cholesterol concentration, the ε2 protective association with CAC, although attenuated, remained strongly significant. Additionally, the presence of ε2 was associated with reduced risk for coronary heart disease (odds ratio 0.77; P=1×10).
CONCLUSIONS - Exome-wide association meta-analysis demonstrates that protein-coding variants in APOB and APOE associate with subclinical atherosclerosis. APOE ε2 represents the first significant association for multiple subclinical atherosclerosis traits across multiple ethnicities, as well as clinical coronary heart disease.
© 2016 American Heart Association, Inc.
Defining molecular features that can predict the recurrence of colorectal cancer (CRC) for stage II-III patients remains challenging in cancer research. Most available clinical samples are Formalin-Fixed, Paraffin-Embedded (FFPE). NanoString nCounter® and Affymetrix GeneChip® Human Transcriptome Array 2.0 (HTA) are the two platforms marketed for high-throughput gene expression profiling for FFPE samples. In this study, to evaluate the gene expression of frozen tissue-derived prognostic signatures in FFPE CRC samples, we evaluated the expression of 516 genes from published frozen tissue-derived prognostic signatures in 42 FFPE CRC samples measured by both platforms. Based on HTA platform-derived data, we identified both gene (99 individual genes, FDR < 0.05) and gene set (four of the six reported multi-gene signatures with sufficient information for evaluation, P < 0.05) expression differences associated with survival outcomes. Using nCounter platform-derived data, one of the six multi-gene signatures (P < 0.05) but no individual gene was associated with survival outcomes. Our study indicated that sufficiently high quality RNA could be obtained from FFPE tumor tissues to detect frozen tissue-derived prognostic gene expression signatures for CRC patients.
Meta-analyses of association results for blood pressure using exome-centric single-variant and gene-based tests identified 31 new loci in a discovery stage among 146,562 individuals, with follow-up and meta-analysis in 180,726 additional individuals (total n = 327,288). These blood pressure-associated loci are enriched for known variants for cardiometabolic traits. Associations were also observed for the aggregation of rare and low-frequency missense variants in three genes, NPR1, DBH, and PTPMT1. In addition, blood pressure associations at 39 previously reported loci were confirmed. The identified variants implicate biological pathways related to cardiometabolic traits, vascular function, and development. Several new variants are inferred to have roles in transcription or as hubs in protein-protein interaction networks. Genetic risk scores constructed from the identified variants were strongly associated with coronary disease and myocardial infarction. This large collection of blood pressure-associated loci suggests new therapeutic strategies for hypertension, emphasizing a link with cardiometabolic risk.
The prognosis of colorectal cancer (CRC) stage II and III patients remains a challenge due to the difficulties of finding robust biomarkers suitable for testing clinical samples. The majority of published gene signatures of CRC have been generated on fresh frozen colorectal tissues. Because collection of frozen tissue is not practical for routine surgical pathology practice, a clinical test that improves prognostic capabilities beyond standard pathological staging of colon cancer will need to be designed for formalin-fixed paraffin-embedded (FFPE) tissues. The NanoString nCounter® platform is a gene expression analysis tool developed for use with FFPE-derived samples. We designed a custom nCounter® codeset based on elements from multiple published fresh frozen tissue microarray-based prognostic gene signatures for colon cancer, and we used this platform to systematically compare gene expression data from FFPE with matched microarray array data from frozen tissues. Our results show moderate correlation of gene expression between two platforms and discovery of a small subset of genes as candidate biomarkers for colon cancer prognosis that are detectable and quantifiable in FFPE tissue sections.
PURPOSE - Age-related macular degeneration is a common form of vision loss affecting older adults. The etiology of AMD is multifactorial and is influenced by environmental and genetic risk factors. In this study, we examine how 19 common risk variants contribute to drusen progression, a hallmark of AMD pathogenesis.
METHODS - Exome chip data was made available through the International AMD Genomics Consortium (IAMDGC). Drusen quantification was carried out with color fundus photographs using an automated drusen detection and quantification algorithm. A genetic risk score (GRS) was calculated per subject by summing risk allele counts at 19 common genetic risk variants weighted by their respective effect sizes. Pathway analysis of drusen progression was carried out with the software package Pathway Analysis by Randomization Incorporating Structure.
RESULTS - We observed significant correlation with drusen baseline area and the GRS in the age-related eye disease study (AREDS) dataset (ρ = 0.175, P = 0.006). Measures of association were not statistically significant between drusen progression and the GRS (P = 0.54). Pathway analysis revealed the cell adhesion molecules pathway as the most highly significant pathway associated with drusen progression (corrected P = 0.02).
CONCLUSIONS - In this study, we explored the potential influence of known common AMD genetic risk factors on drusen progression. Our results from the GRS analysis showed association of increasing genetic burden (from 19 AMD associated loci) to baseline drusen load but not drusen progression in the AREDS dataset while pathway analysis suggests additional genetic contributors to AMD risk.
Serotonergic anorexigens are the primary pharmacologic risk factor associated with pulmonary arterial hypertension (PAH), and the resulting PAH is clinically indistinguishable from the heritable form of disease, associated with BMPR2 mutations. Both BMPR2 mutation and agonists to the serotonin receptor HTR2B have been shown to cause activation of SRC tyrosine kinase; conversely, antagonists to HTR2B inhibit SRC trafficking and downstream function. To test the hypothesis that a HTR2B antagonist can prevent BMRP2 mutation induced PAH by restricting aberrant SRC trafficking and downstream activity, we exposed BMPR2 mutant mice, which spontaneously develop PAH, to a HTR2B antagonist, SB204741, to block the SRC activation caused by BMPR2 mutation. SB204741 prevented the development of PAH in BMPR2 mutant mice, reduced recruitment of inflammatory cells to their lungs, and reduced muscularization of their blood vessels. By atomic force microscopy, we determined that BMPR2 mutant mice normally had a doubling of vessel stiffness, which was substantially normalized by HTR2B inhibition. SB204741 reduced SRC phosphorylation and downstream activity in BMPR2 mutant mice. Gene expression arrays indicate that the primary changes were in cytoskeletal and muscle contractility genes. These results were confirmed by gel contraction assays showing that HTR2B inhibition nearly normalizes the 400% increase in gel contraction normally seen in BMPR2 mutant smooth muscle cells. Heritable PAH results from increased SRC activation, cellular contraction, and vascular resistance, but antagonism of HTR2B prevents SRC phosphorylation, downstream activity, and PAH in BMPR2 mutant mice.