Mountain View
biomedical and healthCAre Data Discovery Index Ecosystem
help Advanced Search
Title: Regtools candidate cis somatic variants effecting splicing      
aggregation:
composite digital object
privacy:
not applicable
refinement:
uncurated
ID:
arn:aws:s3:::regtools-cse-variants
storedIn:
Amazon
dateModified:
02-22-2019
availability:
Available
creators:
Malachi Griffith
keywords:
Cis somatic splicing variants
dateReleased:
02-22-2019
description:
Somatic variants with potential cis splice effects for TCGA cases with RNA-seq and exome data available in the Genomic Data Commons
types:
Unspecified
authors:
Yang-Yang Feng, Avinash Ramu, Kelsy C Cotto, Zachary L Skidmore, Jason Kunisaki, Donald F Conrad, Yiing Lin, William Chapman, Ravindra Uppaulri, Ramaswamy Govindan, Obi L Griffith, Malachi Griffith
publicationVenue:
bioRXiv (Cold Spring Harbor Laboratory Press)
description:
Cancer is caused by somatic mutations within the genome of an initiating cell. These mutations take many forms including small single base substitutions, large insertions and deletions, chromosomal rearrangements, and so on. Mutations also vary with respect to their position relative to annotated gene loci. Some mutations occur within exons and have direct and readily predicted effects on protein sequence and function. Other mutations affect gene function indirectly by occurring within regulatory regions that influence gene expression and RNA splicing. Next generation sequencing has transformed the potential to explore the mutational landscapes of human cancers. However, rapid creation of massive complex datasets and a dearth of established methods for integrated analysis of this data have resulted in a critical research bottleneck. To date, research has focused heavily on the most easily detected and interpreted coding mutations occurring within known exons. Mutations in non-coding genes and regulatory elements that govern gene expression and splicing have been largely overlooked. Similarly, interpretation of the clinical significance of mutations has been limited to a handful of the most well characterized recurrently mutated `hot-spots' of certain genes. The proposed project will develop new tools to identify and characterize mutations with regulatory rather than protein coding consequences. Furthermore, we will develop resources to help the research community interpret the possible clinical relevance of these mutations. To explore these knowledge gaps and test our new tools we will apply them to a cohort of tumor samples from ongoing large scale genome/transcriptome sequencing projects at the Genome Institute. We have preliminary data to suggest that progression of these tumors may be driven by currently unknown regulatory mutations and that a subset of these may suggest novel therapeutic strategies. The Genome Institute at Washington University School of Medicine is one of few places in the world that successfully combines close interaction of physician scientists with a large-scale genome sequencing facility and world class computing infrastructure. The Genome Institute is a leader in the development of sequencing methods and bioinformatics tools needed for the proposed work. This is demonstrated by the candidate's comprehensive preliminary results. The candidate's mentor, Dr. Richard Wilson has an established track record of mentoring genomics scientists. Dr. Wilson has helped the candidate to establish an outstanding mentoring committee with the interdisciplinary skills needed to guide him in the proposed research. Dr. Wilson along with these additional mentors will collaboratively support and guide the candidate towards a successful independent career. The first specific aim, to be completed during the mentored phase will create new methods for integration of whole genome and transcriptome data as well as annotation and prioritization of somatic events. Particular emphasis will be placed on the characterization of non-coding mutations that affect gene regulation and splicing. The independent phase will move towards development of novel resources to help researchers interpret mutations in a clinical context. In both phases, the candidate's research will focus heavily on the bioinformatics aspect of these problems in a way that has minimal overlap but is highly complementary to the mentor's research program. In the long term, the candidate hopes to fill a growing need for bioinformatics investigators working in the area of cancer genomics.
ID:
https://doi.org/10.1101/436634
acknowledges:
INTEGRATED ANALYSIS & INTERPRETATION OF WHOLE GENOME, EXOME & TRANSCRIPTOME SEQUENCE DATA (NIH, NHGRI, HG007940)
title:
RegTools: Integrated analysis of genomic and transcriptomic data for discovery of splicing variants in cancer
dateReleased:
11-25-2018
name:
Somatic variants and their allelic fractions
bearerOfDisease:
cancer
name:
Data from Genomic Data Commons corresponding to sequenced tumor DNA and RNA
name:
Creative Commons Public Domain Dedication (CC0 1.0 Universal)
landingPage: https://creativecommons.org/publicdomain/zero/1.0/
identifier:
HG007940
funders:
NIH (NHGRI)
description:
This work was supported by a grant to Malachi Griffith from the National Human Genome Research Institute (NHGRI) of the NIH under award number R00HG007940. Creation of these data objects was supported by the NIH BD2K Cloud Credits Commons Pilot Program (CCREQ-2017-03-00071).
count:
1
primary:
True
title:
Annotated TCGA exon-exon junctions (V1)
storedIn:
Amazon AWS (S3)
description:
Somatic variants with potential cis splice effects for TCGA cases with RNA-seq and exome data available in the Genomic Data Commons
size:
4500
unit:
MB
ID:
SCR:016270
name:
CEDAR Workbench
abbreviation:
CEDAR
homePage: https://cedar.metadatacenter.org
  • K01 AG044439/AG/NIA NIH HHS/United States

Feedback?

If you are having problems using our tools, or if you would just like to send us some feedback, please post your questions on GitHub.