Mountain View
biomedical and healthCAre Data Discovery Index Ecosystem
help Advanced Search
Title: Enhanced whole exome sequencing by higherDNA insert lengths      
dateReleased:
07-28-2016
description:
Background: Whole exome sequencing (WES) has been proven to serve as a valuable basis for various applications such as variant calling and copy number variation (CNV) analyses. For those analyses the read coverage should be optimally balanced throughout protein coding regions at sufficient read depth. Unfortunately, WES is known for its uneven coverage within coding regions due to GC-rich regions or off-target enrichment. Results: In order to examine the irregularities of WES within genes, we applied Agilent SureSelectXT exome capture on human samples and sequenced these via Illumina in 2x101 paired-end mode. As we suspected the sequenced insert length to be crucial in the uneven coverage of exome captured samples, we sheared 12 genomic DNA samples to two different DNA insert size lengths, namely 130 and 170 bp. Interestingly, although mean coverages of target regions were clearly higher in samples of 130 bp insert length, the level of evenness was more pronounced in 170 bp samples. Moreover, merging overlapping paired-end reads revealed a positive effect on evenness indicating overlapping reads as another reason for the unevenness. In addition, mutation analysis on a subset of the samples was performed. In these isogenic subclones almost twofold mutations were failed in the 130 bp samples when compared to the 170 bp samples. Visual inspection of the discarded mutation sites exposed low coverages at the sites embedded in high amplitudes of coverage depth in the affected region. Conclusions: Producing longer insert reads could be a good strategy to achieve better uniform read coverage in coding regions and hereby enhancing the effective sequencing yield to provide an improved basis for further variant calling and CNV analyses.
privacy:
not applicable
aggregation:
instance of dataset
ID:
E-MTAB-4527
refinement:
raw
dateSubmitted:
04-08-2015
keywords:
functional genomics
dateModified:
11-08-2016
creators:
Claudia Pommerenke
availability:
available
types:
gene expression
name:
Homo sapiens
name:
optimization design
accessURL: https://www.ebi.ac.uk/arrayexpress/files/E-MTAB-4527/E-MTAB-4527.raw.1.zip
storedIn:
ArrayExpress
qualifier:
gzip compressed
format:
TXT
accessType:
download
authentication:
none
authorization:
none
accessURL: https://www.ebi.ac.uk/arrayexpress/files/E-MTAB-4527/E-MTAB-4527.processed.1.zip
storedIn:
ArrayExpress
qualifier:
gzip compressed
format:
TXT
accessType:
download
authentication:
none
authorization:
none
abbreviation:
EBI
homePage: http://www.ebi.ac.uk/
ID:
SCR:004727
name:
European Bioinformatics Institute
homePage: https://www.ebi.ac.uk/arrayexpress/
ID:
SCR:002964
name:
ArrayExpress
Similar Datasets

Feedback?

If you are having problems using our tools, or if you would just like to send us some feedback, please post your questions on GitHub.