Mountain View
biomedical and healthCAre Data Discovery Index Ecosystem
help Advanced Search
Title: Effects of sample size on differential gene expression, rank order and prediction accuracy of a gene signature      
dateReleased:
05-15-2014
description:
Gene expression profiles were generated from muscle biopsies from 134 individuals, and differences in expression based on sex were explored. Top differentially expressed gene lists are often inconsistent between studies and it has been suggested that small sample sizes contribute to lack of reproducibility and poor prediction accuracy in discriminative models. We considered sex differences (69♂, 65♀) in 134 human skeletal muscle biopsies using DNA microarray. The full dataset and subsamples (n= 10 (5♂, 5♀) to n=120 (60♂, 60♀)) thereof were used to assess the effect of sample size on the differential expression of single genes, gene rank order and prediction accuracy. Using our full dataset (n=134), we identified 717 differentially expressed transcripts (p-value < 0.0001; false discovery rate < 0.006) and we were able to predict sex with 92% accuracy, both within our dataset and on external datasets. Both p-values and rank order of top differentially expressed genes became more variable using smaller subsamples. For example, at n=10 (5♂, 5♀), no gene was considered differentially expressed at p<0.0001 and prediction accuracy was ~50% (no better than chance). We found that sample size clearly affects microarray analysis results; small sample sizes result in unstable gene lists and poor prediction accuracy. We anticipate this will apply to other phenotypes, in addition to sex. RNA was isolated from 134 muscle samples. Gene expression is compared between males and females.
privacy:
not applicable
aggregation:
instance of dataset
ID:
E-GEOD-41726
refinement:
raw
alternateIdentifiers:
41726
keywords:
functional genomics
dateModified:
06-02-2014
availability:
available
types:
gene expression
name:
Homo sapiens
ID:
A-AGIL-28
name:
Agilent Whole Human Genome Microarray 4x44K 014850 G4112F (85 cols x 532 rows)
accessURL: https://www.ebi.ac.uk/arrayexpress/files/E-GEOD-41726/E-GEOD-41726.raw.1.zip
storedIn:
ArrayExpress
qualifier:
gzip compressed
format:
TXT
accessType:
download
authentication:
none
authorization:
none
accessURL: https://www.ebi.ac.uk/arrayexpress/files/E-GEOD-41726/E-GEOD-41726.processed.1.zip
storedIn:
ArrayExpress
qualifier:
gzip compressed
format:
TXT
accessType:
download
authentication:
none
authorization:
none
accessURL: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE41726
storedIn:
Gene Expression Omnibus
qualifier:
not compressed
format:
HTML
accessType:
landing page
primary:
true
authentication:
none
authorization:
none
abbreviation:
EBI
homePage: http://www.ebi.ac.uk/
ID:
SCR:004727
name:
European Bioinformatics Institute
homePage: https://www.ebi.ac.uk/arrayexpress/
ID:
SCR:002964
name:
ArrayExpress
Similar Datasets

Feedback?

If you are having problems using our tools, or if you would just like to send us some feedback, please post your questions on GitHub.