• Home
  • About
  • Repositories
  • Search
  • Web API
  • Feedback
<< Go Back

Metadata

Name
Supplementary data for: Chromosome-scale genome assemblies of aphids reveal extensively rearranged autosomes and long-term conservation of the X chromosome
Repository
ZENODO
Identifier
doi:10.5281/zenodo.3712089
Description
Myzus persicae&nbsp;clone O v2 frozen release

Genome assembly: Myzus_persicae_O_v2.0.scaffolds.fa.gz

BRAKER2 gene models:&nbsp;Myzus_persicae_O_v2.0.scaffolds.braker2.gff3

List of gene models containing internal stop codons (removed from the protein and cds fasta files):&nbsp;Myzus_persicae_O_v2.0.scaffolds.braker2.bad_genes.lst

BRAKER2 protein&nbsp;sequences:&nbsp;Myzus_persicae_O_v2.0.scaffolds.braker2.gff3.filtered.aa.fa

BRAKER2 protein sequences (longest transcript per gene only):&nbsp;Myzus_persicae_O_v2.0.scaffolds.braker2.gff3.filtered.aa.LTPG.fa

BRAKER2 coding&nbsp;sequences:&nbsp;Myzus_persicae_O_v2.0.scaffolds.braker2.gff3.filtered.cds.fa

BRAKER2 coding sequences (longest transcript per gene only):&nbsp;Myzus_persicae_O_v2.0.scaffolds.braker2.gff3.filtered.cds.LTPG.fa

De novo repeat library (ReapeatModeler merged with repbase insecta):&nbsp;Myzus_persicae_O_v2.0_repeat_lib.repeatmodeler_merged_repbase_insecta.fa

RepeatMasker transposable element annotation using the M. persicae de novo repeat library: Myzus_persicae_O_v2.0.scaffolds.repeatmodeler_merged_repbase_insecta.repeatmasker.gff.out

RepeatMasker transposable element annotation using the M. persicae de novo repeat library (gff format): Myzus_persicae_O_v2.0.scaffolds.repeatmodeler_merged_repbase_insecta.repeatmasker.gff

Acyrthosiphon pisum clone JIC1 v1&nbsp;frozen release

Genome assembly: Acyrthosiphon_pisum_JIC1_v1.0.scaffolds.fa.gz

BRAKER2 gene models:&nbsp;Acyrthosiphon_pisum_JIC1_v1.0.scaffolds.braker2.gff

List of gene models containing internal stop codons (removed from the protein and cds fasta files):&nbsp;Acyrthosiphon_pisum_JIC1_v1.0.scaffolds.braker2.bad_genes.lst

BRAKER2 protein&nbsp;sequences:&nbsp;Acyrthosiphon_pisum_JIC1_v1.0.scaffolds.braker2.gff.filtered.aa.fa

BRAKER2 protein sequences (longest transcript per gene only):&nbsp;Acyrthosiphon_pisum_JIC1_v1.0.scaffolds.braker2.gff.filtered.aa.LTPG.fa

BRAKER2 coding&nbsp;sequences:&nbsp;Acyrthosiphon_pisum_JIC1_v1.0.scaffolds.braker2.gff.filtered.cds.fa

BRAKER2 coding sequences (longest transcript per gene only):&nbsp;Acyrthosiphon_pisum_JIC1_v1.0.scaffolds.braker2.gff.filtered.cds.LTPG.fa

De novo repeat library (ReapeatModeler merged with repbase insecta):&nbsp;Acyrthosiphon_pisum_JIC1_repeat_lib.repeatmodeler_merged_repbase_insecta.fa

RepeatMasker transposable element annotation using the A. pisum de novo repeat library: Acyrthosiphon_pisum_JIC1_v1.0.scaffolds.repeatmodeler_merged_repbase_insecta.repeatmasker.out

RepeatMasker transposable element annotation using the A. pisum&nbsp;de novo repeat library (gff format): Acyrthosiphon_pisum_JIC1_v1.0.scaffolds.repeatmodeler_merged_repbase_insecta.repeatmasker.gff

Rhodnius prolixus DNA zoo chromosome-scale genome assembly annotation

R. prolixus chromosome-scale genome assembly was obtained here:&nbsp;https://www.dnazoo.org/assemblies/Rhodnius_prolixus.

Genome assembly:&nbsp;Rhodnius_prolixus-3.0.3_HiC.fasta

BRAKER2 gene models:&nbsp;Rhodnius_prolixus-3.0.3_HiC.braker2.gff

BRAKER2 protein&nbsp;sequences:&nbsp;Rhodnius_prolixus-3.0.3_HiC.braker2.gff.aa.fa

BRAKER2 protein sequences (longest transcript per gene only):&nbsp;Rhodnius_prolixus-3.0.3_HiC.braker2.gff.aa.LTPG.fa

BRAKER2 coding&nbsp;sequences:&nbsp;Rhodnius_prolixus-3.0.3_HiC.braker2.gff.cds.fa

Triatoma rubrofasciata&nbsp;chromosome-scale genome assembly annotation

T.&nbsp;rubrofasciata&nbsp;chromosome-scale genome assembly was obtained here:&nbsp;http://dx.doi.org/10.5524/100614

Genome assembly:&nbsp;zhuichun_assembly.fasta

BRAKER2 gene models:&nbsp;zhuichun_assembly.braker2.gff

BRAKER2 protein&nbsp;sequences:&nbsp;zhuichun_assembly.braker2.gff.aa.fa

BRAKER2 protein sequences (longest transcript per gene only):&nbsp;zhuichun_assembly.braker2.gff.aa.LTPG.fa

BRAKER2 coding&nbsp;sequences:&nbsp;zhuichun_assembly.braker2.gff.cds.fa

Hemiptera orthogroups and species tree

OrthoFinder was used to cluster proteomes of 14 Hemiptera into orthogroups for phylogenomic analysis. All proteomes were reduced to the longest transcript per gene. See here for full details:

Species included, taxon IDs and data source:

Mcer = Myzus cerasi v1.1 (https://bipaa.genouest.org/sp/myzus_cerasi/)

MperO = Myzus persicae clone O v2 (This study)

Dnox = Diuraphis noxia Thorpe et. al. gene predictions (https://bipaa.genouest.org/sp/diuraphis_noxia/)

Apis = Acyrthosiphon pisum JIC1 v1 (This study)

Pnig = Pentalonia nigronervosa (This study)

Rmai = Rhopalosiphum maidis v0.1 (http://gigadb.org/dataset/100572)

Rpad = Rhopalosiphum padi v1.0 (https://bipaa.genouest.org/sp/rhopalosiphum_padi/)

Agly = Aphis glycines biotype 4 v2.1 (https://zenodo.org/record/3453468#.XnpL5JOgLRY)

BtabMEAM1 = Bemissia tabacci MEAM1 v1.2 (http://www.whiteflygenomics.org/cgi-bin/bta/index.cgi)

Trub = Triatoma rubrofasciata (This study)

Rpro = Rhodnius prolixus&nbsp;(This study)

Ofas =&nbsp;Oncopeltus fasciatus OGS v1.0 (https://i5k.nal.usda.gov/Oncopeltus_fasciatus)

Sfuc =&nbsp;Sogatella furcifera v1 (http://dx.doi.org/10.5524/100255)

Nlug =&nbsp;Nilaparvata lugens (https://genomebiology.biomedcentral.com/articles/10.1186/s13059-014-0521-0#Sec42)

Files:

Proteomes included in the analysis:&nbsp;proteomes.tar.gz

Orthogroups:&nbsp;Orthogroups.txt

Gene counts per orthogroup, per species:&nbsp;Orthogroups.GeneCount.csv

Single copy conserved orthogroups used for species tree: SingleCopyOrthogroups.txt

Species tree alignment:&nbsp;SpeciesTreeAlignment.fa

r8s configuration file (includes time calibrations and OrthoFinder ML species tree with branch lengths):&nbsp;species_tree_rooted.r8s.nex

r8s time calibrated species tree:&nbsp;r8s_tree.nwk
Data or Study Types
multiple
Source Organization
Unknown
Access Conditions
available
Year
2020
Access Hyperlink
https://doi.org/10.5281/zenodo.3712089

Distributions

  • Encoding Format: HTML ; URL: https://doi.org/10.5281/zenodo.3712089
This project was funded in part by grant U24AI117966 from the NIH National Institute of Allergy and Infectious Diseases as part of the Big Data to Knowledge program. We thank all members of the bioCADDIE community for their valuable input on the overall project.