Repositories

Listed below are all the repositories that are currently in-scope for DataMed. These have been curated from NIH-supported Scientific Data Repositories and other sources. Not all in-scope repositories, listed below, have been indexed at the present time (December 2023), but they should be soon! We welcome your feedback! To suggest new repositories for inclusion, please contact us.

1 through 25 of 145

Archived Clinical Research Datasets: The data repository houses NINDS, Division of Clinical Research (DCR)- funded studies and trials in neurological areas such as Stroke, Parkinson’s disease, Migraine, Multiple Sclerosis and other neurological disorders. The data requests and approvals are managed by NINDS. Imaging, biosamples and other types of data are not supported by this repository. Contact your Program Official to determine if this repository is a good fit for your application.
Archive of Data on Disability to Enable Policy (ADDEP): ADDEP provides access to data including a wide range of topics related to disability. ADDEP data can be used to better understand and inform the implementation of the Americans with Disabilities Act and other disability policies.
AD Data Initiative Repository: The AD Data Initiative Repository enables sharing of human or human-derived Alzheimer’s and related dementia data. Some of the platform’s key features are it is available to data contributors at no cost; is secure with clearly outlined privacy policies and terms of use; allows for configurable data access and data request forms with flexible data sharing options and visibility into data usage via audit trail.
AD Knowledge Portal: The AD Knowledge Portal is a NIH-designated repository and the distribution site for multi-omic data from human samples and model systems. The Portal hosts raw data, analysis results, analytical methodology, and research tools generated through Alzheimer's disease and related dementia programs supported by the National Institute on Aging. Data is available to qualified investigators as open or controlled access depending on data source and summarization level.
AgingResearchBiobank: The AgingResearchBiobank was officially announced in January 2019 with a mission to provide a state-of-the-art inventory system for the storage, maintenance and distribution of biospecimens and associated data from numerous NIA-funded longitudinal studies on aging and on clinical trials with the broader scientific community; foster compliance with NIH/NIA resources sharing policies; and, foster science advances to ultimately help extend the healthy, active years of life for the world’s fast-growing population of older adults. Starting on October 2023, the AgingResearchBiobank is also hosting and distributing imaging data available for some of our study collections.
Alliance of Genome Resources (AGR): The Alliance of Genome Resources (AGR) is a consortium of several model organism databases (MODs) and the Gene Ontology (GO) Consortium whose goal is to provide an integrated view of their data to all biologists, clinicians and other interested parties. The primary mission of the Alliance of Genome Resources (the Alliance) is to develop and maintain sustainable genome information resources that facilitate the use of diverse model organisms in understanding the genetic and genomic basis of human biology, health and disease. This understanding is fundamental for advancing genome biology research and for translating human genome data into clinical utility.
Allele Frequency Aggregator (ALFA): The NCBI Allele Frequency Aggregator (ALFA) seeks to make allele frequency datasets from dbGaP studies the largest and most complete aggregated variant datasets available as open-access. Over two million individuals, up to billions of variations, thousands of phenotypes, and molecular test datasets make up the database dbGaP. Huge opportunities exist to investigate and research genetic differences within human populations and to find genetic factors that affect health and diseases in order to enhance diagnosis, treatment, and prevention.
Accelerating Medicines Partnership in Common Metabolic Diseases (AMP-CMD) Knowledge Portal: The Common Metabolic Diseases Knowledge Portal (CMDKP) aggregates and analyzes genetic association results, epigenomic annotations, and results of computational prediction methods to provide data, visualizations, and tools in an open-access portal. The aim of the CMDKP is to facilitate research on the molecular basis of complex diseases, including type 1 and type 2 diabetes, cardiovascular and cerebrovascular disease, and sleep disorders.
Accelerating Medicines Partnership® Parkinson's Disease (AMP® PD): The Accelerating Medicines Partnership – Parkinson’s Disease (AMP PD) is a public-private partnership formed for the purpose of identifying and validating diagnostic, prognostic, and progression biomarkers for Parkinson’s Disease. To accomplish this goal, AMP PD harmonized longitudinal clinical data from seven large existing cohorts and conducted whole genome sequencing, longitudinal transcriptomics, and longitudinal proteomics analyses from their existing blood, CSF, and/or brain tissue samples. AMP PD data, along with the federated, international Global Parkinson’s Genetics Program (GP2) data, exists in the Google Cloud and is accessible and analyzable on the Verily/Broad Institute’s Terra Platform.
AphasiaBank: AphasiaBank is a shared database of multimedia interactions for the study of communication in aphasia. Access to the data in AphasiaBank is password protected and restricted to members of the AphasiaBank consortium group.
BindingDB: BindingDB is a public, central, web-accessible database of measured binding affinities, focusing chiefly on the interactions of proteins considered to be candidate drug-targets with ligands that are small, drug-like molecules. BindingDB also includes a small collection of host-guest binding data of interest to chemists studying supramolecular systems. BindingDB is a FAIRshare recommended resource, with about 2.1M binding data for about 8,000 proteins and 920,000 small molecules, which is used worldwide for a range of activities, including drug discovery, computational chemistry, systems biology, and education.
Brain Image Library (BIL): The Brain Image Library (BIL) is a national public resource enabling researchers to deposit, analyze, mine, share and interact with large brain image datasets.
BioData Catalyst: NHLBI BioData Catalyst is a cloud-based platform providing tools, applications, and workflows in secure workspaces for heart, lung, blood, and sleep disorder research. BioData Catalyst allows researchers to find, access, share, store, and compute on large scale datasets.
Biologic Specimen and Data Repository Information Coordinating Center (BioLINCC): BioLINCC facilitates and coordinates the activities of the NHLBI Biorepository and the Data Repository. The mission of BioLINCC is to maximize the scientific value of biospecimen and clinical data resources collected by clinical trials and observational studies supported by the Institute. BioLINCC promotes the use of and facilitates access to biospecimen and clinical data resources through a single web-based user interface.
BioPortal: BioPortal is the world’s most comprehensive, integrated knowledgebase of biomedical ontologies and controlled terminologies. BioPortal enables users to navigate, visualize, and leverage both individual ontologies and the collective knowledge suggested by its entire knowledgebase of more than 800 ontologies. It provides specialized tools that employ novel layouts and animation to offer cognitive support to users trying to understand the complexities of large ontologies within the confines of the two dimensions offered by a Web browser.
BioProject: A BioProject is a collection of biological data related to a single initiative, originating from a single organization or from a consortium. A BioProject record provides users a single place to find links to the diverse data types generated for that project.
BioSystics Analytics Platform (BioSystics-AP): The BioSystics Analytics Platform is designed to store, analyze and share complex multimodal datasets from in vitro 2D- and 3D-models including microplate, microphysiology systems (MPS), organ-on-chip, and organoid, as well as animal models and clinical data. It provides a streamlined workflow for designing and implementing studies and capturing data in a central location for efficient review, visualization, analyses and computational modeling. Publicly available data and models in BioSystics-AP can be used to support development of drug discovery tools, qualification, validation, and evaluation of clinical concordance and in vitro to in vivo extrapolation.
The Biological Magnetic Resonance Data Bank (BMRB): BMRB collects, annotates, archives, and disseminates spectral and quantitative data derived from NMR spectroscopic investigations of biological macromolecules and metabolites.
Brain Observatory Storage Service & Database (BossDB): BossDB is a volumetric database for 3D and 4D neuroscience data.
Bacterial and Viral Bioinformatics Resource Center (BV-BRC): The Bacterial and Viral Bioinformatics Resource Center (BV-BRC) is one of two Bioinformatics Resource Centers (BRCs) funded by the US National Institute of Allergy and Infectious Diseases (NIAID). The Bioinformatics Resource Centers (BRCs) for Infectious Diseases program was initiated in 2004 with the main objective of providing public access to computational platforms and analysis tools that enable collecting, archiving, updating, and integrating a variety of genomics and related research data relevant to infectious diseases, and pathogens and their interaction with hosts.
Cancer Nanotechnology Laboratory (caNanoLab): caNanoLab is a data sharing portal designed to facilitate information sharing in the biomedical nanotechnology research community to expedite and validate the use of nanotechnology in biomedicine. caNanoLab provides support for the annotation of nanomaterials with characterizations resulting from physico-chemical, in vitro, and in vivo assays and the sharing of these characterizations and associated nanotechnology protocols in a secure fashion.
CCDI Childhood Cancer Data Catalog (CCDC): The Childhood Cancer Data Initiative (CCDI’s) CCDC is an inventory of pediatric oncology data resources. This includes childhood cancer repositories, registries, knowledgebases, and catalogs that either manage or refer to data. The contact details can be used to connect with resource owners to learn more about how to gain access to the data. The data catalog does not provide access to a resource’s data.
Cancer Data Service (CDS): The Cancer Data Service (CDS) is a data repository under the NCI's Cancer Research Data Commons (CRDC) infrastructure for storing cancer research data generated by NCI funded programs. CDS provides secure and authorized storage and data sharing capabilities in the cloud for studies that do not have a repository specific for their data type or are wait listed or not approved by that repository for storage. CDS hosts both controlled and open access data. Permission to access controlled data on CDS is obtained through the NCBI’s dbGaP system. CDS hosts data and offers analysis capabilities through the NCI Cloud Resources. Seven Bridges Cancer Genomics Cloud, one of the NCI’s Cloud Resources, can be used to search and analyze data. Seven Bridges-CGC is established on Amazon Web Services (AWS).
Chemical Effects in Biological Systems (CEBS): The CEBS database houses data of interest to environmental health scientists. CEBS is a public resource, and has received depositions of data from academic, industrial, and governmental laboratories. Data in CEBS are housed in a relational database designed to display data in the context of biology and study design, and permit data integration for cross-study analysis, knowledge generation and novel meta-analysis.
CellChat: (a) Ligand-Receptor Interaction Explorer to explore ligand-receptor interaction database, and (b) Cell- Cell Communication Atlas Explorer to explore the cell-cell communications for any given scRNA-seq dataset processed by the R toolkit CellChat

1 through 25 of 145