Mountain View
biomedical and healthCAre Data Discovery Index Ecosystem
help Advanced Search
Title: Bioportal Snapshot 30.03.2017      
dateReleased:
03-31-2017
privacy:
information not avaiable
aggregation:
instance of dataset
dateCreated:
03-31-2017
refinement:
raw
ID:
doi:10.5281/ZENODO.439510
creators:
Matentzoglu, Nicolas
Parsia, Bijan
availability:
available
types:
other
description:
The snapshot was created using the [BioPortal REST API](http://data.bioontology.org/documentation). The snapshot was produced for research purpose only. The download process was performed as follows: For all available and indexed ontologies (see meta1490889630984.csv for a view of the whole repository), the latest version was determined using the submission id. There were 512 such entries available via the webservice. An attempt was made to download the referenced file using the download URL. This file was not always available. (498 were attempted to be downloaded). If the file turned out to be a ZIP archive, it was unpacked before processing. Only archives with single ontologies were considered by the downloader. If a file was downloaded / unzipped, it was copied to a new directory and given the extension orig. The files in this directory are all byte equivalent  with the files downloaded, which may be of interest to some. (Total: 438 ontologies). Every file in the orig directory was then attempted to be parsed by the OWL API. If this was successful, the whole imports closure was merged and serialised into OWL/XML with the OWL API (4.2.8). These files are the recommended ones to study for most researchers, and can be found in the owlxml directory/archive. No repairing of OWL 2 DL profile violations was attempted for this snapshot, other than the default workings of the OWL API when serialising to OWL XML. However, for this report, we gathered the metrics applying a single fix: When the ontology hat a non absolute version IRI ([OntologyVersionIRINotAbsolute](https://github.com/owlcs/argo/issues/3)), we created one. This was done so that the profile counts are more meaningful. Note that a number of ontologies merely suffer from undeclared entities, which is easily remedied. The dataset contains 422 ontologies, out of which 73 are OWL Full (many of which with own minor violations), 168 Pure DL, i.e. falling under OWL 2 DL but not under any of the profiles, and 181 falling under one of the profiles (EL only: 47, EL+QL: 45, EL+QL+RL: 67, EL+RL: 4, QL-only: 6, RL-only: 8, RL+QL: 4). Note that the "orig" archive (directory) contains more ontologies the the owl xml, because some of them were downloadable, but not parsable. This might be interesting for researchers interested in exploring the reasons for parsing failures. A more in-depth characterisation with plots on the size distributions, detailed breakdowns of the profile violations and a list of all ontolgies can be found here: http://rpubs.com/matentzn/bioportal2017_03_30. Files in this dataset: bioportal2017.03.30.csv: Metadata about the ontologies in the set, such as axiom counts and profile membership. Non-absolute version IRI exceptions were fixed prior to measurement. bioportal2017.03.30_norepair.csv: Metadata about the ontologies in the set, such as axiom counts and profile membership. No automated repairs. 03_30_2017_05_54_25experiment.log: Print log with details on the errors and exceptions during the snapshot creation meta1490889630984.csv: Dump of the entire BioPortal ontology list, including versions and so on.  original.zip: the archive contains all files that were downloadable in their original state owlxml.zip: the archive contains all files that were downloadable imports merged and serialised to OWLXML For citation, we recommend This dataset directly Noy et al (2009) for BioPortal (https://www.ncbi.nlm.nih.gov/pubmed/19483092) Matentzoglu et al (2013) for reference to the Manchester OWL repository and our continuous efforts to survey the state of OWL and ontologies (https://link.springer.com/chapter/10.1007/978-3-642-41335-3_21) 
accessURL: https://doi.org/10.5281/ZENODO.439510
storedIn:
Zenodo
qualifier:
not compressed
format:
HTML
accessType:
landing page
authentication:
none
authorization:
none
abbreviation:
ZENODO
homePage: https://zenodo.org/
ID:
SCR:004129
name:
ZENODO

Feedback?

If you are having problems using our tools, or if you would just like to send us some feedback, please post your questions on GitHub.