Mountain View
biomedical and healthCAre Data Discovery Index Ecosystem
help Advanced Search
Title: Lexical Relations From The Wisdom Of The Crowd 1.0      
dateReleased:
04-15-2017
privacy:
information not avaiable
aggregation:
instance of dataset
dateCreated:
02-18-2017
refinement:
raw
ID:
doi:10.5281/ZENODO.291991
creators:
Ustalov, Dmitry
availability:
available
types:
other
description:
A set of 300 most frequent nouns have been extracted from the Russian National Corpus. Then, each method or resource, including RuThes, produced at most five hypernyms, if possible. In case it is not possible, missing answers treated as empty results. This resulted in 9 322 unique non-empty subsumption pairs that have been passed for crowdsourcing annotation on the Yandex.Toloka microtask platform. Each pair has been annotated by seven different annotators whose mother tongue is Russian and the age is at least 20 by February 1, 2017. The layout of the human intelligence task (HIT) design assumes the direct answer to a simple question: does the given pair of words represent a meaningful is-a relation? Since the crowd workers are not expert lexicographers and this question might be difficult for them, it has been rephrased as “Is it correct that a kitten is a kind of mammal?” (in Russian). The answers have been aggregated using the Yandex.Toloka proprietary answer aggregation mechanism. As the result, 3 940 out of 9 322 pairs have been annotated as positive while the rest 5 382 have been annotated as negative. Interestingly, the workers were more confident in negative answers rather than in the positive ones. These negative answers are extremely useful for both training and testing different relation extraction methods. To the best of our knowledge, this is the first dataset of this kind made for the Russian language using microtask-based crowdsourcing.
accessURL: https://doi.org/10.5281/ZENODO.291991
storedIn:
Zenodo
qualifier:
not compressed
format:
HTML
accessType:
landing page
authentication:
none
authorization:
none
abbreviation:
ZENODO
homePage: https://zenodo.org/
ID:
SCR:004129
name:
ZENODO

Feedback?

If you are having problems using our tools, or if you would just like to send us some feedback, please post your questions on GitHub.