• Home
  • About
  • Repositories
  • Search
  • Web API
  • Feedback
<< Go Back

Metadata

Name
MIR-1K dataset
Repository
ZENODO
Identifier
doi:10.5281/zenodo.3532216
Description
MIR-1K Dataset

&nbsp;

Multimedia&nbsp;Information&nbsp;Retrieval lab,&nbsp;1000&nbsp;song clips, dataset for&nbsp;singing voice separation

Work by Chao-Ling Hsu and Prof. Jyh-Shing Roger Jang

The&nbsp;MIR-1K&nbsp;dataset is designed for the research of singing voice separation. MIR-1K contains:



1000 song clips which the music accompaniment and the singing voice are recorded at left and right channels, respectively.


Manual annotations of the dataset include pitch contours in semitone, indices and types for unvoiced frames,&nbsp;lyrics, and vocal/non-vocal segment.


The speech recordings of the lyrics by the same person who sang the songs are also provided in the dataset.


The undivided&nbsp;songs of MIR-1K are now available for downloading.



&nbsp;&nbsp;&nbsp; The song clip&nbsp;is named&nbsp;in the form&nbsp;&quot;SingerId_SongId_ClipId&quot;.&nbsp; The duration of each clip ranges from 4 to 13 seconds, and the total length of the dataset is 133 minutes. These clips are extracted from 110 karaoke songs which contain a mixture track and a music accompaniment track. These songs are freely selected from 5000 Chinese pop songs and sung by our labmates of 8 females and 11 males. Most of the singers are amateur and do not have professional music training.

&nbsp;Labels for the unvoiced sounds&nbsp;&nbsp;

&nbsp;&nbsp;&nbsp; In MIR-1K, all frames of each clip are manually labeled as one of the five sound classes:



unvoiced stop


unvoiced fricative and affricate


/h/


inhaling sound


others&nbsp;(include voiced sound and music accompaniment)



The length and the shift of the frame are 40 ms and 20 ms, respectively.

Sound demos for the unvoiced singing voice separation

&nbsp;&nbsp;&nbsp;&nbsp;Sound Demos for Unvoiced Singing Voice Separation

Download MIR-1K dataset

&nbsp;&nbsp; &nbsp;http://mirlab.org/dataset/public/MIR-1K.rar

&nbsp;

Download MIR-1K dataset for MIREX

&nbsp;&nbsp;&nbsp;http://mirlab.org/dataset/public/MIR-1K_for_MIREX.rar

Relevant publications

[1] Chao-Ling Hsu,&nbsp;&nbsp; DeLiang Wang, Jyh-Shing Roger Jang, and Ke Hu, &ldquo; A Tandem Algorithm for Singing Pitch Extraction and Voice Separation from Music Accompaniment,&rdquo;&nbsp;IEEE Trans. Audio, Speech, and Language Processing,&nbsp; 2011 (Accepted)

[2] Chao-Ling Hsu and Jyh-Shing Roger&nbsp;Jang, &ldquo;On the Improvement of Singing Voice Separation for Monaural Recordings Using the MIR-1K Dataset,&rdquo;&nbsp;IEEE Trans. Audio, Speech, and Language Processing, &nbsp;volume 18, issue 2, p.p 310-319, 2010.

[3]&nbsp;Chao-Ling Hsu, DeLiang Wang, and Jyh-Shing Roger Jang, &ldquo;A Trend Estimation Algorithm for Singing Pitch Detection in musical Recordings&rdquo;,&nbsp;IEEE&nbsp;International Conference on Acoustics, Speech and Signal Processing,&nbsp;Prague,&nbsp;Czech, Mar. 2011.

[4] Chao-Ling Hsu, Liang-Yu Chen, Jyh-Shing Roger Jang and Hsing-Ji Li, &ldquo;Singing Pitch Extraction From Monaural Polyphonic Songs By Contextual Audio Modeling and Singing Harmonic Enhancement&rdquo;,&nbsp;International Society for Music Information Retrieval, Kobe, Japan, Oct. 2009.

[5]&nbsp;Chao-Ling Hsu and Jyh-Shing Roger Jang, &ldquo;Singing Pitch Extraction by Voice Vibrato/Tremolo Estimation and Instrument Partial Deletion&rdquo;,&nbsp;International Society for Music Information Retrieval, Utrecht, Netherlands, Aug. 2010.
Data or Study Types
multiple
Source Organization
Unknown
Access Conditions
available
Year
2019
Access Hyperlink
https://doi.org/10.5281/zenodo.3532216

Distributions

  • Encoding Format: HTML ; URL: https://doi.org/10.5281/zenodo.3532216
This project was funded in part by grant U24AI117966 from the NIH National Institute of Allergy and Infectious Diseases as part of the Big Data to Knowledge program. We thank all members of the bioCADDIE community for their valuable input on the overall project.