• Home
  • About
  • Repositories
  • Search
  • Web API
  • Feedback
<< Go Back

Metadata

Name
Hindustani Music Rhythm Dataset
Repository
ZENODO
Identifier
doi:10.5281/zenodo.1264742
Description
CompMusic Hindustani Rhythm Dataset is a rhythm annotated test corpus for automatic rhythm analysis tasks in Hindustani Music. The collection consists of audio excerpts from the CompMusic Hindustani research corpus, manually annotated time aligned markers indicating the progression through the taal cycle, and the associated taal related metadata. A brief description of the dataset is provided below.&nbsp;

For a brief overview and audio examples of taals in Hindustani music, please see

http://compmusic.upf.edu/examples-taal-hindustani

THE DATASET

Audio music content&nbsp;

The pieces are chosen from the CompMusic Hindustani music collection. The pieces were chosen in four popular taals of Hindustani music, which encompasses a majority of Hindustani khyal music. The pieces were chosen include a mix of vocal and instrumental recordings, new and old recordings, and to span three lays. For each taal, there are pieces in dhrut (fast), madhya (medium) and vilambit (slow) lays (tempo class). All pieces have Tabla as the percussion accompaniment. The excerpts are two minutes long. Each piece is uniquely identified using the MBID of the recording. The pieces are stereo, 160 kbps, mp3 files sampled at 44.1 kHz. The audio is also available as wav files for experiments.&nbsp;

Annotations

There are several annotations that accompany each excerpt in the dataset.

Sam, vibhaag and the maatras: The primary annotations are audio synchronized time-stamps indicating the different metrical positions in the taal cycle. The sam and matras of the cycle are annotated. The annotations were created using Sonic Visualizer by tapping to music and manually correcting the taps. Each annotation has a time-stamp and an associated numeric label that indicates the position of the beat marker in the taala cycle. The annotations and the associated metadata have been verified for correctness and completeness by a professional Hindustani musician and musicologist. The long thick lines show vibhaag boundaries. The numerals indicate the matra number in cycle.&nbsp;In each case, the sam (the start of the cycle, analogous to the downbeat) are indicated using the numeral 1.&nbsp;

Taal related metadata: For each excerpt, the taal and the lay of the piece are recorded. Each excerpt can be uniquely identified and located with the MBID of the recording, and the relative start and end times of the excerpt within the whole recording. A separate 5 digit taal based unique ID is also provided for each excerpt as a double check. The artist, release, the lead instrument, and the raag of the piece are additional editorial metadata obtained from the release. There are optional comments on audio quality and annotation specifics.&nbsp;

Data subsets

The dataset consists of excerpts with a wide tempo range from 10 MPM (matras per minute) to 370 MPM. To study any effects of the tempo class, the full dataset (HMDf) is also divided into two other subsets - the long cycle subset (HMDl) consisting of vilambit (slow) pieces with a median tempo between 10-60 MPM, and the short cycle subset (HMDs) with madhyalay (medium, 60-150 MPM) and the drut lay (fast, 150+ MPM).&nbsp;

Possible uses of the dataset

Possible tasks where the dataset can be used include taal, sama and beat tracking, tempo estimation and tracking, taal recognition, rhythm based segmentation of musical audio, audio to score/lyrics alignment, and rhythmic pattern discovery.&nbsp;

Dataset organization

The dataset consists of audio, annotations, an accompanying spreadsheet providing additional metadata, a MAT-file that has identical information as the spreadsheet, and a dataset description document.

Using this dataset

Please cite the following publication if you use the dataset in your work:


Ajay Srinivasasmurthy, Andre Holzapfel, Ali Taylan Cemgil, Xavier Serra, &quot;A generalized Bayesian model for tracking long metrical cycles in acoustic music signals&quot;, in Proc. of the 41st IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2016), Shanghai, China, March 2016


http://hdl.handle.net/10230/32090

We are interested in knowing if you find our datasets useful! If you use our dataset please email us at mtg-info@upf.edu and tell us about your research.

Contact

If you have any questions or comments about the dataset, please feel free to write to us.

Ajay Srinivasamurthy
Music Technology Group
Universitat Pompeu Fabra,
Barcelona, Spain
ajays.murthy@upf.edu

Kaustuv Kanti Ganguli
DAP lab, Dept. of Electrical Engineering,
Indian Institute of Technology Bombay
Mumbai, India
kaustuvkanti@ee.iitb.ac.in

&nbsp;

http://compmusic.upf.edu/hindustani-rhythm-dataset
Data or Study Types
multiple
Source Organization
Unknown
Access Conditions
available
Year
2018
Access Hyperlink
https://doi.org/10.5281/zenodo.1264742

Distributions

  • Encoding Format: HTML ; URL: https://doi.org/10.5281/zenodo.1264742
This project was funded in part by grant U24AI117966 from the NIH National Institute of Allergy and Infectious Diseases as part of the Big Data to Knowledge program. We thank all members of the bioCADDIE community for their valuable input on the overall project.