Music Information Retrieval

cassette tapes

Music is ubiquitous in today's world-almost everyone enjoys listening to music. With the rise of streaming platforms, the amount of music available has substantially increased. While users may seemingly benefit from this plethora of available music, at the same time, it has increasingly made it harder for users to explore new music and find songs they like. Personalized access to music libraries and music recommender systems aim to help users discover and retrieve music they like and enjoy. 

To this end, the field of Music Information Retrieval (MIR) strives to make music accessible to all by advancing retrieval applications such as music recommender systems, content-based search, the generation of personalized playlists, or user interfaces that allow to visually explore music collections. This includes gathering machine-readable musical data, the extraction of meaningful features, developing data representations based on these features, methodologies to process and understand that data. Retrieval approaches specifically leverage these representations for indexing music and providing search and retrieval services.

In our research, we develop methods for analyzing user music consumption behavior, investigate deep learning-based feature extraction methods for music content analysis, predicting the potential success and popularity of songs, and distilling sets of features that allow capturing user music preferences for retrieval tasks.


Public Datasets

For our research, we employ a variety of datasets that we have curated and utilized in our research and publications. We are happy to share the following datasets:

  • #nowplaying is a dataset that leverages Twitter for the creation of a diverse and constantly updated data set describing the music listening behavior of users. Twitter is frequently facilitated to post which music the respective user is currently listening to. From such tweets, we extract track and artist information and further metadata. You can find the dataset on Zenodo: (CC BY 4.0).
  • The #nowplaying-RS dataset features context- and content features of listening events. It contains 11.6 million music listening events of 139K users and 346K tracks collected from Twitter. The dataset comes with a rich set of item content features and user context features, as well as timestamps of the listening events. Moreover, some of the user context features imply the cultural origin of the users, and some others—like hashtags—give clues to the emotional state of a user underlying a listening event. You can find the dataset on Zenodo: (CC BY 4.0).
  • The Spotify playlists dataset is based on the subset of users in the #nowplaying dataset who publish their #nowplaying tweets via Spotify. In principle, the dataset holds users, their playlists, and the tracks contained in these playlists. You can find the dataset on Zenodo: (CC BY 4.0).
  • The Hit Song Prediction dataset features high- and low-level audio descriptors of the songs contained in the Million Song Dataset (extracted via Essentia) for content-based hit song prediction tasks. You can find the dataset on Zenodo: (CC BY 4.0).


Photo by henry perks on Unsplash. 




Bib Link Download

Eva Zangerle, Michael Tschuggnall, Stefan Wurzinger and Günther Specht: ALF-200k: Towards Extensive Multimodal Analyses of Music Tracks and Playlists. In Advances in Information Retrieval - 39th European Conference on IR Research (ECIR 2018), pages 584-590. Springer, 2018

Bib Link Download

Christian Esswein, Markus Schedl and Eva Zangerle: geMsearch: Personalized Explorative Music Search. In Joint Proceedings of the ACM IUI 2018 Workshops co-located with the 23rd ACM Conference on Intelligent User Interfaces (ACM IUI 2018)., 2018

Bib Link Download

Martin Pichl: Multi-Context-Aware Recommender Systems: A Study on Music Recommendation. PhD thesis, University of Innsbruck, Department of Computer Science, 2018.



Eva Zangerle, Michael Tschuggnall, Stefan Wurzinger and Günther Specht: Analyzing Coherent Characteristics in Music Playlists. In Proceedings of the 4th Digital Humanities Austria Conference (dha 2017), Innsbruck, Austria 2017.

Bib Link

Martin Pichl, Eva Zangerle, Günther Specht and Markus Schedl: Mining Culture-Specific Music Listening Behavior from Social Media Data. In Proceedings of the IEEE International Symposium on Multimedia (ISM 2017), Taichung, Taiwan, December 11-13, 2017, pages 208-215. IEEE Computer Society, 2017

Bib Link

Benjamin Murauer, Maximilian Mayerl, Michael Tschuggnall, Eva Zangerle, Martin Pichl and Günther Specht: Hierarchical Multilabel Classification and Voting for Genre Classification. In CEURS Working Notes Proceedings of the MediaEval 2017 Workshop., 2017

Bib Link

Martin Pichl, Eva Zangerle and Günther Specht: Improving Context-Aware Music Recommender Systems: Beyond the Pre-filtering Approach. In Proceedings of the 2017 ACM International Conference on Multimedia Retrieval (ICMR 2017), pages 201-208. ACM, 2017

Bib Link

Martin Pichl, Eva Zangerle and Günther Specht: Understanding User-curated Playlists on Spotify: A Machine Learning Approach. In International Journal of Multimedia Data Engineering and Management (IJMDEM), vol. 8, no. 4. 2017


Bib Link Download

Martin Pichl, Eva Zangerle and Günther Specht: Understanding Playlist Creation on Music Streaming Platforms. In Proceedings of the IEEE Symposium on Multimedia (ISM), pages 475-480. IEEE, 2016

Bib Link Download

Eva Zangerle, Martin Pichl, Benedikt Hupfauf and Günther Specht: Can Microblogs Predict Music Charts? An Analysis of the Relationship Between #Nowplaying Tweets and Music Charts. In Proceedings of the 17th International Society for Music Information Retrieval Conference (ISMIR 2016), New York City, United States, August 7-11, 2016, pages 365-371.


Bib Link

Martin Pichl, Eva Zangerle and Günther Specht: Towards a Context-Aware Music Recommendation Approach: What is Hidden in the Playlist Name?. In Proceedings of 15th IEEE International Conference on Data Mining Workshops (ICDM 2015), pages 1360-1365. IEEE, 2015.

Bib Link

Martin Pichl, Eva Zangerle and Günther Specht: #nowplaying on #Spotify: Leveraging Spotify Information on Twitter for Artist Recommendations. In Current Trends in Web Engineering, 15th International Conference, ICWE 2015 Workshops (Revised Selected Papers), pages 163-174. Springer, 2015.


Bib Link Download

Martin Pichl, Eva Zangerle and Günther Specht: Combining Spotify and Twitter Data for Generating a Recent and Public Dataset for Music Recommendation. In Proceedings of the 26nd Workshop Grundlagen von Datenbanken (GvDB 2014), Ritten, Italy, vol. 1313, pages 35-40., Oct. 2014.

Bib Link Download

Eva Zangerle, Martin Pichl, Wolfgang Gassler and Günther Specht: #nowplaying Music Dataset: Extracting Listening Behavior from Twitter. In Proceedings of the 1st ACM International Workshop on Internet-Scale Multimedia Management (WISMM '14), pages 21-26. ACM, June 2014.


Bib Link Download

Eva Zangerle, Wolfgang Gassler and Günther Specht: Exploiting Twitter's Collective Knowledge for Music Recommendations. In Proceedings of the 2nd Workshop on Making Sense of Microposts (#MSM2012): Big things come in small packages, Lyon, France, 16 April 2012 (in connection with the 21st International Conference on World Wide Web), pages 14-17. 2012