Spotify Million Playlist Recommendation Project

CS 109A Final Project

Spotify is a music streaming platform with over 191 million users, making it one of the most popular music streaming softwares in the world. Spotify hosts over 40 million songs which users can organize into personal playlists.

One of Spotify’s core features is a music recommendation system (MRS). As users craft playlists, Spotify provides suggestions for songs a user might add to that playlist. The ability to curate good playlist suggestions can make or break a user’s experience and therefore is a key area of investment for Spotify.

This project attempts to tackle not only this general problem of playlist generation, it also looks to tackle one of the hardest formulations of playlist generation: the“cold start” problem. The cold start problem attempts to generate suggestions based on a limited number of starting tracks from which to infer. We’ve defined this limited number as 1 to 20 songs, with 1 being the most difficult task and 20 approaching a more regular dataset. For more reasons why the problem of song recommendation is both particularly thorny (and thus perhaps all the more worthwhile) compared to similar problems refer to the context section of the literature review.

For all the nuance of this problem, we aimed to tackle this problem in a simple, yet effective way. We wanted to avoid using other datasets which might encounter fuzzy matching problems or overly complicated hybrid methods that might overfit the model or provide only marginal returns on large investments of computational power. We settled on a collaborative filtering mechanism that we found produced surprisingly good accuracy at predicting a hidden test set.

This website consists of six sections: results from data exploration, a literature review of prior attempts at tackling this problem, overview of our model, a report of our results, a report of our conclusion explaining the implications of our work and future ways to improve upon it, and a works cited page.