Message Boards Message Boards

Music recommendation system with Wolfram Language

Motivation

While sipping my morning energy drink today, I was browsing my old archived backups and I came across some MP3 files from a 5 to 10 years ago, a time when music streaming services were nascent and offline music was the norm. I browsed through the archive and listened to some of my favorites from that time. After sometime, I realized that when I played in Autoplay mode, the music queue was very randomized. 10 years ago this was fine but now I've got used to listening similar tracks unless I manually change. So I thought:

Hey! let's design a music recommendation system, and let's use Mathematica to do it fast!

MP3 Metadata

First, I use exiftool to extract all the metadata of the music files and exported them to a CSV file:

exiftool - csv ~/music_archive/ > metadata . csv

csv = Import["~/metadata.csv"];
Dimensions[csv]
headers = csv[[1]]

The CSV file has 249 rows and 75 columns. The first row contains the field headers and the rest contain the actual metadata. There are 75 fields. But we don't need all. For this task, I have picked the following headers as these would be sufficient for my small music collection:

  • Album
  • Artist
  • Genre
  • Title

I thought of using the Composer and Year fields as well, but most rows did not have an entry for these fields. Next, I extract the position of these fields and extract the data with respect to these fields

fields = Position[headers, #] & /@ {"Album", "Artist", "Genre", "Title"} // Flatten
data = csv[[2 ;;, fields]];

At this point, I also extract the titles for use later: titles = data[[All, 4]];

Encoders

Now that I have the data, I needed a way to represent them numerically in order to further proceed. One simple way is to encode all discrete values as "One-Hot vectors". This encoding ensures that there are no inherent bias or ordinal relationships between the numerically encoded discrete values. But before encoding, I needed to extract all the unique values for each fields.

Albums

albums = data[[All, 1]] // DeleteDuplicates // Sort;
albums // Length

In my music collection, there 144 unique albums. This means that my encoder for album would produce a vector of length 144. Let's create that:

albumsEncoder = NetEncoder[{"Class", albums, "IndicatorVector"}];

Similarly, I create encoders for Artists and Genres.

Artists

artists = StringDelete[#, " - MassTamilan.org"] & /@ StringSplit[#, {"/", ", "}] & /@ 
    data[[All, 2]] // Flatten // DeleteDuplicates // Sort;
artists // Length

There are 186 unique artists in my music collection ... Interesting ...

artistsEncoder = NetEncoder[{"Class", artists, "IndicatorVector"}];

Note that there is StringSplit function used. This is because a single track may have multiple artists.

Genre

genre = DeleteMissing[(ToLowerCase /@ (StringSplit[#, "/"] & /@ 
          data[[All, 3]] // Flatten) // DeleteDuplicates) /. {"" -> Missing[]}] // Sort; genre // Length

And finally, there are 50 genres in my collection. Here, in addition to the use of StringSplit, I've also used DeleteMissing in order to deal with some instances that did not have a genre entry.

genreEncoder = NetEncoder[{"Class", genre, "IndicatorVector"}];

Feature Encoding

Now that I've created all the encoders, I'll take the data for each field and encode them.

Album

albumsData = data[[All, 1]];
albumsVector = albumsEncoder[albumsData];

Artist

In case of Artists, a single track may have multiple artists. To deal with this, I encoded each artist for a given track and then sum the two encoded vectors. This is done with the helper function encodeArtists: artistsData = StringSplit[#, {"/", ", "}] & /@ data[[All, 2]] encodeArtists[artists_] := Plus @@ artistsEncoder[artists] artistsVector = encodeArtists[#] & /@ artistsData;

Genre

Similar to artists, multiple genres for a given track are dealt by summing the encoded vectors of its individual genres. However, an additional challenge here is to deal with the missing values. For this, since the encoding function returns 0 for this, I just convert them to list and pad more zeros to make their length consistent. genreData = ToLowerCase[ StringSplit[#, "/"] & /@ data[[All, 3]]]; encodeGenre[genreList_] := PadRight[Flatten@{Plus @@ genreEncoder[genreList]}, Length[genre]] genreVector = encodeGenre[#] & /@ genreData;

Feature aggregation and embeddings

Now that the fields have been converted to feature vectors, they can be flattened for each instance: features = Flatten /@ Transpose[{albumsVector, artistsVector, genreVector}]; features // Dimensions The result is a {248, 300} array which means 248 examples and each example has 380 features, which isn't very bad for further computation but can be challenging to interpret. So I'm going to reduce this to just 3 dimensions because then it wold be very convenient to visualize and draw interpretations. For dimensionality reduction, I have used "t-SNE" because it preserves the local relationships between data points i.e. it preserves clusters to certain degree.

features3D = DimensionReduce[features, 3, Method -> "TSNE"];
ListPointPlot3D[features3D]

Visualization of metadata features in 3 Dimensions

The visualization is very interesting because it indicates that there are several clusters. The obvious interpretation is that points in the same cluster have high degree of similarity.

Distance computation... and the Recommendations

Given a track, the goal is to identify tracks that are similar to it i.e. given the feature vector of a track, we want to identify n other feature vectors that are closest to it. For simplicity, let's assume here n=5. There are several distance metrics for vectors but I found the cosine distance and the Euclidean distance to give the best results. To begin with the experiment, I'm going to pick a random feature vector and see what song it corresponds to.

In[83]:= sample = RandomChoice[features3D];
Position[features3D, sample]

Out[84]= {{209}}

In[87]:= titles[[209]]

Out[87]= "I'm an Albatraoz"

So the random choice picked here is "I'm an Albatraoz" by Aron Chupa and Lil Sis Nora. This song is very popular and can be classified as a party/pop/rock type.

CosineDistance

Now I compute the cosine distance between the sample and all the other vectors, and then pick 5 which have the smallest distance. In case of Cosine distance, the value ranges from 0 to 2, 0 meaning the vectors are along the same direction and close and 2 meaning the vectors are pointing diametrically opposite.

In[99]:= cosineDistances = CosineDistance[sample, #] & /@ features3D;
top5Cosine = Sort[cosineDistances][[2 ;; 6]];
top5CosineIndex = Flatten[Position[cosineDistances, #] & /@ top5Cosine]; 
titles[[#]] & /@ top5CosineIndex

Out[102]= {"Little Swing", "Trombone", "Rave in the Grave", \
"Tequila", "Kinare"}

The cosine distance works wonderfully. It has picked up 4 songs out of 5 which belong to the same artist and are of same party/pop/rock type. The last song is a Bollywood rock song. Thus this distance metric works wonderfully for my small dataset.

EuclideanDistance

Let' s try and see if Euclidean Distance also gives similar or better results.

In[107]:= 
euclideanDistances = EuclideanDistance[sample, #] & /@ features3D; 
top5Euclidean = euclideanDistances // Sort;
top5EuclideanIndex = Flatten[Position[euclideanDistances, #] & /@ top5Euclidean][[2 ;; 6]];
titles[[#]] & /@ top5EuclideanIndex

Out[110]= {"Trombone", "Little Swing", "Tequila", "Rave in the Grave", "Llama In My Living Room"}

The Euclidean distance works even better. It has picked up all the 5 songs which are by the same artist.

Conclusion

Now that I've come up with a neat model that can compute similar tracks, I will write a script that will generate m3u files. One important thing to note is that when the computed distances are sorted, the instances that occur latter may not necessarily be similar to the initial sample and most certainly not similar to its neighboring instances. Therefore there needs to be a recalibration done after, say, every 5 or 10 tracks in order to obtain tracks that are similar to one that index. I also plan to extend this script such that it can generate playlist from input values combinations such as artist name, genre instead of a song. But that's a project for another weekend.

enter image description here -- you have earned Featured Contributor Badge enter image description here Your exceptional post has been selected for our editorial column Staff Picks http://wolfr.am/StaffPicks and Your Profile is now distinguished by a Featured Contributor Badge and is displayed on the Featured Contributor Board. Thank you!

POSTED BY: EDITORIAL BOARD
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract