Friday, August 2, 2013

Week 7: Midterm overview

This post is an extended version of my e-mail to mixxx-devel.

As part of the Google Summer of Code, I have been working on a branch to facilitate the automated selection of follow-up tracks. Since the project has reached the halfway point, I thought it would be good to get some feedback on the direction I've taken and to find out how this feature could be most useful.

The track suggestions are available through a "Selector" item on the library sidebar. This view is based off of Keith's earlier work, allowing one to filter based on matching key, genre, or BPM; it also adds another layer, predicated on the calculation of several similarity functions. At present, these functions compare the timbre, beat spectra, and Last.fm/MusicBrainz tags of the tracks, assigning a score from 0 (no match) to 1 (exact match); these are more expensive than the key/BPM match, and so are only calculated on demand.

One can specify the "seed track" (basis for comparison) in three ways: dragging a track to the Selector icon, right-clicking on a track and selecting the "Get follow-up track" context menu item, or just playing a track. The filters (checkboxes across the top) are all-or-nothing, allowing you to quickly pare down the library to a set of possible matches in key/BPM; then, the similarity functions (accessed by clicking "Calculate similarity") estimate the degree of closeness of those potential followups. Eventually I intend to have the similarity scores calculated automatically, perhaps when the pool of potential follow-ups is reduced below a critical value such as 100 tracks.

The "Selector" preference pane allows you to adjust the weight of the different similarity functions. I have found that the Last.fm tags are not very complete/helpful, so have generally used 50% timbre and 50% beat spectrum as the weights.

The similarity calculations currently achieve 81.7% accuracy on a "genre classification" task, compared with 82.8% in the automated playlist generator Mirage. However, I'm not sure that this test is a good assessment for how appropriate the suggested follow-ups are. Thus I would welcome any subjective assessments of how reasonable you find the top-scoring tracks to be, and also if adjusting the weights helps to improve matters.

High priorities for me are to establish an API for new similarity functions (basically, they must provide a scoreTrack function that takes two TrackPointers as input and gives a score from 0 to 1) and to abstract the scoring functions out of library/selector/SelectorLibraryTableModel.cpp so that they can be used to add to the Auto-DJ queue or elsewhere in Mixxx.

The branch is up at <https://github.com/chrisjr/mixxx/tree/features_key_selector>. Note, the database schema will be updated if you run it, so please don't use this on your primary library without a backup!

Comments on any aspect of this effort, including the UI, types of similarity function included, and the code itself, are all welcome.

No comments:

Post a Comment