Friday, August 23, 2013

Week 10: Making filters apply consistently

With the previous way of passing filters settings from DlgSelector to SelectorLibraryTableModel, a filter update and select() was executed for every change whether user-initiated or programmatic. Since the default behavior is to load filter settings programmatically when the seed track changes, and each control triggered the update, there were a number of redundant calls to select().

In order to fix this, I changed the API for SelectorLibraryTableModel: now, each filter has a corresponding "SelectorLibraryTableModel::setXfilter" function that changes the appropriate variable but does not trigger the filter update. Thus, SelectorLibraryTableModel::applyFilters must be called manually once all changes have been applied -- the "DlgSelector::filterByX" functions each call applyFilters, as they are meant to be connected to the appropriate signals from the UI, whereas loadStoredFilterSettings does so only once after all the filters have been changed.

This change fixes the issue where filters were not getting applied after the first track was played without manually checking/unchecking them, and cuts down on unnecessary database queries.

There is still an issue wherein, once a track has played, it always appears in the filtered set whether or not it matches the filters. Since I drop the temporary table (that stores the preview and score columns) whenever a new seed track is set, I'm not sure how or why the old track is being kept; this requires more investigation.

Friday, August 9, 2013

Week 8: Isolating similarity calculations

This week, I placed the similarity calculations in a separate class so that they could more easily be called from outside the SelectorLibraryTableModel. Initially, this change seemed to cause a unusually large performance hit, but I eventually realized that due to merging with master, the tracks were being dirtied in TrackDAO::getTrackFromDb and thus each one was being re-saved upon access. Resolving this caused average time for calculating similarity against 1458 tracks to jump back from 2800 ms to about 1100 ms.

Further improvement should still be possible, and in my search for places to optimize I tried using callgrind, but it was not particularly revealing (the biggest bottleneck seemed to be the database queries). I will try with the Google profiling tools to see if I can uncover anything else.

Right now, to add a new similarity function requires changes in the following places:

  • dlgprefselector.ui: add a new slider
  • dlgprefselector.cpp: hook up the slider signals/slots, and add a function to show the description
  • library/selector/selector_preferences.h: add a preference key for said slider
  • library/selector/selectorsimilarity.cpp: implement the actual comparison in the foreach loop of calculateSimilarities
I'll continue looking for ways to speed up the similarity calculation so that it is more responsive, and see if it might be worthwhile to further abstract out comparison functions from SelectorSimilarity to facilitate the addition of new ones.

I also plan to add a function to SelectorSimilarity to allow to fetch the top (or top N) matching tracks according to current preferences; this could then be fed into the Auto-DJ queue or a MIDI mapping with ease. This requires essentially the same filters as are used in SelectorLibraryTableModel, so now is a good time to think about cleaning up the generation of queries for key/BPM filter matching so that the same functions can be used in both classes.

Friday, August 2, 2013

Week 7: Midterm overview

This post is an extended version of my e-mail to mixxx-devel.

As part of the Google Summer of Code, I have been working on a branch to facilitate the automated selection of follow-up tracks. Since the project has reached the halfway point, I thought it would be good to get some feedback on the direction I've taken and to find out how this feature could be most useful.

The track suggestions are available through a "Selector" item on the library sidebar. This view is based off of Keith's earlier work, allowing one to filter based on matching key, genre, or BPM; it also adds another layer, predicated on the calculation of several similarity functions. At present, these functions compare the timbre, beat spectra, and Last.fm/MusicBrainz tags of the tracks, assigning a score from 0 (no match) to 1 (exact match); these are more expensive than the key/BPM match, and so are only calculated on demand.

One can specify the "seed track" (basis for comparison) in three ways: dragging a track to the Selector icon, right-clicking on a track and selecting the "Get follow-up track" context menu item, or just playing a track. The filters (checkboxes across the top) are all-or-nothing, allowing you to quickly pare down the library to a set of possible matches in key/BPM; then, the similarity functions (accessed by clicking "Calculate similarity") estimate the degree of closeness of those potential followups. Eventually I intend to have the similarity scores calculated automatically, perhaps when the pool of potential follow-ups is reduced below a critical value such as 100 tracks.

The "Selector" preference pane allows you to adjust the weight of the different similarity functions. I have found that the Last.fm tags are not very complete/helpful, so have generally used 50% timbre and 50% beat spectrum as the weights.

The similarity calculations currently achieve 81.7% accuracy on a "genre classification" task, compared with 82.8% in the automated playlist generator Mirage. However, I'm not sure that this test is a good assessment for how appropriate the suggested follow-ups are. Thus I would welcome any subjective assessments of how reasonable you find the top-scoring tracks to be, and also if adjusting the weights helps to improve matters.

High priorities for me are to establish an API for new similarity functions (basically, they must provide a scoreTrack function that takes two TrackPointers as input and gives a score from 0 to 1) and to abstract the scoring functions out of library/selector/SelectorLibraryTableModel.cpp so that they can be used to add to the Auto-DJ queue or elsewhere in Mixxx.

The branch is up at <https://github.com/chrisjr/mixxx/tree/features_key_selector>. Note, the database schema will be updated if you run it, so please don't use this on your primary library without a backup!

Comments on any aspect of this effort, including the UI, types of similarity function included, and the code itself, are all welcome.