Thus, a new approach is required. Exaile does this by querying artists similar to the current one, then selecting tracks at random from these other artists. This is a simple but promising approach, and one that can be adapted for Mixxx.
For now, I've changed the Last.fm fetcher to obtain similar artists as "tags," so songs can be compared that way. With my tiny sample library, it seems not to work terribly well, as none of the similar artists are also found in my library, but perhaps this is just a fluke. I'll try to incorporate more of my library to see whether it can be made to work.
As for testing the timbral similarity quantitatively, the test using the ISMIR '04 data is described on p. 55 of Dominik Schnitzer's thesis about Mirage. As he states:
"1. All pieces in a music collection are assigned an adequate genre label [ed note: these are given as a CSV file in the ISMIR data].I spent some time looking at the Google Test framework, but I'm not sure if this process can be completely automated using it. At the very least I can write up clear instructions/a shell script to do the test and automate at least part of it in the TimbreUtils class.
2. The files in the collection are analyzed and the full similarity matrix is computed for all pieces in the collection.
3. Then a genre classification confusion matrix is computed so that each song is assigned the genre of its most similar song.
4. In the confusion matrix the predicted genre of a song is plotted against its actual membership.
5. The confusion matrix diagonal shows the classification accuracies for each genre, which is the number of correctly classified songs divided by the total number of songs in the class."
Next week is the midterm, so I'll want to hammer out the rest of this artist similarity function and write out instructions for the selector UI so that others can give it a spin.