Monday, April 16, 2007

Clustering Music Clips

Data from

http://www.public.iastate.edu/~dicook/stat503/music-plusnew-sub-full.csv

1 Description [ From Di Cook's website]
This data was collected by Dr Cook from her own CDs. Using a Mac she read the track into the music editing software Amadeus II, snipped and saved the first 40 seconds as a WAV file. (WAV is an audio format developed by Microsoft, commonly used on Windows but it is getting less popular.) These files were read into R using the package tuneR. This converts the audio file into numeric data. All of the CDs contained left and right channels, and variables were calculated on both channels. The resulting data has 62 rows (cases)
and 7 columns (variables).

• LVar, LAve, LMax: average, variance, maximum of the frequencies of the left channel.
• LFEner: an indicator of the amplitude or loudness of the sound.
• LFreq: Median of the location of the 15 highest peak in the periodogram.

There are 11 tracks by Abba, 11 from the Beatles and 10 the Eels, which would be considered to be Rock, and 13 tracks by Vivaldi, 6 of Mozart and 8 of Beethoven, considered to be Classical. There are also 3 tracks from Enya, considered to be New Wave. The main question we want to answer is:

Can we group the tracks into a small number of clusters according to their similarity on audio charactieristics?”

This information might be used to arrange tracks on a digital music player. Other questions of interest might
be:
• Do the rock tracks have different characteristics than classical tracks?
• How does Enya compare to rock and classical tracks?
• Are there differences between the tracks of different artists?