View Single Post
Old 10th January 2010, 23:54   #21
Major Dude
Join Date: Mar 2008
Location: Erlangen
Posts: 859
Some experiments on beat detection

Finally got some time to play with the new MD features as of 5.57 beta. Found no problems so far, except for the nightmare syntax, which forces me to nest confusing exec blocks to buggery, where I just wanted to insert a single additional statement... anyway.

I thought now where we can store a million instead of just 32 variables, we can buffer the past few seconds of the volume, and exploit it for a better beat detection, using autocorrelation or Fourier transform. The attached preset demonstrates these experiments. What it does (in the frame equations):

1. Create an average free volume variable (vol). I used bass+treb, which probably contain most of the rhythm information, and subtracted its lowpass filtered value.

2. Push it into a ring buffer. I chose a buffer size of 120, sufficient to store 4 seconds of vol data (at 30 fps). On each frame, the whole buffer content is moved one place, as in a shift register, i.e.
buf(120) will be lost
buf(119) will be moved to place 120
buf(118) will be moved to place 119
The new vol value is written to buf(1). Now we have a history of the last 4 seconds vol data. This is shown in the moving pink curve (newest vol value on the left hand border).

3. Perform an autocorrelation on this buffer. Nothing mysterious and actually quite simple to calculate. Correlation is generally a measure for the similarity between two blocks of data. In our case, we calculate the correlations between the newest block of vol data, say the past 2 seconds, and older data blocks, in increments of one buffer place (1/30 sec):

corr(0) = corr (data(1..60), data (1..60))
corr(1) = corr (data(1..60), data (2..61))
corr(2) = corr (data(1..60), data (3..62))
corr(59)= corr (data(1..60), data (60..120))

Of course corr(0) will yield the highest value because it correlates between the same block of data. We can give it a miss. As the distance between the blocks becomes larger, the correlation will decrease, except if there is a repetitive pattern in the vol data. The autocorrelation reveals this pattern. For instance Gigi D'Agostinos "Another way" has a tempo of 129 BPM (beats per minute), one beat every 0.464 seconds, which at 30 fps corresponds to 14 frames, so we may expect the first correlation maximum for corr(14).
The green curve shows the autocorrelation, starting with 0 at the left hand border. The scale is 0.5 seconds per division. When playing "Another way", you'll note the first maximum at approximately 0.46 seconds as expected, the next at 2*0.46 seconds etc. Note the scaling is only correct if the preset is run at exactly 30fps !!!

4. The bottom two curves show the Fourier transform of:
- the autocorelated vol data (white)
- the vol variable directly (brown)
I scaled it in BPM: One division = 50 BPM. For "Another way", the first maximum appears, as expected, around 130 BPM (again, only if run at 30 fps).

As could be expected, Fourier or autocorrelation analyses work quite well with reasonably rhythmic pieces, but fail where the rhythm is not clearly expressed in volume. This is the same in principle as for a primitive bass based beat detection.
Additionally, the current simple implementation entirely depends on a constant fps rate, and goes wrong when fps changes. This problem can probably be overcome by a more sophisticated code but in general I think these schemes will hardly provide any significant benefit for beat detection.
Nitorami is offline   Reply With Quote