Wednesday, March 01, 2006

Something Random

My sister was complaining about her mp3 player the other day. She didn't like the randomness of her random function, as it seems to play a few songs too many times and a few songs not at all. I have the same mp3 player, and I think I notice it, too. I think the problem is a couple of things: the size of its playlist buffer and the Gaussian normalization of the distribution. It seems like if you have a list of 100 things and one is picked randomly, each thing has a 1 in 100 chance of being picked. So why does such a simple concept result in an mp3 player that can't pick songs correctly?

Of course it'd be a simple thing for a player to keep track of the last few songs played and not to play them, but it may also depend on the operating system's method of keeping track of files as to how big the list can be. Assuming it can't keep track indefinitely, at some point the buffer is going to overflow and there's a chance a song you've already heard's going to be played. It may be song you've heard before a lot, too. In fact, perhaps the size of the buffer is fixed, so you don't want it to be too big anyway because maybe some person doesn't put that many songs on their mp3 player and it'd just screw up the random function anyway? Or maybe with that in mind, the size of the buffer is set as a certain fraction of the number of songs in the overall list?

Now, if you put it on the "play randomly, but only once" mode, this may be entirely different. If I were to implement it, it'd probably work by flagging each file as it's played and picking one that's not flagged, randomly, for each new song. In short, it wouldn't really depend on a playlist buffer at all, and so, isn't limited by that as far as randomness is concerned. There's a term for this in probability class, and even an equation. But I forget.

It's been a long time since I've taken any statistics classes, but one thing I vaguely remember is convolution. Actually, 'vaguely' isn't the right word. 'Poorly' is a better one. One interesting/pertinent thing about convolution, though, is that given any type of distribution (such as our evenly distributed 1 in 100 one above), if you convolve (the verb form of convolution) it with itself enough, it becomes a Guassian distribution: normalized. What this means is that there are going to be some on one end that are played few or no times, a lot toward the middle that are played an average number of times, and some toward the other end that are played a whole lot. A playlist buffer might counteract it initially, but eventually, it becomes useless as far as desired randomness is concerned. I believe this concept is "randomness with reseeding", which forms the basis for the normalization of any distribution convolved with itself enough.

Sometime before my "hard core" statistics class, I took a physics class on optics. The professor showed us this one demonstration where photons were separated using grates, each grate having an equal chance of being the one the photon went through. The professor said that in the original experiment, one photon, passing through the grate, would strike some photon-sensitive backing opposite the grate, making a mark to show where it landed. So with that in mind, what did we, as the classs, think the resulting backing would look like over time as more and more photons hit it?

As it turns out, there was a big clump in the center, and corresponding but smaller clumps to either side of the big one, and so on, further out, in a line. The professor said it was interesting, but didn't really elaborate on why it looked that way. It wasn't until I took my statistics class that I put two and two together. Well, that is to say, I'm pretty sure that's why it looked like that. Just like I'm pretty sure that that's why mp3 player "random repeat" functions can sometimes suck.

2 comments:

Deals On Wheels said...

I can totally relate to this. Last week I had to go on a business trip to Austin for a conference. One of my coworkers came with me. I plugged in my mp3, and we proceeded to listen to whatever playlist I had selected (more mellow music for the sake of my coworker). On this specific playlist I had a couple of songs by JOURNEY. Just two or three in a playlist of over 200. Chances are that I won't hear the same song twice during a four hour long road trip, right? Wrong. Those JOURNEY songs just kept resurfacing (this was particularly frustrating because my coworker apparently isn't the biggest JOURNEY fan). By the end of the trip it had turned into somewhat of a joke, "OMW! It is JOURNEY again! Can you believe it?!"

Actually, I guess it was kind of funny, but I totally get your point. Random can't be random if it turns into a joke...

gabe said...

JOURNEY!

I guess maybe that's why people are so big into have 20-30 gig mp3 players. Given a sufficient pool to choose from, it just becomes less of an issue.