SpazBlog: Cool proof-of-concept complete

Wednesday, April 12, 2006

Cool proof-of-concept complete

I solved the problem from my previous message.

I've just got a really neat proof-of-concept program working. Here's my problem. I like to stream large (150 to 400 MB) AVI files over HTTP. There are currently four ways to do this:

1) Use Divx Player, which can progressively download. It lets you start playing the file immediately, and downloads as it plays. Problem is it has a bad interface, is VERY buggy, and for some reason makes the hard drive grind like mad.

2) Use VLC. It can also progressively download AVI files and start playing right away, but it treats the AVI file like a stream, and uses a buffer. The downside here is that you can't seek. It is also super buggy, because if you pause for too long it won't be able to resume. You'd think that it would let you seek by just restarting a new HTTP connection at a different point in the file, but no. Also, you don't get a copy on your drive when you're done watching.

3) Use MediaPlayerClassic. This is the player I use. It downloads the entire file to a temp file before playing, which makes it useless.

4) Download the whole AVI file with a browser before playing.

None of these options are good! DivX player is the closest, but it is so horrible that I don't want to use it. The huge CPU usage and massive harddrive load alone makes it a poor choice. It also mangles the filename. The first char is always replaced by a lowercase b, and it doesn't know how to convert stuff like %20 into a valid char, so it just drops them.

For the longest time, I haven't understood why nobody did AVI streaming over HTTP properly. First, some background. AVI files have a header at the beginning of the file, then the main data, then the index at the end. it is this index at the end that causes problems, since strictly speaking you need the index to play the file. In reality, many modern media players can play partial files by figuring things out as they go along. And yet, for HTTP, they don't do this. Virtually every media player that supports HTTP acts like Media Player Classic; they download the whole file first.

Now, HTTP supports random access. You can specify the range you want to read in an HTTP request. Since you need to seek near the end of a file when a media player opens it, why don't the players just use the "Range:" header to grab the index?

This stupidity has haunted me for years. I just couldn't figure out why only the DivX people did this (I assume they do).

So, I had an idea. If a player wants to open an AVI file, it doesn't scan the whole thing. It just seeks to the end, reads the index, and then seeks to the beginning and begins playing. So, I decided to create a fake AVI file.

Now, I don't have detailed knowledge of how AVI files work. I just know the basic file structure. I don't have any idea how long the index is supposed to be.

Here is how my proof of concept hack works:

1) Make a HEAD request to the server to get the filesize.

2) Open the file and allocate the whole thing in advance. Just create an empty file that is the full size of the final file.

3) Do a GET request for the last 1% of the file. This should be enough to contain the header. In my 170MB test file, the index was slightly over 1MB in size (It was obvious what the index looked like since it had a different data structure from the rest of the file, and was easily spotted by looking at the raw data). So 1% seems to be big enough. Write out the last 1% into the right part of the existing empty file.

4) Once the last part of the file is downloaded, make a new HTTP request for the whole file (or the first 99% at least). Start writing to the beginning of the file.

So, what do we do with this? Well, wait for the hack to download a bit of the beginning of the file (I waited 10MB in first test), and open it in Media Player Classic. Start playing.

Why does this work? Well, MPC doesn't care about the middle of the file. It reads the index from the end, which I've already put there. Then it goes to the beginning and starts playing. And while it is playing, the middle of the file is still downloading. So, while only part of this fake AVI file is downloaded, it is always downloading.

As long as the download speed is faster than the media's bitrate, the media player never catches up to the part of the file being written.

What is the end result? Using my hack, I can watch an AVI file after only downloading a few megs of it. SUCCESS!

To me, this is really cool. It is such a simple, and poorly written hack (It's a proof of concept hack, after all), but it just works. I can use it in practice. With this hack, ANY media player can be used to play a progressive download.

What is the downside of this hack? There is no indication in the media player how much has downloaded so far. You can seek in the file just fine, as long as you don't seek to the part of the file that hasn't been downloaded yet. But, even that works OK. If you try to seek to an invalid part in MPC, it just shows a static frame, then you can seek back to part that has been downloaded.

But I don't care! I've got my hack, and I'm now going to use it to stream an AVI of a show that I haven't seen before that is sitting on my HTTP server in Texas :)

# posted by Guspaz @ 11:26 p.m.

Comments:

I haven't the faintest idea of who you are.. but nevertheless I wanted to drop by and say: I'm happy for ya, man! Good job and major pat on the back for you :)

-I can relate (to your inability to understand their stupidity)

# posted by

Unknown : 9:27 p.m., June 01, 2007

SpazBlog

Wednesday, April 12, 2006

Cool proof-of-concept complete

About Me

Links

archives