SpazBlog: April 2006

Wednesday, April 12, 2006

Cool proof-of-concept complete

I solved the problem from my previous message.

I've just got a really neat proof-of-concept program working. Here's my problem. I like to stream large (150 to 400 MB) AVI files over HTTP. There are currently four ways to do this:

1) Use Divx Player, which can progressively download. It lets you start playing the file immediately, and downloads as it plays. Problem is it has a bad interface, is VERY buggy, and for some reason makes the hard drive grind like mad.

2) Use VLC. It can also progressively download AVI files and start playing right away, but it treats the AVI file like a stream, and uses a buffer. The downside here is that you can't seek. It is also super buggy, because if you pause for too long it won't be able to resume. You'd think that it would let you seek by just restarting a new HTTP connection at a different point in the file, but no. Also, you don't get a copy on your drive when you're done watching.

3) Use MediaPlayerClassic. This is the player I use. It downloads the entire file to a temp file before playing, which makes it useless.

4) Download the whole AVI file with a browser before playing.

None of these options are good! DivX player is the closest, but it is so horrible that I don't want to use it. The huge CPU usage and massive harddrive load alone makes it a poor choice. It also mangles the filename. The first char is always replaced by a lowercase b, and it doesn't know how to convert stuff like %20 into a valid char, so it just drops them.

For the longest time, I haven't understood why nobody did AVI streaming over HTTP properly. First, some background. AVI files have a header at the beginning of the file, then the main data, then the index at the end. it is this index at the end that causes problems, since strictly speaking you need the index to play the file. In reality, many modern media players can play partial files by figuring things out as they go along. And yet, for HTTP, they don't do this. Virtually every media player that supports HTTP acts like Media Player Classic; they download the whole file first.

Now, HTTP supports random access. You can specify the range you want to read in an HTTP request. Since you need to seek near the end of a file when a media player opens it, why don't the players just use the "Range:" header to grab the index?

This stupidity has haunted me for years. I just couldn't figure out why only the DivX people did this (I assume they do).

So, I had an idea. If a player wants to open an AVI file, it doesn't scan the whole thing. It just seeks to the end, reads the index, and then seeks to the beginning and begins playing. So, I decided to create a fake AVI file.

Now, I don't have detailed knowledge of how AVI files work. I just know the basic file structure. I don't have any idea how long the index is supposed to be.

Here is how my proof of concept hack works:

1) Make a HEAD request to the server to get the filesize.

2) Open the file and allocate the whole thing in advance. Just create an empty file that is the full size of the final file.

3) Do a GET request for the last 1% of the file. This should be enough to contain the header. In my 170MB test file, the index was slightly over 1MB in size (It was obvious what the index looked like since it had a different data structure from the rest of the file, and was easily spotted by looking at the raw data). So 1% seems to be big enough. Write out the last 1% into the right part of the existing empty file.

4) Once the last part of the file is downloaded, make a new HTTP request for the whole file (or the first 99% at least). Start writing to the beginning of the file.

So, what do we do with this? Well, wait for the hack to download a bit of the beginning of the file (I waited 10MB in first test), and open it in Media Player Classic. Start playing.

Why does this work? Well, MPC doesn't care about the middle of the file. It reads the index from the end, which I've already put there. Then it goes to the beginning and starts playing. And while it is playing, the middle of the file is still downloading. So, while only part of this fake AVI file is downloaded, it is always downloading.

As long as the download speed is faster than the media's bitrate, the media player never catches up to the part of the file being written.

What is the end result? Using my hack, I can watch an AVI file after only downloading a few megs of it. SUCCESS!

To me, this is really cool. It is such a simple, and poorly written hack (It's a proof of concept hack, after all), but it just works. I can use it in practice. With this hack, ANY media player can be used to play a progressive download.

What is the downside of this hack? There is no indication in the media player how much has downloaded so far. You can seek in the file just fine, as long as you don't seek to the part of the file that hasn't been downloaded yet. But, even that works OK. If you try to seek to an invalid part in MPC, it just shows a static frame, then you can seek back to part that has been downloaded.

But I don't care! I've got my hack, and I'm now going to use it to stream an AVI of a show that I haven't seen before that is sitting on my HTTP server in Texas :)

# posted by Guspaz @ 11:26 p.m. 1 comments

Small world

So, I was looking at a random source file in Mozilla Firefox (I had the stupid idea of seeing how Firefox opens file to solve my problem, but I'll never find the right source file). And I noticed this:

* Contributor(s):
* Travis Bogard
* Pierre Phaneuf
* Peter Annema
* Dan Rosen

Any name there look familiar? ;)

# posted by Guspaz @ 7:50 p.m. 0 comments

Tuesday, April 11, 2006

Programming trouble

I know that most of the programmers who read planit are of the linux variety, but there are a lot of REALLY smart people here, and I'm hoping that at least one of them has experience with windows programming :)

I'm trying to write to a file, but let other applications read it at the same time. The general idea is to create a file and start writing to it (downloading data as you go), and still have other programs able to open the file to read from the part that has already been downloaded.

What I'm trying to do so far is open the file in write-only mode, and set the fileshare permission to read. I've also tried opening in readwrite, and setting fileshare permission to readwrite, and all combinations of those two. I just can't get this to work.

I know this can be done, because I can read firefox's downloads-in-progress just fine. Does anybody have an idea as to how this can be done?

# posted by Guspaz @ 11:47 p.m. 0 comments

Thursday, April 06, 2006

Adam's Axiom

The bus you miss is always early, and the bus you catch is always late.

# posted by Guspaz @ 9:16 a.m. 0 comments

Monday, April 03, 2006

Err?

So, I wanted to grab a SWF file off a web server. Save-page-as from Firefox wasn't really working perfectly. Out of habit, I opened up a console and used wget to grab the file. The file started downloading. All was good.

Then I realized that my laptop was running Windows XP. Wait a second, wget? On Windows XP? In PATH? I must have installed it a long time ago and forgotten about it. Thank goodness for cross-platform GNU tools. wget is awesome.

# posted by Guspaz @ 7:44 p.m. 0 comments

SpazBlog