I switch back and forth between Linux and Windows 7 pretty regularly these days. But I have has this problem with Linux. I can’t run Firefox 3.6 nightlies (which are nicer than the 3.5 release) and Weave and Flash all at the same time. I can run Firefox 3.5 and I can have Flash, but Weave doesn’t work. Or I can run 3.6 with Weave but then Flash crashes the browser. It turns out that for me Weave is way more important than Flash but I still miss being able to watch the occasional Youtube video.
If you’ve ever run Linux for any period of time, you’ve had this kind of experience.
Anyway, I decided to try and make it so that I could easily play Youtube videos without having to use Flash. (Flash – in many ways – is the weak link in the chain. In this case it’s because I can’t fix/hack it, although I’m happy to not have it because my browser is a lot more reliable.)

Theora + Youtube = Love
I wrote a greasemonkey script called Theoratube that connects to the Firefogg extension. It’s based very heavily on the really great Youtube without Flash Auto user script that lets you embed videos as a plug-in. But in my case I decided to use native Theora and HTML5 video because it’s more reliable, has controls and doesn’t require any additional software to start working.
How does it work? It pulls down the video, uses Firefogg to transcode it, and then stuffs it back into the browser via a private URL. It’s slow because it has to pull + encode the entire video, but it works surprisingly well for something that is as hacky as it is.
The worst part is the delay in between loading it and the first time you can play the video. It needs a few changes to make it more usable:
1. It needs to download the video, start transcoding it and stream it into the browser as soon as possible. This shouldn’t really be a problem and will probably be a pretty good experience. Youtube bursts its traffic to start and then throttles it way back and at least on my machine the encoder can keep up pretty easily to transcode live. This means taking it out of greasemonkey and putting into a proper extension to get everything inline properly.
2. The current Firefogg stuff works pretty well but it isn’t really designed for this use case. For example, once you start a download of a file there’s no way to cancel it. Same with encoding. Also it needs an event feedback system instead of polling, which is what the greasemonkey script has to do right now. But if you’re moving it to an extension above anyway this is probably pretty easy to do – just download in the extension directly and then include Firefogg for the encoding part.
3. Firefogg can only encode directly on a file instead of being part of a stream. Not sure if ffmpeg2theora can handle this across platforms or not – I suspect so, but it’s something that needs to be fixed.
4. It really needs a big online cache to store data once you’ve downloaded it and encoded it to a free format. Think of it as collaborate transcoding in the cloud. Download, transcode, re-upload so that others can benefit as well. You’ve already got the Youtube ID and the format it came from – that would make it pretty easy to key off of and do a quick lookup before trying to re-download the video and re-encode it.*
5. It needs to match particular quality formats to bitrates for the encoder. Right now it just encodes everything at top quality since it’s just local and it’s the least loss you can buy.
6. The copyright issues here are…interesting. There’s some content on Youtube for which the uploader actually has the copyright on that particular work. (I mean, we’re talking about Youtube. Your source for 15 remixes of Drunkest Guy Ever.) And this is mostly about accessibility, not downloading. So we’ll see if anyone gets upset.
7. It doesn’t handle youtube embedded videos on the web. This will be a little trickier, but certainly not impossible. Just need some more greasemonkey to help there. Probably have to transform the object embeds into something else.
8. Seeking might be a little challenging, but not impossible. There’s a parameter to the get_video call that lets you specify where to start pulling the video. Probably offset in the number of seconds. But needs some investigation/work.
So this is certainly 0.1 software. But it raises an interesting tactic in making content more accessible in open formats – even without the participation of the original party. Sometimes the work around isn’t what you expected. (Note: I would like this for t61 as well.)
* No fooling, I wonder if Google would be willing to host such an archive? People are willing to transcode to work around their lack of support for open formats – maybe they could at least provide a valid archive for them to work together? Or maybe the Internet Archive?