bringing theora to youtube (the hard way)

I switch back and forth between Linux and Windows 7 pretty regularly these days.  But I have has this problem with Linux. I can’t run Firefox 3.6 nightlies (which are nicer than the 3.5 release) and Weave and Flash all at the same time. I can run Firefox 3.5 and I can have Flash, but Weave doesn’t work. Or I can run 3.6 with Weave but then Flash crashes the browser. It turns out that for me Weave is way more important than Flash but I still miss being able to watch the occasional Youtube video.

If you’ve ever run Linux for any period of time, you’ve had this kind of experience.

Anyway, I decided to try and make it so that I could easily play Youtube videos without having to use Flash. (Flash – in many ways – is the weak link in the chain.  In this case it’s because I can’t fix/hack it, although I’m happy to not have it because my browser is a lot more reliable.)

Theora + Youtube = Love

Theora + Youtube = Love

I wrote a greasemonkey script called Theoratube that connects to the Firefogg extension. It’s based very heavily on the really great Youtube without Flash Auto user script that lets you embed videos as a plug-in. But in my case I decided to use native Theora and HTML5 video because it’s more reliable, has controls and doesn’t require any additional software to start working.

How does it work?  It pulls down the video, uses Firefogg to transcode it, and then stuffs it back into the browser via a private URL.  It’s slow because it has to pull + encode the entire video, but it works surprisingly well for something that is as hacky as it is.

The worst part is the delay in between loading it and the first time you can play the video.  It needs a few changes to make it more usable:

1. It needs to download the video, start transcoding it and stream it into the browser as soon as possible.  This shouldn’t really be a problem and will probably be a pretty good experience.  Youtube bursts its traffic to start and then throttles it way back and at least on my machine the encoder can keep up pretty easily to transcode live.  This means taking it out of greasemonkey and putting into a proper extension to get everything inline properly.

2. The current Firefogg stuff works pretty well but it isn’t really designed for this use case.  For example, once you start a download of a file there’s no way to cancel it.  Same with encoding.  Also it needs an event feedback system instead of polling, which is what the greasemonkey script has to do right now.  But if you’re moving it to an extension above anyway this is probably pretty easy to do – just download in the extension directly and then include Firefogg for the encoding part.

3. Firefogg can only encode directly on a file instead of being part of a stream.  Not sure if ffmpeg2theora can handle this across platforms or not – I suspect so, but it’s something that needs to be fixed.

4. It really needs a big online cache to store data once you’ve downloaded it and encoded it to a free format.  Think of it as collaborate transcoding in the cloud.  Download, transcode, re-upload so that others can benefit as well.  You’ve already got the Youtube ID and the format it came from – that would make it pretty easy to key off of and do a quick lookup before trying to re-download the video and re-encode it.*

5. It needs to match particular quality formats to bitrates for the encoder.  Right now it just encodes everything at top quality since it’s just local and it’s the least loss you can buy.

6. The copyright issues here are…interesting.  There’s some content on Youtube for which the uploader actually has the copyright on that particular work.  (I mean, we’re talking about Youtube.  Your source for 15 remixes of Drunkest Guy Ever.)  And this is mostly about accessibility, not downloading.  So we’ll see if anyone gets upset.

7. It doesn’t handle youtube embedded videos on the web.  This will be a little trickier, but certainly not impossible.  Just need some more greasemonkey to help there.  Probably have to transform the object embeds into something else.

8. Seeking might be a little challenging, but not impossible.  There’s a parameter to the get_video call that lets you specify where to start pulling the video.  Probably offset in the number of seconds.  But needs some investigation/work.

So this is certainly 0.1 software.  But it raises an interesting tactic in making content more accessible in open formats – even without the participation of the original party.  Sometimes the work around isn’t what you expected.  (Note: I would like this for t61 as well.)

* No fooling, I wonder if Google would be willing to host such an archive?  People are willing to transcode to work around their lack of support for open formats – maybe they could at least provide a valid archive for them to work together?  Or maybe the Internet Archive?

This entry was posted in OGG, Video, Youtube. Bookmark the permalink.

37 Responses to bringing theora to youtube (the hard way)

  1. Pingback: Ryan Paul (segphault) 's status on Monday, 02-Nov-09 14:54:22 UTC - Identi.ca

  2. Michael says:

    Having it hosted on Internet Archive would raise the copyright question, are you allowed to transcode and copy it on another website ?

  3. Joe says:

    Why can’t you just play the native H.264 version of YouTube videos? (See ClickToFlash for MacOS X)

  4. Pingback: Glyn Moody (glynmoody) 's status on Monday, 02-Nov-09 15:09:48 UTC - Identi.ca

  5. Ben says:

    It seems like this comment thread deserves a mention of tinyvid.tv, a sort of YouTube Theora transcode archive.

  6. ulrik says:

    Admireable method you have invented! But I also think –should it not be possible to convert it into a container format –perhaps OGG, that only references a video resource on Youtube that we don’t need to transcode?

  7. Mark says:

    Instead of a real storage for the transcoded files, couldn’t it be a simple peer2peer network, where the most popular videos are available “in the cloud”. You could even extend it to decoding chunks together (if the video is big and multiple people are currently using Theoratube on the same one at the same time.

  8. ac says:

    One useful scenario not discussed here:

    Flash has almosy NO video acceleration (GPU) support, so performance is heavily dependent on your CPU speed. This is a serious flaw for which Adobe has been dragging it’s feet to fix… and in the process, Flash is putting brakes on state of the art…

    There is a new generation of laptops and “green”, low-energy all-in-one PC’s like the Acer Aspire Revo, and these systems couple a relatively “weak” CPU (Intel Atom 230) with a plenty-powerful nvidia video chip.

    These systems have video chipsets which handle Blu-Ray just fine… but give it a YouTube video, and it chokes because Flash must run on the CPU…

  9. Ian McKellar says:

    Surely that transcoding step is just subjecting yourself to pain, waiting and loss of fidelity. Totem’s plugin (with a reasonably default set of packages) should happily play anything served off YouTube’s servers. You might argue for the Freedom of Theora, but Fireogg is obviously doing some patent-dodgy decoding there.

  10. Pingback: Don Christie (normnz) 's status on Monday, 02-Nov-09 19:50:18 UTC - Identi.ca

  11. @Joe – Firefox doesn’t support H.264. It’s a proprietary format and isn’t compatible with the standards that the web is based on. So we haven’t implemented it.

    @Ben – Yeah, that’s one use case for tinyvid. But tinyvid is useful for a lot of other stuff as well.

    @Ian – I’ve never gotten the totem plugin to work with H.264. And I’ve tried.

  12. Pingback: Gastón Sequeira (gastonsequeira) 's status on Monday, 02-Nov-09 22:43:34 UTC - Identi.ca

  13. Pingback: Samat Jain (tamasrepus) 's status on Monday, 02-Nov-09 23:25:42 UTC - Identi.ca

  14. mejogid says:

    Great hack! If you’re seriously looking for a more practical solution, have you checked out gnash and swfdec? Unless it’s a general issue with plugins you’re suffering they’re probably a fair bit more practical and better performing. However there’s something pretty cool about doing this natively in the browser – I’d definitely like to see HTML5 rendering of youtube videos.

  15. James Henstridge says:

    So for this to work, you need a VP6 or H.264 decoder, a theora encoder, and a theora decoder. If you already have the patent encumbered decoders, why bother with the theora step? It will chew up CPU cycles and degrade the resulting quality. That doesn’t really look like a win to me.

    Also, I wouldn’t describe H.264 as a proprietary format. It has a specification that can be used to judge the conformance of an implementation and multiple interoperable implementations. Even if the format were proprietary, that wouldn’t be an explanation as to why Firefox couldn’t implement the format: the problem is patents. It would probably be better to describe the format as “patent encumbered”.

  16. Anonymous says:

    Cool hack to make youtube work in-browser. But for a more practical way to watch youtube videos without Flash, how about youtube-dl, get-flash-videos, keepvid, or clipnabber?

    A more general question: Firefox currently implements Ogg, Theora, and Vorbis natively, but does Mozilla have any plans to try integrating something like gstreamer to fit in more naturally with platforms? (Note that gstreamer does not just run on Linux/POSIX; it works on Windows and OS X too.)

  17. Bod says:

    @James Henstridge

    I’ve seen a few folk make the same claim about H.264 not being “proprietary”, though you go further and state that it’s not actually a problem if it is “proprietary”, it is the patents that are the problem.

    Patents -> Intellectual Property -> Property -> Proprietor -> Proprietary

    Proprietary has had that meaning since before there was such a thing as software (think “proprietary medicine”). I don’t see the benefit of redefining it now.

  18. Pingback: Theoratube convierte los videos de Youtube a Theora « INATUX

  19. James Henstridge says:

    @Bod: I don’t think that is a very useful definition of a proprietary format.

    When people talk about proprietary formats, they are usually referring to ones where the format is defined by a particular implementation rather than a document describing the standard. A second implementation’s conformance would be judged based on how well it interoperates with the primary implementation.

    Implementing proprietary formats is not necessarily a problem for free software: we might not want to encourage people to store their data in such a format, but interoperability can be quite useful. For example, most people would consider it a good thing for free software office apps to support Microsoft’s proprietary formats.

    For similar reasons it would be nice to support formats like H.264, but that is not possible due to the patent situation. So if you are trying to explain this to someone, why not just come out and tell them that it is patent encumbered?

  20. How about asking YouTube (Google) developers to add support for requesting a Theora format of the video stream? There is already a parameter there to request different encodings of the same video (most notably used for HD and non-HD versions and for iPhone versions), why not add another for Theora?

  21. Pingback: Links 03/11/2009: KDE 4.3.3, Mandriva 2010 Released | Boycott Novell

  22. Anonymous says:

    @Aigars: Because they’d have to re-encode every YouTube video in Theora, and store the results. And they seem resistant to doing so, despite evidence that it might actually give better results than H.264 for a given bitrate.

  23. Pingback: Blizzard: bringing theora to youtube (the hard way) | Full-Linux.com

  24. I have my own personal definition for “proprietary.” It’s a simple test – do I have to ask the permission from an organization or person before I can use the technology? For H.264 the answer is yes, even if it’s considered “open” by a lot of people. It’s a huge drag on innovation and creativity.

  25. Pingback: Roy (linuxcanuck) 's status on Wednesday, 04-Nov-09 20:32:03 UTC - Identi.ca

  26. Pingback: Theoratube convierte los videos de Youtube a Theora « Software Libres, Mangas y animes ….. son Los lazos que nos unen a los que visitan esta bitacora

  27. Pingback: Linkschleuder (4) – Die Welt ist gar nicht so.

  28. YAFU says:

    Firefoog is not compatible with Firefox 3.5.5 :(

    It would be nice if you could make it all flash video can be viewed using HTML5 functions, not just from youtube.
    Thank you.

  29. Pingback: Open Video Alliance: Open standards, open source, open content

  30. Matěj Cepl says:

    I am too lazy to rewrite your script as jetpack … could we get at least an update for the current YouTube … it seems your script breaks it as of now.

    Error: document.getElementById(“player-toggle-switch”) is null
    Source File: file:///home/matej/.mozilla/firefox/t5k6klu2.default/gm_scripts/html5_and_theora_youtube/html5_and_theora_youtube.user.js
    Line: 234

  31. Updated. Sadly they moved the resize control into the flash player so you can’t change it on the fly.

  32. Matěj Cepl says:

    THANKS!!! I know it is silly hack, but I somehow got used to it (and it has nice effects on my YouTube addictions ;))

  33. Piet says:

    For non-geeks like me, how do you incorporate the theoratube script into firefox? Copy it to *.default folder?

  34. Matěj Cepl says:

    @Piet install Greasemonkey (https://addons.mozilla.org/en-US/firefox/addon/748) and after restart of Firefox click on the *.user.js file.

  35. Piet says:

    @Matěj thanks a lot, I see that the encoding process takes 100% CPU anyways. What is the gain???

  36. Matěj Cepl says:

    @Piet no flash? And no it isn’t meant (I guess) as a serious alternative to flash. Just an example of what’s possible. It takes 100% when playing or when re-encoding?

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre lang="" line="" escaped="">