“If your browser can show it, you can download it”, doesn’t just apply to text and images, but also to videos. A lot of video players won’t allow you to download the video and the hosting sites will sometimes try very hard to make it near impossible to do so. It’s really a symptom of considering websites as closed, distributed packages, which nobody should be allowed to poke at or even store. If there were an interactive “streaming” protocol, that worked well and would ensure that the client only gets fleeting information, I bet you’d see lots of companies jump on it, to keep as much control on their side as possible. Luckily, that’s (currently) not the case and even if it’s sometimes a bit tricky to get the videos downloaded, it often is much easier than one might think.
The main reasons, why it’s generally quite easy to download a video, are:
- There are only a handful of common video streaming protocols, so the main mechanics are the same for nearly all sites.
- People have written tools for these protocols and plugins for specific sites.
In my bachelor thesis, I went quite deep into how video streaming works, but the gist of it is, that there are more or less two protocols: HSL (HTTP Live Streaming) and DASH (Dynamic Adaptive Streaming over HTTP). As I don’t want to make this a deep-dive post on streaming, I’ll just leave you with these breadcrumbs.
Naturally, tools were created that could captures these streams directly. One of those tools is youtube-dl, which got quite a bit attention, when it was DMCA-ed from GitHub for a while. The temporary takedown then lead to forks popping up, one of which has become very popular as well, that is yt-dlp. As the name suggests, they were originally created to download videos from YouTube, but due to the protocols being the same, extensions were written for other websites and they also support download from more generic streaming URLs.
Basic Use Case
If there’s a dedicated extractor for a given site, which for most popular sites you can rely on, given that there are over a thousand different extractors, then you can just point yt-dlp to a video URL and it will select the best quality for you, download the video and audio streams and mux them (i.e. merge video and audio stream) into a playable video.
yt-dlp https://www.youtube.com/watch?v=dQw4w9WgXcQ
Choosing the Format
I won’t go over all the hundred different options, but picking a specific format is probably one of the more common things to do and thus good to know. There are two flags, one to list all the available formats, and one to select a specific format.
yt-dlp -F https://vimeo.com/783455878
-F
(as in format) will list the available formats for a video, like so:
With -f <format>
you can then select the desired format:
yt-dlp -f hls-fastly_skyfire_sep-5366 https://vimeo.com/783455878
If you want to select a specific video and a specific audio format, you can do so by specifying both, separated by a +
sign:
yt-dlp -f hls-fastly_skyfire_sep-5366+dash-fastly_skyfire_sep-audio-b3698ea4 https://vimeo.com/783455878
Fetching Subtitles
yt-dlp
can not only download video and audio, but also grab all the available subtitles and either save in separate files or directly embed them in the video.
Use --list-subs
to get an overview of available subtitles.
--sub-format <format>
lets you pick a specific format like srt
or vtt
, while --sub-langs <lang>
chooses the wanted languages as a comma separated list.
To embed all available subtitles into the video you can use the following command:
yt-dlp https://vimeo.com/783455878 --embed-subs --sub-langs all
Unsupported Websites
Of course YouTube and Vimeo will have their dedicated extractors, but what about random tutorial site X? Here’s where it can get a bit tricky. yt-dlp has a “generic information extractor”, but I don’t know what exactly it’s looking for. One thing that has worked for me on nearly all sites, is to open the developer tools in your browser, check the network tab and find the web call, which downloads a manifest.
Depending on the protocol in use, this can be of a different type, the most common manifest is a good old *.m3u8 playlist. Open the network tab of your browser’s developer tools, refresh the page and start looking at downloaded files, until you spot a *.m3u8 file. Often times, they’re also helpfully called “manifest” or similar. Copy the complete URL of that downloaded file and ask yt-dlp to fetch the video:
yt-dlp https://<some url>/manifest.m3u8?<some tokens>
These playlists often contain different streaming qualities as well, so if you want a specific format, you can also use the -F
and -f
flags.
403 Forbidden
If the video isn’t publicly accessible, but hides behind some authentication, things start to become a lot more tricky. Checkout the specific Authentication Section on the yt-dlp readme and/or continue reading, where I give an example that uses a Bearer Token for the authentication part.
Microsoft Stream
For all the people living in the corporate world, you may eventually be linked some videos hosted on Microsoft Stream, which can be seen as the internal “YouTube” for companies. In case you want to grab a video there, for which the downloading wasn’t explicitly enabled, here’s a few steps on how achieve it:
- Open the network tab in your browser’s development tool
- Find the
/api/videos
web call- For the request, extract the
authorization
header value, which should start withBearer
- For the request, extract the
- Find the
/manifest(<format>)
web call- There might be multiple options, I recommend replacing
manifest(format=mpd-time-csf)
withmanifest(format=m3u8-aapl-v4)
or pick the later over the former
- There might be multiple options, I recommend replacing
With the information extracted, you can now call yt-dlp:
yt-dlp --add-header "authorization: Bearer <TOKEN>" "https://<url>/manifest(format=m3u8-aapl-v4)
Note:
- The Bearer Token is valid only for a certain period of time
- The Bearer Token gives access to your account, so under no circumstances, give this out to anyone, it would be like handing them your password and 2FA as well
Closing Thoughts
Now it needs to be said that, just because you were able to download a video, doesn’t give you the right to redistribute the video. I’m not a lawyer, but I’d say there’s nothing wrong with downloading a video that you have access to in your browser anyway, even if companies don’t like it and maybe have a clause somewhere in their Terms of Service against it, but you’re approaching a grey zone. Just keep it for personal use, such as a watch-it-later/offline option or archiving, and don’t violate other people’s copyright.
Wow, brilliant article and a big help to me.