Using "youtube-dl" to manage YouTube subscriptions
I’ve recently read the announcement of a very nice self-hosted YouTube subscription manager. I haven’t used YouTube’s built-in subscriptions for a while now, and haven’t missed it at all. When I saw the announcement, I considered writing about the solution I’ve built on top of youtube-dl.
In many ways, I agree with André Staltz’s view on data ownership and privacy:
I started with the basic premise that “I want to be in control of my data”. Sometimes that meant choosing when to interact with an internet giant and how much I feel like revealing to them. Most of times it meant not interacting with them at all. I don’t want to let them be in full control of how much they can know about me. I don’t want to be in autopilot mode. (…) Which leads us to YouTube. While I was able to find alternatives to Gmail (Fastmail), Calendar (Fastmail), Translate (Yandex Translate), etc. YouTube remains as the most indispensable Google-owned web service. It is really really hard to avoid consuming YouTube content. It was probably the smartest startup acquisition ever. My privacy-oriented alternative is to watch YouTube videos through Tor, which is technically feasible but not polite to use the Tor bandwidth for these purposes. I’m still scratching my head with this issue.
Even though I don’t use most alternative services he mentions, I do watch videos from YouTube. But I also feel uncomfortable logging in to YouTube with a Google account, watching videos, creating playlists and similar things.
Using the mobile app is worse: you can’t even block ads in there. You’re in less control on what you share with YouTube and Google.
youtube-dl is a command-line tool for downloading videos, from YouTube and many other sites:
It can be used to download individual videos as showed above, but it also has some interesting flags that we can use:
--output
: use a custom template to create the name of the downloaded file;--download-archive
: use a text file for recording and remembering which videos were already downloaded;--prefer-free-formats
: prefer free video formats, likewebm
,ogv
and Matroskamkv
;--playlist-end
: how many videos to download from a “playlist” (a channel, a user or an actual playlist);--write-description
: write the video description to a.description
file, useful for accessing links and extra content.
Putting it all together:
This will download the latest 20 videos from the selected channel, and
write down the video IDs in the youtube-dl-seen.conf
file. Running it
immediately after one more time won’t have any effect.
If the channel posts one more video, running the same command again will download only the last video, since the other 19 were already downloaded.
With this basic setup you have a minimal subscription system at work, and you can create some functions to help you manage that:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
#!/bin/sh
export DEFAULT_PLAYLIST_END=15
download() {
youtube-dl "$1" \
--download-archive ~/Nextcloud/cache/youtube-dl-seen.conf \
--prefer-free-formats \
--playlist-end $2 \
--write-description \
--output "~/Downloads/yt-dl/%(uploader)s/%(upload_date)s - %(title)s.%(ext)s"
}
export -f download
download_user() {
download "https://www.youtube.com/user/$1" ${2-$DEFAULT_PLAYLIST_END}
}
export -f download_user
download_channel() {
download "https://www.youtube.com/channel/$1" ${2-$DEFAULT_PLAYLIST_END}
}
export -f download_channel
download_playlist() {
download "https://www.youtube.com/playlist?list=$1" ${2-$DEFAULT_PLAYLIST_END}
}
export -f download_playlist
With these functions, you now can have a subscription fetching script to download the latest videos from your favorite channels:
Now, whenever you want to watch the latest videos, just run the above script and you’ll get all of them in your local machine.
-
Offline
My internet speed it somewhat reasonable1, but it is really unstable. Either at work or at home, it’s not uncommon to loose internet access for 2 minutes 3~5 times every day, and stay completely offline for a couple of hours once every week.
Working through the hassle of keeping a playlist on disk has payed off many, many times. Sometimes I even not notice when the connection drops for some minutes, because I’m watching a video and working on some document, all on my local computer.
There’s also no quality adjustment for YouTube’s web player, I always pick the higher quality and it doesn’t change during the video. For some types of content, like a podcast with some tiny visual resources, this doesn’t change much. For other types of content, like a keynote presentation with text written on the slides, watching on 144p isn’t really an option.
If the internet connection drops during the video download, youtube-dl will resume from where it stopped.
This is an offline first benefit that I really like, and works well for me.
-
Sync the “seen” file
I already have a running instance of Nextcloud, so just dumping the
youtube-dl-seen.conf
file inside Nextcloud was a no-brainer.You could try putting it in a dedicated git repository, and wrap the script with an autocommit after every run. If you ever had a merge conflict, you’d simply accept all changes and then run:
1
$ uniq youtube-dl-seen.conf > youtube-dl-seen.conf
to tidy up the file.
-
Doesn’t work on mobile
My primary device that I use everyday is my laptop, not my phone. It works well for me this way.
Also, it’s harder to add ad-blockers to mobile phones, and most mobile software still depends on Google’s and Apple’s blessing.
If you wish, you can sync the videos to the SD card periodically, but that’s a bit of extra manual work.
-
Better privacy
We don’t even have to configure the ad-blocker to keep ads and trackers away!
YouTube still has your IP address, so using a VPN is always a good idea. However, a timing analysis would be able to identify you (considering the current implementation).
-
No need to self-host
There’s no host that needs maintenance. Everything runs locally.
As long as you keep youtube-dl itself up to date and sync your “seen” file, there’s little extra work to do.
-
Track your subscriptions with git
After creating a
subscriptions.sh
executable that downloads all the videos, you can add it to git and use it to track metadata about your subscriptions.
-
Maximum playlist size is your disk size
This is a good thing for getting a realistic view on your actual “watch later” list. However I’ve run out of disk space many times, and now I need to be more aware of how much is left.
We can only avoid all the bad parts of YouTube with youtube-dl as long as YouTube keeps the videos public and programmatically accessible. If YouTube ever blocks that we’d loose the ability to consume content this way, but also loose confidence on considering YouTube a healthy repository of videos on the internet.
Since you’re running everything locally, here are some possibilities to be explored:
You can wrap the download_playlist
function (let’s call the wrapper
inc_download
) and instead of passing it a fixed number to the
--playlist-end
parameter, you can store the $n
in a folder
(something like $HOME/.yt-db/$PLAYLIST_ID
) and increment it by $step
every time you run inc_download
.
This way you can incrementally download videos from a huge playlist without filling your disk with gigabytes of content all at once.
The download_playlist
function could be aware of the specific machine
that it is running on and apply specific policies depending on the
machine: always download everything; only download videos that aren’t
present anywhere else; etc.
youtube-dl is a great tool to keep at hand. It covers a really large range of video websites and works robustly.
Feel free to copy and modify this code, and send me suggestions of improvements or related content.
2019-05-22: Fix spelling.
-
Considering how expensive it is and the many ways it could be better, but also how much it has improved over the last years, I say it’s reasonable. ↩