Photo by Caitlin Jacokes on Unsplash
How to write a simple CLI application to download, convert and save videos as audio
Keeping it really simple!
Table of contents
Have you always wondered about how to create a very simple CLI application that would allow you to download videos from YouTube, convert them to audio, and then save them to a designated folder on your computer? In this post, I would like to explain one quick and easy way of achieving this in Python without being too geeky.
For the most part, we would stick to the native offerings within the Python ecosystem. The only external libraries in this venture will be the Pytube library and Typer.
Setup
To begin, we would need to create a folder to house our simple project. Afterwards, we create a virtual environment and install our requirements, which in this example would be just the Pytube and Typer libraries, and we are good to go.
Implementation
For the program setup, I find it important to structure the code in a very simple and reusable fashion for easy maintenance, transferability, and readability. Having said that, we will define a config file that will hold the basic configurations that our simple CLI tool will have. This will help so much with actually defining the structural configuration that our tool needs to have.
To achieve a simple configuration, we define a config file and write the below code into it.
from dataclasses import dataclass
from typing import TypeVar, Dict, Any, Type
from config.logger import Logger as CustomLogger
from logging import Logger
import os
ConfigClass = TypeVar('ConfigClass', bound='Config')
LOG_FILE = "logfile.log"
#create a new file for every new experiment
if (os.path.exists(LOG_FILE)):
os.remove(LOG_FILE)
setup = {
"VERSION": 1,
"logger": CustomLogger(logger_name="Video-Downloader", log_file=LOG_FILE).get_logger(),
}
@dataclass
class Config:
VERSION: int
logger: Logger
@classmethod
def from_config(cls: Type[ConfigClass], raw_config: Dict[str, Any]) -> ConfigClass:
return cls(**raw_config)
def configure() -> Config:
return Config.from_config(setup)
So, we simply define the data type of our simple configuration, which in this case is just the version of your application and custom logger (which can be found here).
Next, we define what the driver of our application will look like. This should be the main part of the application that does the actual work based on some provided or pre-defined parameters. This should look something like the snippet below.
from .settings import Config
from sources import download
def run(config: Config, command: str, link: str, output_path: str, file_type: str, path: str = None):
if command == "youtube":
download(config=config, output_path=output_path,
link=link, file_type=file_type, path=path
)
// Here you can add additional video sources should you have them
else:
config.logger.error("No valid command selected")
pass
So, we import the config file we defined above and another simple download function, which does the actual download. For now, we only have a scenario where YouTube is the only source of our video. This can be expanded to other video sources, per choice.
The download function imported above can be found below as follows:
from pytube import YouTube
from config import Config
from pathlib import Path
from utils import convert_video_to_audio_ffmpeg
def download(config: Config, output_path: str, link: str, file_type: str, path: str = None) -> None:
path = f"{Path.home()}/Desktop/downloads"
# print(*Path(Path.home()).iterdir(), sep="\\n")
if file_type == "mp4":
config.logger.info(f"Downloading file in mp4 format ...")
yt = YouTube(f"<http://youtube.com/watch?v={link}>")
yt.streams.filter(progressive=True, file_extension='mp4').order_by('resolution').desc().first().download(output_path=path, filename=f"{output_path}.{file_type}")
config.logger.info(f"Video download completed")
config.logger.info(f"Converting video to audio ...")
convert_video_to_audio_ffmpeg(path, f"{output_path}.{file_type}", output_path)
config.logger.info(f"Audio convert completed!")
elif file_type == "3gpp":
config.logger.info(f"Downloading file in 3gpp format")
YouTube(f'<https://youtu.be/{link}>').streams.first().download(output_path="~/Desktop", filename=f"{output_path}.{file_type}")
else:
config.logger.error(f"Wrong file extension format provided. Provided value: {file_type}")
raise Exception(f"Wrong file extension format provided. Provided value: {file_type}")
Now, we need to define our CLI application's basic commands. We start with defining the callbacks for the commands and then proceed with the events that trigger the callbacks.
import typer
from dataclasses import dataclass
import re
import random
import string
@dataclass
class Constants:
app = typer.Typer()
stage_tool_tip = typer.style("DEV or PROD", fg=typer.colors.BRIGHT_GREEN, bold=True, italic=True)
provider_tool_tip = typer.style("YOUTUBE or INSTAGRAM OR FACEBOOK", fg=typer.colors.BRIGHT_GREEN, bold=True, italic=True)
file_type_tool_tip = typer.style("3gpp or mp4. Default value is mp4", fg=typer.colors.BRIGHT_GREEN, bold=True, italic=True)
url_pattern = "^https?:\\/\\/(?:www\\.)?[-a-zA-Z0-9@:%._\\+~#=]{1,256}\\.[a-zA-Z0-9()]{1,6}\\b(?:[-a-zA-Z0-9()@:%_\\+.~#?&\\/=]*)$"
def get_random_string(self, length: int = 10):
combination = string.ascii_lowercase + string.ascii_uppercase + string.digits
return ''.join(random.choice(combination) for i in range(length))
def provider_callback(self, value: str) -> str:
if not value:
raise typer.BadParameter(f"A provider was not provided. Value should be either {self.provider_tool_tip}!")
if value not in ["youtube", "facebook", "instagram"]:
raise typer.BadParameter(f"Invalid value for the provider. Value should be either of {self.provider_tool_tip}!")
return value.lower()
def link_callback(self, value: str) -> str:
if not value:
raise typer.BadParameter(f"A link was not provided!")
if not re.match(self.url_pattern, value):
raise typer.BadParameter(f"Invalid url provided")
m = re.search('v=(.+?)&', value)
if not m:
raise typer.BadParameter("The wrong url format was provided")
return m.group(1)
def file_type_callback(self, value: str) -> str:
if not value:
return "mp4"
if value not in ["3gpp", "mp4"]:
raise typer.BadParameter(f"Invalid url provided. Value should be either of {self.file_type_tool_tip}!")
return value
def output_callback(self, value: str) -> str:
if not value:
return self.get_random_string()
return value
@dataclass
class Downloader(Constants):
def get_provider_options(self):
return typer.Option(
None,
"--provider", "-p",
help="The provider or the video source",
callback=self.provider_callback
)
def get_link_options(self):
return typer.Option(
None,
"--link", "-l",
help="The link to the video source",
callback=self.link_callback
)
def get_file_type_options(self):
return typer.Option(
None,
"--ext", "-x",
help="The prefered file extension to be downloaded",
callback=self.file_type_callback
)
def get_output_options(self):
return typer.Option(
None,
"--output", "-o",
help="The output name of the file to be saved",
callback=self.output_callback
)
Within the callbacks, we handle the incoming events from the commands, taking care of
the provider, which in this case could be youtube, Facebook or Instagram as in the code sample
the link to the video
the file extension
and the filename to which the downloaded file will be saved as.
And that's it; the application is done.
The application can then be invoked as such:
python3 main.py youtube -l <youtube-video-link> -o awesome-song
Or in a very simple fashion,
python3 main.py youtube -l <youtube-video-link>
The last command will generate a random name for the file as it saves it.
Conclusion
We have explored a very simple way of creating a CLI application that downloads, converts, and saves a YouTube video to an audio file using the video URL. This can be expanded to other video sources.
We have managed to structure the project in a very extensible manner so that we can easily integrate other sources with very little effort. Only very few modifications will be required to get this working for whatever other needs you might have.
The entire project can be found here.