dash-mpd-cli

A commandline application for downloading media content from a DASH MPD file, as used for on-demand replay of TV content and video streaming services like YouTube.

Crates.io CI Dependency status LICENSE

Terminal capture

DASH (dynamic adaptive streaming over HTTP), also called MPEG-DASH, is a technology used for media streaming over the web, commonly used for video on demand (VOD) and “replay/catch-up TV” services. The Media Presentation Description (MPD) is an XML document that lists the resources (manifest or “playlist”) forming a streaming service. A DASH client uses the manifest to determine which assets to request in order to perform adaptive streaming of the content. DASH MPD manifests can be used with content using different codecs (including H264, HEVC, AV1, AAC, VP9, MP4A, MP3) and containers (MP4, WebM, Matroska, AVI). There is a good explanation of adaptive bitrate video streaming at howvideo.works.

This commandline application allows you to download content (audio or video) described by an MPD manifest. This involves selecting the alternative with the most appropriate encoding (in terms of bitrate, codec, etc.), fetching segments of the content using HTTP or HTTPS requests and muxing audio and video segments together. There is also support for downloading subtitles (mostly WebVTT, TTML, SRT, tx3g and SMIL formats, with some support for wvtt format).

It runs on most common platforms, including Linux, Microsoft Windows and MacOS.

This application builds on the dash-mpd crate.

Features

The following features are supported:

  • Multi-period content. The media in the different streams will be saved in a single media container if the formats are compatible (same resolution, codecs, bitrate and so on) and the --no-period-concatenation commandline option is not provided, and otherwise in separate media containers.

  • The application can download content available over HTTP, HTTPS and HTTP/2. Network bandwidth can be throttled (see the --limit-rate commandline argument).

  • Support for SOCKS and HTTP proxies, via the --proxy commandline argument. The following environment variables can also be used to specify the proxy at a system level: HTTP_PROXY or http_proxy for HTTP connections, HTTPS_PROXY or https_proxy for HTTPS connections, and ALL_PROXY or all_proxy for all connection types. The system proxy can be disabled using the --no-proxy commandline argument.

  • Support for HTTP Basic authentication (see the --auth-username and --auth-password commandline arguments) and for Bearer authentation (see the --auth-bearer commandline argument). This authentication information is sent both to the server which hosts the DASH manifest, and to the server that hosts the media segments (the latter often being a CDN).

  • Subtitles: download support for WebVTT, TTML, SRT, tx3g and SMIL streams, as well as some support for the wvtt format. We support both subtitles published as a complete file and segmented subtitles made available in media fragments.

  • The application can read cookies from the Firefox, Chromium, Chrome, ChromeBeta, Safari and Edge browsers on Linux, Windows and MacOS, thanks to the bench_scraper crate. See the --cookies-from-browser commandline argument. Browsers that support multiple profiles will have all their profiles scraped for cookies.

  • XLink elements (only with actuate=onLoad semantics), including resolve-to-zero.

  • All forms of segment index info: SegmentBase@indexRange, SegmentTimeline, SegmentTemplate@duration, SegmentTemplate@index, SegmentList.

  • Media containers of types supported by mkvmerge, ffmpeg, VLC or MP4Box (this includes ISO-BMFF / CMAF / MP4, Matroska, WebM, MPEG-2 TS, AVI), and all the codecs supported by these applications.

  • Support for decrypting media streams that use ContentProtection (DRM). This requires either the mp4decrypt or shaka-packager commandline application to be installed. mp4decrypt is available from the Bento4 suite (binaries are available for common platforms), and shaka-packager binaries are available from Google for common platforms (see the Releases section on their GitHub page). See the --key commandline argument to specify a decryption key (can be used several times if different keys are used for different media streams). See the --decryption-application commandline argument to specify which decryption application to use. Shaka packager is able to decrypt more types of media streams (including in particular WebM containers and more encryption formats), whereas mp4decrypt mostly works with MPEG Common Encryption.

  • In practice, all features used by real streaming services and on-demand TV. Our test suite includes test streams published by industry groups such as HbbTV and the DASH Industry Forum, and comprises a wide variety of DASH streams using different publishing software, including GPAC (used by Netflix and other services), Amazon MediaTailor, Google’s Shaka packager, Microsoft’s Azure Media Services, and Unified Streaming. Test content is served by different CDNs including Akamai and various telecom providers.

  • dash-mpd-cli is written in the Rust programming language, meaning that it’s high performance and protected from a variety of vulnerabilities that can affect more traditional software.

The following are not supported:

  • Live streams (dynamic MPD manifests), that are used for live streaming/OTT TV are not really supported. This is because we don’t implement the clock-related throttling that is needed to only download media segments when they become available. However, some media sources publish “pseudo-live” streams where all media segments are in fact available; they simply don’t update the manifest once the live is complete. We are able to download these streams using the --enable-live-streams commandline argument. You might also have some success with a live stream in combination with the --sleep-requests commandline argument. The VLC application is a better choice for watching live streams.

  • XLink elements with actuate=onRequest semantics.

  • HLS streaming (m3u8 manifests).

Licence

dash-mpd-cli is free software distributed according to the terms of the MIT licence.

Usage

This is a commandline application, meaning it runs in a terminal (there is no graphical user interface).

Quickstart

To download from a manifest to a file called MyVideo.mp4:

dash-mpd-cli -v --quality best https://example.com/manifest.mpd -o MyVideo.mp4

To download including Finnish subtitles (which should be written to a file named MyVideo.srt or MyVideo.vtt, depending on the type of subtitles):

dash-mpd-cli -v --quality best --prefer-language fi --write-subs https://example.com/manifest.mpd -o MyVideo.mp4

To know what subtitles and subtitle languages are available, first run (this does not download any content):

dash-mpd-cli -v -v --simulate https://example.com/manifest.mpd

To save the output to a Matroska container using mkvmerge as a muxer:

dash-mpd-cli --muxer-preference mkv:mkvmerge https://example.com/manifest.mpd -o MyVideo.mkv

To decrypt DRM on the media streams (assuming there are different keys for the audio and the video streams):

dash-mpd-cli --key "43215678123412341234123412341237:12341234123412341234123412341237" \
  --key 43215678123412341234123412341236:12341234123412341234123412341236 \
   https://example.com/manifest.mpd -o MyVideo.mp4

To use ffmpeg that is installed in a non-standard location which is not in your PATH:

dash-mpd-cli --ffmpeg-location e:/ffmpeg/ffmpeg.exe https://example.com/manifest.mpd -o MyVideo.mp4

To send necessary cookies to the web server from Firefox (where you have logged in to the private website):

dash-mpd-cli --cookies-from-browser Firefox https://example.com/manifest.mpd -o MyVideo.mp4

If you want to interrupt a download, type Ctrl-C (this works at least on Linux, Windows, MacOS and termux on Android).

Commandline options

Usage: dash-mpd-cli [OPTIONS] MPD-URL

Options:

-U, --user-agent <user-agent>

The value of the user-agent header in HTTP requests. The default is dash-mpd-cli/<version>. If you want to look more like traffic from a web browser, choose a user agents in current use.

--proxy <URL>

The URL of a Socks or HTTP proxy (e.g. https://example.net/ or socks5://example.net/) to use for all network requests.

--no-proxy

Disable use of Socks or HTTP proxy even if the related environment variables are set.

--auth-username <USER>

Username to use for authentication with the server(s) hosting the DASH manifest and the media segments (only relevant for HTTP Basic authentication).

--auth-password <PASSWORD>

Password to use for authentication with the server(s) hosting the DASH manifest and the media segments (only relevant for HTTP Basic authentication).

--auth-bearer <TOKEN>

Token to use for authentication with the server(s) hosting the DASH manifest and the media segments, when HTTP Bearer authentication is required.

--timeout <SECONDS>

Timeout for each network request (from the start to the end of the request), in seconds.

--sleep-requests <SECONDS>

Number of seconds to sleep between network requests (default 0).

--enable-live-streams

Attempt to download from a live media stream (dynamic MPD manifest). Downloading from a genuinely live stream won’t work well, because we don’t implement the clock-related throttling needed to only download media segments when they become available. However, some media sources publish pseudo-live streams where all media segments are in fact available, which we will be able to download. You might also have some success in combination with the --sleep-requests argument.

--force-duration <SECONDS>

Specify a number of seconds (possibly floating point) to download from the media stream. This may be necessary to download from a live stream, where the duration is often not specified in the DASH manifest. It may also be used to download only the first part of a static stream.

-r, --limit-rate <RATE>

Maximum network bandwidth in octets per second (default no limit). For example, 200K, 1M.

--max-error-count <COUNT>

Maximum number of non-transient network errors that should be ignored before a download is aborted (default is 10).

--source-address <source-address>

Source IP address to use for network requests, either IPv4 or IPv6. Network requests will be made using the version of this IP address (e.g. using an IPv6 source-address will select IPv6 network traffic).

--add-root-certificate <CERT>

Add a root certificate (in PEM format) to be used when verifying TLS network connections. This option can be used multiple times.

--client-identity-certificate <CERT>

Client private key and certificate (in PEM format) to be used when authenticating TLS network connections.

--prefer-video-width <WIDTH>

When multiple video streams are available, choose that with horizontal resolution closest to WIDTH.

--prefer-video-height <HEIGHT>

When multiple video streams are available, choose that with vertical resolution closest to HEIGHT.

--quality <quality>

Prefer best quality (and highest bandwidth) representation, or lowest quality. Possible values: best, intermediate, worst.

--prefer-language <LANG>

Preferred language when multiple audio streams with different languages are available. Must be in RFC 5646 format (e.g. fr or en-AU). If a preference is not specified and multiple audio streams are present, the first one listed in the DASH manifest will be downloaded.

--drop-elements <XPATH>

XML elements that match this XPath expression will be removed from the MPD manifest before the download starts. See examples in the user manual. You can use this option multiple times. This option is currently experimental.

The functionality is currently implemented using the external xsltproc commandline application, which implements version 1.0 of the XPath specification.

--xslt-stylesheet <STYLESHEET>

XSLT stylesheet with rewrite rules to be applied to the manifest before downloading media content. You can use this option multiple times. This option is currently experimental.

Stylesheets are applied using the xsltproc commandline application, which implements version 1.0 of the XSLT specification.

--video-only

If media stream has separate audio and video streams, only download the video stream.

--audio-only

If media stream has separate audio and video streams, only download the audio stream.

--simulate

Download the manifest and print diagnostic information, but do not download audio, video or subtitle content, and write nothing to disk.

--write-subs

Download and save subtitle file, if subtitles are available.

--keep-video <VIDEO-PATH>

Keep video stream in file specified by VIDEO-PATH.

--keep-audio <AUDIO-PATH>

Keep audio stream (if audio is available as a separate media stream) in file specified by AUDIO-PATH.

--no-period-concatenation

Never attempt to concatenate media from different Periods. If multiple periods are present, one output file per Period will be saved, with names derived from the requested output filename (adding -p2 for the second period, -p3 for the third period, and so on.

--muxer-preference <CONTAINER:ORDERING>

When muxing into CONTAINER, try muxing applications in order ORDERING. You can use this option multiple times. Examples: mp4:mp4box,vlc and avi:ffmpeg.

--key <KID:KEY>

Use KID:KEY to decrypt encrypted media streams. KID should be either a track id in decimal (e.g. 1), or a 128-bit keyid (32 hexadecimal characters). KEY should be 32 hexadecimal characters. Example: --key eb676abbcb345e96bbcf616630f1a3da:100b6c20940f779a4589152b57d2dacb. You can use this option multiple times.

Please note that obtaining decryption keys is beyond the scope of this application.

--decryption-application <APP>

Application to use to decrypt encrypted media streams (either mp4decrypt or shaka).

--save-fragments <FRAGMENTS-DIR>

Save media fragments to this directory (will be created if it does not exist).

--ignore-content-type

Don’t check the content-type of media fragments (may be required for some poorly configured servers).

--add-header <NAME:VALUE>

Add a custom HTTP header and its value, separated by a colon ‘:’. You can use this option multiple times.

-H, --header <HEADER>

Add a custom HTTP header, in cURL-compatible format. You can use this option multiple times. Example: -H 'X-Custom: ized'.

--referer <URL>

Specify the content of the Referer HTTP header.

-q, --quiet

Disable printing of diagnostics information to the terminal.

-v, --verbose

Level of verbosity (can be used several times).

--no-progress

Disable the progress bar.

--no-xattr

Don’t record metainformation as extended attributes in the output file.

--no-version-check

Disable the check for availability of a more recent version on startup.

--ffmpeg-location <PATH>

Path to the ffmpeg binary (necessary if not located in your PATH).

--vlc-location <PATH>

Path to the VLC binary (necessary if not located in your PATH).

--mkvmerge-location <PATH>

Path to the mkvmerge binary (necessary if not located in your PATH).

--mp4box-location <PATH>

Path to the MP4Box binary (necessary if not located in your PATH).

--mp4decrypt-location <PATH>

Path to the mp4decrypt binary (necessary if not located in your PATH).

--shaka-packager-location <PATH>

Path to the shaka-packager binary (necessary if not located in your PATH).

-o, --output <PATH>

Save media content to this file.

--cookies-from-browser <BROWSER>

Load cookies from BROWSER (possible values, depending on your operating system, include Firefox, Chrome, ChromeBeta, Chromium).

--list-cookie-sources

Show valid values for the BROWSER argument to --cookies-from-browser on this computer, then exit.

-h, --help

Print help (see a summary with -h)

-V, --version

Print version and exit.

Relevant environment variables

You can set certain environment variables to modify the behaviour of the application:

  • The semi-standardized HTTP_PROXY and http_proxy environment variables allow you to specify a proxy to be used for HTTP connections, in the format http://proxy.my.com:8080. The HTTPS_PROXY and https_proxy operate likewise for HTTPS connections, and ALL_PROXY or all_proxy are used for both HTTP and HTTPS. The NO_PROXY or no_proxy environment variable allows you to specify IP addresses or domains that should not be proxied, in a format like NO_PROXY=google.com, 192.168.1.0/24. See the reqwest docs for the full details.

  • On Linux and MacOS, the TMPDIR environment variable will determine where temporary files used while downloading are saved. These temporary files should be cleaned up by the application, unless you interrupt execution using Ctrl-C.

  • On Microsoft Windows, the TMP and TEMP environment variables will determine where temporary files are saved (see the documentation of the GetTempPathA function in the Win32 API, or the Rust documentation for std::env::tmpdir).

  • The RUST_LOG environment variable can be used to obtain extra debugging logging (see the documentation for the env_logger crate). The tracing-subscriber crate is used to collect and display logs.

    For example, you can ask for voluminous logging using

RUST_LOG=trace dash-mpd-cli -o foo.mp4 https://example.com/manifest.mpd

or voluminous logging only from the dash-mpd crate (excluding details regarding the network connections) with

RUST_LOG=dash_mpd=trace dash-mpd-cli -o foo.mp4 https://example.com/manifest.mpd

If you are running in a container:

podman run --env RUST_LOG=trace \
   -v .:/content \
   ghcr.io/emarsden/dash-mpd-cli \
   https://example.com/manifest.mpd -o foo.mp4

Recording metadata

If your filesystem supports extended attributes, the application will save the following metainformation in the output file:

  • user.xdg.origin.url: the URL of the MPD manifest
  • user.dublincore.title: the title, if specified in the manifest metainformation
  • user.dublincore.source: the source, if specified in the manifest metainformation
  • user.dublincore.rights: copyright information, if specified in the manifest metainformation

You can examine these attributes using xattr -l (you may need to install your distribution’s xattr package). Disable this feature using the --no-xattr commandline argument.

Run safely sandboxed in a Docker container

The application, alongside the external helper applications that it uses for muxing media streams, for extracting/converting subtitle streams, and for decrypting content infected with DRM, are available as a prebuilt container, which is probably the easiest and safest way to run it. The container can be run on any host that can run Linux/AMD64 containers (using Podman or Docker on Linux, Microsoft Windows and MacOS, possibly your NAS device). It’s available in the GitHub Container Registry ghcr.io and is automatically built from the sources using GitHub’s useful continuous integration services.

It’s packaged as a multiarch container using the lightweight Alpine Linux distribution. The following helper applications are included in the container:

  • ffmpeg from Alpine Linux

  • mkvmerge from Alpine Linux

  • mp4decrypt from the Bento4 suite, from Alpine Linux

  • MP4Box from the GPAC suite, compiled from source

  • xsltproc from the libxslt package, from Alpine Linux

  • Shaka packager, from Google’s Docker image or from GitHub, or built from source on certain platforms

The container is currently available for the following platforms:

  • linux/amd64
  • linux/arm64/v8
  • linux/arm/v7
  • linux/riscv64

Advantages of running in a container

Why run the application in a container, instead of natively on your machine?

  • Good internet hygiene. It’s much safer, because the container is sandboxed: it can’t modify your host machine, except for writing downloaded media to the directory you specify. This is a very good idea when running random software you downloaded from the internet!

  • No need to install the various helper applications (ffmpeg, mkvmerge, mp4decrypt, MP4Box), which are already present in the container.

  • Automatically run the latest version of dash-mpd-cli and the various helper applications (the container runtime can pull the latest version for you automatically).

  • Podman and Docker also allow you to set various limits on the resources allocated to the container (number of CPUs, memory); see their respective documentation.

Unlike running software in a virtual machine, there is only a negligeable performance penalty to running in a container. That’s not quite true: if you’re running the container on an aarch64 (“Apple Silicon”) Mac, Podman will set up a virtual machine for you. On Windows, Podman will set up a low-overhead WSL2 virtual machine for you.

Tip

I recommend installing Podman because it’s fully free software, whereas Docker is partly commercial. Podman is also able to run containers “rootless”, without special privileges, which is good for security, and doesn’t require a background daemon. Podman has a docker-compatible commandline interface.

Running the container

If you’re running on Microsoft Windows or MacOS, you will need to start the virtual machine that’s used to run the container:

Start up the container runtime (only Windows/MacOS)

podman machine start

(Replace podman by docker if you prefer that option.)

This step is not necessary on Linux.

You can then fetch the container image (currently around 220 MB) from the GitHub container registry ghcr.io and save it to your local disk for later use:

Fetch the container image

podman pull ghcr.io/emarsden/dash-mpd-cli

Then to download some content from an MPD manifest:

Run dash-mpd-cli in the container

podman run --rm --tty -v .:/content ghcr.io/emarsden/dash-mpd-cli https://example.com/manifest.mpd

This should save the media to a file named something like example.com_manifest.mp4 💪 (you can change this name by adding -o foo.mp4). It will remove the container image from your local storage once the download is finished (--rm) and will use a terminal to show a progress bar (--tty).

If you want your local copy of the container image to be updated if a newer one is available from the registry, add --pull=newer:

podman run --rm --tty --pull=newer \
  -v .:/content \
  ghcr.io/emarsden/dash-mpd-cli \
  -v <MPD-URL> -o foo.mp4

If you don’t use the --rm argument, you can later delete the image if you no longer need it using podman image rm with the image id shown by podman images, as illustrated below:

Delete the container image from your local disk

% podman images
REPOSITORY                       TAG         IMAGE ID      CREATED         SIZE
ghcr.io/emarsden/dash-mpd-cli    latest      ae6971bf21ae  4 days ago      216 MB
...
% podman image rm ae6971bf21ae

Mounting a directory into the container

By default, your local disk is neither readable nor writable by the application running in the container (this is a major security advantage!). Since you want to write the downloaded media onto your local disk, you need to mount (bind) a directory into the container, using podman’s -v commandline option.

In the commandline show above, your current working directory (.) will be mounted in the container as /content, which is always the working directory in the container. This means that an output file specified without a full path, such as foo.mp4, will be saved to your current working directory on the host machine. If you specify a full path for the output file, for example -o /tmp/foo.mp4, this will output to the temporary directory in the container, which you won’t have access to once the download has finished.

This sandboxing restriction also applies to any files you need to pass into the container, such as an XSLT stylesheet for rewriting the manifest. If you’re running podman from your Videos directory, a stylesheet has to be in Videos or a subdirectory, or the container won’t be able to see it. Therefore, you should provide a relative name rather than an absolute name to the container. If the stylesheet is in the rewrites directory, for example:

podman run --rm --tty --pull=newer \
  -v .:/content \ 
  --xslt-stylesheet rewrites/my-rewrites.xslt \
  ghcr.io/emarsden/dash-mpd-cli \
  -v <MPD-URL> -o foo.mp4

Increased security with gVisor

On Linux/AMD64, it’s also possible to run the container using the gVisor container runtime runsc, which uses specially-designed sandboxing techniques to improve security (strong isolation, protection against privilege escalation). This requires installation of runsc and running as root (runsc doesn’t currently support rootless operation).

sudo apt install runsc
sudo podman --runtime=runsc run --rm --tty -v .:/content ghcr.io/emarsden/dash-mpd-cli -v <MPD-URL> -o foo.mp4

Installation

The recommended way of running dash-mpd-cli is sandboxed in a Podman container. If you prefer to install the software and its dependencies on your computer in the traditional way, you can download a prebuilt binary or build from source yourself.

Binary releases are available on GitHub for GNU/Linux on AMD64 (statically linked against Musl Libc to avoid glibc versioning problems), Microsoft Windows on AMD64 and MacOS on aarch64 (“Apple Silicon”) and AMD64. These are built automatically on the GitHub continuous integration infrastructure.

You can also build from source using an installed Rust development environment:

cargo install dash-mpd-cli

This installs the binary to your installation root’s bin directory, which is typically $HOME/.cargo/bin.

External dependencies

You should also install the following dependencies:

  • the mkvmerge commandline utility from the MkvToolnix suite, if you download to the Matroska container format (.mkv filename extension). mkvmerge is used as a subprocess for muxing (combining) audio and video streams. See the --mkvmerge-location commandline argument if it’s not installed in a standard location (not on your PATH).

  • ffmpeg for muxing audio and video streams and for concatenating streams from a multi-period manifest. The ffprobe binary (distributed with ffmpeg) is required alongside the ffmpeg binary. See the --ffmpeg-location commandline argument if this is installed in a non-standard location.

  • vlc as an alternative application for muxing audio and video streams (sometimes VLC is able to mux certain streams that ffmpeg doesn’t support). See the --vlc-location commandline argument if this is installed in a non-standard location. Also see the --muxer-preference commandline argument to specify which muxing application to prefer for different container types.

  • the MP4Box commandline utility from the GPAC project, to help with subtitles in wvtt format. If it’s installed, MP4Box will be used to convert the wvtt stream to the more widely recognized SRT format. MP4Box can also be used for muxing audio and video streams to an MP4 container, as a fallback if ffmpeg and vlc are not available. See the --mp4box-location commandline argument if this is installed in a non-standard location.

  • the mp4decrypt commandline application from the Bento4 suite, if you need to fetch encrypted content. Binaries are available for common platforms. See the --mp4decrypt-location commandline argument if this is installed in a non-standard location.

  • for some types of streams that the mp4decrypt application is not able to decrypt (for example content in WebM containers), you should install the Shaka packager application developed by Google. See the --decryption-application commandline option to specify the choice of decryption application, and the --shaka-packager-location commandline argument if it is installed in a non-standard location.

  • the xsltproc commandline utility packaged with libxslt, which is used for the MPD rewriting functionality (the --drop-elements and --xslt-stylesheet commandline options).

Supported platforms

This crate is tested on the following platforms:

  • Our container images are tested using Podman on Linux and Windows

  • Linux on AMD64 (x86-64) and Aarch64 architectures

  • MacOS on AMD64 and Aarch64 architectures

  • Microsoft Windows 10 and Windows 11 on AMD64

  • Android 12 on Aarch64 via termux (you’ll need to install the rust, binutils and ffmpeg packages, and optionally the mkvtoolnix, vlc and gpac packages). You’ll need to disable the cookies feature by building with --no-default-features.

  • FreeBSD/AMD64 and OpenBSD/AMD64. You’ll need to disable the cookies feature. Some of the external applications we depend on (e.g. mp4decrypt, Shaka packager) are poorly supported on OpenBSD.

It should also work on more obscure platforms, such as ppc64le and RISC-V, as long as you can install a recent Rust toolchain and have ffmpeg working.

Rewriting the MPD manifest

For advanced users, there is some experimental support for rewriting the MPD manifest before downloading media segments from it. This allows you to:

  • print additional diagnostics concerning the manifest, which aren’t printed by dash-mpd-cli even with a high verbose level.

  • delete some Periods that the user is not interested in (based for example on their duration, or the origin of the media segments). This can be used to remove advertising segments inserted using dynamic ad insertion (DAI) or server-side ad insertion (SSAI) techniques.

  • delete from the manifest Representations that use undesired codecs. This is a way of making the choice of representation fall back to another Representation, which presumably uses an acceptable codec.

  • delete audio Representations whose language the user is not interested in (though in this case dash-mpd-cli has a builtin mechanism with --prefer-language to select the desired Representation).

  • delete all subtitle languages and formats except for the one the user is interested in (again, as a complement to the builtin --prefer-language functionality).

  • drop an audio AdaptationSet if the user only is interested in video (though this functionality is already builtin with --video-only).

  • modify the BaseURL to include another CDN.

You can use this functionality by:

  • supplying an XPath expression matching XML elements that you don’t want to download, with the --drop-elements commandline option;

  • supplying a full XSLT stylesheet that will be applied to the manifest to allow more complex rewriting rules. XSLT is a language that is specifically designed for XML filtering/rewriting; it’s standards-based though not particularly intuitive.

This functionality is currently implemented by calling out to the xsltproc commandline application, which supports XPath v1.0 and XSLT v1.0.

Examples of XPath filtering

Drop AdaptationSets with alternate audio

Suppose your DASH manifest contains an audio track with audio description, which has an attribute label=alternate. This track is being selected for download instead of the one you want. You can filter out the audio description track using the --drop-elements commandline argument with the following XPath expression:

--drop-elements "//mpd:AdaptationSet[@label='alternate']

If instead the audio track is marked with a Label element that contains the text audiodescr, you can use the following:

--drop-elements "//mpd:AdaptationSet[.//mpd:Label[contains(text(), 'audiodescr')]]"

If instead the AdaptationSet you want to ignore is identified by a Role child element, something like

<Role schemeIdUri="urn:mpeg:dash:role:2011" value="alternate"/>

you can use the following

--drop-elements "//mpd:AdaptationSet[.//mpd:Role[@value='alternate']]"

Another possible application of the XML filtering capability is to avoid overloading the web servers of companies that serve dynamic ad insertion (DAI) content, as illustrated below.

Drop dynamically inserted advertising content

Some DASH manifests include content (generally some Period elements) which is inserted based on your prior viewing habits, the time of day, your geographic location, and so on. You may wish to filter these out based on the URL of the server, with for example

--drop-elements "//mpd:Period[mpd:BaseURL[contains(text(),'https://dai.google.com')]]
--drop-elements "//mpd:Period[mpd:BaseURL[contains(text(),'mediatailor.eu-west-1.amazonaws.com')]]"
--drop-elements "//mpd:Period[mpd:BaseURL[contains(text(),'unified-streaming.com')]]"

or based on other features such as the Period duration or keywords in the BaseURL:

--drop-elements "//mpd:Period[duration='PT5.000S']"
--drop-elements "//mpd:Period[.//mpd:BaseURL[contains(text(),'/creative/')]]"
--drop-elements "//mpd:Period[.//mpd:BaseURL[contains(text(),'Ad_Bumper')]]"

Examples of XSLT rewriting

The XSLT file (stylesheet) shown below will drop any AdaptationSets in the MPD manifest with a @mimeType matching audio/* (leaving only the AdaptationSets containing video).

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0"
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                xmlns:mpd="urn:mpeg:dash:schema:mpd:2011">
  <xsl:output method="xml" indent="yes"/>

  <!-- Default action (unless a template below matches): copy -->
  <xsl:template match="@*|node()">
    <xsl:copy>
      <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
  </xsl:template>

  <!--
      Drop any audio/* AdaptationSets, leaving only the AdaptationSets with mimeType of video/mp4.
      
      Note that in principle we should be able to write the XPath expression we are matching on in
      this simpler form

         "//mpd:AdaptationSet[starts-with(@mimeType, 'audio/')]"

      but unfortunately this manifest is using an incorrectly capitalized xmlns declaration
      xmlns="urn:mpeg:DASH:schema:MPD:2011". XML is case sensitive; this namespace is not the same
      as the one specified in the DASH specifications (this error is infrequent but is present in
      the wild).
  -->
  <xsl:template match="//node()[local-name()='AdaptationSet' and starts-with(@mimeType,'audio/')]" />
</xsl:stylesheet>

Note that the rewriting instruction

  <!-- Default action (unless a template below matches): copy -->
  <xsl:template match="@*|node()">
    <xsl:copy>
      <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
  </xsl:template>

acts as a default action that will copy verbatim to the output any XML elements that aren’t matched by another template in the stylesheet.

The rewriting instruction

<xsl:template match="//node()[local-name()='AdaptationSet' and starts-with(@mimeType,'audio/')]" />

is selecting (using the XPath expression defined in the template’s @match attribute) all AdaptationSet nodes whose @mimeType attribute starts with audio/. It doesn’t specify any action to run on these elements, which means that they are not copied to the XML output.

To run an XSLT template, see the --xslt-stylesheet commandline argument. There are a few example of stylesheets in the tests/fixtures directory.

To download content with a rewritten manifest (here running dash-mpd-cli in a container):

podman run --rm -v .:/content \
  --xslt-stylesheet my-rewrites.xslt \
  ghcr.io/emarsden/dash-mpd-cli \
  https://example.com/manifest.mpd -o foo.mp4

Future plans

Our current implementation of filtering using xsltproc is quite powerful and easy to install, but probably not the easiest to use. Possible alternatives which we might move to in future version of dash-mpd-cli:

  • Saxon-HE, free Java software (MPL v2) which implements XPath v3.1 and XSLT v3.0

  • A generic filter interface implemented as a pipe

  • A command API that takes two filename arguments

  • A WebAssembly-based interface that could be implemented in any programming language that can generate WASM bytecode.

Why?

The dash-mpd-rs library at the core of this application was developed to allow the author to watch a news programme produced by a public media broadcaster whilst at the gym. The programme is published as a DASH stream on the broadcaster’s “replay” service, but network service at the gym is sometimes poor. First world problems!

Caution

The author is not the morality police nor a lawyer, but please note that redistributing media content that you have not produced may, depending on the publication licence, be a breach of intellectual property laws. Also, circumventing DRM may be prohibited in some countries.

Your feedback

Bug reports should be filed as issues on our GitHub project page.

Pull requests are also welcome!