Rewriting the MPD manifest
For advanced users, there is some experimental support for rewriting the MPD manifest before downloading media segments from it. This allows you to:
-
print additional diagnostics concerning the manifest, which aren’t printed by dash-mpd-cli even with a high verbose level.
-
delete some Periods that the user is not interested in (based for example on their duration, or the origin of the media segments). This can be used to remove advertising segments inserted using dynamic ad insertion (DAI) or server-side ad insertion (SSAI) techniques.
-
delete from the manifest Representations that use undesired codecs. This is a way of making the choice of representation fall back to another Representation, which presumably uses an acceptable codec.
-
delete audio Representations whose language the user is not interested in (though in this case dash-mpd-cli has a builtin mechanism with
--prefer-language
to select the desired Representation). -
delete all subtitle languages and formats except for the one the user is interested in (again, as a complement to the builtin
--prefer-language
functionality). -
drop an audio AdaptationSet if the user only is interested in video (though this functionality is already builtin with
--video-only
). -
modify the BaseURL to include another CDN.
You can use this functionality by:
-
supplying an XPath expression matching XML elements that you don’t want to download, with the
--drop-elements
commandline option; -
supplying a full XSLT stylesheet that will be applied to the manifest to allow more complex rewriting rules. XSLT is a language that is specifically designed for XML filtering/rewriting; it’s standards-based though not particularly intuitive.
This functionality is currently implemented by calling out to the xsltproc commandline application, which supports XPath v1.0 and XSLT v1.0.
Examples of XPath filtering
Suppose your DASH manifest contains an audio track with audio description, which has an attribute
label=alternate
. This track is being selected for download instead of the one you want. You can
filter out the audio description track using the --drop-elements
commandline argument with the
following XPath expression:
--drop-elements "//mpd:AdaptationSet[@label='alternate']
If instead the audio track is marked with a Label
element that contains the text audiodescr
, you
can use the following:
--drop-elements "//mpd:AdaptationSet[.//mpd:Label[contains(text(), 'audiodescr')]]"
If instead the AdaptationSet you want to ignore is identified by a Role
child element, something
like
<Role schemeIdUri="urn:mpeg:dash:role:2011" value="alternate"/>
you can use the following
--drop-elements "//mpd:AdaptationSet[.//mpd:Role[@value='alternate']]"
Another possible application of the XML filtering capability is to avoid overloading the web servers of companies that serve dynamic ad insertion (DAI) content, as illustrated below.
Some DASH manifests include content (generally some Period
elements) which is inserted based on
your prior viewing habits, the time of day, your geographic location, and so on. You may wish to
filter these out based on the URL of the server, with for example
--drop-elements "//mpd:Period[mpd:BaseURL[contains(text(),'https://dai.google.com')]]
--drop-elements "//mpd:Period[mpd:BaseURL[contains(text(),'mediatailor.eu-west-1.amazonaws.com')]]"
--drop-elements "//mpd:Period[mpd:BaseURL[contains(text(),'unified-streaming.com')]]"
or based on other features such as the Period duration or keywords in the BaseURL:
--drop-elements "//mpd:Period[duration='PT5.000S']"
--drop-elements "//mpd:Period[.//mpd:BaseURL[contains(text(),'/creative/')]]"
--drop-elements "//mpd:Period[.//mpd:BaseURL[contains(text(),'Ad_Bumper')]]"
Examples of XSLT rewriting
The XSLT file (stylesheet) shown below will drop any AdaptationSets in the MPD manifest with a
@mimeType
matching audio/*
(leaving only the AdaptationSets containing video).
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:mpd="urn:mpeg:dash:schema:mpd:2011">
<xsl:output method="xml" indent="yes"/>
<!-- Default action (unless a template below matches): copy -->
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<!--
Drop any audio/* AdaptationSets, leaving only the AdaptationSets with mimeType of video/mp4.
Note that in principle we should be able to write the XPath expression we are matching on in
this simpler form
"//mpd:AdaptationSet[starts-with(@mimeType, 'audio/')]"
but unfortunately this manifest is using an incorrectly capitalized xmlns declaration
xmlns="urn:mpeg:DASH:schema:MPD:2011". XML is case sensitive; this namespace is not the same
as the one specified in the DASH specifications (this error is infrequent but is present in
the wild).
-->
<xsl:template match="//node()[local-name()='AdaptationSet' and starts-with(@mimeType,'audio/')]" />
</xsl:stylesheet>
Note that the rewriting instruction
<!-- Default action (unless a template below matches): copy -->
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
acts as a default action that will copy verbatim to the output any XML elements that aren’t matched by another template in the stylesheet.
The rewriting instruction
<xsl:template match="//node()[local-name()='AdaptationSet' and starts-with(@mimeType,'audio/')]" />
is selecting (using the XPath expression defined in the template’s @match
attribute) all
AdaptationSet nodes whose @mimeType
attribute starts with audio/
. It doesn’t specify any action
to run on these elements, which means that they are not copied to the XML output.
To run an XSLT template, see the --xslt-stylesheet
commandline argument. There are a few example
of stylesheets in the tests/fixtures
directory.
To download content with a rewritten manifest (here running dash-mpd-cli in a container):
podman run --rm -v .:/content \
--xslt-stylesheet my-rewrites.xslt \
ghcr.io/emarsden/dash-mpd-cli \
https://example.com/manifest.mpd -o foo.mp4
Future plans
Our current implementation of filtering using xsltproc is quite powerful and easy to install, but probably not the easiest to use. Possible alternatives which we might move to in future version of dash-mpd-cli:
-
Saxon-HE, free Java software (MPL v2) which implements XPath v3.1 and XSLT v3.0
-
A generic filter interface implemented as a pipe
-
A command API that takes two filename arguments
-
A WebAssembly-based interface that could be implemented in any programming language that can generate WASM bytecode.