1
0
mirror of https://github.com/yt-dlp/yt-dlp.git synced 2024-12-11 16:15:46 +01:00
yt-dlp/yt_dlp
pukkandan be6202f12b
Subtitle extraction from streaming media manifests #247
Authored by fstirlitz
Modified from: https://github.com/ytdl-org/youtube-dl/pull/6144

Closes: #73
Fixes:
https://github.com/ytdl-org/youtube-dl/issues/6106
https://github.com/ytdl-org/youtube-dl/issues/14977
https://github.com/ytdl-org/youtube-dl/issues/21438
https://github.com/ytdl-org/youtube-dl/issues/23609
https://github.com/ytdl-org/youtube-dl/issues/28132

Might also fix (untested):
https://github.com/ytdl-org/youtube-dl/issues/15424
https://github.com/ytdl-org/youtube-dl/issues/18267
https://github.com/ytdl-org/youtube-dl/issues/23899
https://github.com/ytdl-org/youtube-dl/issues/24375
https://github.com/ytdl-org/youtube-dl/issues/24595
https://github.com/ytdl-org/youtube-dl/issues/27899

Related:
https://github.com/ytdl-org/youtube-dl/issues/22379
https://github.com/ytdl-org/youtube-dl/pull/24517
https://github.com/ytdl-org/youtube-dl/pull/24886
https://github.com/ytdl-org/youtube-dl/pull/27215

Notes:
* The functions `extractor.common._extract_..._formats` are still kept for compatibility
* Only some extractors have currently been moved to using `_extract_..._formats_and_subtitles`
* Direct subtitle manifests (without a master) are not supported and are wrongly identified as containing video formats
* AES support is untested
* The fragmented TTML subtitles extracted from DASH/ISM are valid, but are unsupported by `ffmpeg` and most video players
    * Their XML fragments can be dumped using `ffmpeg -i in.mp4 -f data -map 0 -c copy out.ttml`.
        Once the unnecessary headers are stripped out of this, it becomes a valid self-contained ttml file
    * The ttml subs downloaded from DASH manifests can also be directly opened with <https://github.com/SubtitleEdit>
* Fragmented WebVTT files extracted from DASH/ISM are also unsupported by most tools
    * Unlike the ttml files, the XML fragments of these cannot be dumped using `ffmpeg`
    * The webtt subs extracted from DASH can be parsed by <https://github.com/gpac/gpac>
    * But validity of the those extracted from ISM are untested
2021-04-28 19:02:43 +05:30
..
downloader [downloader/ism] Support muxing TTML subtitles 2021-04-28 17:21:45 +05:30
extractor Subtitle extraction from streaming media manifests #247 2021-04-28 19:02:43 +05:30
postprocessor [MetadataFromField] Improve regex and add tests 2021-04-21 11:12:04 +05:30
__init__.py Add option --skip-playlist-after-errors 2021-04-22 02:16:31 +05:30
__main__.py
aes.py
cache.py
compat.py [downloader/hls] Assemble single-file WebVTT subtitles from HLS segments 2021-04-28 17:21:14 +05:30
jsinterp.py
options.py [documentation] Fix typos 2021-04-22 16:54:44 +05:30
socks.py
swfinterp.py
update.py
utils.py [utils] Improve bug_report_message 2021-04-28 17:19:23 +05:30
version.py [version] update :ci skip all 2021-04-22 17:30:36 +05:30
webvtt.py [downloader/hls] Remove duplicate cues using a sliding window of candidates 2021-04-28 17:21:26 +05:30
YoutubeDL.py Fix case sensitivity of format selector 2021-04-26 10:56:56 +05:30