-
Notifications
You must be signed in to change notification settings - Fork 620
Description
Use case description
I'm working on improving Voice, an audio book player app. A common use case is to store audio books as a single m4a
file (an mp4
container without video) with all metadata. This metadata includes chapters to be able to navigate within the book. Quite often the .m4b
extension is used to mark the file as book.
There are two different standards for storing chapters, from which one is mostly preferred - but often both are contained in the file, since they are not exclusive:
- QuickTime chapters format (preferred)
- Nero chapters format (simpler, but more prone to errors)
Here is a short and good overview of how these formats are structured and how they can be parsed. Unfortunately androidx/media
does not provide support for any of these formats and there isn't even an alternative mp4 library which is well maintained and ready to use. This forces most App developers to write a custom format parser to extract the chapters themselves, which leads to slower / redundant file parsing and less robust solutions, sometimes even uncaught exceptions / App crashes.
There was already an attempt for a PR to solve this problem (#1851), but it did not receive a response.
Proposed solution
Since media3 already contains a solution for MP3 / ID3v2 chapters, the proposed solution would be to use the existing data structures to store a mapping of the mp4
chapters, similar to the other metadata and to the PR mentioned above.
See these commits:
If two chapter formats (QuickTime + Nero) exist in parallel, the QuickTime variant is to be preferred - even if they might contain different data. The reason for this is that damaged or incomplete files most of the time also contain a damaged set of Nero chapters, that can't be parsed - while the QuickTime set stays intact most of the time due to more robust data structure.
Alternatives considered
As an alternative it might be less effort to first integrate only the Nero format using #1851 as a base implementation and later extend this solution to also support QuickTime in a separate task.