You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by Nick Burch <ni...@alfresco.com> on 2011/09/18 17:43:22 UTC
Media container formats?
Hi All
As part of my Ogg stuff, I'm wondering how best to handle media
container formats such as Ogg and AVI.
For a file with only a single stream in it, eg an Ogg Vorbis audio file,
then it seems sensible to treat that as a single (non container) file.
For a file with multiple streams, such as a video with two soundtracks
and subtitles, what should we do? Try to identify the "main" stream
(often not actually marked), parse that as the file and do the other
streams (eg audio) as embedded resources?
The specific use case at the moment I have is for Ogg Vorbis or Ogg Flac
files where only the outer container is detected. I'm thinking that the
general Ogg parser should check for a single stream, and delegate to the
Vorbis or Flac parser as found. However, if it finds multiple streams,
what should it do?
Nick