You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@tika.apache.org by Peter Kronenberg <pe...@torch.ai> on 2021/03/01 20:43:27 UTC

MP4 parsing

The attached file is an MP4 video.  When running new Tika().detect() it returns video/quicktime.   But when actually running it through the MP4Parser, it returns a very generic application/mp4.

Looking at the code, it seems that the generic type comes about because the majorBrand of my file is isom, which doesn't match any of the desired values, so it defaults to application/mp4.  Now, I know absolutely nothing about mp4 encoding.  But looking further, I see there's a list of compatibleBrands, which in my case, includes mp41, which would match the expected type of video/mp4 .

I coded this up so that if the major brand does not match one of the desired values, it checks to see if any of the compatible brands match, and uses the first one it finds.

Is this a proper solution?

Peter Kronenberg  |  Senior AI Analytic ENGINEER
C: 703.887.5623
[Torch AI]<http://www.torch.ai/>
4303 W. 119th St., Leawood, KS 66209
WWW.TORCH.AI<http://www.torch.ai/>