You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@tika.apache.org by Tim Allison <ta...@apache.org> on 2019/06/04 16:47:49 UTC

Re: StreamingZipContainerDetector XLSX template workbook

Tucker,

This should be fixed now in branch_1x and master.  Let me know if
you'd like to try with a nightly build.  Many thanks for the report!

Cheers,

          Tim

On Wed, May 29, 2019 at 6:57 AM Tucker B <ba...@gmail.com> wrote:
>
> After upgrading to Tika 1.21 I have noticed several known XLSX files
> are detected by Tika as "application/x-tika-ooxml". I think I've
> narrowed it down to the new StreamingZipContainerDetector. After
> inspecting the "[Content_Types].xml" of these XLSX files there is no
> reference to any of the configured content types for XLSX in the
> OOXML_CONTENT_TYPES in StreamingZipContainerDetector. Specifically,
>
> "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet.main+xml"
> "application/vnd.ms-excel.sheet.macroEnabled.main+xml"
> "application/vnd.ms-excel.sheet.binary.macroEnabled.main"
>
> I do see a content type of
>
> "application/vnd.openxmlformats-officedocument.spreadsheetml.template.main+xml"
>
> in "[Content_Types].xml". Is the StreamingZipContainerDetector missing
> the XSSFRelation TEMPLATE_WORKBOOK in OOXML_CONTENT_TYPES?