You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Luís Filipe Nassif (Jira)" <ji...@apache.org> on 2023/08/09 23:47:00 UTC

[jira] [Updated] (TIKA-4111) iWork keynote file with wrong extension not detected properly

     [ https://issues.apache.org/jira/browse/TIKA-4111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Luís Filipe Nassif updated TIKA-4111:
-------------------------------------
    Description: 
Attached keynote file, without the *.key extension, should be detected as "application/vnd.apple.keynote.13" but is being detected as "application/vnd.apple.unknown.13". 

The IWork13PackageParser.detect(ZipFile) method iterates over zip entries and calls detectIfPossible(ZipEntry) multiple times. Inside the last method, the check for "Index/MasterSlide-*" or "Index/Slide-*" files should work, they are found inside the attached file. But since those entries can appear in arbitrary order, the "Index/Document.iwa" entry is found before and then "application/vnd.apple.unknown.13" is returned. The fix is simply returning null when "Index/Document.iwa" entry is found.

 

  was:
Attached keynote file, without the *.key extension, should be detected as application/vnd.apple.keynote.13 but is being detected as application/vnd.apple.unknown.13. 

The IWork13PackageParser.detect(ZipFile) method iterates over zip entries and calls detectIfPossible(ZipEntry) multiple times. Inside the last method, the check for "Index/Slide-*.iwa" files should work, they are found inside the attached file. But since those entries can appear in arbitrary order, the "Index/Document.iwa" entry is found before and then application/vnd.apple.unknown.13 is returned. The fix is simply returning null when "Index/Document.iwa" entry is found.

 


> iWork keynote file with wrong extension not detected properly
> -------------------------------------------------------------
>
>                 Key: TIKA-4111
>                 URL: https://issues.apache.org/jira/browse/TIKA-4111
>             Project: Tika
>          Issue Type: Bug
>          Components: detector
>    Affects Versions: 2.8.0
>            Reporter: Luís Filipe Nassif
>            Assignee: Luís Filipe Nassif
>            Priority: Minor
>         Attachments: keynote
>
>
> Attached keynote file, without the *.key extension, should be detected as "application/vnd.apple.keynote.13" but is being detected as "application/vnd.apple.unknown.13". 
> The IWork13PackageParser.detect(ZipFile) method iterates over zip entries and calls detectIfPossible(ZipEntry) multiple times. Inside the last method, the check for "Index/MasterSlide-*" or "Index/Slide-*" files should work, they are found inside the attached file. But since those entries can appear in arbitrary order, the "Index/Document.iwa" entry is found before and then "application/vnd.apple.unknown.13" is returned. The fix is simply returning null when "Index/Document.iwa" entry is found.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)