You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Antoni Mylka (JIRA)" <ji...@apache.org> on 2010/08/19 19:30:17 UTC

[jira] Created: (TIKA-485) ContainerAwareDetector doesn't support truncated POI files

ContainerAwareDetector doesn't support truncated POI files
----------------------------------------------------------

                 Key: TIKA-485
                 URL: https://issues.apache.org/jira/browse/TIKA-485
             Project: Tika
          Issue Type: Improvement
            Reporter: Antoni Mylka
         Attachments: tika-truncated-excel-file.patch

If a file has a POI magic number but the call to  new POIFSFileSystem(new FileInputStream(stream.getFile())); throws an exception because the file is broken - the entire process will fail. A simple try-catch around the call to POIFSContainerDetector.detect would allow the ContainerAwareDetector to return a meaningful result

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (TIKA-485) ContainerAwareDetector doesn't support truncated POI files

Posted by "Antoni Mylka (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/TIKA-485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Antoni Mylka updated TIKA-485:
------------------------------

    Attachment: tika-truncated-excel-file.patch

a patch with a test that exposes the issue

> ContainerAwareDetector doesn't support truncated POI files
> ----------------------------------------------------------
>
>                 Key: TIKA-485
>                 URL: https://issues.apache.org/jira/browse/TIKA-485
>             Project: Tika
>          Issue Type: Improvement
>            Reporter: Antoni Mylka
>         Attachments: tika-truncated-excel-file.patch
>
>
> If a file has a POI magic number but the call to  new POIFSFileSystem(new FileInputStream(stream.getFile())); throws an exception because the file is broken - the entire process will fail. A simple try-catch around the call to POIFSContainerDetector.detect would allow the ContainerAwareDetector to return a meaningful result

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (TIKA-485) ContainerAwareDetector doesn't support truncated POI files

Posted by "Nick Burch (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/TIKA-485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nick Burch resolved TIKA-485.
-----------------------------

         Assignee: Nick Burch
    Fix Version/s: 0.8
       Resolution: Fixed

Thanks for the patch, applied with slight tweaks in r993113.

> ContainerAwareDetector doesn't support truncated POI files
> ----------------------------------------------------------
>
>                 Key: TIKA-485
>                 URL: https://issues.apache.org/jira/browse/TIKA-485
>             Project: Tika
>          Issue Type: Improvement
>            Reporter: Antoni Mylka
>            Assignee: Nick Burch
>             Fix For: 0.8
>
>         Attachments: tika-truncated-excel-file.patch
>
>
> If a file has a POI magic number but the call to  new POIFSFileSystem(new FileInputStream(stream.getFile())); throws an exception because the file is broken - the entire process will fail. A simple try-catch around the call to POIFSContainerDetector.detect would allow the ContainerAwareDetector to return a meaningful result

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.