You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Gerard van der Hoorn (JIRA)" <ji...@apache.org> on 2016/03/14 09:54:33 UTC

[jira] [Updated] (TIKA-1901) tika detect consumes stream when streams contains msoffice file

     [ https://issues.apache.org/jira/browse/TIKA-1901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gerard van der Hoorn updated TIKA-1901:
---------------------------------------
    Attachment: test.xls
                test.pdf
                test.doc
                TikaStreamConsumingIssue.java

Added a junit test with test files.

> tika detect consumes stream when streams contains msoffice file
> ---------------------------------------------------------------
>
>                 Key: TIKA-1901
>                 URL: https://issues.apache.org/jira/browse/TIKA-1901
>             Project: Tika
>          Issue Type: Bug
>          Components: detector
>    Affects Versions: 1.12
>            Reporter: Gerard van der Hoorn
>         Attachments: TikaStreamConsumingIssue.java, test.doc, test.pdf, test.xls
>
>
> When tika.detect is used to on ms-office file (word or excel 2003) the stream is consumed which is not as expected. According to the documentation when  the stream supports marking the position in the file will be returned to the original position.
> Added is a testcase.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)