You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Enrico Donelli (JIRA)" <ji...@apache.org> on 2011/04/20 09:45:05 UTC

[jira] [Created] (TIKA-643) tika hangs parsing doc file (attached)

tika hangs parsing doc file (attached)
--------------------------------------

                 Key: TIKA-643
                 URL: https://issues.apache.org/jira/browse/TIKA-643
             Project: Tika
          Issue Type: Bug
          Components: parser
    Affects Versions: 1.0
         Environment: mac osx and linux debian
            Reporter: Enrico Donelli


Tika hangs parsing the word file:

http://dl.dropbox.com/u/2371175/testfile002.doc

The current version of tika (0.9) works fine with the same file

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (TIKA-643) tika hangs parsing doc file (attached)

Posted by "Nick Burch (Resolved) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/TIKA-643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nick Burch resolved TIKA-643.
-----------------------------

       Resolution: Fixed
    Fix Version/s: 1.0

I believe this was fixed in Tika 1.0, by a POI upgrade
                
> tika hangs parsing doc file (attached)
> --------------------------------------
>
>                 Key: TIKA-643
>                 URL: https://issues.apache.org/jira/browse/TIKA-643
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 0.10
>         Environment: mac osx and linux debian
>            Reporter: Enrico Donelli
>            Assignee: Nick Burch
>             Fix For: 1.0
>
>
> Tika hangs parsing the word file:
> http://dl.dropbox.com/u/2371175/testfile002.doc
> The current version of tika (0.9) works fine with the same file

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (TIKA-643) tika hangs parsing doc file (attached)

Posted by "Nick Burch (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/TIKA-643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13022230#comment-13022230 ] 

Nick Burch commented on TIKA-643:
---------------------------------

Looks to be a bug in the new lower memory NPOIFS that was recently turned on for Tika. Can you please open a new bug in the POI bugzilla, and attach the file there?

> tika hangs parsing doc file (attached)
> --------------------------------------
>
>                 Key: TIKA-643
>                 URL: https://issues.apache.org/jira/browse/TIKA-643
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 1.0
>         Environment: mac osx and linux debian
>            Reporter: Enrico Donelli
>
> Tika hangs parsing the word file:
> http://dl.dropbox.com/u/2371175/testfile002.doc
> The current version of tika (0.9) works fine with the same file

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (TIKA-643) tika hangs parsing doc file (attached)

Posted by "Nick Burch (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/TIKA-643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13022806#comment-13022806 ] 

Nick Burch commented on TIKA-643:
---------------------------------

For now, you'll need to use a nightly snapshot build of POI. When 3.8 beta 3 is released, or when TIKA-645 is fixed (whichever is first) then your file ought to work fine.

> tika hangs parsing doc file (attached)
> --------------------------------------
>
>                 Key: TIKA-643
>                 URL: https://issues.apache.org/jira/browse/TIKA-643
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 1.0
>         Environment: mac osx and linux debian
>            Reporter: Enrico Donelli
>
> Tika hangs parsing the word file:
> http://dl.dropbox.com/u/2371175/testfile002.doc
> The current version of tika (0.9) works fine with the same file

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (TIKA-643) tika hangs parsing doc file (attached)

Posted by "Nick Burch (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/TIKA-643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nick Burch reassigned TIKA-643:
-------------------------------

    Assignee: Nick Burch

> tika hangs parsing doc file (attached)
> --------------------------------------
>
>                 Key: TIKA-643
>                 URL: https://issues.apache.org/jira/browse/TIKA-643
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 1.0
>         Environment: mac osx and linux debian
>            Reporter: Enrico Donelli
>            Assignee: Nick Burch
>
> Tika hangs parsing the word file:
> http://dl.dropbox.com/u/2371175/testfile002.doc
> The current version of tika (0.9) works fine with the same file

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (TIKA-643) tika hangs parsing doc file (attached)

Posted by "Nick Burch (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/TIKA-643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13022794#comment-13022794 ] 

Nick Burch commented on TIKA-643:
---------------------------------

The problem with NPOIFS reading from an inputstream that was exactly the maximum size has been fixed in POI r1095753. Tika will get that when the next POI beta release is out.

Currently looking at why we're not passing the file to NPOIFS, rather than the stream

> tika hangs parsing doc file (attached)
> --------------------------------------
>
>                 Key: TIKA-643
>                 URL: https://issues.apache.org/jira/browse/TIKA-643
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 1.0
>         Environment: mac osx and linux debian
>            Reporter: Enrico Donelli
>
> Tika hangs parsing the word file:
> http://dl.dropbox.com/u/2371175/testfile002.doc
> The current version of tika (0.9) works fine with the same file

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira