You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Tim Allison (JIRA)" <ji...@apache.org> on 2018/09/19 16:58:01 UTC

[jira] [Comment Edited] (TIKA-2730) parseToString fails for a simple mp3

    [ https://issues.apache.org/jira/browse/TIKA-2730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16620843#comment-16620843 ] 

Tim Allison edited comment on TIKA-2730 at 9/19/18 4:57 PM:
------------------------------------------------------------

Thank you for raising this.  It looks like it is totally ok for the final record to be truncated.

My mp3 player didn't object when I truncated one of our test files, and it didn't object with the example mp3 file. 

The comments in the code suggest this as well.

This may be serious enough to warrant a 1.19.1 in a few weeks.  WDYT?


was (Author: tallison@mitre.org):
Thank you for raising this.  This may be serious enough to warrant a 1.19.1 in a few weeks.  WDYT?

> parseToString fails for a simple mp3
> ------------------------------------
>
>                 Key: TIKA-2730
>                 URL: https://issues.apache.org/jira/browse/TIKA-2730
>             Project: Tika
>          Issue Type: Bug
>    Affects Versions: 1.19
>            Reporter: Boris Petrov
>            Assignee: Tim Allison
>            Priority: Major
>             Fix For: 2.0.0, 1.20
>
>         Attachments: demo.mp3
>
>
> This is a regression from 1.18. I've attached the mp3 that fails. The exception I get is:
> {noformat}
> org.apache.tika.exception.TikaException: TIKA-198: Illegal IOException from org.apache.tika.parser.mp3.Mp3Parser@cefe6c6
>     at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:286)
>     at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
>     at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143)
>     at org.apache.tika.Tika.parseToString(Tika.java:527)
>     at com.company.TextExtractor.getText(TextExtractor.java:39)
>     Caused by:
>     java.io.EOFException: EOF: tried to skip 361 but could only skip 247
>         at org.apache.tika.parser.mp3.MpegStream.skipFrame(MpegStream.java:166)
>         at org.apache.tika.parser.mp3.Mp3Parser.getAllTagHandlers(Mp3Parser.java:204)
>         at org.apache.tika.parser.mp3.Mp3Parser.parse(Mp3Parser.java:71)
>         at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
>         ... 5 more{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)