You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Peter Kronenberg (Jira)" <ji...@apache.org> on 2021/03/01 16:58:00 UTC

[jira] [Commented] (TIKA-3255) Parsing MP3 file with record size > 100000 fails

    [ https://issues.apache.org/jira/browse/TIKA-3255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17293024#comment-17293024 ] 

Peter Kronenberg commented on TIKA-3255:
----------------------------------------

Confirmed.  Thank you

> Parsing MP3 file with record size > 100000 fails
> ------------------------------------------------
>
>                 Key: TIKA-3255
>                 URL: https://issues.apache.org/jira/browse/TIKA-3255
>             Project: Tika
>          Issue Type: Bug
>            Reporter: Peter Kronenberg
>            Priority: Major
>             Fix For: 2.0.0
>
>         Attachments: sample-a.mp3
>
>
> I got the following exception with the attached mp3 file
>  
> {code:java}
> Exception in thread "main" org.apache.tika.exception.TikaException: TIKA-198: Illegal IOException from org.apache.tika.parser.mp3.Mp3Parser@152aa092
> 	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:286)
> 	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
> 	at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143)
> 	at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:159)
> 	at org.torchai.ReadFile.autoDetect(ReadFile.java:33)
> 	at org.torchai.ReadFile.main(ReadFile.java:40)
> Caused by: java.io.IOException: Record size (2790678 bytes) is larger than the allowed record size: 1000000
> 	at org.apache.tika.parser.mp3.ID3v2Frame.readFully(ID3v2Frame.java:186)
> 	at org.apache.tika.parser.mp3.ID3v2Frame.<init>(ID3v2Frame.java:138)
> 	at org.apache.tika.parser.mp3.ID3v2Frame.createFrameIfPresent(ID3v2Frame.java:91)
> 	at org.apache.tika.parser.mp3.Mp3Parser.getAllTagHandlers(Mp3Parser.java:188)
> 	at org.apache.tika.parser.mp3.Mp3Parser.parse(Mp3Parser.java:70)
> 	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
> 	... 5 more
> {code}
> This is a perfectly valid MP3 file.  It seems that the code has a hard-coded limit of 100000
>  
> Here is the code I'm running
> {code:java}
> Exception in thread "main" org.apache.tika.exception.TikaException: TIKA-198: Illegal IOException from org.apache.tika.parser.mp3.Mp3Parser@152aa092
> 	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:286)
> 	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
> 	at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143)
> 	at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:159)
> 	at org.torchai.ReadFile.autoDetect(ReadFile.java:33)
> 	at org.torchai.ReadFile.main(ReadFile.java:40)
> Caused by: java.io.IOException: Record size (2790678 bytes) is larger than the allowed record size: 1000000
> 	at org.apache.tika.parser.mp3.ID3v2Frame.readFully(ID3v2Frame.java:186)
> 	at org.apache.tika.parser.mp3.ID3v2Frame.<init>(ID3v2Frame.java:138)
> 	at org.apache.tika.parser.mp3.ID3v2Frame.createFrameIfPresent(ID3v2Frame.java:91)
> 	at org.apache.tika.parser.mp3.Mp3Parser.getAllTagHandlers(Mp3Parser.java:188)
> 	at org.apache.tika.parser.mp3.Mp3Parser.parse(Mp3Parser.java:70)
> 	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
> 	... 5 more
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)