You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Peter Kronenberg (Jira)" <ji...@apache.org> on 2021/03/01 16:58:00 UTC
[jira] [Commented] (TIKA-3255) Parsing MP3 file with record size >
100000 fails
[ https://issues.apache.org/jira/browse/TIKA-3255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17293024#comment-17293024 ]
Peter Kronenberg commented on TIKA-3255:
----------------------------------------
Confirmed. Thank you
> Parsing MP3 file with record size > 100000 fails
> ------------------------------------------------
>
> Key: TIKA-3255
> URL: https://issues.apache.org/jira/browse/TIKA-3255
> Project: Tika
> Issue Type: Bug
> Reporter: Peter Kronenberg
> Priority: Major
> Fix For: 2.0.0
>
> Attachments: sample-a.mp3
>
>
> I got the following exception with the attached mp3 file
>
> {code:java}
> Exception in thread "main" org.apache.tika.exception.TikaException: TIKA-198: Illegal IOException from org.apache.tika.parser.mp3.Mp3Parser@152aa092
> at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:286)
> at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
> at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143)
> at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:159)
> at org.torchai.ReadFile.autoDetect(ReadFile.java:33)
> at org.torchai.ReadFile.main(ReadFile.java:40)
> Caused by: java.io.IOException: Record size (2790678 bytes) is larger than the allowed record size: 1000000
> at org.apache.tika.parser.mp3.ID3v2Frame.readFully(ID3v2Frame.java:186)
> at org.apache.tika.parser.mp3.ID3v2Frame.<init>(ID3v2Frame.java:138)
> at org.apache.tika.parser.mp3.ID3v2Frame.createFrameIfPresent(ID3v2Frame.java:91)
> at org.apache.tika.parser.mp3.Mp3Parser.getAllTagHandlers(Mp3Parser.java:188)
> at org.apache.tika.parser.mp3.Mp3Parser.parse(Mp3Parser.java:70)
> at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
> ... 5 more
> {code}
> This is a perfectly valid MP3 file. It seems that the code has a hard-coded limit of 100000
>
> Here is the code I'm running
> {code:java}
> Exception in thread "main" org.apache.tika.exception.TikaException: TIKA-198: Illegal IOException from org.apache.tika.parser.mp3.Mp3Parser@152aa092
> at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:286)
> at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
> at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143)
> at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:159)
> at org.torchai.ReadFile.autoDetect(ReadFile.java:33)
> at org.torchai.ReadFile.main(ReadFile.java:40)
> Caused by: java.io.IOException: Record size (2790678 bytes) is larger than the allowed record size: 1000000
> at org.apache.tika.parser.mp3.ID3v2Frame.readFully(ID3v2Frame.java:186)
> at org.apache.tika.parser.mp3.ID3v2Frame.<init>(ID3v2Frame.java:138)
> at org.apache.tika.parser.mp3.ID3v2Frame.createFrameIfPresent(ID3v2Frame.java:91)
> at org.apache.tika.parser.mp3.Mp3Parser.getAllTagHandlers(Mp3Parser.java:188)
> at org.apache.tika.parser.mp3.Mp3Parser.parse(Mp3Parser.java:70)
> at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
> ... 5 more
> {code}
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)