You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Nick Burch (JIRA)" <ji...@apache.org> on 2013/04/26 22:10:17 UTC
[jira] [Commented] (TIKA-1112) Parsing for OGV file with invalid
checksum
[ https://issues.apache.org/jira/browse/TIKA-1112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13643204#comment-13643204 ]
Nick Burch commented on TIKA-1112:
----------------------------------
Do you know where the problem files come from? And are you able to use any of the Ogg file level tools to check to see if the checksum is present+valid on the streams?
> Parsing for OGV file with invalid checksum
> ------------------------------------------
>
> Key: TIKA-1112
> URL: https://issues.apache.org/jira/browse/TIKA-1112
> Project: Tika
> Issue Type: Bug
> Components: metadata, parser
> Affects Versions: 1.3
> Environment: OS X 10.8.3
> JDK 1.6.0_45 64-bit
> Reporter: Alexander Chow
>
> When parsing any OGV file (e.g., [Typing_example.ogv|http://commons.wikimedia.org/wiki/File:Typing_example.ogv]), log will output something like the following:
> {code}
> Warning - invalid checksum on page 2 of stream 155f (5471)
> Warning - invalid checksum on page 3 of stream 155f (5471)
> Warning - invalid checksum on page 4 of stream 155f (5471)
> Warning - invalid checksum on page 5 of stream 155f (5471)
> Warning - invalid checksum on page 6 of stream 155f (5471)
> Warning - invalid checksum on page 7 of stream 155f (5471)
> ...
> Warning - invalid checksum on page 3071 of stream 155f (5471)
> Warning - invalid checksum on page 3072 of stream 155f (5471)
> Warning - invalid checksum on page 3073 of stream 155f (5471)
> Warning - invalid checksum on page 3074 of stream 155f (5471)
> Exception in thread "main" java.io.IOException: Asked to read 4228 bytes from 0 but hit EoF at 2884
> at org.gagravarr.ogg.IOUtils.readFully(IOUtils.java:39)
> at org.gagravarr.ogg.IOUtils.readFully(IOUtils.java:31)
> at org.gagravarr.ogg.OggPage.<init>(OggPage.java:82)
> at org.gagravarr.ogg.OggPacketReader.getNextPacket(OggPacketReader.java:116)
> at org.gagravarr.tika.OggDetector.detect(OggDetector.java:79)
> at org.apache.tika.detect.CompositeDetector.detect(CompositeDetector.java:61)
> at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:113)
> at com.test.OGVTest.main(OGVTest.java:31)
> {code}
> My test code was the following:
> {code:java}
> void parse(String fileName) throws Exception {
> InputStream inputStream = new FileInputStream(fileName);
>
> Metadata metadata = new Metadata();
>
> Parser parser = new AutoDetectParser();
>
> ParseContext parserContext = new ParseContext();
> parserContext.set(Parser.class, parser);
> ContentHandler contentHandler = new WriteOutContentHandler(
> new DummyWriter());
> parser.parse(inputStream, contentHandler, metadata, parserContext);
>
> System.out.println(metadata);
> }
> {code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira