You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Alexander Chow (JIRA)" <ji...@apache.org> on 2013/04/25 18:34:15 UTC

[jira] [Created] (TIKA-1112) Parsing for OGV file with invalid checksum

Alexander Chow created TIKA-1112:
------------------------------------

             Summary: Parsing for OGV file with invalid checksum
                 Key: TIKA-1112
                 URL: https://issues.apache.org/jira/browse/TIKA-1112
             Project: Tika
          Issue Type: Bug
          Components: metadata, parser
    Affects Versions: 1.3
            Reporter: Alexander Chow


When parsing any OGV file (e.g., [Typing_example.ogv|http://commons.wikimedia.org/wiki/File:Typing_example.ogv]), log will output something like the following:

{code}
Warning - invalid checksum on page 2 of stream 155f (5471)
Warning - invalid checksum on page 3 of stream 155f (5471)
Warning - invalid checksum on page 4 of stream 155f (5471)
Warning - invalid checksum on page 5 of stream 155f (5471)
Warning - invalid checksum on page 6 of stream 155f (5471)
Warning - invalid checksum on page 7 of stream 155f (5471)
...
Warning - invalid checksum on page 3071 of stream 155f (5471)
Warning - invalid checksum on page 3072 of stream 155f (5471)
Warning - invalid checksum on page 3073 of stream 155f (5471)
Warning - invalid checksum on page 3074 of stream 155f (5471)
Exception in thread "main" java.io.IOException: Asked to read 4228 bytes from 0 but hit EoF at 2884
	at org.gagravarr.ogg.IOUtils.readFully(IOUtils.java:39)
	at org.gagravarr.ogg.IOUtils.readFully(IOUtils.java:31)
	at org.gagravarr.ogg.OggPage.<init>(OggPage.java:82)
	at org.gagravarr.ogg.OggPacketReader.getNextPacket(OggPacketReader.java:116)
	at org.gagravarr.tika.OggDetector.detect(OggDetector.java:79)
	at org.apache.tika.detect.CompositeDetector.detect(CompositeDetector.java:61)
	at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:113)
	at com.test.OGVTest.main(OGVTest.java:31)
{code}

My test code was the following:

{code:java}

	void parse(String fileName) throws Exception {
		InputStream inputStream = new FileInputStream(fileName);
		
		Metadata metadata = new Metadata();
		
		Parser parser = new AutoDetectParser();
		
		ParseContext parserContext = new ParseContext();

		parserContext.set(Parser.class, parser);

		ContentHandler contentHandler = new WriteOutContentHandler(
			new DummyWriter());

		parser.parse(inputStream, contentHandler, metadata, parserContext);
		
		System.out.println(metadata);
	}
{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira