You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by "Zhang, Lisheng" <Li...@BroadVision.com> on 2012/07/01 00:18:49 UTC

RE: Lucene indexed data corruption error

Thanks very much, this is very helpful!

-----Original Message-----
From: Uwe Schindler [mailto:uwe@thetaphi.de]
Sent: Saturday, June 30, 2012 2:48 PM
To: java-user@lucene.apache.org
Subject: RE: Lucene indexed data corruption error


See this issue: https://issues.apache.org/jira/browse/LUCENE-2975

It's not quite clear, which versions of Java are affected by this. I can only say for sure, that everything <=1.6.0_18 is fine and working again in _29 (official Oracle/Sun JDKs only). OpenJDK versions shipped with various Linux versions may or may not have bugs like this, as their build numbers don't reflect Oracle's official version numbering and vendors are adding so-called "security patches" breaking other things out of order. To fix the bug (your index might be OK, actually!), install official Oracle Java in latest version (1.6.0_33) and don't use over-patched and unstable - very old - OpenJDK ones by Linux vendors :).

Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de

> -----Original Message-----
> From: Uwe Schindler [mailto:uwe@thetaphi.de]
> Sent: Saturday, June 30, 2012 10:52 PM
> To: java-user@lucene.apache.org
> Subject: Re: Lucene indexed data corruption error
> 
> What JVM are you using? This looks like one of the Vint bugs we found in recent
> Oracle Java versions, where we have workarounds since Lucene 3.1. See my
> blog post about the Java 7 bugs, too, they are closely related: blog.thetaphi.de
> --
> Uwe Schindler
> H.-H.-Meier-Allee 63, 28213 Bremen
> http://www.thetaphi.de
> 
> 
> 
> "Zhang, Lisheng" <Li...@BroadVision.com> schrieb:
> 
> Hi,
> 
> We have been using lucene 2.3.2 for years well (yes, we should upgrade).
> 
> Recently we encountered data corruption error when commiting IndexWriter:
> 
> ///
> background merge hit exception: _14b:c61262 _1ag:c11225 _1gb:c9411
> _1gv:c905 _1gw:c50 _1gx:c50 _1gy:c50 _1gz:c50 _1h0:c31 into _1h1 [optimize]
> java.io.IOException: background merge hit exception: _14b:c61262
> _1ag:c11225 _1gb:c9411 _1gv:c905 _1gw:c50 _1gx:c50 _1gy:c50 _1gz:c50
> _1h0:c31 into _1h1 [optimize] at
> org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:1787)
> at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:1727)
> at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:1707)
> ///
> 
> Then we use CheckIndex tool to analyze and found one segment (out of 13)
> having problem:
> 
> ///
> test: stored fields.......ERROR [field data are in wrong format:
> java.util.zip.DataFormatException: unknown compression method]
> org.apache.lucene.index.CorruptIndexException: field data are in wrong
> format: java.util.zip.DataFormatException: unknown compression method at
> org.apache.lucene.index.FieldsReader.uncompress(FieldsReader.java:605)
> at org.apache.lucene.index.FieldsReader.addField(FieldsReader.java:392)
> at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java:259)
> at
> org.apache.lucene.index.SegmentReader.document(SegmentReader.java:934)
> at org.apache.lucene.index.IndexReader.document(IndexReader.java:844)
> at org.apache.lucene.index.CheckIndex.testStoredFields(CheckIndex.java:702)
> at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:517)
> at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:898)
> ///
> 
> Our stored fields are very simple (just id and short title, more fields are only for
> search).
> 
> Our data size is about 400MB and 83K documents. We started indexing from an
> empty folder. Also we have been using lucene 2.3.2 for years and this is the 1st
> time to encounter this issue?
> 
> Indexer is running in a linux box, "uname -a" returns:
> Linux <our box name> 2.6.32-342-ec2 #43-Ubuntu SMP Wed Jan 4 18:22:42
> UTC 2012 x86_64 GNU/Linux
> 
> We really appreciate any guidance,
> 
> Lisheng



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org