You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Jeremy Calvert (JIRA)" <ji...@apache.org> on 2005/12/08 17:46:08 UTC

[jira] Created: (LUCENE-480) NullPointerException during IndexWriter.mergeSegments

NullPointerException during IndexWriter.mergeSegments
-----------------------------------------------------

         Key: LUCENE-480
         URL: http://issues.apache.org/jira/browse/LUCENE-480
     Project: Lucene - Java
        Type: Bug
  Components: Index  
    Versions: CVS Nightly - Specify date in submission, 1.9    
 Environment: 64bit, ubuntu, Java 5 SE
    Reporter: Jeremy Calvert


Last commit on culprit org.apache.lucene.index.FieldsReader: Sun Oct 30 05:38:46 2005.

---------------------------------------------------------
Offending code in FieldsReader.java:

...
  final Document doc(int n) throws IOException {
    indexStream.seek(n * 8L);
    long position = indexStream.readLong();
    fieldsStream.seek(position);

    Document doc = new Document();
    int numFields = fieldsStream.readVInt();
    for (int i = 0; i < numFields; i++) {
      int fieldNumber = fieldsStream.readVInt();
      FieldInfo fi = fieldInfos.fieldInfo(fieldNumber); 
//
// This apparently returns null, presumably either as a result of:
//   catch (IndexOutOfBoundsException ioobe) {
//      return null;
//    }
// in fieldInfos.fieldInfo(int fieldNumber)
//  - or -
// because there's a null member of member ArrayList byNumber of FieldInfos

      byte bits = fieldsStream.readByte();
      
      boolean compressed = (bits & FieldsWriter.FIELD_IS_COMPRESSED) != 0;

....

        Field.Store store = Field.Store.YES;
//
// Here --v is where the NPE is thrown.        
        if (fi.isIndexed && tokenize)
          index = Field.Index.TOKENIZED;
...

---------------------------------------------------------

Proposed Patch:
I'm not sure what the behavior should be in this case, but if it's no big deal that there's null field info for an index and we should just ignore that index, an obvious patch could be:

In FieldsReader.java:

...
    for (int i = 0; i < numFields; i++) {
      int fieldNumber = fieldsStream.readVInt();
      FieldInfo fi = fieldInfos.fieldInfo(fieldNumber); 
//    vvvPatchvvv
      if(fi == null) {continue;}

      byte bits = fieldsStream.readByte();
...

---------------------------------------------------------

Other observations:
In my search prior to submitting this issue, I found LUCENE-168, which looks similar, and is perhaps related, but if so, I'm not sure exactly how.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-480) NullPointerException during IndexWriter.mergeSegments

Posted by "Jeremy Calvert (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/LUCENE-480?page=comments#action_12359752 ] 

Jeremy Calvert commented on LUCENE-480:
---------------------------------------

Sure, let me try and put that together.

> NullPointerException during IndexWriter.mergeSegments
> -----------------------------------------------------
>
>          Key: LUCENE-480
>          URL: http://issues.apache.org/jira/browse/LUCENE-480
>      Project: Lucene - Java
>         Type: Bug
>   Components: Index
>     Versions: CVS Nightly - Specify date in submission, 1.9
>  Environment: 64bit, ubuntu, Java 5 SE
>     Reporter: Jeremy Calvert

>
> Last commit on culprit org.apache.lucene.index.FieldsReader: Sun Oct 30 05:38:46 2005.
> ---------------------------------------------------------
> Offending code in FieldsReader.java:
> ...
>   final Document doc(int n) throws IOException {
>     indexStream.seek(n * 8L);
>     long position = indexStream.readLong();
>     fieldsStream.seek(position);
>     Document doc = new Document();
>     int numFields = fieldsStream.readVInt();
>     for (int i = 0; i < numFields; i++) {
>       int fieldNumber = fieldsStream.readVInt();
>       FieldInfo fi = fieldInfos.fieldInfo(fieldNumber); 
> //
> // This apparently returns null, presumably either as a result of:
> //   catch (IndexOutOfBoundsException ioobe) {
> //      return null;
> //    }
> // in fieldInfos.fieldInfo(int fieldNumber)
> //  - or -
> // because there's a null member of member ArrayList byNumber of FieldInfos
>       byte bits = fieldsStream.readByte();
>       
>       boolean compressed = (bits & FieldsWriter.FIELD_IS_COMPRESSED) != 0;
> ....
>         Field.Store store = Field.Store.YES;
> //
> // Here --v is where the NPE is thrown.        
>         if (fi.isIndexed && tokenize)
>           index = Field.Index.TOKENIZED;
> ...
> ---------------------------------------------------------
> Proposed Patch:
> I'm not sure what the behavior should be in this case, but if it's no big deal that there's null field info for an index and we should just ignore that index, an obvious patch could be:
> In FieldsReader.java:
> ...
>     for (int i = 0; i < numFields; i++) {
>       int fieldNumber = fieldsStream.readVInt();
>       FieldInfo fi = fieldInfos.fieldInfo(fieldNumber); 
> //    vvvPatchvvv
>       if(fi == null) {continue;}
>       byte bits = fieldsStream.readByte();
> ...
> ---------------------------------------------------------
> Other observations:
> In my search prior to submitting this issue, I found LUCENE-168, which looks similar, and is perhaps related, but if so, I'm not sure exactly how.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Closed: (LUCENE-480) NullPointerException during IndexWriter.mergeSegments

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/LUCENE-480?page=all ]
     
Yonik Seeley closed LUCENE-480:
-------------------------------

    Resolution: Invalid
     Assign To: Yonik Seeley

No problem... glad to hear it will be an easy fix :-)

> NullPointerException during IndexWriter.mergeSegments
> -----------------------------------------------------
>
>          Key: LUCENE-480
>          URL: http://issues.apache.org/jira/browse/LUCENE-480
>      Project: Lucene - Java
>         Type: Bug
>   Components: Index
>     Versions: CVS Nightly - Specify date in submission, 1.9
>  Environment: 64bit, ubuntu, Java 5 SE
>     Reporter: Jeremy Calvert
>     Assignee: Yonik Seeley

>
> Last commit on culprit org.apache.lucene.index.FieldsReader: Sun Oct 30 05:38:46 2005.
> ---------------------------------------------------------
> Offending code in FieldsReader.java:
> ...
>   final Document doc(int n) throws IOException {
>     indexStream.seek(n * 8L);
>     long position = indexStream.readLong();
>     fieldsStream.seek(position);
>     Document doc = new Document();
>     int numFields = fieldsStream.readVInt();
>     for (int i = 0; i < numFields; i++) {
>       int fieldNumber = fieldsStream.readVInt();
>       FieldInfo fi = fieldInfos.fieldInfo(fieldNumber); 
> //
> // This apparently returns null, presumably either as a result of:
> //   catch (IndexOutOfBoundsException ioobe) {
> //      return null;
> //    }
> // in fieldInfos.fieldInfo(int fieldNumber)
> //  - or -
> // because there's a null member of member ArrayList byNumber of FieldInfos
>       byte bits = fieldsStream.readByte();
>       
>       boolean compressed = (bits & FieldsWriter.FIELD_IS_COMPRESSED) != 0;
> ....
>         Field.Store store = Field.Store.YES;
> //
> // Here --v is where the NPE is thrown.        
>         if (fi.isIndexed && tokenize)
>           index = Field.Index.TOKENIZED;
> ...
> ---------------------------------------------------------
> Proposed Patch:
> I'm not sure what the behavior should be in this case, but if it's no big deal that there's null field info for an index and we should just ignore that index, an obvious patch could be:
> In FieldsReader.java:
> ...
>     for (int i = 0; i < numFields; i++) {
>       int fieldNumber = fieldsStream.readVInt();
>       FieldInfo fi = fieldInfos.fieldInfo(fieldNumber); 
> //    vvvPatchvvv
>       if(fi == null) {continue;}
>       byte bits = fieldsStream.readByte();
> ...
> ---------------------------------------------------------
> Other observations:
> In my search prior to submitting this issue, I found LUCENE-168, which looks similar, and is perhaps related, but if so, I'm not sure exactly how.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-480) NullPointerException during IndexWriter.mergeSegments

Posted by "Michael W. Nassif (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/LUCENE-480?page=comments#action_12418081 ] 

Michael W. Nassif commented on LUCENE-480:
------------------------------------------

I had a similar problem, and it wasn't obvious at first. It turned out to be a corrupt ram chip. The memory worked fine in most instances, the error would reveal itself when doing large indexing, assuming a few small bits are being written incorrectly. You can narrow down the problem by downloading the [URL=http://www.ultimatebootcd.com]Ultimate Boot CD[/url] and running the memory diagnostic.

Good luck.

> NullPointerException during IndexWriter.mergeSegments
> -----------------------------------------------------
>
>          Key: LUCENE-480
>          URL: http://issues.apache.org/jira/browse/LUCENE-480
>      Project: Lucene - Java
>         Type: Bug

>   Components: Index
>     Versions: CVS Nightly - Specify date in submission, 1.9
>  Environment: 64bit, ubuntu, Java 5 SE
>     Reporter: Jeremy Calvert
>     Assignee: Yonik Seeley

>
> Last commit on culprit org.apache.lucene.index.FieldsReader: Sun Oct 30 05:38:46 2005.
> ---------------------------------------------------------
> Offending code in FieldsReader.java:
> ...
>   final Document doc(int n) throws IOException {
>     indexStream.seek(n * 8L);
>     long position = indexStream.readLong();
>     fieldsStream.seek(position);
>     Document doc = new Document();
>     int numFields = fieldsStream.readVInt();
>     for (int i = 0; i < numFields; i++) {
>       int fieldNumber = fieldsStream.readVInt();
>       FieldInfo fi = fieldInfos.fieldInfo(fieldNumber); 
> //
> // This apparently returns null, presumably either as a result of:
> //   catch (IndexOutOfBoundsException ioobe) {
> //      return null;
> //    }
> // in fieldInfos.fieldInfo(int fieldNumber)
> //  - or -
> // because there's a null member of member ArrayList byNumber of FieldInfos
>       byte bits = fieldsStream.readByte();
>       
>       boolean compressed = (bits & FieldsWriter.FIELD_IS_COMPRESSED) != 0;
> ....
>         Field.Store store = Field.Store.YES;
> //
> // Here --v is where the NPE is thrown.        
>         if (fi.isIndexed && tokenize)
>           index = Field.Index.TOKENIZED;
> ...
> ---------------------------------------------------------
> Proposed Patch:
> I'm not sure what the behavior should be in this case, but if it's no big deal that there's null field info for an index and we should just ignore that index, an obvious patch could be:
> In FieldsReader.java:
> ...
>     for (int i = 0; i < numFields; i++) {
>       int fieldNumber = fieldsStream.readVInt();
>       FieldInfo fi = fieldInfos.fieldInfo(fieldNumber); 
> //    vvvPatchvvv
>       if(fi == null) {continue;}
>       byte bits = fieldsStream.readByte();
> ...
> ---------------------------------------------------------
> Other observations:
> In my search prior to submitting this issue, I found LUCENE-168, which looks similar, and is perhaps related, but if so, I'm not sure exactly how.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-480) NullPointerException during IndexWriter.mergeSegments

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/LUCENE-480?page=comments#action_12359750 ] 

Yonik Seeley commented on LUCENE-480:
-------------------------------------

Is this possible to reproduce in a testcase you can add here?
FieldInfo should never be null AFAIK, so  I'd rather get to the root cause of the problem rather than covering it up.

> NullPointerException during IndexWriter.mergeSegments
> -----------------------------------------------------
>
>          Key: LUCENE-480
>          URL: http://issues.apache.org/jira/browse/LUCENE-480
>      Project: Lucene - Java
>         Type: Bug
>   Components: Index
>     Versions: CVS Nightly - Specify date in submission, 1.9
>  Environment: 64bit, ubuntu, Java 5 SE
>     Reporter: Jeremy Calvert

>
> Last commit on culprit org.apache.lucene.index.FieldsReader: Sun Oct 30 05:38:46 2005.
> ---------------------------------------------------------
> Offending code in FieldsReader.java:
> ...
>   final Document doc(int n) throws IOException {
>     indexStream.seek(n * 8L);
>     long position = indexStream.readLong();
>     fieldsStream.seek(position);
>     Document doc = new Document();
>     int numFields = fieldsStream.readVInt();
>     for (int i = 0; i < numFields; i++) {
>       int fieldNumber = fieldsStream.readVInt();
>       FieldInfo fi = fieldInfos.fieldInfo(fieldNumber); 
> //
> // This apparently returns null, presumably either as a result of:
> //   catch (IndexOutOfBoundsException ioobe) {
> //      return null;
> //    }
> // in fieldInfos.fieldInfo(int fieldNumber)
> //  - or -
> // because there's a null member of member ArrayList byNumber of FieldInfos
>       byte bits = fieldsStream.readByte();
>       
>       boolean compressed = (bits & FieldsWriter.FIELD_IS_COMPRESSED) != 0;
> ....
>         Field.Store store = Field.Store.YES;
> //
> // Here --v is where the NPE is thrown.        
>         if (fi.isIndexed && tokenize)
>           index = Field.Index.TOKENIZED;
> ...
> ---------------------------------------------------------
> Proposed Patch:
> I'm not sure what the behavior should be in this case, but if it's no big deal that there's null field info for an index and we should just ignore that index, an obvious patch could be:
> In FieldsReader.java:
> ...
>     for (int i = 0; i < numFields; i++) {
>       int fieldNumber = fieldsStream.readVInt();
>       FieldInfo fi = fieldInfos.fieldInfo(fieldNumber); 
> //    vvvPatchvvv
>       if(fi == null) {continue;}
>       byte bits = fieldsStream.readByte();
> ...
> ---------------------------------------------------------
> Other observations:
> In my search prior to submitting this issue, I found LUCENE-168, which looks similar, and is perhaps related, but if so, I'm not sure exactly how.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-480) NullPointerException during IndexWriter.mergeSegments

Posted by "Jeremy Calvert (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/LUCENE-480?page=comments#action_12359755 ] 

Jeremy Calvert commented on LUCENE-480:
---------------------------------------

A little more data: 

      int fieldNumber = fieldsStream.readVInt();

on line 68 of FieldsReader.java results in fieldNumber = 221997 for my particular fieldsStream, so it would seem that my proposed patch would indeed just gloss over a larger problem wherein the fieldsStream is getting corrupted.

On the other hand, having this cause an NPE seems less than ideal.  Is there some way to throw an exception that's more indicative of the stream corruption?

In any case, I'm tracing back how this happened in the first place.   I would simply give you the code and data to reproduce it, but the data is ~500M worth.

Stay tuned!


> NullPointerException during IndexWriter.mergeSegments
> -----------------------------------------------------
>
>          Key: LUCENE-480
>          URL: http://issues.apache.org/jira/browse/LUCENE-480
>      Project: Lucene - Java
>         Type: Bug
>   Components: Index
>     Versions: CVS Nightly - Specify date in submission, 1.9
>  Environment: 64bit, ubuntu, Java 5 SE
>     Reporter: Jeremy Calvert

>
> Last commit on culprit org.apache.lucene.index.FieldsReader: Sun Oct 30 05:38:46 2005.
> ---------------------------------------------------------
> Offending code in FieldsReader.java:
> ...
>   final Document doc(int n) throws IOException {
>     indexStream.seek(n * 8L);
>     long position = indexStream.readLong();
>     fieldsStream.seek(position);
>     Document doc = new Document();
>     int numFields = fieldsStream.readVInt();
>     for (int i = 0; i < numFields; i++) {
>       int fieldNumber = fieldsStream.readVInt();
>       FieldInfo fi = fieldInfos.fieldInfo(fieldNumber); 
> //
> // This apparently returns null, presumably either as a result of:
> //   catch (IndexOutOfBoundsException ioobe) {
> //      return null;
> //    }
> // in fieldInfos.fieldInfo(int fieldNumber)
> //  - or -
> // because there's a null member of member ArrayList byNumber of FieldInfos
>       byte bits = fieldsStream.readByte();
>       
>       boolean compressed = (bits & FieldsWriter.FIELD_IS_COMPRESSED) != 0;
> ....
>         Field.Store store = Field.Store.YES;
> //
> // Here --v is where the NPE is thrown.        
>         if (fi.isIndexed && tokenize)
>           index = Field.Index.TOKENIZED;
> ...
> ---------------------------------------------------------
> Proposed Patch:
> I'm not sure what the behavior should be in this case, but if it's no big deal that there's null field info for an index and we should just ignore that index, an obvious patch could be:
> In FieldsReader.java:
> ...
>     for (int i = 0; i < numFields; i++) {
>       int fieldNumber = fieldsStream.readVInt();
>       FieldInfo fi = fieldInfos.fieldInfo(fieldNumber); 
> //    vvvPatchvvv
>       if(fi == null) {continue;}
>       byte bits = fieldsStream.readByte();
> ...
> ---------------------------------------------------------
> Other observations:
> In my search prior to submitting this issue, I found LUCENE-168, which looks similar, and is perhaps related, but if so, I'm not sure exactly how.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-480) NullPointerException during IndexWriter.mergeSegments

Posted by "Jeremy Calvert (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/LUCENE-480?page=comments#action_12359810 ] 

Jeremy Calvert commented on LUCENE-480:
---------------------------------------

Apparently my hardware or filesystem is having some difficulties, which could be the reason the fieldsStream is corrupt. I apologize for the false alarm and sincerely appreciate the quick feedback.

# dmesg
...
PCI-DMA: Out of IOMMU space for 180224 bytes at device 0000:00:07.0
end_request: I/O error, dev sda, sector 52463038
printk: 1014 messages suppressed.
Buffer I/O error on device md0, logical block 21106784 

> NullPointerException during IndexWriter.mergeSegments
> -----------------------------------------------------
>
>          Key: LUCENE-480
>          URL: http://issues.apache.org/jira/browse/LUCENE-480
>      Project: Lucene - Java
>         Type: Bug
>   Components: Index
>     Versions: CVS Nightly - Specify date in submission, 1.9
>  Environment: 64bit, ubuntu, Java 5 SE
>     Reporter: Jeremy Calvert

>
> Last commit on culprit org.apache.lucene.index.FieldsReader: Sun Oct 30 05:38:46 2005.
> ---------------------------------------------------------
> Offending code in FieldsReader.java:
> ...
>   final Document doc(int n) throws IOException {
>     indexStream.seek(n * 8L);
>     long position = indexStream.readLong();
>     fieldsStream.seek(position);
>     Document doc = new Document();
>     int numFields = fieldsStream.readVInt();
>     for (int i = 0; i < numFields; i++) {
>       int fieldNumber = fieldsStream.readVInt();
>       FieldInfo fi = fieldInfos.fieldInfo(fieldNumber); 
> //
> // This apparently returns null, presumably either as a result of:
> //   catch (IndexOutOfBoundsException ioobe) {
> //      return null;
> //    }
> // in fieldInfos.fieldInfo(int fieldNumber)
> //  - or -
> // because there's a null member of member ArrayList byNumber of FieldInfos
>       byte bits = fieldsStream.readByte();
>       
>       boolean compressed = (bits & FieldsWriter.FIELD_IS_COMPRESSED) != 0;
> ....
>         Field.Store store = Field.Store.YES;
> //
> // Here --v is where the NPE is thrown.        
>         if (fi.isIndexed && tokenize)
>           index = Field.Index.TOKENIZED;
> ...
> ---------------------------------------------------------
> Proposed Patch:
> I'm not sure what the behavior should be in this case, but if it's no big deal that there's null field info for an index and we should just ignore that index, an obvious patch could be:
> In FieldsReader.java:
> ...
>     for (int i = 0; i < numFields; i++) {
>       int fieldNumber = fieldsStream.readVInt();
>       FieldInfo fi = fieldInfos.fieldInfo(fieldNumber); 
> //    vvvPatchvvv
>       if(fi == null) {continue;}
>       byte bits = fieldsStream.readByte();
> ...
> ---------------------------------------------------------
> Other observations:
> In my search prior to submitting this issue, I found LUCENE-168, which looks similar, and is perhaps related, but if so, I'm not sure exactly how.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org