You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Robert Muir (JIRA)" <ji...@apache.org> on 2012/06/12 13:30:43 UTC

[jira] [Created] (LUCENE-4139) mixing up indexOptions in same IW session makes corrumpt index

Robert Muir created LUCENE-4139:
-----------------------------------

             Summary: mixing up indexOptions in same IW session makes corrumpt index 
                 Key: LUCENE-4139
                 URL: https://issues.apache.org/jira/browse/LUCENE-4139
             Project: Lucene - Java
          Issue Type: Bug
    Affects Versions: 4.0
            Reporter: Robert Muir


I was trying to beef up TestBackwardsCompatibility (LUCENE-4085) but i accidentally made a corrupt index due to a typo:
{code}
// a field with both offsets and term vectors for a cross-check
FieldType customType3 = new FieldType(TextField.TYPE_STORED);
customType3.setStoreTermVectors(true);
customType3.setStoreTermVectorPositions(true);
customType3.setStoreTermVectorOffsets(true);    customType3.setIndexOptions(IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS);
doc.add(new Field("content3", "here is more content with aaa aaa aaa", customType3));
// a field that omits only positions
FieldType customType4 = new FieldType(TextField.TYPE_STORED);
customType4.setStoreTermVectors(true);
customType4.setStoreTermVectorPositions(false);
customType4.setStoreTermVectorOffsets(true);
customType4.setIndexOptions(IndexOptions.DOCS_AND_FREQS);
// check out the copy-paste typo here! i forgot to change this to content4
 doc.add(new Field("content3", "here is more content with aaa aaa aaa", customType3));
{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-4139) multivalued field with offsets makes corrumpt index

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-4139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13293575#comment-13293575 ] 

Robert Muir commented on LUCENE-4139:
-------------------------------------

{quote}
The problem is more complicated:
How would you sum up offsets for Multivalued fields? How to correctly do this? If you just sum up the offsets, they don't help you anymore with higlighting (if you get multiple stored fields), although I have no idea how this should work at all (highlighting MV fields)...
{quote}

Not really: TermVectorsConsumer does this fine and has for many lucene releases. The problem is FreqProxTermsWriter does it wrong. see the patch.
                
> multivalued field with offsets makes corrumpt index 
> ----------------------------------------------------
>
>                 Key: LUCENE-4139
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4139
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 4.0
>            Reporter: Robert Muir
>         Attachments: LUCENE-4139.patch, LUCENE-4139_test.patch, LUCENE-4139_test.patch
>
>
> I was trying to beef up TestBackwardsCompatibility (LUCENE-4085) but i accidentally made a corrupt index due to a typo:
> {code}
> // a field with both offsets and term vectors for a cross-check
> FieldType customType3 = new FieldType(TextField.TYPE_STORED);
> customType3.setStoreTermVectors(true);
> customType3.setStoreTermVectorPositions(true);
> customType3.setStoreTermVectorOffsets(true);    customType3.setIndexOptions(IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS);
> doc.add(new Field("content3", "here is more content with aaa aaa aaa", customType3));
> // a field that omits only positions
> FieldType customType4 = new FieldType(TextField.TYPE_STORED);
> customType4.setStoreTermVectors(true);
> customType4.setStoreTermVectorPositions(false);
> customType4.setStoreTermVectorOffsets(true);
> customType4.setIndexOptions(IndexOptions.DOCS_AND_FREQS);
> // check out the copy-paste typo here! i forgot to change this to content4
>  doc.add(new Field("content3", "here is more content with aaa aaa aaa", customType3));
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-4139) mixing up indexOptions in same IW session makes corrumpt index

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-4139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Muir updated LUCENE-4139:
--------------------------------

    Attachment: LUCENE-4139_test.patch

updated test: actually the bug has nothing to do with mixing up fieldtypes, as i forget to use the new fieldtype too.

it happens when you have a multivalued field.
                
> mixing up indexOptions in same IW session makes corrumpt index 
> ---------------------------------------------------------------
>
>                 Key: LUCENE-4139
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4139
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 4.0
>            Reporter: Robert Muir
>         Attachments: LUCENE-4139_test.patch, LUCENE-4139_test.patch
>
>
> I was trying to beef up TestBackwardsCompatibility (LUCENE-4085) but i accidentally made a corrupt index due to a typo:
> {code}
> // a field with both offsets and term vectors for a cross-check
> FieldType customType3 = new FieldType(TextField.TYPE_STORED);
> customType3.setStoreTermVectors(true);
> customType3.setStoreTermVectorPositions(true);
> customType3.setStoreTermVectorOffsets(true);    customType3.setIndexOptions(IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS);
> doc.add(new Field("content3", "here is more content with aaa aaa aaa", customType3));
> // a field that omits only positions
> FieldType customType4 = new FieldType(TextField.TYPE_STORED);
> customType4.setStoreTermVectors(true);
> customType4.setStoreTermVectorPositions(false);
> customType4.setStoreTermVectorOffsets(true);
> customType4.setIndexOptions(IndexOptions.DOCS_AND_FREQS);
> // check out the copy-paste typo here! i forgot to change this to content4
>  doc.add(new Field("content3", "here is more content with aaa aaa aaa", customType3));
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-4139) multivalued field with offsets makes corrumpt index

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-4139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13293760#comment-13293760 ] 

Michael McCandless commented on LUCENE-4139:
--------------------------------------------

Patch looks good!  Nice find.  +1
                
> multivalued field with offsets makes corrumpt index 
> ----------------------------------------------------
>
>                 Key: LUCENE-4139
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4139
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 4.0
>            Reporter: Robert Muir
>         Attachments: LUCENE-4139.patch, LUCENE-4139.patch, LUCENE-4139.patch, LUCENE-4139_test.patch, LUCENE-4139_test.patch
>
>
> I was trying to beef up TestBackwardsCompatibility (LUCENE-4085) but i accidentally made a corrupt index due to a typo:
> {code}
> // a field with both offsets and term vectors for a cross-check
> FieldType customType3 = new FieldType(TextField.TYPE_STORED);
> customType3.setStoreTermVectors(true);
> customType3.setStoreTermVectorPositions(true);
> customType3.setStoreTermVectorOffsets(true);    customType3.setIndexOptions(IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS);
> doc.add(new Field("content3", "here is more content with aaa aaa aaa", customType3));
> // a field that omits only positions
> FieldType customType4 = new FieldType(TextField.TYPE_STORED);
> customType4.setStoreTermVectors(true);
> customType4.setStoreTermVectorPositions(false);
> customType4.setStoreTermVectorOffsets(true);
> customType4.setIndexOptions(IndexOptions.DOCS_AND_FREQS);
> // check out the copy-paste typo here! i forgot to change this to content4
>  doc.add(new Field("content3", "here is more content with aaa aaa aaa", customType3));
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-4139) multivalued field with offsets makes corrumpt index

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-4139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Muir updated LUCENE-4139:
--------------------------------

    Attachment: LUCENE-4139.patch

stupid IDE. forgot to press save. This one actually has the 'prevOffset -> offsetAccum' rename.
                
> multivalued field with offsets makes corrumpt index 
> ----------------------------------------------------
>
>                 Key: LUCENE-4139
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4139
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 4.0
>            Reporter: Robert Muir
>         Attachments: LUCENE-4139.patch, LUCENE-4139.patch, LUCENE-4139.patch, LUCENE-4139_test.patch, LUCENE-4139_test.patch
>
>
> I was trying to beef up TestBackwardsCompatibility (LUCENE-4085) but i accidentally made a corrupt index due to a typo:
> {code}
> // a field with both offsets and term vectors for a cross-check
> FieldType customType3 = new FieldType(TextField.TYPE_STORED);
> customType3.setStoreTermVectors(true);
> customType3.setStoreTermVectorPositions(true);
> customType3.setStoreTermVectorOffsets(true);    customType3.setIndexOptions(IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS);
> doc.add(new Field("content3", "here is more content with aaa aaa aaa", customType3));
> // a field that omits only positions
> FieldType customType4 = new FieldType(TextField.TYPE_STORED);
> customType4.setStoreTermVectors(true);
> customType4.setStoreTermVectorPositions(false);
> customType4.setStoreTermVectorOffsets(true);
> customType4.setIndexOptions(IndexOptions.DOCS_AND_FREQS);
> // check out the copy-paste typo here! i forgot to change this to content4
>  doc.add(new Field("content3", "here is more content with aaa aaa aaa", customType3));
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-4139) multivalued field with offsets makes corrumpt index

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-4139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13293561#comment-13293561 ] 

Uwe Schindler commented on LUCENE-4139:
---------------------------------------

The problem is more complicated:
How would you sum up offsets for Multivalued fields? How to correctly do this? If you just sum up the offsets, they don't help you anymore with higlighting (if you get multiple stored fields), although I have no idea how this should work at all (highlighting MV fields)...
                
> multivalued field with offsets makes corrumpt index 
> ----------------------------------------------------
>
>                 Key: LUCENE-4139
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4139
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 4.0
>            Reporter: Robert Muir
>         Attachments: LUCENE-4139_test.patch, LUCENE-4139_test.patch
>
>
> I was trying to beef up TestBackwardsCompatibility (LUCENE-4085) but i accidentally made a corrupt index due to a typo:
> {code}
> // a field with both offsets and term vectors for a cross-check
> FieldType customType3 = new FieldType(TextField.TYPE_STORED);
> customType3.setStoreTermVectors(true);
> customType3.setStoreTermVectorPositions(true);
> customType3.setStoreTermVectorOffsets(true);    customType3.setIndexOptions(IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS);
> doc.add(new Field("content3", "here is more content with aaa aaa aaa", customType3));
> // a field that omits only positions
> FieldType customType4 = new FieldType(TextField.TYPE_STORED);
> customType4.setStoreTermVectors(true);
> customType4.setStoreTermVectorPositions(false);
> customType4.setStoreTermVectorOffsets(true);
> customType4.setIndexOptions(IndexOptions.DOCS_AND_FREQS);
> // check out the copy-paste typo here! i forgot to change this to content4
>  doc.add(new Field("content3", "here is more content with aaa aaa aaa", customType3));
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-4139) multivalued field with offsets makes corrumpt index

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-4139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Muir updated LUCENE-4139:
--------------------------------

    Attachment: LUCENE-4139.patch

updated patch, i renamed the prevOffset in writeOffset to offsetAccum (i think this is less misleading). also added a random test.
                
> multivalued field with offsets makes corrumpt index 
> ----------------------------------------------------
>
>                 Key: LUCENE-4139
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4139
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 4.0
>            Reporter: Robert Muir
>         Attachments: LUCENE-4139.patch, LUCENE-4139.patch, LUCENE-4139_test.patch, LUCENE-4139_test.patch
>
>
> I was trying to beef up TestBackwardsCompatibility (LUCENE-4085) but i accidentally made a corrupt index due to a typo:
> {code}
> // a field with both offsets and term vectors for a cross-check
> FieldType customType3 = new FieldType(TextField.TYPE_STORED);
> customType3.setStoreTermVectors(true);
> customType3.setStoreTermVectorPositions(true);
> customType3.setStoreTermVectorOffsets(true);    customType3.setIndexOptions(IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS);
> doc.add(new Field("content3", "here is more content with aaa aaa aaa", customType3));
> // a field that omits only positions
> FieldType customType4 = new FieldType(TextField.TYPE_STORED);
> customType4.setStoreTermVectors(true);
> customType4.setStoreTermVectorPositions(false);
> customType4.setStoreTermVectorOffsets(true);
> customType4.setIndexOptions(IndexOptions.DOCS_AND_FREQS);
> // check out the copy-paste typo here! i forgot to change this to content4
>  doc.add(new Field("content3", "here is more content with aaa aaa aaa", customType3));
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Resolved] (LUCENE-4139) multivalued field with offsets makes corrumpt index

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-4139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Muir resolved LUCENE-4139.
---------------------------------

       Resolution: Fixed
    Fix Version/s: 5.0
                   4.0
    
> multivalued field with offsets makes corrumpt index 
> ----------------------------------------------------
>
>                 Key: LUCENE-4139
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4139
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 4.0
>            Reporter: Robert Muir
>             Fix For: 4.0, 5.0
>
>         Attachments: LUCENE-4139.patch, LUCENE-4139.patch, LUCENE-4139.patch, LUCENE-4139_test.patch, LUCENE-4139_test.patch
>
>
> I was trying to beef up TestBackwardsCompatibility (LUCENE-4085) but i accidentally made a corrupt index due to a typo:
> {code}
> // a field with both offsets and term vectors for a cross-check
> FieldType customType3 = new FieldType(TextField.TYPE_STORED);
> customType3.setStoreTermVectors(true);
> customType3.setStoreTermVectorPositions(true);
> customType3.setStoreTermVectorOffsets(true);    customType3.setIndexOptions(IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS);
> doc.add(new Field("content3", "here is more content with aaa aaa aaa", customType3));
> // a field that omits only positions
> FieldType customType4 = new FieldType(TextField.TYPE_STORED);
> customType4.setStoreTermVectors(true);
> customType4.setStoreTermVectorPositions(false);
> customType4.setStoreTermVectorOffsets(true);
> customType4.setIndexOptions(IndexOptions.DOCS_AND_FREQS);
> // check out the copy-paste typo here! i forgot to change this to content4
>  doc.add(new Field("content3", "here is more content with aaa aaa aaa", customType3));
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-4139) multivalued field with offsets makes corrumpt index

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-4139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13293554#comment-13293554 ] 

Robert Muir commented on LUCENE-4139:
-------------------------------------

Looks like we arent summing up offsets correctly for multivalued fields, thus they go backwards.
I added this assert to the postingswriter:
      assert offsetDelta >= 0 && offsetLength >= 0 : "startOffset=" + startOffset + ",lastOffset=" + lastOffset + ",endOffset=" + endOffset;

   [junit4]    > Throwable #1: java.lang.AssertionError: startOffset=26,lastOffset=34,endOffset=29
   [junit4]    > 	at __randomizedtesting.SeedInfo.seed([76B886A04FD18EEC:D9439B78AFF692]:0)
   [junit4]    > 	at org.apache.lucene.codecs.lucene40.Lucene40PostingsWriter.addPosition(Lucene40PostingsWriter.java:255)
                
> multivalued field with offsets makes corrumpt index 
> ----------------------------------------------------
>
>                 Key: LUCENE-4139
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4139
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 4.0
>            Reporter: Robert Muir
>         Attachments: LUCENE-4139_test.patch, LUCENE-4139_test.patch
>
>
> I was trying to beef up TestBackwardsCompatibility (LUCENE-4085) but i accidentally made a corrupt index due to a typo:
> {code}
> // a field with both offsets and term vectors for a cross-check
> FieldType customType3 = new FieldType(TextField.TYPE_STORED);
> customType3.setStoreTermVectors(true);
> customType3.setStoreTermVectorPositions(true);
> customType3.setStoreTermVectorOffsets(true);    customType3.setIndexOptions(IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS);
> doc.add(new Field("content3", "here is more content with aaa aaa aaa", customType3));
> // a field that omits only positions
> FieldType customType4 = new FieldType(TextField.TYPE_STORED);
> customType4.setStoreTermVectors(true);
> customType4.setStoreTermVectorPositions(false);
> customType4.setStoreTermVectorOffsets(true);
> customType4.setIndexOptions(IndexOptions.DOCS_AND_FREQS);
> // check out the copy-paste typo here! i forgot to change this to content4
>  doc.add(new Field("content3", "here is more content with aaa aaa aaa", customType3));
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-4139) multivalued field with offsets makes corrumpt index

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-4139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13293553#comment-13293553 ] 

Robert Muir commented on LUCENE-4139:
-------------------------------------

I dont know whats going on with offsets for multivalued fields: will try to dig:
{noformat}
java.lang.RuntimeException: vector term=[61 61 61] field=content3 doc=0: startOffset=64 differs from postings startOffset=-2147483622
{noformat}
                
> multivalued field with offsets makes corrumpt index 
> ----------------------------------------------------
>
>                 Key: LUCENE-4139
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4139
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 4.0
>            Reporter: Robert Muir
>         Attachments: LUCENE-4139_test.patch, LUCENE-4139_test.patch
>
>
> I was trying to beef up TestBackwardsCompatibility (LUCENE-4085) but i accidentally made a corrupt index due to a typo:
> {code}
> // a field with both offsets and term vectors for a cross-check
> FieldType customType3 = new FieldType(TextField.TYPE_STORED);
> customType3.setStoreTermVectors(true);
> customType3.setStoreTermVectorPositions(true);
> customType3.setStoreTermVectorOffsets(true);    customType3.setIndexOptions(IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS);
> doc.add(new Field("content3", "here is more content with aaa aaa aaa", customType3));
> // a field that omits only positions
> FieldType customType4 = new FieldType(TextField.TYPE_STORED);
> customType4.setStoreTermVectors(true);
> customType4.setStoreTermVectorPositions(false);
> customType4.setStoreTermVectorOffsets(true);
> customType4.setIndexOptions(IndexOptions.DOCS_AND_FREQS);
> // check out the copy-paste typo here! i forgot to change this to content4
>  doc.add(new Field("content3", "here is more content with aaa aaa aaa", customType3));
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-4139) multivalued field with offsets makes corrumpt index

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-4139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Muir updated LUCENE-4139:
--------------------------------

    Summary: multivalued field with offsets makes corrumpt index   (was: mixing up indexOptions in same IW session makes corrumpt index )
    
> multivalued field with offsets makes corrumpt index 
> ----------------------------------------------------
>
>                 Key: LUCENE-4139
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4139
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 4.0
>            Reporter: Robert Muir
>         Attachments: LUCENE-4139_test.patch, LUCENE-4139_test.patch
>
>
> I was trying to beef up TestBackwardsCompatibility (LUCENE-4085) but i accidentally made a corrupt index due to a typo:
> {code}
> // a field with both offsets and term vectors for a cross-check
> FieldType customType3 = new FieldType(TextField.TYPE_STORED);
> customType3.setStoreTermVectors(true);
> customType3.setStoreTermVectorPositions(true);
> customType3.setStoreTermVectorOffsets(true);    customType3.setIndexOptions(IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS);
> doc.add(new Field("content3", "here is more content with aaa aaa aaa", customType3));
> // a field that omits only positions
> FieldType customType4 = new FieldType(TextField.TYPE_STORED);
> customType4.setStoreTermVectors(true);
> customType4.setStoreTermVectorPositions(false);
> customType4.setStoreTermVectorOffsets(true);
> customType4.setIndexOptions(IndexOptions.DOCS_AND_FREQS);
> // check out the copy-paste typo here! i forgot to change this to content4
>  doc.add(new Field("content3", "here is more content with aaa aaa aaa", customType3));
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-4139) mixing up indexOptions in same IW session makes corrumpt index

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-4139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Muir updated LUCENE-4139:
--------------------------------

    Attachment: LUCENE-4139_test.patch

simple test.
                
> mixing up indexOptions in same IW session makes corrumpt index 
> ---------------------------------------------------------------
>
>                 Key: LUCENE-4139
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4139
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 4.0
>            Reporter: Robert Muir
>         Attachments: LUCENE-4139_test.patch
>
>
> I was trying to beef up TestBackwardsCompatibility (LUCENE-4085) but i accidentally made a corrupt index due to a typo:
> {code}
> // a field with both offsets and term vectors for a cross-check
> FieldType customType3 = new FieldType(TextField.TYPE_STORED);
> customType3.setStoreTermVectors(true);
> customType3.setStoreTermVectorPositions(true);
> customType3.setStoreTermVectorOffsets(true);    customType3.setIndexOptions(IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS);
> doc.add(new Field("content3", "here is more content with aaa aaa aaa", customType3));
> // a field that omits only positions
> FieldType customType4 = new FieldType(TextField.TYPE_STORED);
> customType4.setStoreTermVectors(true);
> customType4.setStoreTermVectorPositions(false);
> customType4.setStoreTermVectorOffsets(true);
> customType4.setIndexOptions(IndexOptions.DOCS_AND_FREQS);
> // check out the copy-paste typo here! i forgot to change this to content4
>  doc.add(new Field("content3", "here is more content with aaa aaa aaa", customType3));
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-4139) multivalued field with offsets makes corrumpt index

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-4139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Muir updated LUCENE-4139:
--------------------------------

    Attachment: LUCENE-4139.patch

patch... needs review and maybe suggestions on how to make it more intuitive: but fixes the bug
                
> multivalued field with offsets makes corrumpt index 
> ----------------------------------------------------
>
>                 Key: LUCENE-4139
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4139
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 4.0
>            Reporter: Robert Muir
>         Attachments: LUCENE-4139.patch, LUCENE-4139_test.patch, LUCENE-4139_test.patch
>
>
> I was trying to beef up TestBackwardsCompatibility (LUCENE-4085) but i accidentally made a corrupt index due to a typo:
> {code}
> // a field with both offsets and term vectors for a cross-check
> FieldType customType3 = new FieldType(TextField.TYPE_STORED);
> customType3.setStoreTermVectors(true);
> customType3.setStoreTermVectorPositions(true);
> customType3.setStoreTermVectorOffsets(true);    customType3.setIndexOptions(IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS);
> doc.add(new Field("content3", "here is more content with aaa aaa aaa", customType3));
> // a field that omits only positions
> FieldType customType4 = new FieldType(TextField.TYPE_STORED);
> customType4.setStoreTermVectors(true);
> customType4.setStoreTermVectorPositions(false);
> customType4.setStoreTermVectorOffsets(true);
> customType4.setIndexOptions(IndexOptions.DOCS_AND_FREQS);
> // check out the copy-paste typo here! i forgot to change this to content4
>  doc.add(new Field("content3", "here is more content with aaa aaa aaa", customType3));
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org