You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@lucene.apache.org by alessandrobenedetti <gi...@git.apache.org> on 2018/06/08 15:02:08 UTC

[GitHub] lucene-solr pull request #398: Lucene 8343 data type migration

GitHub user alessandrobenedetti opened a pull request:

    https://github.com/apache/lucene-solr/pull/398

    Lucene 8343 data type migration

    Different approach, data type migration to fix the bugs  : 
    
    
    1) Weight for the Document dictionary moved to Long from long
    2) Suggestion score moved to double from long

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/SeaseLtd/lucene-solr LUCENE-8343-dataTypeMigration

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/lucene-solr/pull/398.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #398
    
----
commit e83e8ee1a42388606fffd10330ed1aeec9518098
Author: Alessandro Benedetti <a....@...>
Date:   2018-06-01T11:52:41Z

    [LUCENE-8343] introduced weight 0 check and positional coefficient scaling + tests

commit 17cfa634798f96539c2535dca2e9a8f2cc0bff45
Author: Alessandro Benedetti <a....@...>
Date:   2018-06-06T18:42:08Z

    [LUCENE-8343] documentation fix

commit cef9a2283e30a297b3add8e73ee6dba9492dcc57
Author: Alessandro Benedetti <a....@...>
Date:   2018-06-07T15:50:58Z

    Merge remote-tracking branch 'upstream/master' into LUCENE-8343

commit 2b636e8c3adb879f0cd2cff45824e226d747b5f0
Author: Alessandro Benedetti <a....@...>
Date:   2018-06-07T15:51:38Z

    [LUCENE-8343] minor documentation fixes

commit e0232f104509f28126d9ce060663f87508366338
Author: Alessandro Benedetti <a....@...>
Date:   2018-06-07T17:57:30Z

    [LUCENE-8343] weight long overflow fix + test

commit cd4ad3b3be64edaf554cb3795a3a21a2da93de6f
Author: Alessandro Benedetti <a....@...>
Date:   2018-06-08T13:59:39Z

    Merge remote-tracking branch 'upstream/master' into LUCENE-8343-dataTypeMigration

commit 484a85df9b707e0a82723650f1f60531e3cc39bb
Author: Alessandro Benedetti <a....@...>
Date:   2018-06-08T14:37:54Z

    [LUCENE-8343] data type migration approach for weight not defined - weight too small Blended Infix Suggestion Score calculus bug

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

[GitHub] lucene-solr pull request #398: Lucene 8343 data type migration

Posted by mikemccand <gi...@git.apache.org>.

Github user mikemccand commented on a diff in the pull request:

    https://github.com/apache/lucene-solr/pull/398#discussion_r195387148
  
    --- Diff: lucene/suggest/src/java/org/apache/lucene/search/suggest/Lookup.java ---
    @@ -53,7 +53,7 @@
         public final Object highlightKey;
     
         /** the key's weight */
    -    public final long value;
    +    public final double value;
    --- End diff --
    
    Maybe improve javadocs here, explaining that this is not just the weight originally supplied during indexing, for some suggesters?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

[GitHub] lucene-solr pull request #398: Lucene 8343 data type migration

Posted by alessandrobenedetti <gi...@git.apache.org>.

Github user alessandrobenedetti commented on a diff in the pull request:

    https://github.com/apache/lucene-solr/pull/398#discussion_r194254789
  
    --- Diff: lucene/suggest/src/java/org/apache/lucene/search/suggest/analyzing/BlendedInfixSuggester.java ---
    @@ -200,7 +201,13 @@ protected FieldType getTextFieldType() {
           textDV.advance(fd.doc);
     
           final String text = textDV.binaryValue().utf8ToString();
    -      long weight = (Long) fd.fields[0];
    +
    +      NumericDocValues weightDV = MultiDocValues.getNumericValues(searcher.getIndexReader(), WEIGHT_FIELD_NAME);
    --- End diff --
    
    Thank you for your note, It was not the main scope of the bug fix, but It's a good recommendation so I just contributed the little refactor.
    Thanks!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

[GitHub] lucene-solr pull request #398: Lucene 8343 data type migration

Posted by nvnmandadhi <gi...@git.apache.org>.

Github user nvnmandadhi commented on a diff in the pull request:

    https://github.com/apache/lucene-solr/pull/398#discussion_r194259193
  
    --- Diff: lucene/suggest/src/java/org/apache/lucene/search/suggest/analyzing/BlendedInfixSuggester.java ---
    @@ -200,7 +201,13 @@ protected FieldType getTextFieldType() {
           textDV.advance(fd.doc);
     
           final String text = textDV.binaryValue().utf8ToString();
    -      long weight = (Long) fd.fields[0];
    +
    +      NumericDocValues weightDV = MultiDocValues.getNumericValues(searcher.getIndexReader(), WEIGHT_FIELD_NAME);
    --- End diff --
    
    Thanks for incorporating the changes!!!!!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

[GitHub] lucene-solr pull request #398: Lucene 8343 data type migration

Posted by alessandrobenedetti <gi...@git.apache.org>.

Github user alessandrobenedetti commented on a diff in the pull request:

    https://github.com/apache/lucene-solr/pull/398#discussion_r195689810
  
    --- Diff: lucene/suggest/src/java/org/apache/lucene/search/suggest/InputIterator.java ---
    @@ -34,7 +34,7 @@
     public interface InputIterator extends BytesRefIterator {
     
       /** A term's weight, higher numbers mean better suggestions. */
    --- End diff --
    
    Hi Michael,
    The reason to allow for null at the InputIterator level is to distinguish it from an explicit 0 weight.
    In the DocumentDictionary this translates in differentiating when  the weight field was missing for the original document ( NULL ) in opposition to when the weight field was present and with 0 value.
    At this level we just want to ensure that the same behavior is maintained when we build the auxiliary index : 
    i.e. if the weight field was missing for the original document, I want it to be null for the auxiliary index as well.
    How the different suggesters implementation will use this to return a suggestion score, I think will depend on a case by case scenario.
    Did I misunderstand anything here ?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

[GitHub] lucene-solr pull request #398: Lucene 8343 data type migration

Posted by alessandrobenedetti <gi...@git.apache.org>.

Github user alessandrobenedetti commented on a diff in the pull request:

    https://github.com/apache/lucene-solr/pull/398#discussion_r195698769
  
    --- Diff: lucene/suggest/src/java/org/apache/lucene/search/suggest/Lookup.java ---
    @@ -53,7 +53,7 @@
         public final Object highlightKey;
     
         /** the key's weight */
    -    public final long value;
    +    public final double value;
    --- End diff --
    
    I agree, just adde!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

[GitHub] lucene-solr pull request #398: Lucene 8343 data type migration

Posted by mikemccand <gi...@git.apache.org>.

Github user mikemccand commented on a diff in the pull request:

    https://github.com/apache/lucene-solr/pull/398#discussion_r195386599
  
    --- Diff: lucene/suggest/src/java/org/apache/lucene/search/suggest/InputIterator.java ---
    @@ -34,7 +34,7 @@
     public interface InputIterator extends BytesRefIterator {
     
       /** A term's weight, higher numbers mean better suggestions. */
    --- End diff --
    
    Maybe add javadocs explaining what `null` means?  Though, why do we need to allow for `null`?  Shouldn't the iterator not return a suggestion that has no weight?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

[GitHub] lucene-solr pull request #398: Lucene 8343 data type migration

Posted by nvnmandadhi <gi...@git.apache.org>.

Github user nvnmandadhi commented on a diff in the pull request:

    https://github.com/apache/lucene-solr/pull/398#discussion_r194210584
  
    --- Diff: lucene/suggest/src/java/org/apache/lucene/search/suggest/analyzing/BlendedInfixSuggester.java ---
    @@ -200,7 +201,13 @@ protected FieldType getTextFieldType() {
           textDV.advance(fd.doc);
     
           final String text = textDV.binaryValue().utf8ToString();
    -      long weight = (Long) fd.fields[0];
    +
    +      NumericDocValues weightDV = MultiDocValues.getNumericValues(searcher.getIndexReader(), WEIGHT_FIELD_NAME);
    --- End diff --
    
    Could you please make local variables final to prevent reassignment. 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org