You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by alessandrobenedetti <gi...@git.apache.org> on 2018/06/08 15:02:08 UTC
[GitHub] lucene-solr pull request #398: Lucene 8343 data type migration
GitHub user alessandrobenedetti opened a pull request:
https://github.com/apache/lucene-solr/pull/398
Lucene 8343 data type migration
Different approach, data type migration to fix the bugs :
1) Weight for the Document dictionary moved to Long from long
2) Suggestion score moved to double from long
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/SeaseLtd/lucene-solr LUCENE-8343-dataTypeMigration
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/lucene-solr/pull/398.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #398
----
commit e83e8ee1a42388606fffd10330ed1aeec9518098
Author: Alessandro Benedetti <a....@...>
Date: 2018-06-01T11:52:41Z
[LUCENE-8343] introduced weight 0 check and positional coefficient scaling + tests
commit 17cfa634798f96539c2535dca2e9a8f2cc0bff45
Author: Alessandro Benedetti <a....@...>
Date: 2018-06-06T18:42:08Z
[LUCENE-8343] documentation fix
commit cef9a2283e30a297b3add8e73ee6dba9492dcc57
Author: Alessandro Benedetti <a....@...>
Date: 2018-06-07T15:50:58Z
Merge remote-tracking branch 'upstream/master' into LUCENE-8343
commit 2b636e8c3adb879f0cd2cff45824e226d747b5f0
Author: Alessandro Benedetti <a....@...>
Date: 2018-06-07T15:51:38Z
[LUCENE-8343] minor documentation fixes
commit e0232f104509f28126d9ce060663f87508366338
Author: Alessandro Benedetti <a....@...>
Date: 2018-06-07T17:57:30Z
[LUCENE-8343] weight long overflow fix + test
commit cd4ad3b3be64edaf554cb3795a3a21a2da93de6f
Author: Alessandro Benedetti <a....@...>
Date: 2018-06-08T13:59:39Z
Merge remote-tracking branch 'upstream/master' into LUCENE-8343-dataTypeMigration
commit 484a85df9b707e0a82723650f1f60531e3cc39bb
Author: Alessandro Benedetti <a....@...>
Date: 2018-06-08T14:37:54Z
[LUCENE-8343] data type migration approach for weight not defined - weight too small Blended Infix Suggestion Score calculus bug
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[GitHub] lucene-solr pull request #398: Lucene 8343 data type migration
Posted by mikemccand <gi...@git.apache.org>.
Github user mikemccand commented on a diff in the pull request:
https://github.com/apache/lucene-solr/pull/398#discussion_r195387148
--- Diff: lucene/suggest/src/java/org/apache/lucene/search/suggest/Lookup.java ---
@@ -53,7 +53,7 @@
public final Object highlightKey;
/** the key's weight */
- public final long value;
+ public final double value;
--- End diff --
Maybe improve javadocs here, explaining that this is not just the weight originally supplied during indexing, for some suggesters?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[GitHub] lucene-solr pull request #398: Lucene 8343 data type migration
Posted by alessandrobenedetti <gi...@git.apache.org>.
Github user alessandrobenedetti commented on a diff in the pull request:
https://github.com/apache/lucene-solr/pull/398#discussion_r194254789
--- Diff: lucene/suggest/src/java/org/apache/lucene/search/suggest/analyzing/BlendedInfixSuggester.java ---
@@ -200,7 +201,13 @@ protected FieldType getTextFieldType() {
textDV.advance(fd.doc);
final String text = textDV.binaryValue().utf8ToString();
- long weight = (Long) fd.fields[0];
+
+ NumericDocValues weightDV = MultiDocValues.getNumericValues(searcher.getIndexReader(), WEIGHT_FIELD_NAME);
--- End diff --
Thank you for your note, It was not the main scope of the bug fix, but It's a good recommendation so I just contributed the little refactor.
Thanks!
---
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[GitHub] lucene-solr pull request #398: Lucene 8343 data type migration
Posted by nvnmandadhi <gi...@git.apache.org>.
Github user nvnmandadhi commented on a diff in the pull request:
https://github.com/apache/lucene-solr/pull/398#discussion_r194259193
--- Diff: lucene/suggest/src/java/org/apache/lucene/search/suggest/analyzing/BlendedInfixSuggester.java ---
@@ -200,7 +201,13 @@ protected FieldType getTextFieldType() {
textDV.advance(fd.doc);
final String text = textDV.binaryValue().utf8ToString();
- long weight = (Long) fd.fields[0];
+
+ NumericDocValues weightDV = MultiDocValues.getNumericValues(searcher.getIndexReader(), WEIGHT_FIELD_NAME);
--- End diff --
Thanks for incorporating the changes!!!!!
---
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[GitHub] lucene-solr pull request #398: Lucene 8343 data type migration
Posted by alessandrobenedetti <gi...@git.apache.org>.
Github user alessandrobenedetti commented on a diff in the pull request:
https://github.com/apache/lucene-solr/pull/398#discussion_r195689810
--- Diff: lucene/suggest/src/java/org/apache/lucene/search/suggest/InputIterator.java ---
@@ -34,7 +34,7 @@
public interface InputIterator extends BytesRefIterator {
/** A term's weight, higher numbers mean better suggestions. */
--- End diff --
Hi Michael,
The reason to allow for null at the InputIterator level is to distinguish it from an explicit 0 weight.
In the DocumentDictionary this translates in differentiating when the weight field was missing for the original document ( NULL ) in opposition to when the weight field was present and with 0 value.
At this level we just want to ensure that the same behavior is maintained when we build the auxiliary index :
i.e. if the weight field was missing for the original document, I want it to be null for the auxiliary index as well.
How the different suggesters implementation will use this to return a suggestion score, I think will depend on a case by case scenario.
Did I misunderstand anything here ?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[GitHub] lucene-solr pull request #398: Lucene 8343 data type migration
Posted by alessandrobenedetti <gi...@git.apache.org>.
Github user alessandrobenedetti commented on a diff in the pull request:
https://github.com/apache/lucene-solr/pull/398#discussion_r195698769
--- Diff: lucene/suggest/src/java/org/apache/lucene/search/suggest/Lookup.java ---
@@ -53,7 +53,7 @@
public final Object highlightKey;
/** the key's weight */
- public final long value;
+ public final double value;
--- End diff --
I agree, just adde!
---
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[GitHub] lucene-solr pull request #398: Lucene 8343 data type migration
Posted by mikemccand <gi...@git.apache.org>.
Github user mikemccand commented on a diff in the pull request:
https://github.com/apache/lucene-solr/pull/398#discussion_r195386599
--- Diff: lucene/suggest/src/java/org/apache/lucene/search/suggest/InputIterator.java ---
@@ -34,7 +34,7 @@
public interface InputIterator extends BytesRefIterator {
/** A term's weight, higher numbers mean better suggestions. */
--- End diff --
Maybe add javadocs explaining what `null` means? Though, why do we need to allow for `null`? Shouldn't the iterator not return a suggestion that has no weight?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[GitHub] lucene-solr pull request #398: Lucene 8343 data type migration
Posted by nvnmandadhi <gi...@git.apache.org>.
Github user nvnmandadhi commented on a diff in the pull request:
https://github.com/apache/lucene-solr/pull/398#discussion_r194210584
--- Diff: lucene/suggest/src/java/org/apache/lucene/search/suggest/analyzing/BlendedInfixSuggester.java ---
@@ -200,7 +201,13 @@ protected FieldType getTextFieldType() {
textDV.advance(fd.doc);
final String text = textDV.binaryValue().utf8ToString();
- long weight = (Long) fd.fields[0];
+
+ NumericDocValues weightDV = MultiDocValues.getNumericValues(searcher.getIndexReader(), WEIGHT_FIELD_NAME);
--- End diff --
Could you please make local variables final to prevent reassignment.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org