You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2020/08/11 19:22:48 UTC

[GitHub] [lucene-solr] mikemccand commented on a change in pull request #1733: LUCENE-9450 Use BinaryDocValues in the taxonomy writer

mikemccand commented on a change in pull request #1733:
URL: https://github.com/apache/lucene-solr/pull/1733#discussion_r468808904



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/taxonomy/directory/DirectoryTaxonomyReader.java
##########
@@ -323,8 +323,10 @@ public FacetLabel getPath(int ordinal) throws IOException {
       }
     }
     
-    Document doc = indexReader.document(ordinal);
-    FacetLabel ret = new FacetLabel(FacetsConfig.stringToPath(doc.get(Consts.FULL)));
+    boolean found = MultiDocValues.getBinaryValues(indexReader, Consts.FULL).advanceExact(catIDInteger);

Review comment:
       OK, I see one issue -- you are pulling a new `BinaryDocValues`, calling `.advanceExact` on it (good), but then pulling a new `BinaryDocValues` below and not calling `.advanceExact` on it.
   
   I think you must add a new local variable, e.g. `BinaryDocValues values`.  Pull it once (using the `MultiDocValues.getBinaryValues` sugar API).  Then call `.advanceExact` on that and assert it succeeded. Finally, use that same `values` instance (now that it has advanced to the right `docId`) to call `.binaryValue().utf8ToString()`.
   
   I think that should fix the `NPE`?
   
   This is misuse of the API for the default Lucene Codec for `BinaryDocValues`, since you were calling `.binaryValue()` before `.advanceExact()`.  It is somewhat disappointing that the codec threw a confusing `NPE` and not a clearer (best effort) exception stating that you must first call `.advanceExact`.  Maybe we could improve the default Codec?  (Though, not if that would hurt performance of correct usage).  OK I see: the `NPE` is because of `MultiDocValues.currentValues` is `null` since `.advanceExact` was not yet called.  Maybe we could add an `assert` there, confirming `.advanceExact` was indeed called and had returned `true`?  It would have made debugging this easier, and should not hurt performance when assertions are disabled ...

##########
File path: lucene/facet/src/java/org/apache/lucene/facet/taxonomy/directory/DirectoryTaxonomyReader.java
##########
@@ -323,8 +323,10 @@ public FacetLabel getPath(int ordinal) throws IOException {
       }
     }
     
-    Document doc = indexReader.document(ordinal);
-    FacetLabel ret = new FacetLabel(FacetsConfig.stringToPath(doc.get(Consts.FULL)));
+    boolean found = MultiDocValues.getBinaryValues(indexReader, Consts.FULL).advanceExact(catIDInteger);

Review comment:
       I think instead of the boxed `Integer catIDInteger` we should pass the `int ordinal` to `.advanceExact(...)`?  Not the cause of the `NPE`, just cleaner.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org