You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2020/09/19 01:31:25 UTC

[GitHub] [lucene-solr] goankur opened a new pull request #1893: LUCENE-9444 Utility class to get facet labels from taxonomy for a fac…

goankur opened a new pull request #1893:
URL: https://github.com/apache/lucene-solr/pull/1893


   …et field. This is useful if a facet field is requested in the list of return fields for each hit.
   
   <!--
   _(If you are a project committer then you may remove some/all of the following template.)_
   
   Before creating a pull request, please file an issue in the ASF Jira system for Lucene or Solr:
   
   * https://issues.apache.org/jira/projects/LUCENE
   * https://issues.apache.org/jira/projects/SOLR
   
   You will need to create an account in Jira in order to create an issue.
   
   The title of the PR should reference the Jira issue number in the form:
   
   * LUCENE-####: <short description of problem or changes>
   * SOLR-####: <short description of problem or changes>
   
   LUCENE and SOLR must be fully capitalized. A short description helps people scanning pull requests for items they can work on.
   
   Properly referencing the issue in the title ensures that Jira is correctly updated with code review comments and commits. -->
   
   
   # Description
   
   Please provide a short description of the changes you're making with this pull request.
   
   # Solution
   
   Please provide a short description of the approach taken to implement your solution.
   
   # Tests
   
   Please describe the tests you've developed or run to confirm this patch implements the feature or solves the problem.
   
   # Checklist
   
   Please review the following and check all that apply:
   
   - [ ] I have reviewed the guidelines for [How to Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms to the standards described there to the best of my ability.
   - [ ] I have created a Jira issue and added the issue ID to my pull request title.
   - [ ] I have given Solr maintainers [access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork) to contribute to my PR branch. (optional but recommended)
   - [ ] I have developed this patch against the `master` branch.
   - [ ] I have run `./gradlew check`.
   - [ ] I have added tests for my changes.
   - [ ] I have added documentation for the [Ref Guide](https://github.com/apache/lucene-solr/tree/master/solr/solr-ref-guide) (for Solr changes only).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] mikemccand commented on a change in pull request #1893: LUCENE-9444 Utility class to get facet labels from taxonomy for a fac…

Posted by GitBox <gi...@apache.org>.
mikemccand commented on a change in pull request #1893:
URL: https://github.com/apache/lucene-solr/pull/1893#discussion_r494398560



##########
File path: lucene/facet/src/test/org/apache/lucene/facet/FacetTestCase.java
##########
@@ -56,6 +60,28 @@ public Facets getTaxonomyFacetCounts(TaxonomyReader taxoReader, FacetsConfig con
     return facets;
   }
 
+  public List<List<FacetLabel>> getTaxonomyFacetLabels(TaxonomyReader taxoReader, FacetsConfig config, FacetsCollector fc) throws IOException {

Review comment:
       Thank you for adding this utility method so tests can easily use the new utility class!
   
   Can we rename this to `getAllTaxonomyFacetLabels`, and add javadoc explaining that the outer list is one entry per matched hit, and the inner list is one entry per `FacetLabel` belonging to that hit?

##########
File path: lucene/facet/src/test/org/apache/lucene/facet/taxonomy/TestTaxonomyFacetCounts.java
##########
@@ -726,6 +743,39 @@ public void testRandom() throws Exception {
     IOUtils.close(tw, searcher.getIndexReader(), tr, indexDir, taxoDir);
   }
 
+  private static List<List<FacetLabel>> sortedFacetLabels(List<List<FacetLabel>> allfacetLabels) {
+    for (List<FacetLabel> facetLabels : allfacetLabels) {
+      Collections.sort(facetLabels);
+    }
+
+    Collections.sort(allfacetLabels, (o1, o2) -> {
+      if (o1 == null) {

Review comment:
       Hmm why are these `null` checks necessary?  Are we really seeing `null` in the argument?  Oh, I guess this legitimately happens when the hit had no facets?  Maybe add a comment?  Hmm, actually, looking at how actual and expected are populated, neither of them seems to add `null`?  One of them filters out empty list but the other does not?

##########
File path: lucene/facet/src/test/org/apache/lucene/facet/taxonomy/TestTaxonomyFacetCounts.java
##########
@@ -711,6 +723,11 @@ public void testRandom() throws Exception {
         }
       }
 
+      // Test facet labels for each matching test doc
+      List<List<FacetLabel>> actualLabels = getTaxonomyFacetLabels(tr, config, fc);
+      assertEquals(expectedLabels.size(), actualLabels.size());

Review comment:
       Hmm I think `expectedLabels` filters out empty `List<FacetLabel>` but `actualLabels` does not, so this might false trip?

##########
File path: lucene/facet/src/test/org/apache/lucene/facet/taxonomy/TestTaxonomyFacetCounts.java
##########
@@ -726,6 +743,39 @@ public void testRandom() throws Exception {
     IOUtils.close(tw, searcher.getIndexReader(), tr, indexDir, taxoDir);
   }
 
+  private static List<List<FacetLabel>> sortedFacetLabels(List<List<FacetLabel>> allfacetLabels) {
+    for (List<FacetLabel> facetLabels : allfacetLabels) {
+      Collections.sort(facetLabels);
+    }
+
+    Collections.sort(allfacetLabels, (o1, o2) -> {

Review comment:
       I'm confused why we are sorting the top list?  Isn't the top list in order of the hits?  And we want to confirm, for a given `docId` hit, that expected and actual labels match?
   
   OK, I think I understand: this test does not index anything allowing you to track which original doc mapped to which `FacetLabel`, so then you cannot know, per segment, which docs ended up where :)
   
   Given that, I think it's OK to do the top-level sort of all `List<FacetLabel>` across all hits.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] goankur commented on a change in pull request #1893: LUCENE-9444 Utility class to get facet labels from taxonomy for a fac…

Posted by GitBox <gi...@apache.org>.
goankur commented on a change in pull request #1893:
URL: https://github.com/apache/lucene-solr/pull/1893#discussion_r493117522



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/taxonomy/TaxonomyFacetLabels.java
##########
@@ -0,0 +1,184 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.facet.taxonomy;
+
+import org.apache.lucene.facet.FacetsConfig;
+import org.apache.lucene.index.LeafReaderContext;
+import org.apache.lucene.util.IntsRef;
+
+import java.io.IOException;
+
+import static org.apache.lucene.facet.taxonomy.TaxonomyReader.INVALID_ORDINAL;
+import static org.apache.lucene.facet.taxonomy.TaxonomyReader.ROOT_ORDINAL;
+
+/**
+ * Utility class to easily retrieve previously indexed facet labels, allowing you to skip also adding stored fields for these values,
+ * reducing your index size.
+ *
+ * @lucene.experimental
+ **/
+public class TaxonomyFacetLabels {
+
+  /**
+   * Index field name provided to the constructor
+   */
+  private final String indexFieldName;
+
+  /**
+   * {@code TaxonomyReader} provided to the constructor
+   */
+  private final TaxonomyReader taxoReader;
+
+  /**
+   * {@code FacetsConfig} provided to the constructor
+   */
+  private final FacetsConfig config;
+
+  /**
+   * {@code OrdinalsReader} to decode ordinals previously indexed into the {@code BinaryDocValues} facet field
+   */
+  private final OrdinalsReader ordsReader;
+
+  /**
+   * Sole constructor.  Do not close the provided {@link TaxonomyReader} while still using this instance!
+   */
+  public TaxonomyFacetLabels(TaxonomyReader taxoReader, FacetsConfig config, String indexFieldName) throws IOException {
+    this.taxoReader = taxoReader;
+    this.config = config;
+    this.indexFieldName = indexFieldName;
+    this.ordsReader = new DocValuesOrdinalsReader(indexFieldName);
+  }
+
+  /**
+   * Create and return an instance of {@link FacetLabelReader} to retrieve facet labels for
+   * multiple documents and (optionally) for a specific dimension.  You must create this per-segment,
+   * and then step through all hits, in order, for that segment.
+   *
+   * <p><b>NOTE</b>: This class is not thread-safe, so you must use a new instance of this
+   * class for each thread.</p>
+   *
+   * @param readerContext LeafReaderContext used to access the {@code BinaryDocValues} facet field
+   * @return an instance of {@link FacetLabelReader}
+   * @throws IOException when a low-level IO issue occurs
+   */
+  public FacetLabelReader getFacetLabelReader(LeafReaderContext readerContext) throws IOException {
+    return new FacetLabelReader(ordsReader, readerContext);
+  }
+
+  /**
+   * Utility class to retrieve facet labels for multiple documents.
+   *
+   * @lucene.experimental
+   */
+  public class FacetLabelReader {
+    private final OrdinalsReader.OrdinalsSegmentReader ordinalsSegmentReader;
+    private final IntsRef decodedOrds = new IntsRef();
+    private int currentDocId = -1;
+    private int currentPos = -1;
+
+    // Lazily set when nextFacetLabel(int docId, String facetDimension) is first called
+    private int[] parents;
+
+    /**
+     * Sole constructor.
+     */
+    public FacetLabelReader(OrdinalsReader ordsReader, LeafReaderContext readerContext) throws IOException {
+      ordinalsSegmentReader = ordsReader.getReader(readerContext);
+    }
+
+    /**
+     * Retrieves the next {@link FacetLabel} for the specified {@code docId}, or {@code null} if there are no more.
+     * This method has state: if the provided {@code docId} is the same as the previous invocation, it returns the
+     * next {@link FacetLabel} for that document.  Otherwise, it advances to the new {@code docId} and provides the
+     * first {@link FacetLabel} for that document, or {@code null} if that document has no indexed facets.  Each
+     * new {@code docId} must be in strictly monotonic (increasing) order.
+     *
+     * @param docId input docId provided in monotonic (non-decreasing) order
+     * @return the first or next {@link FacetLabel}, or {@code null} if there are no more
+     * @throws IOException when a low-level IO issue occurs
+     */
+    public FacetLabel nextFacetLabel(int docId) throws IOException {
+      if (currentDocId != docId) {
+        assert docId > currentDocId: "docs out of order!  currentDocId=" + currentDocId + " docId=" + docId;
+        ordinalsSegmentReader.get(docId, decodedOrds);
+        currentDocId = docId;
+        currentPos = decodedOrds.offset;
+      }
+
+      int endPos = decodedOrds.offset + decodedOrds.length;
+      assert currentPos <= endPos;
+
+      if (currentPos == endPos) {
+        // no more FacetLabels
+        return null;
+      }
+
+      int ord = decodedOrds.ints[currentPos++];
+      return taxoReader.getPath(ord);
+    }
+
+    private boolean isDescendant(int ord, int ancestorOrd) {
+      while (ord != INVALID_ORDINAL && ord != ROOT_ORDINAL) {
+        if (parents[ord] == ancestorOrd) {
+          return true;
+        }
+        ord = parents[ord];
+      }
+      return false;
+    }
+
+    /**
+     * Retrieves the next {@link FacetLabel} for the specified {@code docId} under the requested {@code facetDimension},
+     * or {@code null} if there are no more. This method has state: if the provided {@code docId} is the same as the
+     * previous invocation, it returns the next {@link FacetLabel} for that document.  Otherwise, it advances to
+     * the new {@code docId} and provides the first {@link FacetLabel} for that document, or {@code null} if that document
+     * has no indexed facets.  Each new {@code docId} must be in strictly monotonic (increasing) order.
+     *
+     * <p><b>NOTE</b>: this method loads the {@code int[] parents} array from the taxonomy index</p>
+     *
+     * @param docId input docId provided in non-decreasing order
+     * @return the first or next {@link FacetLabel}, or {@code null} if there are no more
+     * @throws IOException if {@link TaxonomyReader} has problems getting path for an ordinal
+     */
+    public FacetLabel nextFacetLabel(int docId, String facetDimension) throws IOException {
+      final int parentOrd = taxoReader.getOrdinal(new FacetLabel(facetDimension));
+      assert parentOrd != INVALID_ORDINAL : "Category ordinal not found for facet dimension: " + facetDimension;

Review comment:
       done in next revision.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] goankur commented on a change in pull request #1893: LUCENE-9444 Utility class to get facet labels from taxonomy for a fac…

Posted by GitBox <gi...@apache.org>.
goankur commented on a change in pull request #1893:
URL: https://github.com/apache/lucene-solr/pull/1893#discussion_r495342228



##########
File path: lucene/facet/src/test/org/apache/lucene/facet/taxonomy/TestTaxonomyFacetCounts.java
##########
@@ -711,6 +723,11 @@ public void testRandom() throws Exception {
         }
       }
 
+      // Test facet labels for each matching test doc
+      List<List<FacetLabel>> actualLabels = getTaxonomyFacetLabels(tr, config, fc);
+      assertEquals(expectedLabels.size(), actualLabels.size());

Review comment:
       Done. I made changes to keep empty lists in both `expectedLabels` and `actualLabels` and also updated javadoc for `FacetTestCase.getAllTaxonomyFacetLabels()` to reflect this for `actualLabels`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] mikemccand commented on a change in pull request #1893: LUCENE-9444 Utility class to get facet labels from taxonomy for a fac…

Posted by GitBox <gi...@apache.org>.
mikemccand commented on a change in pull request #1893:
URL: https://github.com/apache/lucene-solr/pull/1893#discussion_r493041886



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/taxonomy/TaxonomyFacetLabels.java
##########
@@ -0,0 +1,184 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.facet.taxonomy;
+
+import org.apache.lucene.facet.FacetsConfig;
+import org.apache.lucene.index.LeafReaderContext;
+import org.apache.lucene.util.IntsRef;
+
+import java.io.IOException;
+
+import static org.apache.lucene.facet.taxonomy.TaxonomyReader.INVALID_ORDINAL;
+import static org.apache.lucene.facet.taxonomy.TaxonomyReader.ROOT_ORDINAL;
+
+/**
+ * Utility class to easily retrieve previously indexed facet labels, allowing you to skip also adding stored fields for these values,
+ * reducing your index size.
+ *
+ * @lucene.experimental
+ **/
+public class TaxonomyFacetLabels {
+
+  /**
+   * Index field name provided to the constructor
+   */
+  private final String indexFieldName;
+
+  /**
+   * {@code TaxonomyReader} provided to the constructor
+   */
+  private final TaxonomyReader taxoReader;
+
+  /**
+   * {@code FacetsConfig} provided to the constructor
+   */
+  private final FacetsConfig config;
+
+  /**
+   * {@code OrdinalsReader} to decode ordinals previously indexed into the {@code BinaryDocValues} facet field
+   */
+  private final OrdinalsReader ordsReader;
+
+  /**
+   * Sole constructor.  Do not close the provided {@link TaxonomyReader} while still using this instance!
+   */
+  public TaxonomyFacetLabels(TaxonomyReader taxoReader, FacetsConfig config, String indexFieldName) throws IOException {
+    this.taxoReader = taxoReader;
+    this.config = config;
+    this.indexFieldName = indexFieldName;
+    this.ordsReader = new DocValuesOrdinalsReader(indexFieldName);
+  }
+
+  /**
+   * Create and return an instance of {@link FacetLabelReader} to retrieve facet labels for
+   * multiple documents and (optionally) for a specific dimension.  You must create this per-segment,
+   * and then step through all hits, in order, for that segment.
+   *
+   * <p><b>NOTE</b>: This class is not thread-safe, so you must use a new instance of this
+   * class for each thread.</p>
+   *
+   * @param readerContext LeafReaderContext used to access the {@code BinaryDocValues} facet field
+   * @return an instance of {@link FacetLabelReader}
+   * @throws IOException when a low-level IO issue occurs
+   */
+  public FacetLabelReader getFacetLabelReader(LeafReaderContext readerContext) throws IOException {
+    return new FacetLabelReader(ordsReader, readerContext);
+  }
+
+  /**
+   * Utility class to retrieve facet labels for multiple documents.
+   *
+   * @lucene.experimental
+   */
+  public class FacetLabelReader {
+    private final OrdinalsReader.OrdinalsSegmentReader ordinalsSegmentReader;
+    private final IntsRef decodedOrds = new IntsRef();
+    private int currentDocId = -1;
+    private int currentPos = -1;
+
+    // Lazily set when nextFacetLabel(int docId, String facetDimension) is first called
+    private int[] parents;
+
+    /**
+     * Sole constructor.
+     */
+    public FacetLabelReader(OrdinalsReader ordsReader, LeafReaderContext readerContext) throws IOException {
+      ordinalsSegmentReader = ordsReader.getReader(readerContext);
+    }
+
+    /**
+     * Retrieves the next {@link FacetLabel} for the specified {@code docId}, or {@code null} if there are no more.
+     * This method has state: if the provided {@code docId} is the same as the previous invocation, it returns the
+     * next {@link FacetLabel} for that document.  Otherwise, it advances to the new {@code docId} and provides the
+     * first {@link FacetLabel} for that document, or {@code null} if that document has no indexed facets.  Each
+     * new {@code docId} must be in strictly monotonic (increasing) order.
+     *
+     * @param docId input docId provided in monotonic (non-decreasing) order
+     * @return the first or next {@link FacetLabel}, or {@code null} if there are no more
+     * @throws IOException when a low-level IO issue occurs
+     */
+    public FacetLabel nextFacetLabel(int docId) throws IOException {
+      if (currentDocId != docId) {
+        assert docId > currentDocId: "docs out of order!  currentDocId=" + currentDocId + " docId=" + docId;

Review comment:
       Hmm let's upgrade this to a real `if` not an `assert` (that requires that assertions are enabled), and change to throw `IllegalArgumentException`?  In general if it is a problem that a user could legitimately trip up on, it should be a real `if`, as long as performance impact of such micro-policing is acceptable (which should be the case here).

##########
File path: lucene/facet/src/java/org/apache/lucene/facet/taxonomy/TaxonomyFacetLabels.java
##########
@@ -0,0 +1,184 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.facet.taxonomy;
+
+import org.apache.lucene.facet.FacetsConfig;
+import org.apache.lucene.index.LeafReaderContext;
+import org.apache.lucene.util.IntsRef;
+
+import java.io.IOException;
+
+import static org.apache.lucene.facet.taxonomy.TaxonomyReader.INVALID_ORDINAL;
+import static org.apache.lucene.facet.taxonomy.TaxonomyReader.ROOT_ORDINAL;
+
+/**
+ * Utility class to easily retrieve previously indexed facet labels, allowing you to skip also adding stored fields for these values,
+ * reducing your index size.
+ *
+ * @lucene.experimental
+ **/
+public class TaxonomyFacetLabels {
+
+  /**
+   * Index field name provided to the constructor
+   */
+  private final String indexFieldName;
+
+  /**
+   * {@code TaxonomyReader} provided to the constructor
+   */
+  private final TaxonomyReader taxoReader;
+
+  /**
+   * {@code FacetsConfig} provided to the constructor
+   */
+  private final FacetsConfig config;
+
+  /**
+   * {@code OrdinalsReader} to decode ordinals previously indexed into the {@code BinaryDocValues} facet field
+   */
+  private final OrdinalsReader ordsReader;
+
+  /**
+   * Sole constructor.  Do not close the provided {@link TaxonomyReader} while still using this instance!
+   */
+  public TaxonomyFacetLabels(TaxonomyReader taxoReader, FacetsConfig config, String indexFieldName) throws IOException {
+    this.taxoReader = taxoReader;
+    this.config = config;
+    this.indexFieldName = indexFieldName;
+    this.ordsReader = new DocValuesOrdinalsReader(indexFieldName);
+  }
+
+  /**
+   * Create and return an instance of {@link FacetLabelReader} to retrieve facet labels for
+   * multiple documents and (optionally) for a specific dimension.  You must create this per-segment,
+   * and then step through all hits, in order, for that segment.
+   *
+   * <p><b>NOTE</b>: This class is not thread-safe, so you must use a new instance of this
+   * class for each thread.</p>
+   *
+   * @param readerContext LeafReaderContext used to access the {@code BinaryDocValues} facet field
+   * @return an instance of {@link FacetLabelReader}
+   * @throws IOException when a low-level IO issue occurs
+   */
+  public FacetLabelReader getFacetLabelReader(LeafReaderContext readerContext) throws IOException {
+    return new FacetLabelReader(ordsReader, readerContext);
+  }
+
+  /**
+   * Utility class to retrieve facet labels for multiple documents.
+   *
+   * @lucene.experimental
+   */
+  public class FacetLabelReader {
+    private final OrdinalsReader.OrdinalsSegmentReader ordinalsSegmentReader;
+    private final IntsRef decodedOrds = new IntsRef();
+    private int currentDocId = -1;
+    private int currentPos = -1;
+
+    // Lazily set when nextFacetLabel(int docId, String facetDimension) is first called
+    private int[] parents;
+
+    /**
+     * Sole constructor.
+     */
+    public FacetLabelReader(OrdinalsReader ordsReader, LeafReaderContext readerContext) throws IOException {
+      ordinalsSegmentReader = ordsReader.getReader(readerContext);
+    }
+
+    /**
+     * Retrieves the next {@link FacetLabel} for the specified {@code docId}, or {@code null} if there are no more.
+     * This method has state: if the provided {@code docId} is the same as the previous invocation, it returns the
+     * next {@link FacetLabel} for that document.  Otherwise, it advances to the new {@code docId} and provides the
+     * first {@link FacetLabel} for that document, or {@code null} if that document has no indexed facets.  Each
+     * new {@code docId} must be in strictly monotonic (increasing) order.
+     *
+     * @param docId input docId provided in monotonic (non-decreasing) order
+     * @return the first or next {@link FacetLabel}, or {@code null} if there are no more
+     * @throws IOException when a low-level IO issue occurs
+     */
+    public FacetLabel nextFacetLabel(int docId) throws IOException {
+      if (currentDocId != docId) {
+        assert docId > currentDocId: "docs out of order!  currentDocId=" + currentDocId + " docId=" + docId;
+        ordinalsSegmentReader.get(docId, decodedOrds);
+        currentDocId = docId;
+        currentPos = decodedOrds.offset;
+      }
+
+      int endPos = decodedOrds.offset + decodedOrds.length;
+      assert currentPos <= endPos;

Review comment:
       This one is a good usage of `assert` because this should only be violated if there is a bug in this code or our default `Codec`, etc.  No accidental abuse by users could cause this.

##########
File path: lucene/facet/src/java/org/apache/lucene/facet/taxonomy/TaxonomyFacetLabels.java
##########
@@ -0,0 +1,184 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.facet.taxonomy;
+
+import org.apache.lucene.facet.FacetsConfig;
+import org.apache.lucene.index.LeafReaderContext;
+import org.apache.lucene.util.IntsRef;
+
+import java.io.IOException;
+
+import static org.apache.lucene.facet.taxonomy.TaxonomyReader.INVALID_ORDINAL;
+import static org.apache.lucene.facet.taxonomy.TaxonomyReader.ROOT_ORDINAL;
+
+/**
+ * Utility class to easily retrieve previously indexed facet labels, allowing you to skip also adding stored fields for these values,
+ * reducing your index size.
+ *
+ * @lucene.experimental
+ **/
+public class TaxonomyFacetLabels {
+
+  /**
+   * Index field name provided to the constructor
+   */
+  private final String indexFieldName;
+
+  /**
+   * {@code TaxonomyReader} provided to the constructor
+   */
+  private final TaxonomyReader taxoReader;
+
+  /**
+   * {@code FacetsConfig} provided to the constructor
+   */
+  private final FacetsConfig config;
+
+  /**
+   * {@code OrdinalsReader} to decode ordinals previously indexed into the {@code BinaryDocValues} facet field
+   */
+  private final OrdinalsReader ordsReader;
+
+  /**
+   * Sole constructor.  Do not close the provided {@link TaxonomyReader} while still using this instance!
+   */
+  public TaxonomyFacetLabels(TaxonomyReader taxoReader, FacetsConfig config, String indexFieldName) throws IOException {
+    this.taxoReader = taxoReader;
+    this.config = config;
+    this.indexFieldName = indexFieldName;
+    this.ordsReader = new DocValuesOrdinalsReader(indexFieldName);
+  }
+
+  /**
+   * Create and return an instance of {@link FacetLabelReader} to retrieve facet labels for
+   * multiple documents and (optionally) for a specific dimension.  You must create this per-segment,
+   * and then step through all hits, in order, for that segment.
+   *
+   * <p><b>NOTE</b>: This class is not thread-safe, so you must use a new instance of this
+   * class for each thread.</p>
+   *
+   * @param readerContext LeafReaderContext used to access the {@code BinaryDocValues} facet field
+   * @return an instance of {@link FacetLabelReader}
+   * @throws IOException when a low-level IO issue occurs
+   */
+  public FacetLabelReader getFacetLabelReader(LeafReaderContext readerContext) throws IOException {
+    return new FacetLabelReader(ordsReader, readerContext);
+  }
+
+  /**
+   * Utility class to retrieve facet labels for multiple documents.
+   *
+   * @lucene.experimental
+   */
+  public class FacetLabelReader {
+    private final OrdinalsReader.OrdinalsSegmentReader ordinalsSegmentReader;
+    private final IntsRef decodedOrds = new IntsRef();
+    private int currentDocId = -1;
+    private int currentPos = -1;
+
+    // Lazily set when nextFacetLabel(int docId, String facetDimension) is first called
+    private int[] parents;
+
+    /**
+     * Sole constructor.
+     */
+    public FacetLabelReader(OrdinalsReader ordsReader, LeafReaderContext readerContext) throws IOException {
+      ordinalsSegmentReader = ordsReader.getReader(readerContext);
+    }
+
+    /**
+     * Retrieves the next {@link FacetLabel} for the specified {@code docId}, or {@code null} if there are no more.
+     * This method has state: if the provided {@code docId} is the same as the previous invocation, it returns the
+     * next {@link FacetLabel} for that document.  Otherwise, it advances to the new {@code docId} and provides the
+     * first {@link FacetLabel} for that document, or {@code null} if that document has no indexed facets.  Each
+     * new {@code docId} must be in strictly monotonic (increasing) order.
+     *
+     * @param docId input docId provided in monotonic (non-decreasing) order

Review comment:
       Can we add a **Note** here, that the returned `FacetLabel` are not necessarily in the same order in which they were indexed?

##########
File path: lucene/facet/src/java/org/apache/lucene/facet/taxonomy/TaxonomyFacetLabels.java
##########
@@ -0,0 +1,184 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.facet.taxonomy;
+
+import org.apache.lucene.facet.FacetsConfig;
+import org.apache.lucene.index.LeafReaderContext;
+import org.apache.lucene.util.IntsRef;
+
+import java.io.IOException;
+
+import static org.apache.lucene.facet.taxonomy.TaxonomyReader.INVALID_ORDINAL;
+import static org.apache.lucene.facet.taxonomy.TaxonomyReader.ROOT_ORDINAL;
+
+/**
+ * Utility class to easily retrieve previously indexed facet labels, allowing you to skip also adding stored fields for these values,
+ * reducing your index size.
+ *
+ * @lucene.experimental
+ **/
+public class TaxonomyFacetLabels {
+
+  /**
+   * Index field name provided to the constructor
+   */
+  private final String indexFieldName;
+
+  /**
+   * {@code TaxonomyReader} provided to the constructor
+   */
+  private final TaxonomyReader taxoReader;
+
+  /**
+   * {@code FacetsConfig} provided to the constructor
+   */
+  private final FacetsConfig config;
+
+  /**
+   * {@code OrdinalsReader} to decode ordinals previously indexed into the {@code BinaryDocValues} facet field
+   */
+  private final OrdinalsReader ordsReader;
+
+  /**
+   * Sole constructor.  Do not close the provided {@link TaxonomyReader} while still using this instance!
+   */
+  public TaxonomyFacetLabels(TaxonomyReader taxoReader, FacetsConfig config, String indexFieldName) throws IOException {
+    this.taxoReader = taxoReader;
+    this.config = config;
+    this.indexFieldName = indexFieldName;
+    this.ordsReader = new DocValuesOrdinalsReader(indexFieldName);
+  }
+
+  /**
+   * Create and return an instance of {@link FacetLabelReader} to retrieve facet labels for
+   * multiple documents and (optionally) for a specific dimension.  You must create this per-segment,
+   * and then step through all hits, in order, for that segment.
+   *
+   * <p><b>NOTE</b>: This class is not thread-safe, so you must use a new instance of this
+   * class for each thread.</p>
+   *
+   * @param readerContext LeafReaderContext used to access the {@code BinaryDocValues} facet field
+   * @return an instance of {@link FacetLabelReader}
+   * @throws IOException when a low-level IO issue occurs
+   */
+  public FacetLabelReader getFacetLabelReader(LeafReaderContext readerContext) throws IOException {
+    return new FacetLabelReader(ordsReader, readerContext);
+  }
+
+  /**
+   * Utility class to retrieve facet labels for multiple documents.
+   *
+   * @lucene.experimental
+   */
+  public class FacetLabelReader {
+    private final OrdinalsReader.OrdinalsSegmentReader ordinalsSegmentReader;
+    private final IntsRef decodedOrds = new IntsRef();
+    private int currentDocId = -1;
+    private int currentPos = -1;
+
+    // Lazily set when nextFacetLabel(int docId, String facetDimension) is first called
+    private int[] parents;
+
+    /**
+     * Sole constructor.
+     */
+    public FacetLabelReader(OrdinalsReader ordsReader, LeafReaderContext readerContext) throws IOException {
+      ordinalsSegmentReader = ordsReader.getReader(readerContext);
+    }
+
+    /**
+     * Retrieves the next {@link FacetLabel} for the specified {@code docId}, or {@code null} if there are no more.
+     * This method has state: if the provided {@code docId} is the same as the previous invocation, it returns the
+     * next {@link FacetLabel} for that document.  Otherwise, it advances to the new {@code docId} and provides the
+     * first {@link FacetLabel} for that document, or {@code null} if that document has no indexed facets.  Each
+     * new {@code docId} must be in strictly monotonic (increasing) order.
+     *
+     * @param docId input docId provided in monotonic (non-decreasing) order
+     * @return the first or next {@link FacetLabel}, or {@code null} if there are no more
+     * @throws IOException when a low-level IO issue occurs
+     */
+    public FacetLabel nextFacetLabel(int docId) throws IOException {
+      if (currentDocId != docId) {
+        assert docId > currentDocId: "docs out of order!  currentDocId=" + currentDocId + " docId=" + docId;
+        ordinalsSegmentReader.get(docId, decodedOrds);
+        currentDocId = docId;
+        currentPos = decodedOrds.offset;
+      }
+
+      int endPos = decodedOrds.offset + decodedOrds.length;
+      assert currentPos <= endPos;
+
+      if (currentPos == endPos) {
+        // no more FacetLabels
+        return null;
+      }
+
+      int ord = decodedOrds.ints[currentPos++];
+      return taxoReader.getPath(ord);
+    }
+
+    private boolean isDescendant(int ord, int ancestorOrd) {
+      while (ord != INVALID_ORDINAL && ord != ROOT_ORDINAL) {
+        if (parents[ord] == ancestorOrd) {
+          return true;
+        }
+        ord = parents[ord];
+      }
+      return false;
+    }
+
+    /**
+     * Retrieves the next {@link FacetLabel} for the specified {@code docId} under the requested {@code facetDimension},
+     * or {@code null} if there are no more. This method has state: if the provided {@code docId} is the same as the
+     * previous invocation, it returns the next {@link FacetLabel} for that document.  Otherwise, it advances to
+     * the new {@code docId} and provides the first {@link FacetLabel} for that document, or {@code null} if that document
+     * has no indexed facets.  Each new {@code docId} must be in strictly monotonic (increasing) order.
+     *
+     * <p><b>NOTE</b>: this method loads the {@code int[] parents} array from the taxonomy index</p>
+     *
+     * @param docId input docId provided in non-decreasing order
+     * @return the first or next {@link FacetLabel}, or {@code null} if there are no more
+     * @throws IOException if {@link TaxonomyReader} has problems getting path for an ordinal
+     */
+    public FacetLabel nextFacetLabel(int docId, String facetDimension) throws IOException {
+      final int parentOrd = taxoReader.getOrdinal(new FacetLabel(facetDimension));
+      assert parentOrd != INVALID_ORDINAL : "Category ordinal not found for facet dimension: " + facetDimension;

Review comment:
       Can we upgrade this to a real `if`?  Caller might legitimately pass in a non-existing `facetDimension`?

##########
File path: lucene/facet/src/test/org/apache/lucene/facet/taxonomy/TestTaxonomyLabels.java
##########
@@ -0,0 +1,192 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.facet.taxonomy;
+
+import org.apache.lucene.document.Document;
+import org.apache.lucene.facet.FacetField;
+import org.apache.lucene.facet.FacetTestCase;
+import org.apache.lucene.facet.FacetsCollector;
+import org.apache.lucene.facet.FacetsCollector.MatchingDocs;
+import org.apache.lucene.facet.FacetsConfig;
+import org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyReader;
+import org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter;
+import org.apache.lucene.index.IndexWriterConfig;
+import org.apache.lucene.index.RandomIndexWriter;
+import org.apache.lucene.search.DocIdSetIterator;
+import org.apache.lucene.search.IndexSearcher;
+import org.apache.lucene.search.MatchAllDocsQuery;
+import org.apache.lucene.store.Directory;
+import org.apache.lucene.util.IOUtils;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.Set;
+import java.util.stream.Collectors;
+
+public class TestTaxonomyLabels extends FacetTestCase {
+
+  private List<Document> prepareDocuments() {
+    List<Document> docs = new ArrayList<>();
+
+    Document doc = new Document();
+    doc.add(new FacetField("Author", "Bob"));
+    doc.add(new FacetField("Publish Date", "2010", "10", "15"));
+    docs.add(doc);
+
+    doc = new Document();
+    doc.add(new FacetField("Author", "Lisa"));
+    doc.add(new FacetField("Publish Date", "2010", "10", "20"));
+    docs.add(doc);
+
+    doc = new Document();
+    doc.add(new FacetField("Author", "Tom"));
+    doc.add(new FacetField("Publish Date", "2012", "1", "1"));
+    docs.add(doc);
+
+    doc = new Document();
+    doc.add(new FacetField("Author", "Susan"));
+    doc.add(new FacetField("Publish Date", "2012", "1", "7"));
+    docs.add(doc);
+
+    doc = new Document();
+    doc.add(new FacetField("Author", "Frank"));
+    doc.add(new FacetField("Publish Date", "1999", "5", "5"));
+    docs.add(doc);
+
+    return docs;
+  }
+
+  private List<Integer> allDocIds(MatchingDocs m, boolean decreasingDocIds) throws IOException {
+    DocIdSetIterator disi = m.bits.iterator();
+    List<Integer> docIds = new ArrayList<>();
+    while (disi.nextDoc() != DocIdSetIterator.NO_MORE_DOCS) {
+      docIds.add(disi.docID());
+    }
+
+    if (decreasingDocIds == true) {
+      Collections.reverse(docIds);
+    }
+    return docIds;
+  }
+
+  private List<FacetLabel> lookupFacetLabels(TaxonomyFacetLabels taxoLabels,
+                                             List<MatchingDocs> matchingDocs) throws IOException {
+    return lookupFacetLabels(taxoLabels, matchingDocs, null, false);
+  }
+
+  private List<FacetLabel> lookupFacetLabels(TaxonomyFacetLabels taxoLabels,
+                                             List<MatchingDocs> matchingDocs,
+                                             String dimension) throws IOException {
+    return lookupFacetLabels(taxoLabels, matchingDocs, dimension, false);
+  }
+
+  private List<FacetLabel> lookupFacetLabels(TaxonomyFacetLabels taxoLabels, List<MatchingDocs> matchingDocs, String dimension,
+                                             boolean decreasingDocIds) throws IOException {
+    List<FacetLabel> facetLabels = new ArrayList<>();
+
+    for (MatchingDocs m : matchingDocs) {
+      TaxonomyFacetLabels.FacetLabelReader facetLabelReader = taxoLabels.getFacetLabelReader(m.context);
+      List<Integer> docIds = allDocIds(m, decreasingDocIds);
+      FacetLabel facetLabel;
+      for (Integer docId : docIds) {
+        while (true) {
+          if (dimension != null) {
+            facetLabel = facetLabelReader.nextFacetLabel(docId, dimension);
+          } else {
+            facetLabel = facetLabelReader.nextFacetLabel(docId);
+          }
+
+          if (facetLabel == null) {
+            break;
+          }
+          facetLabels.add(facetLabel);
+        }
+      }
+    }
+
+    return facetLabels;
+  }
+
+
+  public void testBasic() throws Exception {
+    Directory dir = newDirectory();
+    Directory taxoDir = newDirectory();
+
+    // Writes facet ords to a separate directory from the main index:
+    DirectoryTaxonomyWriter taxoWriter = new DirectoryTaxonomyWriter(taxoDir, IndexWriterConfig.OpenMode.CREATE);
+    RandomIndexWriter writer = new RandomIndexWriter(random(), dir);
+
+    FacetsConfig config = new FacetsConfig();
+    config.setHierarchical("Publish Date", true);
+
+    for (Document doc : prepareDocuments()) {
+      writer.addDocument(config.build(taxoWriter, doc));
+    }
+
+    // NRT open
+    IndexSearcher searcher = newSearcher(writer.getReader());
+    // NRT open
+    TaxonomyReader taxoReader = new DirectoryTaxonomyReader(taxoWriter);
+
+    FacetsCollector fc = new FacetsCollector();
+    searcher.search(new MatchAllDocsQuery(), fc);
+
+    TaxonomyFacetLabels taxoLabels = new TaxonomyFacetLabels(taxoReader, config, FacetsConfig.DEFAULT_INDEX_FIELD_NAME);
+
+    // Check labels for all dimensions
+    List<FacetLabel> facetLabels = lookupFacetLabels(taxoLabels, fc.getMatchingDocs());
+    assertTrue(facetLabels.size() == 10);
+    assertTrue(facetLabels.stream()
+        .filter(l -> "Author".equals(l.components[0]))
+        .map(l -> l.components[1]).collect(Collectors.toSet())
+        .equals(Set.of("Bob", "Lisa", "Susan", "Frank", "Tom")));
+
+    assertTrue(facetLabels.stream()
+        .filter(l -> "Publish Date".equals(l.components[0]))
+        .map(l -> String.join("/", l.components[1], l.components[2], l.components[3]))
+        .collect(Collectors.toSet())
+        .equals(Set.of("2010/10/15", "2010/10/20", "2012/1/1", "2012/1/7", "1999/5/5")));
+
+    // Check labels for specific dimension
+    facetLabels = lookupFacetLabels(taxoLabels, fc.getMatchingDocs(), "Publish Date");
+    assertTrue(facetLabels.size() == 5);
+    assertTrue(facetLabels.stream()
+        .filter(l -> "Publish Date".equals(l.components[0]))
+        .map(l -> String.join("/", l.components[1], l.components[2], l.components[3]))
+        .collect(Collectors.toSet())
+        .equals(Set.of("2010/10/15", "2010/10/20", "2012/1/1", "2012/1/7", "1999/5/5")));
+
+    try {
+      facetLabels = lookupFacetLabels(taxoLabels, fc.getMatchingDocs(), null, true);
+      fail("Assertion error was not thrown for using docIds supplied in decreasing order");
+    } catch (AssertionError ae) {

Review comment:
       Ahh, be careful here!  Lucene's `test-framework` sometimes randomly runs without assertions enabled, to help us catch bugs where we accidentally create code relying on assertions (it has happened, do not ask who).
   
   So when tests run with assertions disabled then this test would false-fail?  But, if we upgrade to real `if` that would also fix this.

##########
File path: lucene/facet/src/test/org/apache/lucene/facet/taxonomy/TestTaxonomyLabels.java
##########
@@ -0,0 +1,192 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.facet.taxonomy;
+
+import org.apache.lucene.document.Document;
+import org.apache.lucene.facet.FacetField;
+import org.apache.lucene.facet.FacetTestCase;
+import org.apache.lucene.facet.FacetsCollector;
+import org.apache.lucene.facet.FacetsCollector.MatchingDocs;
+import org.apache.lucene.facet.FacetsConfig;
+import org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyReader;
+import org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter;
+import org.apache.lucene.index.IndexWriterConfig;
+import org.apache.lucene.index.RandomIndexWriter;
+import org.apache.lucene.search.DocIdSetIterator;
+import org.apache.lucene.search.IndexSearcher;
+import org.apache.lucene.search.MatchAllDocsQuery;
+import org.apache.lucene.store.Directory;
+import org.apache.lucene.util.IOUtils;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.Set;
+import java.util.stream.Collectors;
+
+public class TestTaxonomyLabels extends FacetTestCase {
+
+  private List<Document> prepareDocuments() {
+    List<Document> docs = new ArrayList<>();
+
+    Document doc = new Document();
+    doc.add(new FacetField("Author", "Bob"));
+    doc.add(new FacetField("Publish Date", "2010", "10", "15"));
+    docs.add(doc);
+
+    doc = new Document();
+    doc.add(new FacetField("Author", "Lisa"));
+    doc.add(new FacetField("Publish Date", "2010", "10", "20"));
+    docs.add(doc);
+
+    doc = new Document();
+    doc.add(new FacetField("Author", "Tom"));
+    doc.add(new FacetField("Publish Date", "2012", "1", "1"));
+    docs.add(doc);
+
+    doc = new Document();
+    doc.add(new FacetField("Author", "Susan"));
+    doc.add(new FacetField("Publish Date", "2012", "1", "7"));
+    docs.add(doc);
+
+    doc = new Document();
+    doc.add(new FacetField("Author", "Frank"));
+    doc.add(new FacetField("Publish Date", "1999", "5", "5"));
+    docs.add(doc);
+
+    return docs;
+  }
+
+  private List<Integer> allDocIds(MatchingDocs m, boolean decreasingDocIds) throws IOException {
+    DocIdSetIterator disi = m.bits.iterator();
+    List<Integer> docIds = new ArrayList<>();
+    while (disi.nextDoc() != DocIdSetIterator.NO_MORE_DOCS) {
+      docIds.add(disi.docID());
+    }
+
+    if (decreasingDocIds == true) {
+      Collections.reverse(docIds);
+    }
+    return docIds;
+  }
+
+  private List<FacetLabel> lookupFacetLabels(TaxonomyFacetLabels taxoLabels,
+                                             List<MatchingDocs> matchingDocs) throws IOException {
+    return lookupFacetLabels(taxoLabels, matchingDocs, null, false);
+  }
+
+  private List<FacetLabel> lookupFacetLabels(TaxonomyFacetLabels taxoLabels,
+                                             List<MatchingDocs> matchingDocs,
+                                             String dimension) throws IOException {
+    return lookupFacetLabels(taxoLabels, matchingDocs, dimension, false);
+  }
+
+  private List<FacetLabel> lookupFacetLabels(TaxonomyFacetLabels taxoLabels, List<MatchingDocs> matchingDocs, String dimension,
+                                             boolean decreasingDocIds) throws IOException {
+    List<FacetLabel> facetLabels = new ArrayList<>();
+
+    for (MatchingDocs m : matchingDocs) {
+      TaxonomyFacetLabels.FacetLabelReader facetLabelReader = taxoLabels.getFacetLabelReader(m.context);
+      List<Integer> docIds = allDocIds(m, decreasingDocIds);
+      FacetLabel facetLabel;
+      for (Integer docId : docIds) {
+        while (true) {
+          if (dimension != null) {
+            facetLabel = facetLabelReader.nextFacetLabel(docId, dimension);
+          } else {
+            facetLabel = facetLabelReader.nextFacetLabel(docId);
+          }
+
+          if (facetLabel == null) {
+            break;
+          }
+          facetLabels.add(facetLabel);
+        }
+      }
+    }
+
+    return facetLabels;
+  }
+
+
+  public void testBasic() throws Exception {

Review comment:
       Could we make a small change to an existing randomized test, maybe `TestTaxonomyFacetCounts.testRandom`, to use this API to randomly retrieve N facet labels?  (So we exercise the class beyond the simplish `testBasic` ... maybe we uncover a rare bug, somewhere).

##########
File path: lucene/facet/src/java/org/apache/lucene/facet/taxonomy/TaxonomyFacetLabels.java
##########
@@ -0,0 +1,184 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.facet.taxonomy;
+
+import org.apache.lucene.facet.FacetsConfig;
+import org.apache.lucene.index.LeafReaderContext;
+import org.apache.lucene.util.IntsRef;
+
+import java.io.IOException;
+
+import static org.apache.lucene.facet.taxonomy.TaxonomyReader.INVALID_ORDINAL;
+import static org.apache.lucene.facet.taxonomy.TaxonomyReader.ROOT_ORDINAL;
+
+/**
+ * Utility class to easily retrieve previously indexed facet labels, allowing you to skip also adding stored fields for these values,
+ * reducing your index size.
+ *
+ * @lucene.experimental
+ **/
+public class TaxonomyFacetLabels {
+
+  /**
+   * Index field name provided to the constructor
+   */
+  private final String indexFieldName;
+
+  /**
+   * {@code TaxonomyReader} provided to the constructor
+   */
+  private final TaxonomyReader taxoReader;
+
+  /**
+   * {@code FacetsConfig} provided to the constructor
+   */
+  private final FacetsConfig config;
+
+  /**
+   * {@code OrdinalsReader} to decode ordinals previously indexed into the {@code BinaryDocValues} facet field
+   */
+  private final OrdinalsReader ordsReader;
+
+  /**
+   * Sole constructor.  Do not close the provided {@link TaxonomyReader} while still using this instance!
+   */
+  public TaxonomyFacetLabels(TaxonomyReader taxoReader, FacetsConfig config, String indexFieldName) throws IOException {
+    this.taxoReader = taxoReader;
+    this.config = config;
+    this.indexFieldName = indexFieldName;
+    this.ordsReader = new DocValuesOrdinalsReader(indexFieldName);
+  }
+
+  /**
+   * Create and return an instance of {@link FacetLabelReader} to retrieve facet labels for
+   * multiple documents and (optionally) for a specific dimension.  You must create this per-segment,
+   * and then step through all hits, in order, for that segment.
+   *
+   * <p><b>NOTE</b>: This class is not thread-safe, so you must use a new instance of this
+   * class for each thread.</p>
+   *
+   * @param readerContext LeafReaderContext used to access the {@code BinaryDocValues} facet field
+   * @return an instance of {@link FacetLabelReader}
+   * @throws IOException when a low-level IO issue occurs
+   */
+  public FacetLabelReader getFacetLabelReader(LeafReaderContext readerContext) throws IOException {
+    return new FacetLabelReader(ordsReader, readerContext);
+  }
+
+  /**
+   * Utility class to retrieve facet labels for multiple documents.
+   *
+   * @lucene.experimental
+   */
+  public class FacetLabelReader {
+    private final OrdinalsReader.OrdinalsSegmentReader ordinalsSegmentReader;
+    private final IntsRef decodedOrds = new IntsRef();
+    private int currentDocId = -1;
+    private int currentPos = -1;
+
+    // Lazily set when nextFacetLabel(int docId, String facetDimension) is first called
+    private int[] parents;
+
+    /**
+     * Sole constructor.
+     */
+    public FacetLabelReader(OrdinalsReader ordsReader, LeafReaderContext readerContext) throws IOException {
+      ordinalsSegmentReader = ordsReader.getReader(readerContext);
+    }
+
+    /**
+     * Retrieves the next {@link FacetLabel} for the specified {@code docId}, or {@code null} if there are no more.
+     * This method has state: if the provided {@code docId} is the same as the previous invocation, it returns the
+     * next {@link FacetLabel} for that document.  Otherwise, it advances to the new {@code docId} and provides the
+     * first {@link FacetLabel} for that document, or {@code null} if that document has no indexed facets.  Each
+     * new {@code docId} must be in strictly monotonic (increasing) order.
+     *
+     * @param docId input docId provided in monotonic (non-decreasing) order
+     * @return the first or next {@link FacetLabel}, or {@code null} if there are no more
+     * @throws IOException when a low-level IO issue occurs
+     */
+    public FacetLabel nextFacetLabel(int docId) throws IOException {
+      if (currentDocId != docId) {
+        assert docId > currentDocId: "docs out of order!  currentDocId=" + currentDocId + " docId=" + docId;
+        ordinalsSegmentReader.get(docId, decodedOrds);
+        currentDocId = docId;
+        currentPos = decodedOrds.offset;
+      }
+
+      int endPos = decodedOrds.offset + decodedOrds.length;
+      assert currentPos <= endPos;
+
+      if (currentPos == endPos) {
+        // no more FacetLabels
+        return null;
+      }
+
+      int ord = decodedOrds.ints[currentPos++];
+      return taxoReader.getPath(ord);
+    }
+
+    private boolean isDescendant(int ord, int ancestorOrd) {
+      while (ord != INVALID_ORDINAL && ord != ROOT_ORDINAL) {
+        if (parents[ord] == ancestorOrd) {
+          return true;
+        }
+        ord = parents[ord];
+      }
+      return false;
+    }
+
+    /**
+     * Retrieves the next {@link FacetLabel} for the specified {@code docId} under the requested {@code facetDimension},
+     * or {@code null} if there are no more. This method has state: if the provided {@code docId} is the same as the
+     * previous invocation, it returns the next {@link FacetLabel} for that document.  Otherwise, it advances to
+     * the new {@code docId} and provides the first {@link FacetLabel} for that document, or {@code null} if that document
+     * has no indexed facets.  Each new {@code docId} must be in strictly monotonic (increasing) order.
+     *
+     * <p><b>NOTE</b>: this method loads the {@code int[] parents} array from the taxonomy index</p>

Review comment:
       Add same **Note** here as above comment?

##########
File path: lucene/facet/src/java/org/apache/lucene/facet/taxonomy/TaxonomyFacetLabels.java
##########
@@ -0,0 +1,184 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.facet.taxonomy;
+
+import org.apache.lucene.facet.FacetsConfig;
+import org.apache.lucene.index.LeafReaderContext;
+import org.apache.lucene.util.IntsRef;
+
+import java.io.IOException;
+
+import static org.apache.lucene.facet.taxonomy.TaxonomyReader.INVALID_ORDINAL;
+import static org.apache.lucene.facet.taxonomy.TaxonomyReader.ROOT_ORDINAL;
+
+/**
+ * Utility class to easily retrieve previously indexed facet labels, allowing you to skip also adding stored fields for these values,
+ * reducing your index size.
+ *
+ * @lucene.experimental
+ **/
+public class TaxonomyFacetLabels {
+
+  /**
+   * Index field name provided to the constructor
+   */
+  private final String indexFieldName;
+
+  /**
+   * {@code TaxonomyReader} provided to the constructor
+   */
+  private final TaxonomyReader taxoReader;
+
+  /**
+   * {@code FacetsConfig} provided to the constructor
+   */
+  private final FacetsConfig config;
+
+  /**
+   * {@code OrdinalsReader} to decode ordinals previously indexed into the {@code BinaryDocValues} facet field
+   */
+  private final OrdinalsReader ordsReader;
+
+  /**
+   * Sole constructor.  Do not close the provided {@link TaxonomyReader} while still using this instance!
+   */
+  public TaxonomyFacetLabels(TaxonomyReader taxoReader, FacetsConfig config, String indexFieldName) throws IOException {
+    this.taxoReader = taxoReader;
+    this.config = config;
+    this.indexFieldName = indexFieldName;
+    this.ordsReader = new DocValuesOrdinalsReader(indexFieldName);
+  }
+
+  /**
+   * Create and return an instance of {@link FacetLabelReader} to retrieve facet labels for
+   * multiple documents and (optionally) for a specific dimension.  You must create this per-segment,
+   * and then step through all hits, in order, for that segment.
+   *
+   * <p><b>NOTE</b>: This class is not thread-safe, so you must use a new instance of this
+   * class for each thread.</p>
+   *
+   * @param readerContext LeafReaderContext used to access the {@code BinaryDocValues} facet field
+   * @return an instance of {@link FacetLabelReader}
+   * @throws IOException when a low-level IO issue occurs
+   */
+  public FacetLabelReader getFacetLabelReader(LeafReaderContext readerContext) throws IOException {
+    return new FacetLabelReader(ordsReader, readerContext);
+  }
+
+  /**
+   * Utility class to retrieve facet labels for multiple documents.
+   *
+   * @lucene.experimental
+   */
+  public class FacetLabelReader {
+    private final OrdinalsReader.OrdinalsSegmentReader ordinalsSegmentReader;
+    private final IntsRef decodedOrds = new IntsRef();
+    private int currentDocId = -1;
+    private int currentPos = -1;
+
+    // Lazily set when nextFacetLabel(int docId, String facetDimension) is first called
+    private int[] parents;
+
+    /**
+     * Sole constructor.
+     */
+    public FacetLabelReader(OrdinalsReader ordsReader, LeafReaderContext readerContext) throws IOException {
+      ordinalsSegmentReader = ordsReader.getReader(readerContext);
+    }
+
+    /**
+     * Retrieves the next {@link FacetLabel} for the specified {@code docId}, or {@code null} if there are no more.
+     * This method has state: if the provided {@code docId} is the same as the previous invocation, it returns the
+     * next {@link FacetLabel} for that document.  Otherwise, it advances to the new {@code docId} and provides the
+     * first {@link FacetLabel} for that document, or {@code null} if that document has no indexed facets.  Each
+     * new {@code docId} must be in strictly monotonic (increasing) order.
+     *
+     * @param docId input docId provided in monotonic (non-decreasing) order
+     * @return the first or next {@link FacetLabel}, or {@code null} if there are no more
+     * @throws IOException when a low-level IO issue occurs
+     */
+    public FacetLabel nextFacetLabel(int docId) throws IOException {
+      if (currentDocId != docId) {
+        assert docId > currentDocId: "docs out of order!  currentDocId=" + currentDocId + " docId=" + docId;
+        ordinalsSegmentReader.get(docId, decodedOrds);
+        currentDocId = docId;
+        currentPos = decodedOrds.offset;
+      }
+
+      int endPos = decodedOrds.offset + decodedOrds.length;
+      assert currentPos <= endPos;
+
+      if (currentPos == endPos) {
+        // no more FacetLabels
+        return null;
+      }
+
+      int ord = decodedOrds.ints[currentPos++];
+      return taxoReader.getPath(ord);
+    }
+
+    private boolean isDescendant(int ord, int ancestorOrd) {
+      while (ord != INVALID_ORDINAL && ord != ROOT_ORDINAL) {
+        if (parents[ord] == ancestorOrd) {
+          return true;
+        }
+        ord = parents[ord];
+      }
+      return false;
+    }
+
+    /**
+     * Retrieves the next {@link FacetLabel} for the specified {@code docId} under the requested {@code facetDimension},
+     * or {@code null} if there are no more. This method has state: if the provided {@code docId} is the same as the
+     * previous invocation, it returns the next {@link FacetLabel} for that document.  Otherwise, it advances to
+     * the new {@code docId} and provides the first {@link FacetLabel} for that document, or {@code null} if that document
+     * has no indexed facets.  Each new {@code docId} must be in strictly monotonic (increasing) order.
+     *
+     * <p><b>NOTE</b>: this method loads the {@code int[] parents} array from the taxonomy index</p>
+     *
+     * @param docId input docId provided in non-decreasing order
+     * @return the first or next {@link FacetLabel}, or {@code null} if there are no more
+     * @throws IOException if {@link TaxonomyReader} has problems getting path for an ordinal
+     */
+    public FacetLabel nextFacetLabel(int docId, String facetDimension) throws IOException {
+      final int parentOrd = taxoReader.getOrdinal(new FacetLabel(facetDimension));
+      assert parentOrd != INVALID_ORDINAL : "Category ordinal not found for facet dimension: " + facetDimension;
+
+      if (currentDocId != docId) {
+        assert docId > currentDocId: "docs out of order!  currentDocId=" + currentDocId + " docId=" + docId;
+        ordinalsSegmentReader.get(docId, decodedOrds);
+        currentPos = decodedOrds.offset;
+        currentDocId = docId;
+      }
+
+      if (parents == null) {
+        parents = taxoReader.getParallelTaxonomyArrays().parents();
+      }
+
+      int endPos = decodedOrds.offset + decodedOrds.length;
+      assert currentPos <= endPos;
+
+      for (; currentPos < endPos; ) {
+        int ord = decodedOrds.ints[currentPos++];
+        if (isDescendant(ord, parentOrd) == true) {
+          return taxoReader.getPath(ord);

Review comment:
       Hmm when we switch to `BinaryDocValues` for this lookup, we could fix this class to persistently hold a single `BinaryDocValues`, and improve performance for caller that need to look up multiple ordinals, which is the common case.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] goankur commented on a change in pull request #1893: LUCENE-9444 Utility class to get facet labels from taxonomy for a fac…

Posted by GitBox <gi...@apache.org>.
goankur commented on a change in pull request #1893:
URL: https://github.com/apache/lucene-solr/pull/1893#discussion_r493114139



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/taxonomy/TaxonomyFacetLabels.java
##########
@@ -0,0 +1,184 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.facet.taxonomy;
+
+import org.apache.lucene.facet.FacetsConfig;
+import org.apache.lucene.index.LeafReaderContext;
+import org.apache.lucene.util.IntsRef;
+
+import java.io.IOException;
+
+import static org.apache.lucene.facet.taxonomy.TaxonomyReader.INVALID_ORDINAL;
+import static org.apache.lucene.facet.taxonomy.TaxonomyReader.ROOT_ORDINAL;
+
+/**
+ * Utility class to easily retrieve previously indexed facet labels, allowing you to skip also adding stored fields for these values,
+ * reducing your index size.
+ *
+ * @lucene.experimental
+ **/
+public class TaxonomyFacetLabels {
+
+  /**
+   * Index field name provided to the constructor
+   */
+  private final String indexFieldName;
+
+  /**
+   * {@code TaxonomyReader} provided to the constructor
+   */
+  private final TaxonomyReader taxoReader;
+
+  /**
+   * {@code FacetsConfig} provided to the constructor
+   */
+  private final FacetsConfig config;
+
+  /**
+   * {@code OrdinalsReader} to decode ordinals previously indexed into the {@code BinaryDocValues} facet field
+   */
+  private final OrdinalsReader ordsReader;
+
+  /**
+   * Sole constructor.  Do not close the provided {@link TaxonomyReader} while still using this instance!
+   */
+  public TaxonomyFacetLabels(TaxonomyReader taxoReader, FacetsConfig config, String indexFieldName) throws IOException {
+    this.taxoReader = taxoReader;
+    this.config = config;
+    this.indexFieldName = indexFieldName;
+    this.ordsReader = new DocValuesOrdinalsReader(indexFieldName);
+  }
+
+  /**
+   * Create and return an instance of {@link FacetLabelReader} to retrieve facet labels for
+   * multiple documents and (optionally) for a specific dimension.  You must create this per-segment,
+   * and then step through all hits, in order, for that segment.
+   *
+   * <p><b>NOTE</b>: This class is not thread-safe, so you must use a new instance of this
+   * class for each thread.</p>
+   *
+   * @param readerContext LeafReaderContext used to access the {@code BinaryDocValues} facet field
+   * @return an instance of {@link FacetLabelReader}
+   * @throws IOException when a low-level IO issue occurs
+   */
+  public FacetLabelReader getFacetLabelReader(LeafReaderContext readerContext) throws IOException {
+    return new FacetLabelReader(ordsReader, readerContext);
+  }
+
+  /**
+   * Utility class to retrieve facet labels for multiple documents.
+   *
+   * @lucene.experimental
+   */
+  public class FacetLabelReader {
+    private final OrdinalsReader.OrdinalsSegmentReader ordinalsSegmentReader;
+    private final IntsRef decodedOrds = new IntsRef();
+    private int currentDocId = -1;
+    private int currentPos = -1;
+
+    // Lazily set when nextFacetLabel(int docId, String facetDimension) is first called
+    private int[] parents;
+
+    /**
+     * Sole constructor.
+     */
+    public FacetLabelReader(OrdinalsReader ordsReader, LeafReaderContext readerContext) throws IOException {
+      ordinalsSegmentReader = ordsReader.getReader(readerContext);
+    }
+
+    /**
+     * Retrieves the next {@link FacetLabel} for the specified {@code docId}, or {@code null} if there are no more.
+     * This method has state: if the provided {@code docId} is the same as the previous invocation, it returns the
+     * next {@link FacetLabel} for that document.  Otherwise, it advances to the new {@code docId} and provides the
+     * first {@link FacetLabel} for that document, or {@code null} if that document has no indexed facets.  Each
+     * new {@code docId} must be in strictly monotonic (increasing) order.
+     *
+     * @param docId input docId provided in monotonic (non-decreasing) order

Review comment:
       done in next revision




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] mikemccand merged pull request #1893: LUCENE-9444 Utility class to get facet labels from taxonomy for a fac…

Posted by GitBox <gi...@apache.org>.
mikemccand merged pull request #1893:
URL: https://github.com/apache/lucene-solr/pull/1893


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] goankur commented on a change in pull request #1893: LUCENE-9444 Utility class to get facet labels from taxonomy for a fac…

Posted by GitBox <gi...@apache.org>.
goankur commented on a change in pull request #1893:
URL: https://github.com/apache/lucene-solr/pull/1893#discussion_r494665188



##########
File path: lucene/facet/src/test/org/apache/lucene/facet/taxonomy/TestTaxonomyFacetCounts.java
##########
@@ -726,6 +743,39 @@ public void testRandom() throws Exception {
     IOUtils.close(tw, searcher.getIndexReader(), tr, indexDir, taxoDir);
   }
 
+  private static List<List<FacetLabel>> sortedFacetLabels(List<List<FacetLabel>> allfacetLabels) {
+    for (List<FacetLabel> facetLabels : allfacetLabels) {
+      Collections.sort(facetLabels);
+    }
+
+    Collections.sort(allfacetLabels, (o1, o2) -> {

Review comment:
       Yes, a document with `N`<sup>th</sup> position in the input sequence might end up with `K`<sup>th</sup> docId in a random segment making it harder to compare actual and expected labels.
   
   Thanks for confirming that the approach is acceptable.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] mikemccand commented on pull request #1893: LUCENE-9444 Utility class to get facet labels from taxonomy for a fac…

Posted by GitBox <gi...@apache.org>.
mikemccand commented on pull request #1893:
URL: https://github.com/apache/lucene-solr/pull/1893#issuecomment-699702003


   Thanks @goankur I think this is ready!  I'll confirm all tests pass and squash/merge soon.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] goankur commented on a change in pull request #1893: LUCENE-9444 Utility class to get facet labels from taxonomy for a fac…

Posted by GitBox <gi...@apache.org>.
goankur commented on a change in pull request #1893:
URL: https://github.com/apache/lucene-solr/pull/1893#discussion_r494654559



##########
File path: lucene/facet/src/test/org/apache/lucene/facet/FacetTestCase.java
##########
@@ -56,6 +60,28 @@ public Facets getTaxonomyFacetCounts(TaxonomyReader taxoReader, FacetsConfig con
     return facets;
   }
 
+  public List<List<FacetLabel>> getTaxonomyFacetLabels(TaxonomyReader taxoReader, FacetsConfig config, FacetsCollector fc) throws IOException {

Review comment:
       done in the next revision.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] goankur commented on a change in pull request #1893: LUCENE-9444 Utility class to get facet labels from taxonomy for a fac…

Posted by GitBox <gi...@apache.org>.
goankur commented on a change in pull request #1893:
URL: https://github.com/apache/lucene-solr/pull/1893#discussion_r493148293



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/taxonomy/TaxonomyFacetLabels.java
##########
@@ -0,0 +1,184 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.facet.taxonomy;
+
+import org.apache.lucene.facet.FacetsConfig;
+import org.apache.lucene.index.LeafReaderContext;
+import org.apache.lucene.util.IntsRef;
+
+import java.io.IOException;
+
+import static org.apache.lucene.facet.taxonomy.TaxonomyReader.INVALID_ORDINAL;
+import static org.apache.lucene.facet.taxonomy.TaxonomyReader.ROOT_ORDINAL;
+
+/**
+ * Utility class to easily retrieve previously indexed facet labels, allowing you to skip also adding stored fields for these values,
+ * reducing your index size.
+ *
+ * @lucene.experimental
+ **/
+public class TaxonomyFacetLabels {
+
+  /**
+   * Index field name provided to the constructor
+   */
+  private final String indexFieldName;
+
+  /**
+   * {@code TaxonomyReader} provided to the constructor
+   */
+  private final TaxonomyReader taxoReader;
+
+  /**
+   * {@code FacetsConfig} provided to the constructor
+   */
+  private final FacetsConfig config;
+
+  /**
+   * {@code OrdinalsReader} to decode ordinals previously indexed into the {@code BinaryDocValues} facet field
+   */
+  private final OrdinalsReader ordsReader;
+
+  /**
+   * Sole constructor.  Do not close the provided {@link TaxonomyReader} while still using this instance!
+   */
+  public TaxonomyFacetLabels(TaxonomyReader taxoReader, FacetsConfig config, String indexFieldName) throws IOException {
+    this.taxoReader = taxoReader;
+    this.config = config;
+    this.indexFieldName = indexFieldName;
+    this.ordsReader = new DocValuesOrdinalsReader(indexFieldName);
+  }
+
+  /**
+   * Create and return an instance of {@link FacetLabelReader} to retrieve facet labels for
+   * multiple documents and (optionally) for a specific dimension.  You must create this per-segment,
+   * and then step through all hits, in order, for that segment.
+   *
+   * <p><b>NOTE</b>: This class is not thread-safe, so you must use a new instance of this
+   * class for each thread.</p>
+   *
+   * @param readerContext LeafReaderContext used to access the {@code BinaryDocValues} facet field
+   * @return an instance of {@link FacetLabelReader}
+   * @throws IOException when a low-level IO issue occurs
+   */
+  public FacetLabelReader getFacetLabelReader(LeafReaderContext readerContext) throws IOException {
+    return new FacetLabelReader(ordsReader, readerContext);
+  }
+
+  /**
+   * Utility class to retrieve facet labels for multiple documents.
+   *
+   * @lucene.experimental
+   */
+  public class FacetLabelReader {
+    private final OrdinalsReader.OrdinalsSegmentReader ordinalsSegmentReader;
+    private final IntsRef decodedOrds = new IntsRef();
+    private int currentDocId = -1;
+    private int currentPos = -1;
+
+    // Lazily set when nextFacetLabel(int docId, String facetDimension) is first called
+    private int[] parents;
+
+    /**
+     * Sole constructor.
+     */
+    public FacetLabelReader(OrdinalsReader ordsReader, LeafReaderContext readerContext) throws IOException {
+      ordinalsSegmentReader = ordsReader.getReader(readerContext);
+    }
+
+    /**
+     * Retrieves the next {@link FacetLabel} for the specified {@code docId}, or {@code null} if there are no more.
+     * This method has state: if the provided {@code docId} is the same as the previous invocation, it returns the
+     * next {@link FacetLabel} for that document.  Otherwise, it advances to the new {@code docId} and provides the
+     * first {@link FacetLabel} for that document, or {@code null} if that document has no indexed facets.  Each
+     * new {@code docId} must be in strictly monotonic (increasing) order.
+     *
+     * @param docId input docId provided in monotonic (non-decreasing) order
+     * @return the first or next {@link FacetLabel}, or {@code null} if there are no more
+     * @throws IOException when a low-level IO issue occurs
+     */
+    public FacetLabel nextFacetLabel(int docId) throws IOException {
+      if (currentDocId != docId) {
+        assert docId > currentDocId: "docs out of order!  currentDocId=" + currentDocId + " docId=" + docId;
+        ordinalsSegmentReader.get(docId, decodedOrds);
+        currentDocId = docId;
+        currentPos = decodedOrds.offset;
+      }
+
+      int endPos = decodedOrds.offset + decodedOrds.length;
+      assert currentPos <= endPos;
+
+      if (currentPos == endPos) {
+        // no more FacetLabels
+        return null;
+      }
+
+      int ord = decodedOrds.ints[currentPos++];
+      return taxoReader.getPath(ord);
+    }
+
+    private boolean isDescendant(int ord, int ancestorOrd) {
+      while (ord != INVALID_ORDINAL && ord != ROOT_ORDINAL) {
+        if (parents[ord] == ancestorOrd) {
+          return true;
+        }
+        ord = parents[ord];
+      }
+      return false;
+    }
+
+    /**
+     * Retrieves the next {@link FacetLabel} for the specified {@code docId} under the requested {@code facetDimension},
+     * or {@code null} if there are no more. This method has state: if the provided {@code docId} is the same as the
+     * previous invocation, it returns the next {@link FacetLabel} for that document.  Otherwise, it advances to
+     * the new {@code docId} and provides the first {@link FacetLabel} for that document, or {@code null} if that document
+     * has no indexed facets.  Each new {@code docId} must be in strictly monotonic (increasing) order.
+     *
+     * <p><b>NOTE</b>: this method loads the {@code int[] parents} array from the taxonomy index</p>
+     *
+     * @param docId input docId provided in non-decreasing order
+     * @return the first or next {@link FacetLabel}, or {@code null} if there are no more
+     * @throws IOException if {@link TaxonomyReader} has problems getting path for an ordinal
+     */
+    public FacetLabel nextFacetLabel(int docId, String facetDimension) throws IOException {
+      final int parentOrd = taxoReader.getOrdinal(new FacetLabel(facetDimension));
+      assert parentOrd != INVALID_ORDINAL : "Category ordinal not found for facet dimension: " + facetDimension;
+
+      if (currentDocId != docId) {
+        assert docId > currentDocId: "docs out of order!  currentDocId=" + currentDocId + " docId=" + docId;
+        ordinalsSegmentReader.get(docId, decodedOrds);
+        currentPos = decodedOrds.offset;
+        currentDocId = docId;
+      }
+
+      if (parents == null) {
+        parents = taxoReader.getParallelTaxonomyArrays().parents();
+      }
+
+      int endPos = decodedOrds.offset + decodedOrds.length;
+      assert currentPos <= endPos;
+
+      for (; currentPos < endPos; ) {
+        int ord = decodedOrds.ints[currentPos++];
+        if (isDescendant(ord, parentOrd) == true) {
+          return taxoReader.getPath(ord);

Review comment:
       By this class you mean `TaxonomyReader` because that is where `BinaryDocValues` will be cached as part of [LUCENE-9476](https://issues.apache.org/jira/browse/LUCENE-9476) improving performance for callers that need to lookup multiple ordinals ?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] goankur commented on a change in pull request #1893: LUCENE-9444 Utility class to get facet labels from taxonomy for a fac…

Posted by GitBox <gi...@apache.org>.
goankur commented on a change in pull request #1893:
URL: https://github.com/apache/lucene-solr/pull/1893#discussion_r493112985



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/taxonomy/TaxonomyFacetLabels.java
##########
@@ -0,0 +1,184 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.facet.taxonomy;
+
+import org.apache.lucene.facet.FacetsConfig;
+import org.apache.lucene.index.LeafReaderContext;
+import org.apache.lucene.util.IntsRef;
+
+import java.io.IOException;
+
+import static org.apache.lucene.facet.taxonomy.TaxonomyReader.INVALID_ORDINAL;
+import static org.apache.lucene.facet.taxonomy.TaxonomyReader.ROOT_ORDINAL;
+
+/**
+ * Utility class to easily retrieve previously indexed facet labels, allowing you to skip also adding stored fields for these values,
+ * reducing your index size.
+ *
+ * @lucene.experimental
+ **/
+public class TaxonomyFacetLabels {
+
+  /**
+   * Index field name provided to the constructor
+   */
+  private final String indexFieldName;
+
+  /**
+   * {@code TaxonomyReader} provided to the constructor
+   */
+  private final TaxonomyReader taxoReader;
+
+  /**
+   * {@code FacetsConfig} provided to the constructor
+   */
+  private final FacetsConfig config;
+
+  /**
+   * {@code OrdinalsReader} to decode ordinals previously indexed into the {@code BinaryDocValues} facet field
+   */
+  private final OrdinalsReader ordsReader;
+
+  /**
+   * Sole constructor.  Do not close the provided {@link TaxonomyReader} while still using this instance!
+   */
+  public TaxonomyFacetLabels(TaxonomyReader taxoReader, FacetsConfig config, String indexFieldName) throws IOException {
+    this.taxoReader = taxoReader;
+    this.config = config;
+    this.indexFieldName = indexFieldName;
+    this.ordsReader = new DocValuesOrdinalsReader(indexFieldName);
+  }
+
+  /**
+   * Create and return an instance of {@link FacetLabelReader} to retrieve facet labels for
+   * multiple documents and (optionally) for a specific dimension.  You must create this per-segment,
+   * and then step through all hits, in order, for that segment.
+   *
+   * <p><b>NOTE</b>: This class is not thread-safe, so you must use a new instance of this
+   * class for each thread.</p>
+   *
+   * @param readerContext LeafReaderContext used to access the {@code BinaryDocValues} facet field
+   * @return an instance of {@link FacetLabelReader}
+   * @throws IOException when a low-level IO issue occurs
+   */
+  public FacetLabelReader getFacetLabelReader(LeafReaderContext readerContext) throws IOException {
+    return new FacetLabelReader(ordsReader, readerContext);
+  }
+
+  /**
+   * Utility class to retrieve facet labels for multiple documents.
+   *
+   * @lucene.experimental
+   */
+  public class FacetLabelReader {
+    private final OrdinalsReader.OrdinalsSegmentReader ordinalsSegmentReader;
+    private final IntsRef decodedOrds = new IntsRef();
+    private int currentDocId = -1;
+    private int currentPos = -1;
+
+    // Lazily set when nextFacetLabel(int docId, String facetDimension) is first called
+    private int[] parents;
+
+    /**
+     * Sole constructor.
+     */
+    public FacetLabelReader(OrdinalsReader ordsReader, LeafReaderContext readerContext) throws IOException {
+      ordinalsSegmentReader = ordsReader.getReader(readerContext);
+    }
+
+    /**
+     * Retrieves the next {@link FacetLabel} for the specified {@code docId}, or {@code null} if there are no more.
+     * This method has state: if the provided {@code docId} is the same as the previous invocation, it returns the
+     * next {@link FacetLabel} for that document.  Otherwise, it advances to the new {@code docId} and provides the
+     * first {@link FacetLabel} for that document, or {@code null} if that document has no indexed facets.  Each
+     * new {@code docId} must be in strictly monotonic (increasing) order.
+     *
+     * @param docId input docId provided in monotonic (non-decreasing) order
+     * @return the first or next {@link FacetLabel}, or {@code null} if there are no more
+     * @throws IOException when a low-level IO issue occurs
+     */
+    public FacetLabel nextFacetLabel(int docId) throws IOException {
+      if (currentDocId != docId) {
+        assert docId > currentDocId: "docs out of order!  currentDocId=" + currentDocId + " docId=" + docId;

Review comment:
       Changed in the next revision.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] mikemccand commented on a change in pull request #1893: LUCENE-9444 Utility class to get facet labels from taxonomy for a fac…

Posted by GitBox <gi...@apache.org>.
mikemccand commented on a change in pull request #1893:
URL: https://github.com/apache/lucene-solr/pull/1893#discussion_r494395610



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/taxonomy/TaxonomyFacetLabels.java
##########
@@ -0,0 +1,184 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.facet.taxonomy;
+
+import org.apache.lucene.facet.FacetsConfig;
+import org.apache.lucene.index.LeafReaderContext;
+import org.apache.lucene.util.IntsRef;
+
+import java.io.IOException;
+
+import static org.apache.lucene.facet.taxonomy.TaxonomyReader.INVALID_ORDINAL;
+import static org.apache.lucene.facet.taxonomy.TaxonomyReader.ROOT_ORDINAL;
+
+/**
+ * Utility class to easily retrieve previously indexed facet labels, allowing you to skip also adding stored fields for these values,
+ * reducing your index size.
+ *
+ * @lucene.experimental
+ **/
+public class TaxonomyFacetLabels {
+
+  /**
+   * Index field name provided to the constructor
+   */
+  private final String indexFieldName;
+
+  /**
+   * {@code TaxonomyReader} provided to the constructor
+   */
+  private final TaxonomyReader taxoReader;
+
+  /**
+   * {@code FacetsConfig} provided to the constructor
+   */
+  private final FacetsConfig config;
+
+  /**
+   * {@code OrdinalsReader} to decode ordinals previously indexed into the {@code BinaryDocValues} facet field
+   */
+  private final OrdinalsReader ordsReader;
+
+  /**
+   * Sole constructor.  Do not close the provided {@link TaxonomyReader} while still using this instance!
+   */
+  public TaxonomyFacetLabels(TaxonomyReader taxoReader, FacetsConfig config, String indexFieldName) throws IOException {
+    this.taxoReader = taxoReader;
+    this.config = config;
+    this.indexFieldName = indexFieldName;
+    this.ordsReader = new DocValuesOrdinalsReader(indexFieldName);
+  }
+
+  /**
+   * Create and return an instance of {@link FacetLabelReader} to retrieve facet labels for
+   * multiple documents and (optionally) for a specific dimension.  You must create this per-segment,
+   * and then step through all hits, in order, for that segment.
+   *
+   * <p><b>NOTE</b>: This class is not thread-safe, so you must use a new instance of this
+   * class for each thread.</p>
+   *
+   * @param readerContext LeafReaderContext used to access the {@code BinaryDocValues} facet field
+   * @return an instance of {@link FacetLabelReader}
+   * @throws IOException when a low-level IO issue occurs
+   */
+  public FacetLabelReader getFacetLabelReader(LeafReaderContext readerContext) throws IOException {
+    return new FacetLabelReader(ordsReader, readerContext);
+  }
+
+  /**
+   * Utility class to retrieve facet labels for multiple documents.
+   *
+   * @lucene.experimental
+   */
+  public class FacetLabelReader {
+    private final OrdinalsReader.OrdinalsSegmentReader ordinalsSegmentReader;
+    private final IntsRef decodedOrds = new IntsRef();
+    private int currentDocId = -1;
+    private int currentPos = -1;
+
+    // Lazily set when nextFacetLabel(int docId, String facetDimension) is first called
+    private int[] parents;
+
+    /**
+     * Sole constructor.
+     */
+    public FacetLabelReader(OrdinalsReader ordsReader, LeafReaderContext readerContext) throws IOException {
+      ordinalsSegmentReader = ordsReader.getReader(readerContext);
+    }
+
+    /**
+     * Retrieves the next {@link FacetLabel} for the specified {@code docId}, or {@code null} if there are no more.
+     * This method has state: if the provided {@code docId} is the same as the previous invocation, it returns the
+     * next {@link FacetLabel} for that document.  Otherwise, it advances to the new {@code docId} and provides the
+     * first {@link FacetLabel} for that document, or {@code null} if that document has no indexed facets.  Each
+     * new {@code docId} must be in strictly monotonic (increasing) order.
+     *
+     * @param docId input docId provided in monotonic (non-decreasing) order
+     * @return the first or next {@link FacetLabel}, or {@code null} if there are no more
+     * @throws IOException when a low-level IO issue occurs
+     */
+    public FacetLabel nextFacetLabel(int docId) throws IOException {
+      if (currentDocId != docId) {
+        assert docId > currentDocId: "docs out of order!  currentDocId=" + currentDocId + " docId=" + docId;
+        ordinalsSegmentReader.get(docId, decodedOrds);
+        currentDocId = docId;
+        currentPos = decodedOrds.offset;
+      }
+
+      int endPos = decodedOrds.offset + decodedOrds.length;
+      assert currentPos <= endPos;
+
+      if (currentPos == endPos) {
+        // no more FacetLabels
+        return null;
+      }
+
+      int ord = decodedOrds.ints[currentPos++];
+      return taxoReader.getPath(ord);
+    }
+
+    private boolean isDescendant(int ord, int ancestorOrd) {
+      while (ord != INVALID_ORDINAL && ord != ROOT_ORDINAL) {
+        if (parents[ord] == ancestorOrd) {
+          return true;
+        }
+        ord = parents[ord];
+      }
+      return false;
+    }
+
+    /**
+     * Retrieves the next {@link FacetLabel} for the specified {@code docId} under the requested {@code facetDimension},
+     * or {@code null} if there are no more. This method has state: if the provided {@code docId} is the same as the
+     * previous invocation, it returns the next {@link FacetLabel} for that document.  Otherwise, it advances to
+     * the new {@code docId} and provides the first {@link FacetLabel} for that document, or {@code null} if that document
+     * has no indexed facets.  Each new {@code docId} must be in strictly monotonic (increasing) order.
+     *
+     * <p><b>NOTE</b>: this method loads the {@code int[] parents} array from the taxonomy index</p>
+     *
+     * @param docId input docId provided in non-decreasing order
+     * @return the first or next {@link FacetLabel}, or {@code null} if there are no more
+     * @throws IOException if {@link TaxonomyReader} has problems getting path for an ordinal
+     */
+    public FacetLabel nextFacetLabel(int docId, String facetDimension) throws IOException {
+      final int parentOrd = taxoReader.getOrdinal(new FacetLabel(facetDimension));
+      assert parentOrd != INVALID_ORDINAL : "Category ordinal not found for facet dimension: " + facetDimension;
+
+      if (currentDocId != docId) {
+        assert docId > currentDocId: "docs out of order!  currentDocId=" + currentDocId + " docId=" + docId;
+        ordinalsSegmentReader.get(docId, decodedOrds);
+        currentPos = decodedOrds.offset;
+        currentDocId = docId;
+      }
+
+      if (parents == null) {
+        parents = taxoReader.getParallelTaxonomyArrays().parents();
+      }
+
+      int endPos = decodedOrds.offset + decodedOrds.length;
+      assert currentPos <= endPos;
+
+      for (; currentPos < endPos; ) {
+        int ord = decodedOrds.ints[currentPos++];
+        if (isDescendant(ord, parentOrd) == true) {
+          return taxoReader.getPath(ord);

Review comment:
       Hmm, I was thinking this new per-segment class (`FacetLabelReader`) would hold onto a single `BinaryDocValues` instance, but you're right, that won't work.
   
   The facet ordinals are stored / compacted in sorted order in a single `BinaryDocValues` field which we decode here into `decodedOrds`.  Resolving those ordinals to `FacetLabel` is costly, currently retrieving one stored document per ord.  After LUCENE-9450 (switching from stored fields to BDV), we could pull a new BDV for each unique `docId` passed to `nextFacetLabel`, and then bulk resolve all the ordinals?
   
   Anyway, we can pursue all of this later -- this PR already looks awesome -- progress not perfection!




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] goankur commented on a change in pull request #1893: LUCENE-9444 Utility class to get facet labels from taxonomy for a fac…

Posted by GitBox <gi...@apache.org>.
goankur commented on a change in pull request #1893:
URL: https://github.com/apache/lucene-solr/pull/1893#discussion_r493113512



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/taxonomy/TaxonomyFacetLabels.java
##########
@@ -0,0 +1,184 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.facet.taxonomy;
+
+import org.apache.lucene.facet.FacetsConfig;
+import org.apache.lucene.index.LeafReaderContext;
+import org.apache.lucene.util.IntsRef;
+
+import java.io.IOException;
+
+import static org.apache.lucene.facet.taxonomy.TaxonomyReader.INVALID_ORDINAL;
+import static org.apache.lucene.facet.taxonomy.TaxonomyReader.ROOT_ORDINAL;
+
+/**
+ * Utility class to easily retrieve previously indexed facet labels, allowing you to skip also adding stored fields for these values,
+ * reducing your index size.
+ *
+ * @lucene.experimental
+ **/
+public class TaxonomyFacetLabels {
+
+  /**
+   * Index field name provided to the constructor
+   */
+  private final String indexFieldName;
+
+  /**
+   * {@code TaxonomyReader} provided to the constructor
+   */
+  private final TaxonomyReader taxoReader;
+
+  /**
+   * {@code FacetsConfig} provided to the constructor
+   */
+  private final FacetsConfig config;
+
+  /**
+   * {@code OrdinalsReader} to decode ordinals previously indexed into the {@code BinaryDocValues} facet field
+   */
+  private final OrdinalsReader ordsReader;
+
+  /**
+   * Sole constructor.  Do not close the provided {@link TaxonomyReader} while still using this instance!
+   */
+  public TaxonomyFacetLabels(TaxonomyReader taxoReader, FacetsConfig config, String indexFieldName) throws IOException {
+    this.taxoReader = taxoReader;
+    this.config = config;
+    this.indexFieldName = indexFieldName;
+    this.ordsReader = new DocValuesOrdinalsReader(indexFieldName);
+  }
+
+  /**
+   * Create and return an instance of {@link FacetLabelReader} to retrieve facet labels for
+   * multiple documents and (optionally) for a specific dimension.  You must create this per-segment,
+   * and then step through all hits, in order, for that segment.
+   *
+   * <p><b>NOTE</b>: This class is not thread-safe, so you must use a new instance of this
+   * class for each thread.</p>
+   *
+   * @param readerContext LeafReaderContext used to access the {@code BinaryDocValues} facet field
+   * @return an instance of {@link FacetLabelReader}
+   * @throws IOException when a low-level IO issue occurs
+   */
+  public FacetLabelReader getFacetLabelReader(LeafReaderContext readerContext) throws IOException {
+    return new FacetLabelReader(ordsReader, readerContext);
+  }
+
+  /**
+   * Utility class to retrieve facet labels for multiple documents.
+   *
+   * @lucene.experimental
+   */
+  public class FacetLabelReader {
+    private final OrdinalsReader.OrdinalsSegmentReader ordinalsSegmentReader;
+    private final IntsRef decodedOrds = new IntsRef();
+    private int currentDocId = -1;
+    private int currentPos = -1;
+
+    // Lazily set when nextFacetLabel(int docId, String facetDimension) is first called
+    private int[] parents;
+
+    /**
+     * Sole constructor.
+     */
+    public FacetLabelReader(OrdinalsReader ordsReader, LeafReaderContext readerContext) throws IOException {
+      ordinalsSegmentReader = ordsReader.getReader(readerContext);
+    }
+
+    /**
+     * Retrieves the next {@link FacetLabel} for the specified {@code docId}, or {@code null} if there are no more.
+     * This method has state: if the provided {@code docId} is the same as the previous invocation, it returns the
+     * next {@link FacetLabel} for that document.  Otherwise, it advances to the new {@code docId} and provides the
+     * first {@link FacetLabel} for that document, or {@code null} if that document has no indexed facets.  Each
+     * new {@code docId} must be in strictly monotonic (increasing) order.
+     *
+     * @param docId input docId provided in monotonic (non-decreasing) order
+     * @return the first or next {@link FacetLabel}, or {@code null} if there are no more
+     * @throws IOException when a low-level IO issue occurs
+     */
+    public FacetLabel nextFacetLabel(int docId) throws IOException {
+      if (currentDocId != docId) {
+        assert docId > currentDocId: "docs out of order!  currentDocId=" + currentDocId + " docId=" + docId;
+        ordinalsSegmentReader.get(docId, decodedOrds);
+        currentDocId = docId;
+        currentPos = decodedOrds.offset;
+      }
+
+      int endPos = decodedOrds.offset + decodedOrds.length;
+      assert currentPos <= endPos;

Review comment:
       Got it thanks for explaining appropriate usage of assert and `IlegalArgumentException`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] goankur commented on a change in pull request #1893: LUCENE-9444 Utility class to get facet labels from taxonomy for a fac…

Posted by GitBox <gi...@apache.org>.
goankur commented on a change in pull request #1893:
URL: https://github.com/apache/lucene-solr/pull/1893#discussion_r493137058



##########
File path: lucene/facet/src/test/org/apache/lucene/facet/taxonomy/TestTaxonomyLabels.java
##########
@@ -0,0 +1,192 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.facet.taxonomy;
+
+import org.apache.lucene.document.Document;
+import org.apache.lucene.facet.FacetField;
+import org.apache.lucene.facet.FacetTestCase;
+import org.apache.lucene.facet.FacetsCollector;
+import org.apache.lucene.facet.FacetsCollector.MatchingDocs;
+import org.apache.lucene.facet.FacetsConfig;
+import org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyReader;
+import org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter;
+import org.apache.lucene.index.IndexWriterConfig;
+import org.apache.lucene.index.RandomIndexWriter;
+import org.apache.lucene.search.DocIdSetIterator;
+import org.apache.lucene.search.IndexSearcher;
+import org.apache.lucene.search.MatchAllDocsQuery;
+import org.apache.lucene.store.Directory;
+import org.apache.lucene.util.IOUtils;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.Set;
+import java.util.stream.Collectors;
+
+public class TestTaxonomyLabels extends FacetTestCase {
+
+  private List<Document> prepareDocuments() {
+    List<Document> docs = new ArrayList<>();
+
+    Document doc = new Document();
+    doc.add(new FacetField("Author", "Bob"));
+    doc.add(new FacetField("Publish Date", "2010", "10", "15"));
+    docs.add(doc);
+
+    doc = new Document();
+    doc.add(new FacetField("Author", "Lisa"));
+    doc.add(new FacetField("Publish Date", "2010", "10", "20"));
+    docs.add(doc);
+
+    doc = new Document();
+    doc.add(new FacetField("Author", "Tom"));
+    doc.add(new FacetField("Publish Date", "2012", "1", "1"));
+    docs.add(doc);
+
+    doc = new Document();
+    doc.add(new FacetField("Author", "Susan"));
+    doc.add(new FacetField("Publish Date", "2012", "1", "7"));
+    docs.add(doc);
+
+    doc = new Document();
+    doc.add(new FacetField("Author", "Frank"));
+    doc.add(new FacetField("Publish Date", "1999", "5", "5"));
+    docs.add(doc);
+
+    return docs;
+  }
+
+  private List<Integer> allDocIds(MatchingDocs m, boolean decreasingDocIds) throws IOException {
+    DocIdSetIterator disi = m.bits.iterator();
+    List<Integer> docIds = new ArrayList<>();
+    while (disi.nextDoc() != DocIdSetIterator.NO_MORE_DOCS) {
+      docIds.add(disi.docID());
+    }
+
+    if (decreasingDocIds == true) {
+      Collections.reverse(docIds);
+    }
+    return docIds;
+  }
+
+  private List<FacetLabel> lookupFacetLabels(TaxonomyFacetLabels taxoLabels,
+                                             List<MatchingDocs> matchingDocs) throws IOException {
+    return lookupFacetLabels(taxoLabels, matchingDocs, null, false);
+  }
+
+  private List<FacetLabel> lookupFacetLabels(TaxonomyFacetLabels taxoLabels,
+                                             List<MatchingDocs> matchingDocs,
+                                             String dimension) throws IOException {
+    return lookupFacetLabels(taxoLabels, matchingDocs, dimension, false);
+  }
+
+  private List<FacetLabel> lookupFacetLabels(TaxonomyFacetLabels taxoLabels, List<MatchingDocs> matchingDocs, String dimension,
+                                             boolean decreasingDocIds) throws IOException {
+    List<FacetLabel> facetLabels = new ArrayList<>();
+
+    for (MatchingDocs m : matchingDocs) {
+      TaxonomyFacetLabels.FacetLabelReader facetLabelReader = taxoLabels.getFacetLabelReader(m.context);
+      List<Integer> docIds = allDocIds(m, decreasingDocIds);
+      FacetLabel facetLabel;
+      for (Integer docId : docIds) {
+        while (true) {
+          if (dimension != null) {
+            facetLabel = facetLabelReader.nextFacetLabel(docId, dimension);
+          } else {
+            facetLabel = facetLabelReader.nextFacetLabel(docId);
+          }
+
+          if (facetLabel == null) {
+            break;
+          }
+          facetLabels.add(facetLabel);
+        }
+      }
+    }
+
+    return facetLabels;
+  }
+
+
+  public void testBasic() throws Exception {
+    Directory dir = newDirectory();
+    Directory taxoDir = newDirectory();
+
+    // Writes facet ords to a separate directory from the main index:
+    DirectoryTaxonomyWriter taxoWriter = new DirectoryTaxonomyWriter(taxoDir, IndexWriterConfig.OpenMode.CREATE);
+    RandomIndexWriter writer = new RandomIndexWriter(random(), dir);
+
+    FacetsConfig config = new FacetsConfig();
+    config.setHierarchical("Publish Date", true);
+
+    for (Document doc : prepareDocuments()) {
+      writer.addDocument(config.build(taxoWriter, doc));
+    }
+
+    // NRT open
+    IndexSearcher searcher = newSearcher(writer.getReader());
+    // NRT open
+    TaxonomyReader taxoReader = new DirectoryTaxonomyReader(taxoWriter);
+
+    FacetsCollector fc = new FacetsCollector();
+    searcher.search(new MatchAllDocsQuery(), fc);
+
+    TaxonomyFacetLabels taxoLabels = new TaxonomyFacetLabels(taxoReader, config, FacetsConfig.DEFAULT_INDEX_FIELD_NAME);
+
+    // Check labels for all dimensions
+    List<FacetLabel> facetLabels = lookupFacetLabels(taxoLabels, fc.getMatchingDocs());
+    assertTrue(facetLabels.size() == 10);
+    assertTrue(facetLabels.stream()
+        .filter(l -> "Author".equals(l.components[0]))
+        .map(l -> l.components[1]).collect(Collectors.toSet())
+        .equals(Set.of("Bob", "Lisa", "Susan", "Frank", "Tom")));
+
+    assertTrue(facetLabels.stream()
+        .filter(l -> "Publish Date".equals(l.components[0]))
+        .map(l -> String.join("/", l.components[1], l.components[2], l.components[3]))
+        .collect(Collectors.toSet())
+        .equals(Set.of("2010/10/15", "2010/10/20", "2012/1/1", "2012/1/7", "1999/5/5")));
+
+    // Check labels for specific dimension
+    facetLabels = lookupFacetLabels(taxoLabels, fc.getMatchingDocs(), "Publish Date");
+    assertTrue(facetLabels.size() == 5);
+    assertTrue(facetLabels.stream()
+        .filter(l -> "Publish Date".equals(l.components[0]))
+        .map(l -> String.join("/", l.components[1], l.components[2], l.components[3]))
+        .collect(Collectors.toSet())
+        .equals(Set.of("2010/10/15", "2010/10/20", "2012/1/1", "2012/1/7", "1999/5/5")));
+
+    try {
+      facetLabels = lookupFacetLabels(taxoLabels, fc.getMatchingDocs(), null, true);
+      fail("Assertion error was not thrown for using docIds supplied in decreasing order");
+    } catch (AssertionError ae) {

Review comment:
       Nice catch, I changed the implementation to throw `IllegalArgumentExceeption` after ugrade to `if` :-)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] goankur commented on a change in pull request #1893: LUCENE-9444 Utility class to get facet labels from taxonomy for a fac…

Posted by GitBox <gi...@apache.org>.
goankur commented on a change in pull request #1893:
URL: https://github.com/apache/lucene-solr/pull/1893#discussion_r494637269



##########
File path: lucene/facet/src/test/org/apache/lucene/facet/taxonomy/TestTaxonomyFacetCounts.java
##########
@@ -711,6 +723,11 @@ public void testRandom() throws Exception {
         }
       }
 
+      // Test facet labels for each matching test doc
+      List<List<FacetLabel>> actualLabels = getTaxonomyFacetLabels(tr, config, fc);
+      assertEquals(expectedLabels.size(), actualLabels.size());

Review comment:
       Nice catch, thanks.  I fixed `actualLabels` generation in `FacetTestCase.getAllTaxonomyFacetLabels()` method to filter out empty `List<facetLabels>`.

##########
File path: lucene/facet/src/test/org/apache/lucene/facet/FacetTestCase.java
##########
@@ -56,6 +60,28 @@ public Facets getTaxonomyFacetCounts(TaxonomyReader taxoReader, FacetsConfig con
     return facets;
   }
 
+  public List<List<FacetLabel>> getTaxonomyFacetLabels(TaxonomyReader taxoReader, FacetsConfig config, FacetsCollector fc) throws IOException {

Review comment:
       done in the next revision.

##########
File path: lucene/facet/src/test/org/apache/lucene/facet/taxonomy/TestTaxonomyFacetCounts.java
##########
@@ -726,6 +743,39 @@ public void testRandom() throws Exception {
     IOUtils.close(tw, searcher.getIndexReader(), tr, indexDir, taxoDir);
   }
 
+  private static List<List<FacetLabel>> sortedFacetLabels(List<List<FacetLabel>> allfacetLabels) {
+    for (List<FacetLabel> facetLabels : allfacetLabels) {
+      Collections.sort(facetLabels);
+    }
+
+    Collections.sort(allfacetLabels, (o1, o2) -> {
+      if (o1 == null) {

Review comment:
       Thanks for catching this @mikemccand. I fixed the `actualLabels` to exclude empty lists. The null checks were just me being extra cautious. I realized they were unnecessary and removed them :-)

##########
File path: lucene/facet/src/test/org/apache/lucene/facet/taxonomy/TestTaxonomyFacetCounts.java
##########
@@ -726,6 +743,39 @@ public void testRandom() throws Exception {
     IOUtils.close(tw, searcher.getIndexReader(), tr, indexDir, taxoDir);
   }
 
+  private static List<List<FacetLabel>> sortedFacetLabels(List<List<FacetLabel>> allfacetLabels) {
+    for (List<FacetLabel> facetLabels : allfacetLabels) {
+      Collections.sort(facetLabels);
+    }
+
+    Collections.sort(allfacetLabels, (o1, o2) -> {

Review comment:
       Yes, a document with `N`<sup>th</sup> position in the input sequence might end up with `K`<sup>th</sup> docId in a random segment making it harder to compare actual and expected labels.
   
   Thanks for confirming that the approach is acceptable.

##########
File path: lucene/facet/src/java/org/apache/lucene/facet/taxonomy/TaxonomyFacetLabels.java
##########
@@ -0,0 +1,184 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.facet.taxonomy;
+
+import org.apache.lucene.facet.FacetsConfig;
+import org.apache.lucene.index.LeafReaderContext;
+import org.apache.lucene.util.IntsRef;
+
+import java.io.IOException;
+
+import static org.apache.lucene.facet.taxonomy.TaxonomyReader.INVALID_ORDINAL;
+import static org.apache.lucene.facet.taxonomy.TaxonomyReader.ROOT_ORDINAL;
+
+/**
+ * Utility class to easily retrieve previously indexed facet labels, allowing you to skip also adding stored fields for these values,
+ * reducing your index size.
+ *
+ * @lucene.experimental
+ **/
+public class TaxonomyFacetLabels {
+
+  /**
+   * Index field name provided to the constructor
+   */
+  private final String indexFieldName;
+
+  /**
+   * {@code TaxonomyReader} provided to the constructor
+   */
+  private final TaxonomyReader taxoReader;
+
+  /**
+   * {@code FacetsConfig} provided to the constructor
+   */
+  private final FacetsConfig config;
+
+  /**
+   * {@code OrdinalsReader} to decode ordinals previously indexed into the {@code BinaryDocValues} facet field
+   */
+  private final OrdinalsReader ordsReader;
+
+  /**
+   * Sole constructor.  Do not close the provided {@link TaxonomyReader} while still using this instance!
+   */
+  public TaxonomyFacetLabels(TaxonomyReader taxoReader, FacetsConfig config, String indexFieldName) throws IOException {
+    this.taxoReader = taxoReader;
+    this.config = config;
+    this.indexFieldName = indexFieldName;
+    this.ordsReader = new DocValuesOrdinalsReader(indexFieldName);
+  }
+
+  /**
+   * Create and return an instance of {@link FacetLabelReader} to retrieve facet labels for
+   * multiple documents and (optionally) for a specific dimension.  You must create this per-segment,
+   * and then step through all hits, in order, for that segment.
+   *
+   * <p><b>NOTE</b>: This class is not thread-safe, so you must use a new instance of this
+   * class for each thread.</p>
+   *
+   * @param readerContext LeafReaderContext used to access the {@code BinaryDocValues} facet field
+   * @return an instance of {@link FacetLabelReader}
+   * @throws IOException when a low-level IO issue occurs
+   */
+  public FacetLabelReader getFacetLabelReader(LeafReaderContext readerContext) throws IOException {
+    return new FacetLabelReader(ordsReader, readerContext);
+  }
+
+  /**
+   * Utility class to retrieve facet labels for multiple documents.
+   *
+   * @lucene.experimental
+   */
+  public class FacetLabelReader {
+    private final OrdinalsReader.OrdinalsSegmentReader ordinalsSegmentReader;
+    private final IntsRef decodedOrds = new IntsRef();
+    private int currentDocId = -1;
+    private int currentPos = -1;
+
+    // Lazily set when nextFacetLabel(int docId, String facetDimension) is first called
+    private int[] parents;
+
+    /**
+     * Sole constructor.
+     */
+    public FacetLabelReader(OrdinalsReader ordsReader, LeafReaderContext readerContext) throws IOException {
+      ordinalsSegmentReader = ordsReader.getReader(readerContext);
+    }
+
+    /**
+     * Retrieves the next {@link FacetLabel} for the specified {@code docId}, or {@code null} if there are no more.
+     * This method has state: if the provided {@code docId} is the same as the previous invocation, it returns the
+     * next {@link FacetLabel} for that document.  Otherwise, it advances to the new {@code docId} and provides the
+     * first {@link FacetLabel} for that document, or {@code null} if that document has no indexed facets.  Each
+     * new {@code docId} must be in strictly monotonic (increasing) order.
+     *
+     * @param docId input docId provided in monotonic (non-decreasing) order
+     * @return the first or next {@link FacetLabel}, or {@code null} if there are no more
+     * @throws IOException when a low-level IO issue occurs
+     */
+    public FacetLabel nextFacetLabel(int docId) throws IOException {
+      if (currentDocId != docId) {
+        assert docId > currentDocId: "docs out of order!  currentDocId=" + currentDocId + " docId=" + docId;
+        ordinalsSegmentReader.get(docId, decodedOrds);
+        currentDocId = docId;
+        currentPos = decodedOrds.offset;
+      }
+
+      int endPos = decodedOrds.offset + decodedOrds.length;
+      assert currentPos <= endPos;
+
+      if (currentPos == endPos) {
+        // no more FacetLabels
+        return null;
+      }
+
+      int ord = decodedOrds.ints[currentPos++];
+      return taxoReader.getPath(ord);
+    }
+
+    private boolean isDescendant(int ord, int ancestorOrd) {
+      while (ord != INVALID_ORDINAL && ord != ROOT_ORDINAL) {
+        if (parents[ord] == ancestorOrd) {
+          return true;
+        }
+        ord = parents[ord];
+      }
+      return false;
+    }
+
+    /**
+     * Retrieves the next {@link FacetLabel} for the specified {@code docId} under the requested {@code facetDimension},
+     * or {@code null} if there are no more. This method has state: if the provided {@code docId} is the same as the
+     * previous invocation, it returns the next {@link FacetLabel} for that document.  Otherwise, it advances to
+     * the new {@code docId} and provides the first {@link FacetLabel} for that document, or {@code null} if that document
+     * has no indexed facets.  Each new {@code docId} must be in strictly monotonic (increasing) order.
+     *
+     * <p><b>NOTE</b>: this method loads the {@code int[] parents} array from the taxonomy index</p>
+     *
+     * @param docId input docId provided in non-decreasing order
+     * @return the first or next {@link FacetLabel}, or {@code null} if there are no more
+     * @throws IOException if {@link TaxonomyReader} has problems getting path for an ordinal
+     */
+    public FacetLabel nextFacetLabel(int docId, String facetDimension) throws IOException {
+      final int parentOrd = taxoReader.getOrdinal(new FacetLabel(facetDimension));
+      assert parentOrd != INVALID_ORDINAL : "Category ordinal not found for facet dimension: " + facetDimension;
+
+      if (currentDocId != docId) {
+        assert docId > currentDocId: "docs out of order!  currentDocId=" + currentDocId + " docId=" + docId;
+        ordinalsSegmentReader.get(docId, decodedOrds);
+        currentPos = decodedOrds.offset;
+        currentDocId = docId;
+      }
+
+      if (parents == null) {
+        parents = taxoReader.getParallelTaxonomyArrays().parents();
+      }
+
+      int endPos = decodedOrds.offset + decodedOrds.length;
+      assert currentPos <= endPos;
+
+      for (; currentPos < endPos; ) {
+        int ord = decodedOrds.ints[currentPos++];
+        if (isDescendant(ord, parentOrd) == true) {
+          return taxoReader.getPath(ord);

Review comment:
       The new per-segment class (`FacetLabelReader`) holds an instance of `DocValuesOrdinalsReader` which internally holds a reusable instance of `BinaryDocValues`.
   
   In the `nextFacetLabel()` method `ordinalsSegmentReader.get(docId, decodedOrds)` that does the actual decoding of the ordinals is only called **once** if the input docId is the same as the one supplied in prior invocation of `nextFacetLabel()`.  
   
   Subsequent invocations fetch the ordinals from `FacetLabelReader.decodedOrds` and use the ordinal to lookup FacetLabels using `taxoReader.getPath(ord)`. 
   
   Yes, after LUCENE-9450, we can bulk resolve all the ordinals using a different API, something like -> `List<FacetLabels> taxoReader.getPaths(IntRef decodedOrds)` and add a new API - `FacetLabelReader.allFacetLabels(int docId)`
   
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] mikemccand commented on a change in pull request #1893: LUCENE-9444 Utility class to get facet labels from taxonomy for a fac…

Posted by GitBox <gi...@apache.org>.
mikemccand commented on a change in pull request #1893:
URL: https://github.com/apache/lucene-solr/pull/1893#discussion_r494395610



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/taxonomy/TaxonomyFacetLabels.java
##########
@@ -0,0 +1,184 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.facet.taxonomy;
+
+import org.apache.lucene.facet.FacetsConfig;
+import org.apache.lucene.index.LeafReaderContext;
+import org.apache.lucene.util.IntsRef;
+
+import java.io.IOException;
+
+import static org.apache.lucene.facet.taxonomy.TaxonomyReader.INVALID_ORDINAL;
+import static org.apache.lucene.facet.taxonomy.TaxonomyReader.ROOT_ORDINAL;
+
+/**
+ * Utility class to easily retrieve previously indexed facet labels, allowing you to skip also adding stored fields for these values,
+ * reducing your index size.
+ *
+ * @lucene.experimental
+ **/
+public class TaxonomyFacetLabels {
+
+  /**
+   * Index field name provided to the constructor
+   */
+  private final String indexFieldName;
+
+  /**
+   * {@code TaxonomyReader} provided to the constructor
+   */
+  private final TaxonomyReader taxoReader;
+
+  /**
+   * {@code FacetsConfig} provided to the constructor
+   */
+  private final FacetsConfig config;
+
+  /**
+   * {@code OrdinalsReader} to decode ordinals previously indexed into the {@code BinaryDocValues} facet field
+   */
+  private final OrdinalsReader ordsReader;
+
+  /**
+   * Sole constructor.  Do not close the provided {@link TaxonomyReader} while still using this instance!
+   */
+  public TaxonomyFacetLabels(TaxonomyReader taxoReader, FacetsConfig config, String indexFieldName) throws IOException {
+    this.taxoReader = taxoReader;
+    this.config = config;
+    this.indexFieldName = indexFieldName;
+    this.ordsReader = new DocValuesOrdinalsReader(indexFieldName);
+  }
+
+  /**
+   * Create and return an instance of {@link FacetLabelReader} to retrieve facet labels for
+   * multiple documents and (optionally) for a specific dimension.  You must create this per-segment,
+   * and then step through all hits, in order, for that segment.
+   *
+   * <p><b>NOTE</b>: This class is not thread-safe, so you must use a new instance of this
+   * class for each thread.</p>
+   *
+   * @param readerContext LeafReaderContext used to access the {@code BinaryDocValues} facet field
+   * @return an instance of {@link FacetLabelReader}
+   * @throws IOException when a low-level IO issue occurs
+   */
+  public FacetLabelReader getFacetLabelReader(LeafReaderContext readerContext) throws IOException {
+    return new FacetLabelReader(ordsReader, readerContext);
+  }
+
+  /**
+   * Utility class to retrieve facet labels for multiple documents.
+   *
+   * @lucene.experimental
+   */
+  public class FacetLabelReader {
+    private final OrdinalsReader.OrdinalsSegmentReader ordinalsSegmentReader;
+    private final IntsRef decodedOrds = new IntsRef();
+    private int currentDocId = -1;
+    private int currentPos = -1;
+
+    // Lazily set when nextFacetLabel(int docId, String facetDimension) is first called
+    private int[] parents;
+
+    /**
+     * Sole constructor.
+     */
+    public FacetLabelReader(OrdinalsReader ordsReader, LeafReaderContext readerContext) throws IOException {
+      ordinalsSegmentReader = ordsReader.getReader(readerContext);
+    }
+
+    /**
+     * Retrieves the next {@link FacetLabel} for the specified {@code docId}, or {@code null} if there are no more.
+     * This method has state: if the provided {@code docId} is the same as the previous invocation, it returns the
+     * next {@link FacetLabel} for that document.  Otherwise, it advances to the new {@code docId} and provides the
+     * first {@link FacetLabel} for that document, or {@code null} if that document has no indexed facets.  Each
+     * new {@code docId} must be in strictly monotonic (increasing) order.
+     *
+     * @param docId input docId provided in monotonic (non-decreasing) order
+     * @return the first or next {@link FacetLabel}, or {@code null} if there are no more
+     * @throws IOException when a low-level IO issue occurs
+     */
+    public FacetLabel nextFacetLabel(int docId) throws IOException {
+      if (currentDocId != docId) {
+        assert docId > currentDocId: "docs out of order!  currentDocId=" + currentDocId + " docId=" + docId;
+        ordinalsSegmentReader.get(docId, decodedOrds);
+        currentDocId = docId;
+        currentPos = decodedOrds.offset;
+      }
+
+      int endPos = decodedOrds.offset + decodedOrds.length;
+      assert currentPos <= endPos;
+
+      if (currentPos == endPos) {
+        // no more FacetLabels
+        return null;
+      }
+
+      int ord = decodedOrds.ints[currentPos++];
+      return taxoReader.getPath(ord);
+    }
+
+    private boolean isDescendant(int ord, int ancestorOrd) {
+      while (ord != INVALID_ORDINAL && ord != ROOT_ORDINAL) {
+        if (parents[ord] == ancestorOrd) {
+          return true;
+        }
+        ord = parents[ord];
+      }
+      return false;
+    }
+
+    /**
+     * Retrieves the next {@link FacetLabel} for the specified {@code docId} under the requested {@code facetDimension},
+     * or {@code null} if there are no more. This method has state: if the provided {@code docId} is the same as the
+     * previous invocation, it returns the next {@link FacetLabel} for that document.  Otherwise, it advances to
+     * the new {@code docId} and provides the first {@link FacetLabel} for that document, or {@code null} if that document
+     * has no indexed facets.  Each new {@code docId} must be in strictly monotonic (increasing) order.
+     *
+     * <p><b>NOTE</b>: this method loads the {@code int[] parents} array from the taxonomy index</p>
+     *
+     * @param docId input docId provided in non-decreasing order
+     * @return the first or next {@link FacetLabel}, or {@code null} if there are no more
+     * @throws IOException if {@link TaxonomyReader} has problems getting path for an ordinal
+     */
+    public FacetLabel nextFacetLabel(int docId, String facetDimension) throws IOException {
+      final int parentOrd = taxoReader.getOrdinal(new FacetLabel(facetDimension));
+      assert parentOrd != INVALID_ORDINAL : "Category ordinal not found for facet dimension: " + facetDimension;
+
+      if (currentDocId != docId) {
+        assert docId > currentDocId: "docs out of order!  currentDocId=" + currentDocId + " docId=" + docId;
+        ordinalsSegmentReader.get(docId, decodedOrds);
+        currentPos = decodedOrds.offset;
+        currentDocId = docId;
+      }
+
+      if (parents == null) {
+        parents = taxoReader.getParallelTaxonomyArrays().parents();
+      }
+
+      int endPos = decodedOrds.offset + decodedOrds.length;
+      assert currentPos <= endPos;
+
+      for (; currentPos < endPos; ) {
+        int ord = decodedOrds.ints[currentPos++];
+        if (isDescendant(ord, parentOrd) == true) {
+          return taxoReader.getPath(ord);

Review comment:
       Hmm, I was thinking this new per-segment class (`FacetLabelReader`) would hold onto a single `BinaryDocValues` instance, but you're right, that won't work.
   
   The facet ordinals are stored / compacted in sorted order in a single `BinaryDocValues` field which we decode here into `decodedOrds`.  Resolving those ordinals to `FacetLabel` is costly, currently retrieving one stored document per ord.  After LUCENE-9450 (switching from stored fields to BDV), we could pull a new BDV for each unique `docId` passed to `nextFacetLabel`, and then bulk resolve all the ordinals?
   
   Anyway, we can pursue all of this later -- this PR already looks awesome -- progress not perfection!

##########
File path: lucene/facet/src/test/org/apache/lucene/facet/FacetTestCase.java
##########
@@ -56,6 +60,28 @@ public Facets getTaxonomyFacetCounts(TaxonomyReader taxoReader, FacetsConfig con
     return facets;
   }
 
+  public List<List<FacetLabel>> getTaxonomyFacetLabels(TaxonomyReader taxoReader, FacetsConfig config, FacetsCollector fc) throws IOException {

Review comment:
       Thank you for adding this utility method so tests can easily use the new utility class!
   
   Can we rename this to `getAllTaxonomyFacetLabels`, and add javadoc explaining that the outer list is one entry per matched hit, and the inner list is one entry per `FacetLabel` belonging to that hit?

##########
File path: lucene/facet/src/test/org/apache/lucene/facet/taxonomy/TestTaxonomyFacetCounts.java
##########
@@ -726,6 +743,39 @@ public void testRandom() throws Exception {
     IOUtils.close(tw, searcher.getIndexReader(), tr, indexDir, taxoDir);
   }
 
+  private static List<List<FacetLabel>> sortedFacetLabels(List<List<FacetLabel>> allfacetLabels) {
+    for (List<FacetLabel> facetLabels : allfacetLabels) {
+      Collections.sort(facetLabels);
+    }
+
+    Collections.sort(allfacetLabels, (o1, o2) -> {
+      if (o1 == null) {

Review comment:
       Hmm why are these `null` checks necessary?  Are we really seeing `null` in the argument?  Oh, I guess this legitimately happens when the hit had no facets?  Maybe add a comment?  Hmm, actually, looking at how actual and expected are populated, neither of them seems to add `null`?  One of them filters out empty list but the other does not?

##########
File path: lucene/facet/src/test/org/apache/lucene/facet/taxonomy/TestTaxonomyFacetCounts.java
##########
@@ -711,6 +723,11 @@ public void testRandom() throws Exception {
         }
       }
 
+      // Test facet labels for each matching test doc
+      List<List<FacetLabel>> actualLabels = getTaxonomyFacetLabels(tr, config, fc);
+      assertEquals(expectedLabels.size(), actualLabels.size());

Review comment:
       Hmm I think `expectedLabels` filters out empty `List<FacetLabel>` but `actualLabels` does not, so this might false trip?

##########
File path: lucene/facet/src/test/org/apache/lucene/facet/taxonomy/TestTaxonomyFacetCounts.java
##########
@@ -726,6 +743,39 @@ public void testRandom() throws Exception {
     IOUtils.close(tw, searcher.getIndexReader(), tr, indexDir, taxoDir);
   }
 
+  private static List<List<FacetLabel>> sortedFacetLabels(List<List<FacetLabel>> allfacetLabels) {
+    for (List<FacetLabel> facetLabels : allfacetLabels) {
+      Collections.sort(facetLabels);
+    }
+
+    Collections.sort(allfacetLabels, (o1, o2) -> {

Review comment:
       I'm confused why we are sorting the top list?  Isn't the top list in order of the hits?  And we want to confirm, for a given `docId` hit, that expected and actual labels match?
   
   OK, I think I understand: this test does not index anything allowing you to track which original doc mapped to which `FacetLabel`, so then you cannot know, per segment, which docs ended up where :)
   
   Given that, I think it's OK to do the top-level sort of all `List<FacetLabel>` across all hits.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] goankur commented on a change in pull request #1893: LUCENE-9444 Utility class to get facet labels from taxonomy for a fac…

Posted by GitBox <gi...@apache.org>.
goankur commented on a change in pull request #1893:
URL: https://github.com/apache/lucene-solr/pull/1893#discussion_r494675000



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/taxonomy/TaxonomyFacetLabels.java
##########
@@ -0,0 +1,184 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.facet.taxonomy;
+
+import org.apache.lucene.facet.FacetsConfig;
+import org.apache.lucene.index.LeafReaderContext;
+import org.apache.lucene.util.IntsRef;
+
+import java.io.IOException;
+
+import static org.apache.lucene.facet.taxonomy.TaxonomyReader.INVALID_ORDINAL;
+import static org.apache.lucene.facet.taxonomy.TaxonomyReader.ROOT_ORDINAL;
+
+/**
+ * Utility class to easily retrieve previously indexed facet labels, allowing you to skip also adding stored fields for these values,
+ * reducing your index size.
+ *
+ * @lucene.experimental
+ **/
+public class TaxonomyFacetLabels {
+
+  /**
+   * Index field name provided to the constructor
+   */
+  private final String indexFieldName;
+
+  /**
+   * {@code TaxonomyReader} provided to the constructor
+   */
+  private final TaxonomyReader taxoReader;
+
+  /**
+   * {@code FacetsConfig} provided to the constructor
+   */
+  private final FacetsConfig config;
+
+  /**
+   * {@code OrdinalsReader} to decode ordinals previously indexed into the {@code BinaryDocValues} facet field
+   */
+  private final OrdinalsReader ordsReader;
+
+  /**
+   * Sole constructor.  Do not close the provided {@link TaxonomyReader} while still using this instance!
+   */
+  public TaxonomyFacetLabels(TaxonomyReader taxoReader, FacetsConfig config, String indexFieldName) throws IOException {
+    this.taxoReader = taxoReader;
+    this.config = config;
+    this.indexFieldName = indexFieldName;
+    this.ordsReader = new DocValuesOrdinalsReader(indexFieldName);
+  }
+
+  /**
+   * Create and return an instance of {@link FacetLabelReader} to retrieve facet labels for
+   * multiple documents and (optionally) for a specific dimension.  You must create this per-segment,
+   * and then step through all hits, in order, for that segment.
+   *
+   * <p><b>NOTE</b>: This class is not thread-safe, so you must use a new instance of this
+   * class for each thread.</p>
+   *
+   * @param readerContext LeafReaderContext used to access the {@code BinaryDocValues} facet field
+   * @return an instance of {@link FacetLabelReader}
+   * @throws IOException when a low-level IO issue occurs
+   */
+  public FacetLabelReader getFacetLabelReader(LeafReaderContext readerContext) throws IOException {
+    return new FacetLabelReader(ordsReader, readerContext);
+  }
+
+  /**
+   * Utility class to retrieve facet labels for multiple documents.
+   *
+   * @lucene.experimental
+   */
+  public class FacetLabelReader {
+    private final OrdinalsReader.OrdinalsSegmentReader ordinalsSegmentReader;
+    private final IntsRef decodedOrds = new IntsRef();
+    private int currentDocId = -1;
+    private int currentPos = -1;
+
+    // Lazily set when nextFacetLabel(int docId, String facetDimension) is first called
+    private int[] parents;
+
+    /**
+     * Sole constructor.
+     */
+    public FacetLabelReader(OrdinalsReader ordsReader, LeafReaderContext readerContext) throws IOException {
+      ordinalsSegmentReader = ordsReader.getReader(readerContext);
+    }
+
+    /**
+     * Retrieves the next {@link FacetLabel} for the specified {@code docId}, or {@code null} if there are no more.
+     * This method has state: if the provided {@code docId} is the same as the previous invocation, it returns the
+     * next {@link FacetLabel} for that document.  Otherwise, it advances to the new {@code docId} and provides the
+     * first {@link FacetLabel} for that document, or {@code null} if that document has no indexed facets.  Each
+     * new {@code docId} must be in strictly monotonic (increasing) order.
+     *
+     * @param docId input docId provided in monotonic (non-decreasing) order
+     * @return the first or next {@link FacetLabel}, or {@code null} if there are no more
+     * @throws IOException when a low-level IO issue occurs
+     */
+    public FacetLabel nextFacetLabel(int docId) throws IOException {
+      if (currentDocId != docId) {
+        assert docId > currentDocId: "docs out of order!  currentDocId=" + currentDocId + " docId=" + docId;
+        ordinalsSegmentReader.get(docId, decodedOrds);
+        currentDocId = docId;
+        currentPos = decodedOrds.offset;
+      }
+
+      int endPos = decodedOrds.offset + decodedOrds.length;
+      assert currentPos <= endPos;
+
+      if (currentPos == endPos) {
+        // no more FacetLabels
+        return null;
+      }
+
+      int ord = decodedOrds.ints[currentPos++];
+      return taxoReader.getPath(ord);
+    }
+
+    private boolean isDescendant(int ord, int ancestorOrd) {
+      while (ord != INVALID_ORDINAL && ord != ROOT_ORDINAL) {
+        if (parents[ord] == ancestorOrd) {
+          return true;
+        }
+        ord = parents[ord];
+      }
+      return false;
+    }
+
+    /**
+     * Retrieves the next {@link FacetLabel} for the specified {@code docId} under the requested {@code facetDimension},
+     * or {@code null} if there are no more. This method has state: if the provided {@code docId} is the same as the
+     * previous invocation, it returns the next {@link FacetLabel} for that document.  Otherwise, it advances to
+     * the new {@code docId} and provides the first {@link FacetLabel} for that document, or {@code null} if that document
+     * has no indexed facets.  Each new {@code docId} must be in strictly monotonic (increasing) order.
+     *
+     * <p><b>NOTE</b>: this method loads the {@code int[] parents} array from the taxonomy index</p>
+     *
+     * @param docId input docId provided in non-decreasing order
+     * @return the first or next {@link FacetLabel}, or {@code null} if there are no more
+     * @throws IOException if {@link TaxonomyReader} has problems getting path for an ordinal
+     */
+    public FacetLabel nextFacetLabel(int docId, String facetDimension) throws IOException {
+      final int parentOrd = taxoReader.getOrdinal(new FacetLabel(facetDimension));
+      assert parentOrd != INVALID_ORDINAL : "Category ordinal not found for facet dimension: " + facetDimension;
+
+      if (currentDocId != docId) {
+        assert docId > currentDocId: "docs out of order!  currentDocId=" + currentDocId + " docId=" + docId;
+        ordinalsSegmentReader.get(docId, decodedOrds);
+        currentPos = decodedOrds.offset;
+        currentDocId = docId;
+      }
+
+      if (parents == null) {
+        parents = taxoReader.getParallelTaxonomyArrays().parents();
+      }
+
+      int endPos = decodedOrds.offset + decodedOrds.length;
+      assert currentPos <= endPos;
+
+      for (; currentPos < endPos; ) {
+        int ord = decodedOrds.ints[currentPos++];
+        if (isDescendant(ord, parentOrd) == true) {
+          return taxoReader.getPath(ord);

Review comment:
       The new per-segment class (`FacetLabelReader`) holds an instance of `DocValuesOrdinalsReader` which internally holds a reusable instance of `BinaryDocValues`.
   
   In the `nextFacetLabel()` method `ordinalsSegmentReader.get(docId, decodedOrds)` that does the actual decoding of the ordinals is only called **once** if the input docId is the same as the one supplied in prior invocation of `nextFacetLabel()`.  
   
   Subsequent invocations fetch the ordinals from `FacetLabelReader.decodedOrds` and use the ordinal to lookup FacetLabels using `taxoReader.getPath(ord)`. 
   
   Yes, after LUCENE-9450, we can bulk resolve all the ordinals using a different API, something like -> `List<FacetLabels> taxoReader.getPaths(IntRef decodedOrds)` and add a new API - `FacetLabelReader.allFacetLabels(int docId)`
   
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] goankur commented on a change in pull request #1893: LUCENE-9444 Utility class to get facet labels from taxonomy for a fac…

Posted by GitBox <gi...@apache.org>.
goankur commented on a change in pull request #1893:
URL: https://github.com/apache/lucene-solr/pull/1893#discussion_r493919329



##########
File path: lucene/facet/src/test/org/apache/lucene/facet/taxonomy/TestTaxonomyLabels.java
##########
@@ -0,0 +1,192 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.facet.taxonomy;
+
+import org.apache.lucene.document.Document;
+import org.apache.lucene.facet.FacetField;
+import org.apache.lucene.facet.FacetTestCase;
+import org.apache.lucene.facet.FacetsCollector;
+import org.apache.lucene.facet.FacetsCollector.MatchingDocs;
+import org.apache.lucene.facet.FacetsConfig;
+import org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyReader;
+import org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter;
+import org.apache.lucene.index.IndexWriterConfig;
+import org.apache.lucene.index.RandomIndexWriter;
+import org.apache.lucene.search.DocIdSetIterator;
+import org.apache.lucene.search.IndexSearcher;
+import org.apache.lucene.search.MatchAllDocsQuery;
+import org.apache.lucene.store.Directory;
+import org.apache.lucene.util.IOUtils;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.Set;
+import java.util.stream.Collectors;
+
+public class TestTaxonomyLabels extends FacetTestCase {
+
+  private List<Document> prepareDocuments() {
+    List<Document> docs = new ArrayList<>();
+
+    Document doc = new Document();
+    doc.add(new FacetField("Author", "Bob"));
+    doc.add(new FacetField("Publish Date", "2010", "10", "15"));
+    docs.add(doc);
+
+    doc = new Document();
+    doc.add(new FacetField("Author", "Lisa"));
+    doc.add(new FacetField("Publish Date", "2010", "10", "20"));
+    docs.add(doc);
+
+    doc = new Document();
+    doc.add(new FacetField("Author", "Tom"));
+    doc.add(new FacetField("Publish Date", "2012", "1", "1"));
+    docs.add(doc);
+
+    doc = new Document();
+    doc.add(new FacetField("Author", "Susan"));
+    doc.add(new FacetField("Publish Date", "2012", "1", "7"));
+    docs.add(doc);
+
+    doc = new Document();
+    doc.add(new FacetField("Author", "Frank"));
+    doc.add(new FacetField("Publish Date", "1999", "5", "5"));
+    docs.add(doc);
+
+    return docs;
+  }
+
+  private List<Integer> allDocIds(MatchingDocs m, boolean decreasingDocIds) throws IOException {
+    DocIdSetIterator disi = m.bits.iterator();
+    List<Integer> docIds = new ArrayList<>();
+    while (disi.nextDoc() != DocIdSetIterator.NO_MORE_DOCS) {
+      docIds.add(disi.docID());
+    }
+
+    if (decreasingDocIds == true) {
+      Collections.reverse(docIds);
+    }
+    return docIds;
+  }
+
+  private List<FacetLabel> lookupFacetLabels(TaxonomyFacetLabels taxoLabels,
+                                             List<MatchingDocs> matchingDocs) throws IOException {
+    return lookupFacetLabels(taxoLabels, matchingDocs, null, false);
+  }
+
+  private List<FacetLabel> lookupFacetLabels(TaxonomyFacetLabels taxoLabels,
+                                             List<MatchingDocs> matchingDocs,
+                                             String dimension) throws IOException {
+    return lookupFacetLabels(taxoLabels, matchingDocs, dimension, false);
+  }
+
+  private List<FacetLabel> lookupFacetLabels(TaxonomyFacetLabels taxoLabels, List<MatchingDocs> matchingDocs, String dimension,
+                                             boolean decreasingDocIds) throws IOException {
+    List<FacetLabel> facetLabels = new ArrayList<>();
+
+    for (MatchingDocs m : matchingDocs) {
+      TaxonomyFacetLabels.FacetLabelReader facetLabelReader = taxoLabels.getFacetLabelReader(m.context);
+      List<Integer> docIds = allDocIds(m, decreasingDocIds);
+      FacetLabel facetLabel;
+      for (Integer docId : docIds) {
+        while (true) {
+          if (dimension != null) {
+            facetLabel = facetLabelReader.nextFacetLabel(docId, dimension);
+          } else {
+            facetLabel = facetLabelReader.nextFacetLabel(docId);
+          }
+
+          if (facetLabel == null) {
+            break;
+          }
+          facetLabels.add(facetLabel);
+        }
+      }
+    }
+
+    return facetLabels;
+  }
+
+
+  public void testBasic() throws Exception {

Review comment:
       Done in this revision




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] goankur commented on a change in pull request #1893: LUCENE-9444 Utility class to get facet labels from taxonomy for a fac…

Posted by GitBox <gi...@apache.org>.
goankur commented on a change in pull request #1893:
URL: https://github.com/apache/lucene-solr/pull/1893#discussion_r494655549



##########
File path: lucene/facet/src/test/org/apache/lucene/facet/taxonomy/TestTaxonomyFacetCounts.java
##########
@@ -726,6 +743,39 @@ public void testRandom() throws Exception {
     IOUtils.close(tw, searcher.getIndexReader(), tr, indexDir, taxoDir);
   }
 
+  private static List<List<FacetLabel>> sortedFacetLabels(List<List<FacetLabel>> allfacetLabels) {
+    for (List<FacetLabel> facetLabels : allfacetLabels) {
+      Collections.sort(facetLabels);
+    }
+
+    Collections.sort(allfacetLabels, (o1, o2) -> {
+      if (o1 == null) {

Review comment:
       Thanks for catching this @mikemccand. I fixed the `actualLabels` to exclude empty lists. The null checks were just me being extra cautious. I realized they were unnecessary and removed them :-)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] goankur commented on a change in pull request #1893: LUCENE-9444 Utility class to get facet labels from taxonomy for a fac…

Posted by GitBox <gi...@apache.org>.
goankur commented on a change in pull request #1893:
URL: https://github.com/apache/lucene-solr/pull/1893#discussion_r493137058



##########
File path: lucene/facet/src/test/org/apache/lucene/facet/taxonomy/TestTaxonomyLabels.java
##########
@@ -0,0 +1,192 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.facet.taxonomy;
+
+import org.apache.lucene.document.Document;
+import org.apache.lucene.facet.FacetField;
+import org.apache.lucene.facet.FacetTestCase;
+import org.apache.lucene.facet.FacetsCollector;
+import org.apache.lucene.facet.FacetsCollector.MatchingDocs;
+import org.apache.lucene.facet.FacetsConfig;
+import org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyReader;
+import org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter;
+import org.apache.lucene.index.IndexWriterConfig;
+import org.apache.lucene.index.RandomIndexWriter;
+import org.apache.lucene.search.DocIdSetIterator;
+import org.apache.lucene.search.IndexSearcher;
+import org.apache.lucene.search.MatchAllDocsQuery;
+import org.apache.lucene.store.Directory;
+import org.apache.lucene.util.IOUtils;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.Set;
+import java.util.stream.Collectors;
+
+public class TestTaxonomyLabels extends FacetTestCase {
+
+  private List<Document> prepareDocuments() {
+    List<Document> docs = new ArrayList<>();
+
+    Document doc = new Document();
+    doc.add(new FacetField("Author", "Bob"));
+    doc.add(new FacetField("Publish Date", "2010", "10", "15"));
+    docs.add(doc);
+
+    doc = new Document();
+    doc.add(new FacetField("Author", "Lisa"));
+    doc.add(new FacetField("Publish Date", "2010", "10", "20"));
+    docs.add(doc);
+
+    doc = new Document();
+    doc.add(new FacetField("Author", "Tom"));
+    doc.add(new FacetField("Publish Date", "2012", "1", "1"));
+    docs.add(doc);
+
+    doc = new Document();
+    doc.add(new FacetField("Author", "Susan"));
+    doc.add(new FacetField("Publish Date", "2012", "1", "7"));
+    docs.add(doc);
+
+    doc = new Document();
+    doc.add(new FacetField("Author", "Frank"));
+    doc.add(new FacetField("Publish Date", "1999", "5", "5"));
+    docs.add(doc);
+
+    return docs;
+  }
+
+  private List<Integer> allDocIds(MatchingDocs m, boolean decreasingDocIds) throws IOException {
+    DocIdSetIterator disi = m.bits.iterator();
+    List<Integer> docIds = new ArrayList<>();
+    while (disi.nextDoc() != DocIdSetIterator.NO_MORE_DOCS) {
+      docIds.add(disi.docID());
+    }
+
+    if (decreasingDocIds == true) {
+      Collections.reverse(docIds);
+    }
+    return docIds;
+  }
+
+  private List<FacetLabel> lookupFacetLabels(TaxonomyFacetLabels taxoLabels,
+                                             List<MatchingDocs> matchingDocs) throws IOException {
+    return lookupFacetLabels(taxoLabels, matchingDocs, null, false);
+  }
+
+  private List<FacetLabel> lookupFacetLabels(TaxonomyFacetLabels taxoLabels,
+                                             List<MatchingDocs> matchingDocs,
+                                             String dimension) throws IOException {
+    return lookupFacetLabels(taxoLabels, matchingDocs, dimension, false);
+  }
+
+  private List<FacetLabel> lookupFacetLabels(TaxonomyFacetLabels taxoLabels, List<MatchingDocs> matchingDocs, String dimension,
+                                             boolean decreasingDocIds) throws IOException {
+    List<FacetLabel> facetLabels = new ArrayList<>();
+
+    for (MatchingDocs m : matchingDocs) {
+      TaxonomyFacetLabels.FacetLabelReader facetLabelReader = taxoLabels.getFacetLabelReader(m.context);
+      List<Integer> docIds = allDocIds(m, decreasingDocIds);
+      FacetLabel facetLabel;
+      for (Integer docId : docIds) {
+        while (true) {
+          if (dimension != null) {
+            facetLabel = facetLabelReader.nextFacetLabel(docId, dimension);
+          } else {
+            facetLabel = facetLabelReader.nextFacetLabel(docId);
+          }
+
+          if (facetLabel == null) {
+            break;
+          }
+          facetLabels.add(facetLabel);
+        }
+      }
+    }
+
+    return facetLabels;
+  }
+
+
+  public void testBasic() throws Exception {
+    Directory dir = newDirectory();
+    Directory taxoDir = newDirectory();
+
+    // Writes facet ords to a separate directory from the main index:
+    DirectoryTaxonomyWriter taxoWriter = new DirectoryTaxonomyWriter(taxoDir, IndexWriterConfig.OpenMode.CREATE);
+    RandomIndexWriter writer = new RandomIndexWriter(random(), dir);
+
+    FacetsConfig config = new FacetsConfig();
+    config.setHierarchical("Publish Date", true);
+
+    for (Document doc : prepareDocuments()) {
+      writer.addDocument(config.build(taxoWriter, doc));
+    }
+
+    // NRT open
+    IndexSearcher searcher = newSearcher(writer.getReader());
+    // NRT open
+    TaxonomyReader taxoReader = new DirectoryTaxonomyReader(taxoWriter);
+
+    FacetsCollector fc = new FacetsCollector();
+    searcher.search(new MatchAllDocsQuery(), fc);
+
+    TaxonomyFacetLabels taxoLabels = new TaxonomyFacetLabels(taxoReader, config, FacetsConfig.DEFAULT_INDEX_FIELD_NAME);
+
+    // Check labels for all dimensions
+    List<FacetLabel> facetLabels = lookupFacetLabels(taxoLabels, fc.getMatchingDocs());
+    assertTrue(facetLabels.size() == 10);
+    assertTrue(facetLabels.stream()
+        .filter(l -> "Author".equals(l.components[0]))
+        .map(l -> l.components[1]).collect(Collectors.toSet())
+        .equals(Set.of("Bob", "Lisa", "Susan", "Frank", "Tom")));
+
+    assertTrue(facetLabels.stream()
+        .filter(l -> "Publish Date".equals(l.components[0]))
+        .map(l -> String.join("/", l.components[1], l.components[2], l.components[3]))
+        .collect(Collectors.toSet())
+        .equals(Set.of("2010/10/15", "2010/10/20", "2012/1/1", "2012/1/7", "1999/5/5")));
+
+    // Check labels for specific dimension
+    facetLabels = lookupFacetLabels(taxoLabels, fc.getMatchingDocs(), "Publish Date");
+    assertTrue(facetLabels.size() == 5);
+    assertTrue(facetLabels.stream()
+        .filter(l -> "Publish Date".equals(l.components[0]))
+        .map(l -> String.join("/", l.components[1], l.components[2], l.components[3]))
+        .collect(Collectors.toSet())
+        .equals(Set.of("2010/10/15", "2010/10/20", "2012/1/1", "2012/1/7", "1999/5/5")));
+
+    try {
+      facetLabels = lookupFacetLabels(taxoLabels, fc.getMatchingDocs(), null, true);
+      fail("Assertion error was not thrown for using docIds supplied in decreasing order");
+    } catch (AssertionError ae) {

Review comment:
       Nice catch, I changed the implementation of `nextFacetLabel(...)` methods to throw `IllegalArgumentException` should docIds be supplied in decreasing order and updated this test to expect an `IllegalArgumentException` when docIds are supplied in decreasing order.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] mikemccand commented on a change in pull request #1893: LUCENE-9444 Utility class to get facet labels from taxonomy for a fac…

Posted by GitBox <gi...@apache.org>.
mikemccand commented on a change in pull request #1893:
URL: https://github.com/apache/lucene-solr/pull/1893#discussion_r495240661



##########
File path: lucene/facet/src/test/org/apache/lucene/facet/taxonomy/TestTaxonomyFacetCounts.java
##########
@@ -711,6 +723,11 @@ public void testRandom() throws Exception {
         }
       }
 
+      // Test facet labels for each matching test doc
+      List<List<FacetLabel>> actualLabels = getTaxonomyFacetLabels(tr, config, fc);
+      assertEquals(expectedLabels.size(), actualLabels.size());

Review comment:
       Hmm I don't like filtering out "no facet fields" case from `FacetTestCase.getAllTaxonomyFacetLabels()` -- that means callers cannot safely map hits to the `List<FacetLabel>`?
   
   Could we instead keep the empty lists in both expected and actual?

##########
File path: lucene/facet/src/java/org/apache/lucene/facet/taxonomy/TaxonomyFacetLabels.java
##########
@@ -0,0 +1,184 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.facet.taxonomy;
+
+import org.apache.lucene.facet.FacetsConfig;
+import org.apache.lucene.index.LeafReaderContext;
+import org.apache.lucene.util.IntsRef;
+
+import java.io.IOException;
+
+import static org.apache.lucene.facet.taxonomy.TaxonomyReader.INVALID_ORDINAL;
+import static org.apache.lucene.facet.taxonomy.TaxonomyReader.ROOT_ORDINAL;
+
+/**
+ * Utility class to easily retrieve previously indexed facet labels, allowing you to skip also adding stored fields for these values,
+ * reducing your index size.
+ *
+ * @lucene.experimental
+ **/
+public class TaxonomyFacetLabels {
+
+  /**
+   * Index field name provided to the constructor
+   */
+  private final String indexFieldName;
+
+  /**
+   * {@code TaxonomyReader} provided to the constructor
+   */
+  private final TaxonomyReader taxoReader;
+
+  /**
+   * {@code FacetsConfig} provided to the constructor
+   */
+  private final FacetsConfig config;
+
+  /**
+   * {@code OrdinalsReader} to decode ordinals previously indexed into the {@code BinaryDocValues} facet field
+   */
+  private final OrdinalsReader ordsReader;
+
+  /**
+   * Sole constructor.  Do not close the provided {@link TaxonomyReader} while still using this instance!
+   */
+  public TaxonomyFacetLabels(TaxonomyReader taxoReader, FacetsConfig config, String indexFieldName) throws IOException {
+    this.taxoReader = taxoReader;
+    this.config = config;
+    this.indexFieldName = indexFieldName;
+    this.ordsReader = new DocValuesOrdinalsReader(indexFieldName);
+  }
+
+  /**
+   * Create and return an instance of {@link FacetLabelReader} to retrieve facet labels for
+   * multiple documents and (optionally) for a specific dimension.  You must create this per-segment,
+   * and then step through all hits, in order, for that segment.
+   *
+   * <p><b>NOTE</b>: This class is not thread-safe, so you must use a new instance of this
+   * class for each thread.</p>
+   *
+   * @param readerContext LeafReaderContext used to access the {@code BinaryDocValues} facet field
+   * @return an instance of {@link FacetLabelReader}
+   * @throws IOException when a low-level IO issue occurs
+   */
+  public FacetLabelReader getFacetLabelReader(LeafReaderContext readerContext) throws IOException {
+    return new FacetLabelReader(ordsReader, readerContext);
+  }
+
+  /**
+   * Utility class to retrieve facet labels for multiple documents.
+   *
+   * @lucene.experimental
+   */
+  public class FacetLabelReader {
+    private final OrdinalsReader.OrdinalsSegmentReader ordinalsSegmentReader;
+    private final IntsRef decodedOrds = new IntsRef();
+    private int currentDocId = -1;
+    private int currentPos = -1;
+
+    // Lazily set when nextFacetLabel(int docId, String facetDimension) is first called
+    private int[] parents;
+
+    /**
+     * Sole constructor.
+     */
+    public FacetLabelReader(OrdinalsReader ordsReader, LeafReaderContext readerContext) throws IOException {
+      ordinalsSegmentReader = ordsReader.getReader(readerContext);
+    }
+
+    /**
+     * Retrieves the next {@link FacetLabel} for the specified {@code docId}, or {@code null} if there are no more.
+     * This method has state: if the provided {@code docId} is the same as the previous invocation, it returns the
+     * next {@link FacetLabel} for that document.  Otherwise, it advances to the new {@code docId} and provides the
+     * first {@link FacetLabel} for that document, or {@code null} if that document has no indexed facets.  Each
+     * new {@code docId} must be in strictly monotonic (increasing) order.
+     *
+     * @param docId input docId provided in monotonic (non-decreasing) order
+     * @return the first or next {@link FacetLabel}, or {@code null} if there are no more
+     * @throws IOException when a low-level IO issue occurs
+     */
+    public FacetLabel nextFacetLabel(int docId) throws IOException {
+      if (currentDocId != docId) {
+        assert docId > currentDocId: "docs out of order!  currentDocId=" + currentDocId + " docId=" + docId;
+        ordinalsSegmentReader.get(docId, decodedOrds);
+        currentDocId = docId;
+        currentPos = decodedOrds.offset;
+      }
+
+      int endPos = decodedOrds.offset + decodedOrds.length;
+      assert currentPos <= endPos;
+
+      if (currentPos == endPos) {
+        // no more FacetLabels
+        return null;
+      }
+
+      int ord = decodedOrds.ints[currentPos++];
+      return taxoReader.getPath(ord);
+    }
+
+    private boolean isDescendant(int ord, int ancestorOrd) {
+      while (ord != INVALID_ORDINAL && ord != ROOT_ORDINAL) {
+        if (parents[ord] == ancestorOrd) {
+          return true;
+        }
+        ord = parents[ord];
+      }
+      return false;
+    }
+
+    /**
+     * Retrieves the next {@link FacetLabel} for the specified {@code docId} under the requested {@code facetDimension},
+     * or {@code null} if there are no more. This method has state: if the provided {@code docId} is the same as the
+     * previous invocation, it returns the next {@link FacetLabel} for that document.  Otherwise, it advances to
+     * the new {@code docId} and provides the first {@link FacetLabel} for that document, or {@code null} if that document
+     * has no indexed facets.  Each new {@code docId} must be in strictly monotonic (increasing) order.
+     *
+     * <p><b>NOTE</b>: this method loads the {@code int[] parents} array from the taxonomy index</p>
+     *
+     * @param docId input docId provided in non-decreasing order
+     * @return the first or next {@link FacetLabel}, or {@code null} if there are no more
+     * @throws IOException if {@link TaxonomyReader} has problems getting path for an ordinal
+     */
+    public FacetLabel nextFacetLabel(int docId, String facetDimension) throws IOException {
+      final int parentOrd = taxoReader.getOrdinal(new FacetLabel(facetDimension));
+      assert parentOrd != INVALID_ORDINAL : "Category ordinal not found for facet dimension: " + facetDimension;
+
+      if (currentDocId != docId) {
+        assert docId > currentDocId: "docs out of order!  currentDocId=" + currentDocId + " docId=" + docId;
+        ordinalsSegmentReader.get(docId, decodedOrds);
+        currentPos = decodedOrds.offset;
+        currentDocId = docId;
+      }
+
+      if (parents == null) {
+        parents = taxoReader.getParallelTaxonomyArrays().parents();
+      }
+
+      int endPos = decodedOrds.offset + decodedOrds.length;
+      assert currentPos <= endPos;
+
+      for (; currentPos < endPos; ) {
+        int ord = decodedOrds.ints[currentPos++];
+        if (isDescendant(ord, parentOrd) == true) {
+          return taxoReader.getPath(ord);

Review comment:
       Yeah, or we could bulk map all ord -> FacetLabel as soon as caller goes to the next `docId`, and then iterate them one by one with this API.
   
   Anyway, we don't need to solve that here.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] mikemccand commented on a change in pull request #1893: LUCENE-9444 Utility class to get facet labels from taxonomy for a fac…

Posted by GitBox <gi...@apache.org>.
mikemccand commented on a change in pull request #1893:
URL: https://github.com/apache/lucene-solr/pull/1893#discussion_r493041886



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/taxonomy/TaxonomyFacetLabels.java
##########
@@ -0,0 +1,184 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.facet.taxonomy;
+
+import org.apache.lucene.facet.FacetsConfig;
+import org.apache.lucene.index.LeafReaderContext;
+import org.apache.lucene.util.IntsRef;
+
+import java.io.IOException;
+
+import static org.apache.lucene.facet.taxonomy.TaxonomyReader.INVALID_ORDINAL;
+import static org.apache.lucene.facet.taxonomy.TaxonomyReader.ROOT_ORDINAL;
+
+/**
+ * Utility class to easily retrieve previously indexed facet labels, allowing you to skip also adding stored fields for these values,
+ * reducing your index size.
+ *
+ * @lucene.experimental
+ **/
+public class TaxonomyFacetLabels {
+
+  /**
+   * Index field name provided to the constructor
+   */
+  private final String indexFieldName;
+
+  /**
+   * {@code TaxonomyReader} provided to the constructor
+   */
+  private final TaxonomyReader taxoReader;
+
+  /**
+   * {@code FacetsConfig} provided to the constructor
+   */
+  private final FacetsConfig config;
+
+  /**
+   * {@code OrdinalsReader} to decode ordinals previously indexed into the {@code BinaryDocValues} facet field
+   */
+  private final OrdinalsReader ordsReader;
+
+  /**
+   * Sole constructor.  Do not close the provided {@link TaxonomyReader} while still using this instance!
+   */
+  public TaxonomyFacetLabels(TaxonomyReader taxoReader, FacetsConfig config, String indexFieldName) throws IOException {
+    this.taxoReader = taxoReader;
+    this.config = config;
+    this.indexFieldName = indexFieldName;
+    this.ordsReader = new DocValuesOrdinalsReader(indexFieldName);
+  }
+
+  /**
+   * Create and return an instance of {@link FacetLabelReader} to retrieve facet labels for
+   * multiple documents and (optionally) for a specific dimension.  You must create this per-segment,
+   * and then step through all hits, in order, for that segment.
+   *
+   * <p><b>NOTE</b>: This class is not thread-safe, so you must use a new instance of this
+   * class for each thread.</p>
+   *
+   * @param readerContext LeafReaderContext used to access the {@code BinaryDocValues} facet field
+   * @return an instance of {@link FacetLabelReader}
+   * @throws IOException when a low-level IO issue occurs
+   */
+  public FacetLabelReader getFacetLabelReader(LeafReaderContext readerContext) throws IOException {
+    return new FacetLabelReader(ordsReader, readerContext);
+  }
+
+  /**
+   * Utility class to retrieve facet labels for multiple documents.
+   *
+   * @lucene.experimental
+   */
+  public class FacetLabelReader {
+    private final OrdinalsReader.OrdinalsSegmentReader ordinalsSegmentReader;
+    private final IntsRef decodedOrds = new IntsRef();
+    private int currentDocId = -1;
+    private int currentPos = -1;
+
+    // Lazily set when nextFacetLabel(int docId, String facetDimension) is first called
+    private int[] parents;
+
+    /**
+     * Sole constructor.
+     */
+    public FacetLabelReader(OrdinalsReader ordsReader, LeafReaderContext readerContext) throws IOException {
+      ordinalsSegmentReader = ordsReader.getReader(readerContext);
+    }
+
+    /**
+     * Retrieves the next {@link FacetLabel} for the specified {@code docId}, or {@code null} if there are no more.
+     * This method has state: if the provided {@code docId} is the same as the previous invocation, it returns the
+     * next {@link FacetLabel} for that document.  Otherwise, it advances to the new {@code docId} and provides the
+     * first {@link FacetLabel} for that document, or {@code null} if that document has no indexed facets.  Each
+     * new {@code docId} must be in strictly monotonic (increasing) order.
+     *
+     * @param docId input docId provided in monotonic (non-decreasing) order
+     * @return the first or next {@link FacetLabel}, or {@code null} if there are no more
+     * @throws IOException when a low-level IO issue occurs
+     */
+    public FacetLabel nextFacetLabel(int docId) throws IOException {
+      if (currentDocId != docId) {
+        assert docId > currentDocId: "docs out of order!  currentDocId=" + currentDocId + " docId=" + docId;

Review comment:
       Hmm let's upgrade this to a real `if` not an `assert` (that requires that assertions are enabled), and change to throw `IllegalArgumentException`?  In general if it is a problem that a user could legitimately trip up on, it should be a real `if`, as long as performance impact of such micro-policing is acceptable (which should be the case here).

##########
File path: lucene/facet/src/java/org/apache/lucene/facet/taxonomy/TaxonomyFacetLabels.java
##########
@@ -0,0 +1,184 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.facet.taxonomy;
+
+import org.apache.lucene.facet.FacetsConfig;
+import org.apache.lucene.index.LeafReaderContext;
+import org.apache.lucene.util.IntsRef;
+
+import java.io.IOException;
+
+import static org.apache.lucene.facet.taxonomy.TaxonomyReader.INVALID_ORDINAL;
+import static org.apache.lucene.facet.taxonomy.TaxonomyReader.ROOT_ORDINAL;
+
+/**
+ * Utility class to easily retrieve previously indexed facet labels, allowing you to skip also adding stored fields for these values,
+ * reducing your index size.
+ *
+ * @lucene.experimental
+ **/
+public class TaxonomyFacetLabels {
+
+  /**
+   * Index field name provided to the constructor
+   */
+  private final String indexFieldName;
+
+  /**
+   * {@code TaxonomyReader} provided to the constructor
+   */
+  private final TaxonomyReader taxoReader;
+
+  /**
+   * {@code FacetsConfig} provided to the constructor
+   */
+  private final FacetsConfig config;
+
+  /**
+   * {@code OrdinalsReader} to decode ordinals previously indexed into the {@code BinaryDocValues} facet field
+   */
+  private final OrdinalsReader ordsReader;
+
+  /**
+   * Sole constructor.  Do not close the provided {@link TaxonomyReader} while still using this instance!
+   */
+  public TaxonomyFacetLabels(TaxonomyReader taxoReader, FacetsConfig config, String indexFieldName) throws IOException {
+    this.taxoReader = taxoReader;
+    this.config = config;
+    this.indexFieldName = indexFieldName;
+    this.ordsReader = new DocValuesOrdinalsReader(indexFieldName);
+  }
+
+  /**
+   * Create and return an instance of {@link FacetLabelReader} to retrieve facet labels for
+   * multiple documents and (optionally) for a specific dimension.  You must create this per-segment,
+   * and then step through all hits, in order, for that segment.
+   *
+   * <p><b>NOTE</b>: This class is not thread-safe, so you must use a new instance of this
+   * class for each thread.</p>
+   *
+   * @param readerContext LeafReaderContext used to access the {@code BinaryDocValues} facet field
+   * @return an instance of {@link FacetLabelReader}
+   * @throws IOException when a low-level IO issue occurs
+   */
+  public FacetLabelReader getFacetLabelReader(LeafReaderContext readerContext) throws IOException {
+    return new FacetLabelReader(ordsReader, readerContext);
+  }
+
+  /**
+   * Utility class to retrieve facet labels for multiple documents.
+   *
+   * @lucene.experimental
+   */
+  public class FacetLabelReader {
+    private final OrdinalsReader.OrdinalsSegmentReader ordinalsSegmentReader;
+    private final IntsRef decodedOrds = new IntsRef();
+    private int currentDocId = -1;
+    private int currentPos = -1;
+
+    // Lazily set when nextFacetLabel(int docId, String facetDimension) is first called
+    private int[] parents;
+
+    /**
+     * Sole constructor.
+     */
+    public FacetLabelReader(OrdinalsReader ordsReader, LeafReaderContext readerContext) throws IOException {
+      ordinalsSegmentReader = ordsReader.getReader(readerContext);
+    }
+
+    /**
+     * Retrieves the next {@link FacetLabel} for the specified {@code docId}, or {@code null} if there are no more.
+     * This method has state: if the provided {@code docId} is the same as the previous invocation, it returns the
+     * next {@link FacetLabel} for that document.  Otherwise, it advances to the new {@code docId} and provides the
+     * first {@link FacetLabel} for that document, or {@code null} if that document has no indexed facets.  Each
+     * new {@code docId} must be in strictly monotonic (increasing) order.
+     *
+     * @param docId input docId provided in monotonic (non-decreasing) order
+     * @return the first or next {@link FacetLabel}, or {@code null} if there are no more
+     * @throws IOException when a low-level IO issue occurs
+     */
+    public FacetLabel nextFacetLabel(int docId) throws IOException {
+      if (currentDocId != docId) {
+        assert docId > currentDocId: "docs out of order!  currentDocId=" + currentDocId + " docId=" + docId;
+        ordinalsSegmentReader.get(docId, decodedOrds);
+        currentDocId = docId;
+        currentPos = decodedOrds.offset;
+      }
+
+      int endPos = decodedOrds.offset + decodedOrds.length;
+      assert currentPos <= endPos;

Review comment:
       This one is a good usage of `assert` because this should only be violated if there is a bug in this code or our default `Codec`, etc.  No accidental abuse by users could cause this.

##########
File path: lucene/facet/src/java/org/apache/lucene/facet/taxonomy/TaxonomyFacetLabels.java
##########
@@ -0,0 +1,184 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.facet.taxonomy;
+
+import org.apache.lucene.facet.FacetsConfig;
+import org.apache.lucene.index.LeafReaderContext;
+import org.apache.lucene.util.IntsRef;
+
+import java.io.IOException;
+
+import static org.apache.lucene.facet.taxonomy.TaxonomyReader.INVALID_ORDINAL;
+import static org.apache.lucene.facet.taxonomy.TaxonomyReader.ROOT_ORDINAL;
+
+/**
+ * Utility class to easily retrieve previously indexed facet labels, allowing you to skip also adding stored fields for these values,
+ * reducing your index size.
+ *
+ * @lucene.experimental
+ **/
+public class TaxonomyFacetLabels {
+
+  /**
+   * Index field name provided to the constructor
+   */
+  private final String indexFieldName;
+
+  /**
+   * {@code TaxonomyReader} provided to the constructor
+   */
+  private final TaxonomyReader taxoReader;
+
+  /**
+   * {@code FacetsConfig} provided to the constructor
+   */
+  private final FacetsConfig config;
+
+  /**
+   * {@code OrdinalsReader} to decode ordinals previously indexed into the {@code BinaryDocValues} facet field
+   */
+  private final OrdinalsReader ordsReader;
+
+  /**
+   * Sole constructor.  Do not close the provided {@link TaxonomyReader} while still using this instance!
+   */
+  public TaxonomyFacetLabels(TaxonomyReader taxoReader, FacetsConfig config, String indexFieldName) throws IOException {
+    this.taxoReader = taxoReader;
+    this.config = config;
+    this.indexFieldName = indexFieldName;
+    this.ordsReader = new DocValuesOrdinalsReader(indexFieldName);
+  }
+
+  /**
+   * Create and return an instance of {@link FacetLabelReader} to retrieve facet labels for
+   * multiple documents and (optionally) for a specific dimension.  You must create this per-segment,
+   * and then step through all hits, in order, for that segment.
+   *
+   * <p><b>NOTE</b>: This class is not thread-safe, so you must use a new instance of this
+   * class for each thread.</p>
+   *
+   * @param readerContext LeafReaderContext used to access the {@code BinaryDocValues} facet field
+   * @return an instance of {@link FacetLabelReader}
+   * @throws IOException when a low-level IO issue occurs
+   */
+  public FacetLabelReader getFacetLabelReader(LeafReaderContext readerContext) throws IOException {
+    return new FacetLabelReader(ordsReader, readerContext);
+  }
+
+  /**
+   * Utility class to retrieve facet labels for multiple documents.
+   *
+   * @lucene.experimental
+   */
+  public class FacetLabelReader {
+    private final OrdinalsReader.OrdinalsSegmentReader ordinalsSegmentReader;
+    private final IntsRef decodedOrds = new IntsRef();
+    private int currentDocId = -1;
+    private int currentPos = -1;
+
+    // Lazily set when nextFacetLabel(int docId, String facetDimension) is first called
+    private int[] parents;
+
+    /**
+     * Sole constructor.
+     */
+    public FacetLabelReader(OrdinalsReader ordsReader, LeafReaderContext readerContext) throws IOException {
+      ordinalsSegmentReader = ordsReader.getReader(readerContext);
+    }
+
+    /**
+     * Retrieves the next {@link FacetLabel} for the specified {@code docId}, or {@code null} if there are no more.
+     * This method has state: if the provided {@code docId} is the same as the previous invocation, it returns the
+     * next {@link FacetLabel} for that document.  Otherwise, it advances to the new {@code docId} and provides the
+     * first {@link FacetLabel} for that document, or {@code null} if that document has no indexed facets.  Each
+     * new {@code docId} must be in strictly monotonic (increasing) order.
+     *
+     * @param docId input docId provided in monotonic (non-decreasing) order

Review comment:
       Can we add a **Note** here, that the returned `FacetLabel` are not necessarily in the same order in which they were indexed?

##########
File path: lucene/facet/src/java/org/apache/lucene/facet/taxonomy/TaxonomyFacetLabels.java
##########
@@ -0,0 +1,184 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.facet.taxonomy;
+
+import org.apache.lucene.facet.FacetsConfig;
+import org.apache.lucene.index.LeafReaderContext;
+import org.apache.lucene.util.IntsRef;
+
+import java.io.IOException;
+
+import static org.apache.lucene.facet.taxonomy.TaxonomyReader.INVALID_ORDINAL;
+import static org.apache.lucene.facet.taxonomy.TaxonomyReader.ROOT_ORDINAL;
+
+/**
+ * Utility class to easily retrieve previously indexed facet labels, allowing you to skip also adding stored fields for these values,
+ * reducing your index size.
+ *
+ * @lucene.experimental
+ **/
+public class TaxonomyFacetLabels {
+
+  /**
+   * Index field name provided to the constructor
+   */
+  private final String indexFieldName;
+
+  /**
+   * {@code TaxonomyReader} provided to the constructor
+   */
+  private final TaxonomyReader taxoReader;
+
+  /**
+   * {@code FacetsConfig} provided to the constructor
+   */
+  private final FacetsConfig config;
+
+  /**
+   * {@code OrdinalsReader} to decode ordinals previously indexed into the {@code BinaryDocValues} facet field
+   */
+  private final OrdinalsReader ordsReader;
+
+  /**
+   * Sole constructor.  Do not close the provided {@link TaxonomyReader} while still using this instance!
+   */
+  public TaxonomyFacetLabels(TaxonomyReader taxoReader, FacetsConfig config, String indexFieldName) throws IOException {
+    this.taxoReader = taxoReader;
+    this.config = config;
+    this.indexFieldName = indexFieldName;
+    this.ordsReader = new DocValuesOrdinalsReader(indexFieldName);
+  }
+
+  /**
+   * Create and return an instance of {@link FacetLabelReader} to retrieve facet labels for
+   * multiple documents and (optionally) for a specific dimension.  You must create this per-segment,
+   * and then step through all hits, in order, for that segment.
+   *
+   * <p><b>NOTE</b>: This class is not thread-safe, so you must use a new instance of this
+   * class for each thread.</p>
+   *
+   * @param readerContext LeafReaderContext used to access the {@code BinaryDocValues} facet field
+   * @return an instance of {@link FacetLabelReader}
+   * @throws IOException when a low-level IO issue occurs
+   */
+  public FacetLabelReader getFacetLabelReader(LeafReaderContext readerContext) throws IOException {
+    return new FacetLabelReader(ordsReader, readerContext);
+  }
+
+  /**
+   * Utility class to retrieve facet labels for multiple documents.
+   *
+   * @lucene.experimental
+   */
+  public class FacetLabelReader {
+    private final OrdinalsReader.OrdinalsSegmentReader ordinalsSegmentReader;
+    private final IntsRef decodedOrds = new IntsRef();
+    private int currentDocId = -1;
+    private int currentPos = -1;
+
+    // Lazily set when nextFacetLabel(int docId, String facetDimension) is first called
+    private int[] parents;
+
+    /**
+     * Sole constructor.
+     */
+    public FacetLabelReader(OrdinalsReader ordsReader, LeafReaderContext readerContext) throws IOException {
+      ordinalsSegmentReader = ordsReader.getReader(readerContext);
+    }
+
+    /**
+     * Retrieves the next {@link FacetLabel} for the specified {@code docId}, or {@code null} if there are no more.
+     * This method has state: if the provided {@code docId} is the same as the previous invocation, it returns the
+     * next {@link FacetLabel} for that document.  Otherwise, it advances to the new {@code docId} and provides the
+     * first {@link FacetLabel} for that document, or {@code null} if that document has no indexed facets.  Each
+     * new {@code docId} must be in strictly monotonic (increasing) order.
+     *
+     * @param docId input docId provided in monotonic (non-decreasing) order
+     * @return the first or next {@link FacetLabel}, or {@code null} if there are no more
+     * @throws IOException when a low-level IO issue occurs
+     */
+    public FacetLabel nextFacetLabel(int docId) throws IOException {
+      if (currentDocId != docId) {
+        assert docId > currentDocId: "docs out of order!  currentDocId=" + currentDocId + " docId=" + docId;
+        ordinalsSegmentReader.get(docId, decodedOrds);
+        currentDocId = docId;
+        currentPos = decodedOrds.offset;
+      }
+
+      int endPos = decodedOrds.offset + decodedOrds.length;
+      assert currentPos <= endPos;
+
+      if (currentPos == endPos) {
+        // no more FacetLabels
+        return null;
+      }
+
+      int ord = decodedOrds.ints[currentPos++];
+      return taxoReader.getPath(ord);
+    }
+
+    private boolean isDescendant(int ord, int ancestorOrd) {
+      while (ord != INVALID_ORDINAL && ord != ROOT_ORDINAL) {
+        if (parents[ord] == ancestorOrd) {
+          return true;
+        }
+        ord = parents[ord];
+      }
+      return false;
+    }
+
+    /**
+     * Retrieves the next {@link FacetLabel} for the specified {@code docId} under the requested {@code facetDimension},
+     * or {@code null} if there are no more. This method has state: if the provided {@code docId} is the same as the
+     * previous invocation, it returns the next {@link FacetLabel} for that document.  Otherwise, it advances to
+     * the new {@code docId} and provides the first {@link FacetLabel} for that document, or {@code null} if that document
+     * has no indexed facets.  Each new {@code docId} must be in strictly monotonic (increasing) order.
+     *
+     * <p><b>NOTE</b>: this method loads the {@code int[] parents} array from the taxonomy index</p>
+     *
+     * @param docId input docId provided in non-decreasing order
+     * @return the first or next {@link FacetLabel}, or {@code null} if there are no more
+     * @throws IOException if {@link TaxonomyReader} has problems getting path for an ordinal
+     */
+    public FacetLabel nextFacetLabel(int docId, String facetDimension) throws IOException {
+      final int parentOrd = taxoReader.getOrdinal(new FacetLabel(facetDimension));
+      assert parentOrd != INVALID_ORDINAL : "Category ordinal not found for facet dimension: " + facetDimension;

Review comment:
       Can we upgrade this to a real `if`?  Caller might legitimately pass in a non-existing `facetDimension`?

##########
File path: lucene/facet/src/test/org/apache/lucene/facet/taxonomy/TestTaxonomyLabels.java
##########
@@ -0,0 +1,192 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.facet.taxonomy;
+
+import org.apache.lucene.document.Document;
+import org.apache.lucene.facet.FacetField;
+import org.apache.lucene.facet.FacetTestCase;
+import org.apache.lucene.facet.FacetsCollector;
+import org.apache.lucene.facet.FacetsCollector.MatchingDocs;
+import org.apache.lucene.facet.FacetsConfig;
+import org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyReader;
+import org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter;
+import org.apache.lucene.index.IndexWriterConfig;
+import org.apache.lucene.index.RandomIndexWriter;
+import org.apache.lucene.search.DocIdSetIterator;
+import org.apache.lucene.search.IndexSearcher;
+import org.apache.lucene.search.MatchAllDocsQuery;
+import org.apache.lucene.store.Directory;
+import org.apache.lucene.util.IOUtils;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.Set;
+import java.util.stream.Collectors;
+
+public class TestTaxonomyLabels extends FacetTestCase {
+
+  private List<Document> prepareDocuments() {
+    List<Document> docs = new ArrayList<>();
+
+    Document doc = new Document();
+    doc.add(new FacetField("Author", "Bob"));
+    doc.add(new FacetField("Publish Date", "2010", "10", "15"));
+    docs.add(doc);
+
+    doc = new Document();
+    doc.add(new FacetField("Author", "Lisa"));
+    doc.add(new FacetField("Publish Date", "2010", "10", "20"));
+    docs.add(doc);
+
+    doc = new Document();
+    doc.add(new FacetField("Author", "Tom"));
+    doc.add(new FacetField("Publish Date", "2012", "1", "1"));
+    docs.add(doc);
+
+    doc = new Document();
+    doc.add(new FacetField("Author", "Susan"));
+    doc.add(new FacetField("Publish Date", "2012", "1", "7"));
+    docs.add(doc);
+
+    doc = new Document();
+    doc.add(new FacetField("Author", "Frank"));
+    doc.add(new FacetField("Publish Date", "1999", "5", "5"));
+    docs.add(doc);
+
+    return docs;
+  }
+
+  private List<Integer> allDocIds(MatchingDocs m, boolean decreasingDocIds) throws IOException {
+    DocIdSetIterator disi = m.bits.iterator();
+    List<Integer> docIds = new ArrayList<>();
+    while (disi.nextDoc() != DocIdSetIterator.NO_MORE_DOCS) {
+      docIds.add(disi.docID());
+    }
+
+    if (decreasingDocIds == true) {
+      Collections.reverse(docIds);
+    }
+    return docIds;
+  }
+
+  private List<FacetLabel> lookupFacetLabels(TaxonomyFacetLabels taxoLabels,
+                                             List<MatchingDocs> matchingDocs) throws IOException {
+    return lookupFacetLabels(taxoLabels, matchingDocs, null, false);
+  }
+
+  private List<FacetLabel> lookupFacetLabels(TaxonomyFacetLabels taxoLabels,
+                                             List<MatchingDocs> matchingDocs,
+                                             String dimension) throws IOException {
+    return lookupFacetLabels(taxoLabels, matchingDocs, dimension, false);
+  }
+
+  private List<FacetLabel> lookupFacetLabels(TaxonomyFacetLabels taxoLabels, List<MatchingDocs> matchingDocs, String dimension,
+                                             boolean decreasingDocIds) throws IOException {
+    List<FacetLabel> facetLabels = new ArrayList<>();
+
+    for (MatchingDocs m : matchingDocs) {
+      TaxonomyFacetLabels.FacetLabelReader facetLabelReader = taxoLabels.getFacetLabelReader(m.context);
+      List<Integer> docIds = allDocIds(m, decreasingDocIds);
+      FacetLabel facetLabel;
+      for (Integer docId : docIds) {
+        while (true) {
+          if (dimension != null) {
+            facetLabel = facetLabelReader.nextFacetLabel(docId, dimension);
+          } else {
+            facetLabel = facetLabelReader.nextFacetLabel(docId);
+          }
+
+          if (facetLabel == null) {
+            break;
+          }
+          facetLabels.add(facetLabel);
+        }
+      }
+    }
+
+    return facetLabels;
+  }
+
+
+  public void testBasic() throws Exception {
+    Directory dir = newDirectory();
+    Directory taxoDir = newDirectory();
+
+    // Writes facet ords to a separate directory from the main index:
+    DirectoryTaxonomyWriter taxoWriter = new DirectoryTaxonomyWriter(taxoDir, IndexWriterConfig.OpenMode.CREATE);
+    RandomIndexWriter writer = new RandomIndexWriter(random(), dir);
+
+    FacetsConfig config = new FacetsConfig();
+    config.setHierarchical("Publish Date", true);
+
+    for (Document doc : prepareDocuments()) {
+      writer.addDocument(config.build(taxoWriter, doc));
+    }
+
+    // NRT open
+    IndexSearcher searcher = newSearcher(writer.getReader());
+    // NRT open
+    TaxonomyReader taxoReader = new DirectoryTaxonomyReader(taxoWriter);
+
+    FacetsCollector fc = new FacetsCollector();
+    searcher.search(new MatchAllDocsQuery(), fc);
+
+    TaxonomyFacetLabels taxoLabels = new TaxonomyFacetLabels(taxoReader, config, FacetsConfig.DEFAULT_INDEX_FIELD_NAME);
+
+    // Check labels for all dimensions
+    List<FacetLabel> facetLabels = lookupFacetLabels(taxoLabels, fc.getMatchingDocs());
+    assertTrue(facetLabels.size() == 10);
+    assertTrue(facetLabels.stream()
+        .filter(l -> "Author".equals(l.components[0]))
+        .map(l -> l.components[1]).collect(Collectors.toSet())
+        .equals(Set.of("Bob", "Lisa", "Susan", "Frank", "Tom")));
+
+    assertTrue(facetLabels.stream()
+        .filter(l -> "Publish Date".equals(l.components[0]))
+        .map(l -> String.join("/", l.components[1], l.components[2], l.components[3]))
+        .collect(Collectors.toSet())
+        .equals(Set.of("2010/10/15", "2010/10/20", "2012/1/1", "2012/1/7", "1999/5/5")));
+
+    // Check labels for specific dimension
+    facetLabels = lookupFacetLabels(taxoLabels, fc.getMatchingDocs(), "Publish Date");
+    assertTrue(facetLabels.size() == 5);
+    assertTrue(facetLabels.stream()
+        .filter(l -> "Publish Date".equals(l.components[0]))
+        .map(l -> String.join("/", l.components[1], l.components[2], l.components[3]))
+        .collect(Collectors.toSet())
+        .equals(Set.of("2010/10/15", "2010/10/20", "2012/1/1", "2012/1/7", "1999/5/5")));
+
+    try {
+      facetLabels = lookupFacetLabels(taxoLabels, fc.getMatchingDocs(), null, true);
+      fail("Assertion error was not thrown for using docIds supplied in decreasing order");
+    } catch (AssertionError ae) {

Review comment:
       Ahh, be careful here!  Lucene's `test-framework` sometimes randomly runs without assertions enabled, to help us catch bugs where we accidentally create code relying on assertions (it has happened, do not ask who).
   
   So when tests run with assertions disabled then this test would false-fail?  But, if we upgrade to real `if` that would also fix this.

##########
File path: lucene/facet/src/test/org/apache/lucene/facet/taxonomy/TestTaxonomyLabels.java
##########
@@ -0,0 +1,192 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.facet.taxonomy;
+
+import org.apache.lucene.document.Document;
+import org.apache.lucene.facet.FacetField;
+import org.apache.lucene.facet.FacetTestCase;
+import org.apache.lucene.facet.FacetsCollector;
+import org.apache.lucene.facet.FacetsCollector.MatchingDocs;
+import org.apache.lucene.facet.FacetsConfig;
+import org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyReader;
+import org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter;
+import org.apache.lucene.index.IndexWriterConfig;
+import org.apache.lucene.index.RandomIndexWriter;
+import org.apache.lucene.search.DocIdSetIterator;
+import org.apache.lucene.search.IndexSearcher;
+import org.apache.lucene.search.MatchAllDocsQuery;
+import org.apache.lucene.store.Directory;
+import org.apache.lucene.util.IOUtils;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.Set;
+import java.util.stream.Collectors;
+
+public class TestTaxonomyLabels extends FacetTestCase {
+
+  private List<Document> prepareDocuments() {
+    List<Document> docs = new ArrayList<>();
+
+    Document doc = new Document();
+    doc.add(new FacetField("Author", "Bob"));
+    doc.add(new FacetField("Publish Date", "2010", "10", "15"));
+    docs.add(doc);
+
+    doc = new Document();
+    doc.add(new FacetField("Author", "Lisa"));
+    doc.add(new FacetField("Publish Date", "2010", "10", "20"));
+    docs.add(doc);
+
+    doc = new Document();
+    doc.add(new FacetField("Author", "Tom"));
+    doc.add(new FacetField("Publish Date", "2012", "1", "1"));
+    docs.add(doc);
+
+    doc = new Document();
+    doc.add(new FacetField("Author", "Susan"));
+    doc.add(new FacetField("Publish Date", "2012", "1", "7"));
+    docs.add(doc);
+
+    doc = new Document();
+    doc.add(new FacetField("Author", "Frank"));
+    doc.add(new FacetField("Publish Date", "1999", "5", "5"));
+    docs.add(doc);
+
+    return docs;
+  }
+
+  private List<Integer> allDocIds(MatchingDocs m, boolean decreasingDocIds) throws IOException {
+    DocIdSetIterator disi = m.bits.iterator();
+    List<Integer> docIds = new ArrayList<>();
+    while (disi.nextDoc() != DocIdSetIterator.NO_MORE_DOCS) {
+      docIds.add(disi.docID());
+    }
+
+    if (decreasingDocIds == true) {
+      Collections.reverse(docIds);
+    }
+    return docIds;
+  }
+
+  private List<FacetLabel> lookupFacetLabels(TaxonomyFacetLabels taxoLabels,
+                                             List<MatchingDocs> matchingDocs) throws IOException {
+    return lookupFacetLabels(taxoLabels, matchingDocs, null, false);
+  }
+
+  private List<FacetLabel> lookupFacetLabels(TaxonomyFacetLabels taxoLabels,
+                                             List<MatchingDocs> matchingDocs,
+                                             String dimension) throws IOException {
+    return lookupFacetLabels(taxoLabels, matchingDocs, dimension, false);
+  }
+
+  private List<FacetLabel> lookupFacetLabels(TaxonomyFacetLabels taxoLabels, List<MatchingDocs> matchingDocs, String dimension,
+                                             boolean decreasingDocIds) throws IOException {
+    List<FacetLabel> facetLabels = new ArrayList<>();
+
+    for (MatchingDocs m : matchingDocs) {
+      TaxonomyFacetLabels.FacetLabelReader facetLabelReader = taxoLabels.getFacetLabelReader(m.context);
+      List<Integer> docIds = allDocIds(m, decreasingDocIds);
+      FacetLabel facetLabel;
+      for (Integer docId : docIds) {
+        while (true) {
+          if (dimension != null) {
+            facetLabel = facetLabelReader.nextFacetLabel(docId, dimension);
+          } else {
+            facetLabel = facetLabelReader.nextFacetLabel(docId);
+          }
+
+          if (facetLabel == null) {
+            break;
+          }
+          facetLabels.add(facetLabel);
+        }
+      }
+    }
+
+    return facetLabels;
+  }
+
+
+  public void testBasic() throws Exception {

Review comment:
       Could we make a small change to an existing randomized test, maybe `TestTaxonomyFacetCounts.testRandom`, to use this API to randomly retrieve N facet labels?  (So we exercise the class beyond the simplish `testBasic` ... maybe we uncover a rare bug, somewhere).

##########
File path: lucene/facet/src/java/org/apache/lucene/facet/taxonomy/TaxonomyFacetLabels.java
##########
@@ -0,0 +1,184 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.facet.taxonomy;
+
+import org.apache.lucene.facet.FacetsConfig;
+import org.apache.lucene.index.LeafReaderContext;
+import org.apache.lucene.util.IntsRef;
+
+import java.io.IOException;
+
+import static org.apache.lucene.facet.taxonomy.TaxonomyReader.INVALID_ORDINAL;
+import static org.apache.lucene.facet.taxonomy.TaxonomyReader.ROOT_ORDINAL;
+
+/**
+ * Utility class to easily retrieve previously indexed facet labels, allowing you to skip also adding stored fields for these values,
+ * reducing your index size.
+ *
+ * @lucene.experimental
+ **/
+public class TaxonomyFacetLabels {
+
+  /**
+   * Index field name provided to the constructor
+   */
+  private final String indexFieldName;
+
+  /**
+   * {@code TaxonomyReader} provided to the constructor
+   */
+  private final TaxonomyReader taxoReader;
+
+  /**
+   * {@code FacetsConfig} provided to the constructor
+   */
+  private final FacetsConfig config;
+
+  /**
+   * {@code OrdinalsReader} to decode ordinals previously indexed into the {@code BinaryDocValues} facet field
+   */
+  private final OrdinalsReader ordsReader;
+
+  /**
+   * Sole constructor.  Do not close the provided {@link TaxonomyReader} while still using this instance!
+   */
+  public TaxonomyFacetLabels(TaxonomyReader taxoReader, FacetsConfig config, String indexFieldName) throws IOException {
+    this.taxoReader = taxoReader;
+    this.config = config;
+    this.indexFieldName = indexFieldName;
+    this.ordsReader = new DocValuesOrdinalsReader(indexFieldName);
+  }
+
+  /**
+   * Create and return an instance of {@link FacetLabelReader} to retrieve facet labels for
+   * multiple documents and (optionally) for a specific dimension.  You must create this per-segment,
+   * and then step through all hits, in order, for that segment.
+   *
+   * <p><b>NOTE</b>: This class is not thread-safe, so you must use a new instance of this
+   * class for each thread.</p>
+   *
+   * @param readerContext LeafReaderContext used to access the {@code BinaryDocValues} facet field
+   * @return an instance of {@link FacetLabelReader}
+   * @throws IOException when a low-level IO issue occurs
+   */
+  public FacetLabelReader getFacetLabelReader(LeafReaderContext readerContext) throws IOException {
+    return new FacetLabelReader(ordsReader, readerContext);
+  }
+
+  /**
+   * Utility class to retrieve facet labels for multiple documents.
+   *
+   * @lucene.experimental
+   */
+  public class FacetLabelReader {
+    private final OrdinalsReader.OrdinalsSegmentReader ordinalsSegmentReader;
+    private final IntsRef decodedOrds = new IntsRef();
+    private int currentDocId = -1;
+    private int currentPos = -1;
+
+    // Lazily set when nextFacetLabel(int docId, String facetDimension) is first called
+    private int[] parents;
+
+    /**
+     * Sole constructor.
+     */
+    public FacetLabelReader(OrdinalsReader ordsReader, LeafReaderContext readerContext) throws IOException {
+      ordinalsSegmentReader = ordsReader.getReader(readerContext);
+    }
+
+    /**
+     * Retrieves the next {@link FacetLabel} for the specified {@code docId}, or {@code null} if there are no more.
+     * This method has state: if the provided {@code docId} is the same as the previous invocation, it returns the
+     * next {@link FacetLabel} for that document.  Otherwise, it advances to the new {@code docId} and provides the
+     * first {@link FacetLabel} for that document, or {@code null} if that document has no indexed facets.  Each
+     * new {@code docId} must be in strictly monotonic (increasing) order.
+     *
+     * @param docId input docId provided in monotonic (non-decreasing) order
+     * @return the first or next {@link FacetLabel}, or {@code null} if there are no more
+     * @throws IOException when a low-level IO issue occurs
+     */
+    public FacetLabel nextFacetLabel(int docId) throws IOException {
+      if (currentDocId != docId) {
+        assert docId > currentDocId: "docs out of order!  currentDocId=" + currentDocId + " docId=" + docId;
+        ordinalsSegmentReader.get(docId, decodedOrds);
+        currentDocId = docId;
+        currentPos = decodedOrds.offset;
+      }
+
+      int endPos = decodedOrds.offset + decodedOrds.length;
+      assert currentPos <= endPos;
+
+      if (currentPos == endPos) {
+        // no more FacetLabels
+        return null;
+      }
+
+      int ord = decodedOrds.ints[currentPos++];
+      return taxoReader.getPath(ord);
+    }
+
+    private boolean isDescendant(int ord, int ancestorOrd) {
+      while (ord != INVALID_ORDINAL && ord != ROOT_ORDINAL) {
+        if (parents[ord] == ancestorOrd) {
+          return true;
+        }
+        ord = parents[ord];
+      }
+      return false;
+    }
+
+    /**
+     * Retrieves the next {@link FacetLabel} for the specified {@code docId} under the requested {@code facetDimension},
+     * or {@code null} if there are no more. This method has state: if the provided {@code docId} is the same as the
+     * previous invocation, it returns the next {@link FacetLabel} for that document.  Otherwise, it advances to
+     * the new {@code docId} and provides the first {@link FacetLabel} for that document, or {@code null} if that document
+     * has no indexed facets.  Each new {@code docId} must be in strictly monotonic (increasing) order.
+     *
+     * <p><b>NOTE</b>: this method loads the {@code int[] parents} array from the taxonomy index</p>

Review comment:
       Add same **Note** here as above comment?

##########
File path: lucene/facet/src/java/org/apache/lucene/facet/taxonomy/TaxonomyFacetLabels.java
##########
@@ -0,0 +1,184 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.facet.taxonomy;
+
+import org.apache.lucene.facet.FacetsConfig;
+import org.apache.lucene.index.LeafReaderContext;
+import org.apache.lucene.util.IntsRef;
+
+import java.io.IOException;
+
+import static org.apache.lucene.facet.taxonomy.TaxonomyReader.INVALID_ORDINAL;
+import static org.apache.lucene.facet.taxonomy.TaxonomyReader.ROOT_ORDINAL;
+
+/**
+ * Utility class to easily retrieve previously indexed facet labels, allowing you to skip also adding stored fields for these values,
+ * reducing your index size.
+ *
+ * @lucene.experimental
+ **/
+public class TaxonomyFacetLabels {
+
+  /**
+   * Index field name provided to the constructor
+   */
+  private final String indexFieldName;
+
+  /**
+   * {@code TaxonomyReader} provided to the constructor
+   */
+  private final TaxonomyReader taxoReader;
+
+  /**
+   * {@code FacetsConfig} provided to the constructor
+   */
+  private final FacetsConfig config;
+
+  /**
+   * {@code OrdinalsReader} to decode ordinals previously indexed into the {@code BinaryDocValues} facet field
+   */
+  private final OrdinalsReader ordsReader;
+
+  /**
+   * Sole constructor.  Do not close the provided {@link TaxonomyReader} while still using this instance!
+   */
+  public TaxonomyFacetLabels(TaxonomyReader taxoReader, FacetsConfig config, String indexFieldName) throws IOException {
+    this.taxoReader = taxoReader;
+    this.config = config;
+    this.indexFieldName = indexFieldName;
+    this.ordsReader = new DocValuesOrdinalsReader(indexFieldName);
+  }
+
+  /**
+   * Create and return an instance of {@link FacetLabelReader} to retrieve facet labels for
+   * multiple documents and (optionally) for a specific dimension.  You must create this per-segment,
+   * and then step through all hits, in order, for that segment.
+   *
+   * <p><b>NOTE</b>: This class is not thread-safe, so you must use a new instance of this
+   * class for each thread.</p>
+   *
+   * @param readerContext LeafReaderContext used to access the {@code BinaryDocValues} facet field
+   * @return an instance of {@link FacetLabelReader}
+   * @throws IOException when a low-level IO issue occurs
+   */
+  public FacetLabelReader getFacetLabelReader(LeafReaderContext readerContext) throws IOException {
+    return new FacetLabelReader(ordsReader, readerContext);
+  }
+
+  /**
+   * Utility class to retrieve facet labels for multiple documents.
+   *
+   * @lucene.experimental
+   */
+  public class FacetLabelReader {
+    private final OrdinalsReader.OrdinalsSegmentReader ordinalsSegmentReader;
+    private final IntsRef decodedOrds = new IntsRef();
+    private int currentDocId = -1;
+    private int currentPos = -1;
+
+    // Lazily set when nextFacetLabel(int docId, String facetDimension) is first called
+    private int[] parents;
+
+    /**
+     * Sole constructor.
+     */
+    public FacetLabelReader(OrdinalsReader ordsReader, LeafReaderContext readerContext) throws IOException {
+      ordinalsSegmentReader = ordsReader.getReader(readerContext);
+    }
+
+    /**
+     * Retrieves the next {@link FacetLabel} for the specified {@code docId}, or {@code null} if there are no more.
+     * This method has state: if the provided {@code docId} is the same as the previous invocation, it returns the
+     * next {@link FacetLabel} for that document.  Otherwise, it advances to the new {@code docId} and provides the
+     * first {@link FacetLabel} for that document, or {@code null} if that document has no indexed facets.  Each
+     * new {@code docId} must be in strictly monotonic (increasing) order.
+     *
+     * @param docId input docId provided in monotonic (non-decreasing) order
+     * @return the first or next {@link FacetLabel}, or {@code null} if there are no more
+     * @throws IOException when a low-level IO issue occurs
+     */
+    public FacetLabel nextFacetLabel(int docId) throws IOException {
+      if (currentDocId != docId) {
+        assert docId > currentDocId: "docs out of order!  currentDocId=" + currentDocId + " docId=" + docId;
+        ordinalsSegmentReader.get(docId, decodedOrds);
+        currentDocId = docId;
+        currentPos = decodedOrds.offset;
+      }
+
+      int endPos = decodedOrds.offset + decodedOrds.length;
+      assert currentPos <= endPos;
+
+      if (currentPos == endPos) {
+        // no more FacetLabels
+        return null;
+      }
+
+      int ord = decodedOrds.ints[currentPos++];
+      return taxoReader.getPath(ord);
+    }
+
+    private boolean isDescendant(int ord, int ancestorOrd) {
+      while (ord != INVALID_ORDINAL && ord != ROOT_ORDINAL) {
+        if (parents[ord] == ancestorOrd) {
+          return true;
+        }
+        ord = parents[ord];
+      }
+      return false;
+    }
+
+    /**
+     * Retrieves the next {@link FacetLabel} for the specified {@code docId} under the requested {@code facetDimension},
+     * or {@code null} if there are no more. This method has state: if the provided {@code docId} is the same as the
+     * previous invocation, it returns the next {@link FacetLabel} for that document.  Otherwise, it advances to
+     * the new {@code docId} and provides the first {@link FacetLabel} for that document, or {@code null} if that document
+     * has no indexed facets.  Each new {@code docId} must be in strictly monotonic (increasing) order.
+     *
+     * <p><b>NOTE</b>: this method loads the {@code int[] parents} array from the taxonomy index</p>
+     *
+     * @param docId input docId provided in non-decreasing order
+     * @return the first or next {@link FacetLabel}, or {@code null} if there are no more
+     * @throws IOException if {@link TaxonomyReader} has problems getting path for an ordinal
+     */
+    public FacetLabel nextFacetLabel(int docId, String facetDimension) throws IOException {
+      final int parentOrd = taxoReader.getOrdinal(new FacetLabel(facetDimension));
+      assert parentOrd != INVALID_ORDINAL : "Category ordinal not found for facet dimension: " + facetDimension;
+
+      if (currentDocId != docId) {
+        assert docId > currentDocId: "docs out of order!  currentDocId=" + currentDocId + " docId=" + docId;
+        ordinalsSegmentReader.get(docId, decodedOrds);
+        currentPos = decodedOrds.offset;
+        currentDocId = docId;
+      }
+
+      if (parents == null) {
+        parents = taxoReader.getParallelTaxonomyArrays().parents();
+      }
+
+      int endPos = decodedOrds.offset + decodedOrds.length;
+      assert currentPos <= endPos;
+
+      for (; currentPos < endPos; ) {
+        int ord = decodedOrds.ints[currentPos++];
+        if (isDescendant(ord, parentOrd) == true) {
+          return taxoReader.getPath(ord);

Review comment:
       Hmm when we switch to `BinaryDocValues` for this lookup, we could fix this class to persistently hold a single `BinaryDocValues`, and improve performance for caller that need to look up multiple ordinals, which is the common case.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] goankur commented on a change in pull request #1893: LUCENE-9444 Utility class to get facet labels from taxonomy for a fac…

Posted by GitBox <gi...@apache.org>.
goankur commented on a change in pull request #1893:
URL: https://github.com/apache/lucene-solr/pull/1893#discussion_r494637269



##########
File path: lucene/facet/src/test/org/apache/lucene/facet/taxonomy/TestTaxonomyFacetCounts.java
##########
@@ -711,6 +723,11 @@ public void testRandom() throws Exception {
         }
       }
 
+      // Test facet labels for each matching test doc
+      List<List<FacetLabel>> actualLabels = getTaxonomyFacetLabels(tr, config, fc);
+      assertEquals(expectedLabels.size(), actualLabels.size());

Review comment:
       Nice catch, thanks.  I fixed `actualLabels` generation in `FacetTestCase.getAllTaxonomyFacetLabels()` method to filter out empty `List<facetLabels>`.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] goankur commented on a change in pull request #1893: LUCENE-9444 Utility class to get facet labels from taxonomy for a fac…

Posted by GitBox <gi...@apache.org>.
goankur commented on a change in pull request #1893:
URL: https://github.com/apache/lucene-solr/pull/1893#discussion_r493114208



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/taxonomy/TaxonomyFacetLabels.java
##########
@@ -0,0 +1,184 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.facet.taxonomy;
+
+import org.apache.lucene.facet.FacetsConfig;
+import org.apache.lucene.index.LeafReaderContext;
+import org.apache.lucene.util.IntsRef;
+
+import java.io.IOException;
+
+import static org.apache.lucene.facet.taxonomy.TaxonomyReader.INVALID_ORDINAL;
+import static org.apache.lucene.facet.taxonomy.TaxonomyReader.ROOT_ORDINAL;
+
+/**
+ * Utility class to easily retrieve previously indexed facet labels, allowing you to skip also adding stored fields for these values,
+ * reducing your index size.
+ *
+ * @lucene.experimental
+ **/
+public class TaxonomyFacetLabels {
+
+  /**
+   * Index field name provided to the constructor
+   */
+  private final String indexFieldName;
+
+  /**
+   * {@code TaxonomyReader} provided to the constructor
+   */
+  private final TaxonomyReader taxoReader;
+
+  /**
+   * {@code FacetsConfig} provided to the constructor
+   */
+  private final FacetsConfig config;
+
+  /**
+   * {@code OrdinalsReader} to decode ordinals previously indexed into the {@code BinaryDocValues} facet field
+   */
+  private final OrdinalsReader ordsReader;
+
+  /**
+   * Sole constructor.  Do not close the provided {@link TaxonomyReader} while still using this instance!
+   */
+  public TaxonomyFacetLabels(TaxonomyReader taxoReader, FacetsConfig config, String indexFieldName) throws IOException {
+    this.taxoReader = taxoReader;
+    this.config = config;
+    this.indexFieldName = indexFieldName;
+    this.ordsReader = new DocValuesOrdinalsReader(indexFieldName);
+  }
+
+  /**
+   * Create and return an instance of {@link FacetLabelReader} to retrieve facet labels for
+   * multiple documents and (optionally) for a specific dimension.  You must create this per-segment,
+   * and then step through all hits, in order, for that segment.
+   *
+   * <p><b>NOTE</b>: This class is not thread-safe, so you must use a new instance of this
+   * class for each thread.</p>
+   *
+   * @param readerContext LeafReaderContext used to access the {@code BinaryDocValues} facet field
+   * @return an instance of {@link FacetLabelReader}
+   * @throws IOException when a low-level IO issue occurs
+   */
+  public FacetLabelReader getFacetLabelReader(LeafReaderContext readerContext) throws IOException {
+    return new FacetLabelReader(ordsReader, readerContext);
+  }
+
+  /**
+   * Utility class to retrieve facet labels for multiple documents.
+   *
+   * @lucene.experimental
+   */
+  public class FacetLabelReader {
+    private final OrdinalsReader.OrdinalsSegmentReader ordinalsSegmentReader;
+    private final IntsRef decodedOrds = new IntsRef();
+    private int currentDocId = -1;
+    private int currentPos = -1;
+
+    // Lazily set when nextFacetLabel(int docId, String facetDimension) is first called
+    private int[] parents;
+
+    /**
+     * Sole constructor.
+     */
+    public FacetLabelReader(OrdinalsReader ordsReader, LeafReaderContext readerContext) throws IOException {
+      ordinalsSegmentReader = ordsReader.getReader(readerContext);
+    }
+
+    /**
+     * Retrieves the next {@link FacetLabel} for the specified {@code docId}, or {@code null} if there are no more.
+     * This method has state: if the provided {@code docId} is the same as the previous invocation, it returns the
+     * next {@link FacetLabel} for that document.  Otherwise, it advances to the new {@code docId} and provides the
+     * first {@link FacetLabel} for that document, or {@code null} if that document has no indexed facets.  Each
+     * new {@code docId} must be in strictly monotonic (increasing) order.
+     *
+     * @param docId input docId provided in monotonic (non-decreasing) order
+     * @return the first or next {@link FacetLabel}, or {@code null} if there are no more
+     * @throws IOException when a low-level IO issue occurs
+     */
+    public FacetLabel nextFacetLabel(int docId) throws IOException {
+      if (currentDocId != docId) {
+        assert docId > currentDocId: "docs out of order!  currentDocId=" + currentDocId + " docId=" + docId;
+        ordinalsSegmentReader.get(docId, decodedOrds);
+        currentDocId = docId;
+        currentPos = decodedOrds.offset;
+      }
+
+      int endPos = decodedOrds.offset + decodedOrds.length;
+      assert currentPos <= endPos;
+
+      if (currentPos == endPos) {
+        // no more FacetLabels
+        return null;
+      }
+
+      int ord = decodedOrds.ints[currentPos++];
+      return taxoReader.getPath(ord);
+    }
+
+    private boolean isDescendant(int ord, int ancestorOrd) {
+      while (ord != INVALID_ORDINAL && ord != ROOT_ORDINAL) {
+        if (parents[ord] == ancestorOrd) {
+          return true;
+        }
+        ord = parents[ord];
+      }
+      return false;
+    }
+
+    /**
+     * Retrieves the next {@link FacetLabel} for the specified {@code docId} under the requested {@code facetDimension},
+     * or {@code null} if there are no more. This method has state: if the provided {@code docId} is the same as the
+     * previous invocation, it returns the next {@link FacetLabel} for that document.  Otherwise, it advances to
+     * the new {@code docId} and provides the first {@link FacetLabel} for that document, or {@code null} if that document
+     * has no indexed facets.  Each new {@code docId} must be in strictly monotonic (increasing) order.
+     *
+     * <p><b>NOTE</b>: this method loads the {@code int[] parents} array from the taxonomy index</p>

Review comment:
       done in next revision




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] goankur commented on a change in pull request #1893: LUCENE-9444 Utility class to get facet labels from taxonomy for a fac…

Posted by GitBox <gi...@apache.org>.
goankur commented on a change in pull request #1893:
URL: https://github.com/apache/lucene-solr/pull/1893#discussion_r493112985



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/taxonomy/TaxonomyFacetLabels.java
##########
@@ -0,0 +1,184 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.facet.taxonomy;
+
+import org.apache.lucene.facet.FacetsConfig;
+import org.apache.lucene.index.LeafReaderContext;
+import org.apache.lucene.util.IntsRef;
+
+import java.io.IOException;
+
+import static org.apache.lucene.facet.taxonomy.TaxonomyReader.INVALID_ORDINAL;
+import static org.apache.lucene.facet.taxonomy.TaxonomyReader.ROOT_ORDINAL;
+
+/**
+ * Utility class to easily retrieve previously indexed facet labels, allowing you to skip also adding stored fields for these values,
+ * reducing your index size.
+ *
+ * @lucene.experimental
+ **/
+public class TaxonomyFacetLabels {
+
+  /**
+   * Index field name provided to the constructor
+   */
+  private final String indexFieldName;
+
+  /**
+   * {@code TaxonomyReader} provided to the constructor
+   */
+  private final TaxonomyReader taxoReader;
+
+  /**
+   * {@code FacetsConfig} provided to the constructor
+   */
+  private final FacetsConfig config;
+
+  /**
+   * {@code OrdinalsReader} to decode ordinals previously indexed into the {@code BinaryDocValues} facet field
+   */
+  private final OrdinalsReader ordsReader;
+
+  /**
+   * Sole constructor.  Do not close the provided {@link TaxonomyReader} while still using this instance!
+   */
+  public TaxonomyFacetLabels(TaxonomyReader taxoReader, FacetsConfig config, String indexFieldName) throws IOException {
+    this.taxoReader = taxoReader;
+    this.config = config;
+    this.indexFieldName = indexFieldName;
+    this.ordsReader = new DocValuesOrdinalsReader(indexFieldName);
+  }
+
+  /**
+   * Create and return an instance of {@link FacetLabelReader} to retrieve facet labels for
+   * multiple documents and (optionally) for a specific dimension.  You must create this per-segment,
+   * and then step through all hits, in order, for that segment.
+   *
+   * <p><b>NOTE</b>: This class is not thread-safe, so you must use a new instance of this
+   * class for each thread.</p>
+   *
+   * @param readerContext LeafReaderContext used to access the {@code BinaryDocValues} facet field
+   * @return an instance of {@link FacetLabelReader}
+   * @throws IOException when a low-level IO issue occurs
+   */
+  public FacetLabelReader getFacetLabelReader(LeafReaderContext readerContext) throws IOException {
+    return new FacetLabelReader(ordsReader, readerContext);
+  }
+
+  /**
+   * Utility class to retrieve facet labels for multiple documents.
+   *
+   * @lucene.experimental
+   */
+  public class FacetLabelReader {
+    private final OrdinalsReader.OrdinalsSegmentReader ordinalsSegmentReader;
+    private final IntsRef decodedOrds = new IntsRef();
+    private int currentDocId = -1;
+    private int currentPos = -1;
+
+    // Lazily set when nextFacetLabel(int docId, String facetDimension) is first called
+    private int[] parents;
+
+    /**
+     * Sole constructor.
+     */
+    public FacetLabelReader(OrdinalsReader ordsReader, LeafReaderContext readerContext) throws IOException {
+      ordinalsSegmentReader = ordsReader.getReader(readerContext);
+    }
+
+    /**
+     * Retrieves the next {@link FacetLabel} for the specified {@code docId}, or {@code null} if there are no more.
+     * This method has state: if the provided {@code docId} is the same as the previous invocation, it returns the
+     * next {@link FacetLabel} for that document.  Otherwise, it advances to the new {@code docId} and provides the
+     * first {@link FacetLabel} for that document, or {@code null} if that document has no indexed facets.  Each
+     * new {@code docId} must be in strictly monotonic (increasing) order.
+     *
+     * @param docId input docId provided in monotonic (non-decreasing) order
+     * @return the first or next {@link FacetLabel}, or {@code null} if there are no more
+     * @throws IOException when a low-level IO issue occurs
+     */
+    public FacetLabel nextFacetLabel(int docId) throws IOException {
+      if (currentDocId != docId) {
+        assert docId > currentDocId: "docs out of order!  currentDocId=" + currentDocId + " docId=" + docId;

Review comment:
       Changed in the next revision.

##########
File path: lucene/facet/src/java/org/apache/lucene/facet/taxonomy/TaxonomyFacetLabels.java
##########
@@ -0,0 +1,184 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.facet.taxonomy;
+
+import org.apache.lucene.facet.FacetsConfig;
+import org.apache.lucene.index.LeafReaderContext;
+import org.apache.lucene.util.IntsRef;
+
+import java.io.IOException;
+
+import static org.apache.lucene.facet.taxonomy.TaxonomyReader.INVALID_ORDINAL;
+import static org.apache.lucene.facet.taxonomy.TaxonomyReader.ROOT_ORDINAL;
+
+/**
+ * Utility class to easily retrieve previously indexed facet labels, allowing you to skip also adding stored fields for these values,
+ * reducing your index size.
+ *
+ * @lucene.experimental
+ **/
+public class TaxonomyFacetLabels {
+
+  /**
+   * Index field name provided to the constructor
+   */
+  private final String indexFieldName;
+
+  /**
+   * {@code TaxonomyReader} provided to the constructor
+   */
+  private final TaxonomyReader taxoReader;
+
+  /**
+   * {@code FacetsConfig} provided to the constructor
+   */
+  private final FacetsConfig config;
+
+  /**
+   * {@code OrdinalsReader} to decode ordinals previously indexed into the {@code BinaryDocValues} facet field
+   */
+  private final OrdinalsReader ordsReader;
+
+  /**
+   * Sole constructor.  Do not close the provided {@link TaxonomyReader} while still using this instance!
+   */
+  public TaxonomyFacetLabels(TaxonomyReader taxoReader, FacetsConfig config, String indexFieldName) throws IOException {
+    this.taxoReader = taxoReader;
+    this.config = config;
+    this.indexFieldName = indexFieldName;
+    this.ordsReader = new DocValuesOrdinalsReader(indexFieldName);
+  }
+
+  /**
+   * Create and return an instance of {@link FacetLabelReader} to retrieve facet labels for
+   * multiple documents and (optionally) for a specific dimension.  You must create this per-segment,
+   * and then step through all hits, in order, for that segment.
+   *
+   * <p><b>NOTE</b>: This class is not thread-safe, so you must use a new instance of this
+   * class for each thread.</p>
+   *
+   * @param readerContext LeafReaderContext used to access the {@code BinaryDocValues} facet field
+   * @return an instance of {@link FacetLabelReader}
+   * @throws IOException when a low-level IO issue occurs
+   */
+  public FacetLabelReader getFacetLabelReader(LeafReaderContext readerContext) throws IOException {
+    return new FacetLabelReader(ordsReader, readerContext);
+  }
+
+  /**
+   * Utility class to retrieve facet labels for multiple documents.
+   *
+   * @lucene.experimental
+   */
+  public class FacetLabelReader {
+    private final OrdinalsReader.OrdinalsSegmentReader ordinalsSegmentReader;
+    private final IntsRef decodedOrds = new IntsRef();
+    private int currentDocId = -1;
+    private int currentPos = -1;
+
+    // Lazily set when nextFacetLabel(int docId, String facetDimension) is first called
+    private int[] parents;
+
+    /**
+     * Sole constructor.
+     */
+    public FacetLabelReader(OrdinalsReader ordsReader, LeafReaderContext readerContext) throws IOException {
+      ordinalsSegmentReader = ordsReader.getReader(readerContext);
+    }
+
+    /**
+     * Retrieves the next {@link FacetLabel} for the specified {@code docId}, or {@code null} if there are no more.
+     * This method has state: if the provided {@code docId} is the same as the previous invocation, it returns the
+     * next {@link FacetLabel} for that document.  Otherwise, it advances to the new {@code docId} and provides the
+     * first {@link FacetLabel} for that document, or {@code null} if that document has no indexed facets.  Each
+     * new {@code docId} must be in strictly monotonic (increasing) order.
+     *
+     * @param docId input docId provided in monotonic (non-decreasing) order
+     * @return the first or next {@link FacetLabel}, or {@code null} if there are no more
+     * @throws IOException when a low-level IO issue occurs
+     */
+    public FacetLabel nextFacetLabel(int docId) throws IOException {
+      if (currentDocId != docId) {
+        assert docId > currentDocId: "docs out of order!  currentDocId=" + currentDocId + " docId=" + docId;
+        ordinalsSegmentReader.get(docId, decodedOrds);
+        currentDocId = docId;
+        currentPos = decodedOrds.offset;
+      }
+
+      int endPos = decodedOrds.offset + decodedOrds.length;
+      assert currentPos <= endPos;

Review comment:
       Got it thanks for explaining appropriate usage of assert and `IlegalArgumentException`

##########
File path: lucene/facet/src/java/org/apache/lucene/facet/taxonomy/TaxonomyFacetLabels.java
##########
@@ -0,0 +1,184 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.facet.taxonomy;
+
+import org.apache.lucene.facet.FacetsConfig;
+import org.apache.lucene.index.LeafReaderContext;
+import org.apache.lucene.util.IntsRef;
+
+import java.io.IOException;
+
+import static org.apache.lucene.facet.taxonomy.TaxonomyReader.INVALID_ORDINAL;
+import static org.apache.lucene.facet.taxonomy.TaxonomyReader.ROOT_ORDINAL;
+
+/**
+ * Utility class to easily retrieve previously indexed facet labels, allowing you to skip also adding stored fields for these values,
+ * reducing your index size.
+ *
+ * @lucene.experimental
+ **/
+public class TaxonomyFacetLabels {
+
+  /**
+   * Index field name provided to the constructor
+   */
+  private final String indexFieldName;
+
+  /**
+   * {@code TaxonomyReader} provided to the constructor
+   */
+  private final TaxonomyReader taxoReader;
+
+  /**
+   * {@code FacetsConfig} provided to the constructor
+   */
+  private final FacetsConfig config;
+
+  /**
+   * {@code OrdinalsReader} to decode ordinals previously indexed into the {@code BinaryDocValues} facet field
+   */
+  private final OrdinalsReader ordsReader;
+
+  /**
+   * Sole constructor.  Do not close the provided {@link TaxonomyReader} while still using this instance!
+   */
+  public TaxonomyFacetLabels(TaxonomyReader taxoReader, FacetsConfig config, String indexFieldName) throws IOException {
+    this.taxoReader = taxoReader;
+    this.config = config;
+    this.indexFieldName = indexFieldName;
+    this.ordsReader = new DocValuesOrdinalsReader(indexFieldName);
+  }
+
+  /**
+   * Create and return an instance of {@link FacetLabelReader} to retrieve facet labels for
+   * multiple documents and (optionally) for a specific dimension.  You must create this per-segment,
+   * and then step through all hits, in order, for that segment.
+   *
+   * <p><b>NOTE</b>: This class is not thread-safe, so you must use a new instance of this
+   * class for each thread.</p>
+   *
+   * @param readerContext LeafReaderContext used to access the {@code BinaryDocValues} facet field
+   * @return an instance of {@link FacetLabelReader}
+   * @throws IOException when a low-level IO issue occurs
+   */
+  public FacetLabelReader getFacetLabelReader(LeafReaderContext readerContext) throws IOException {
+    return new FacetLabelReader(ordsReader, readerContext);
+  }
+
+  /**
+   * Utility class to retrieve facet labels for multiple documents.
+   *
+   * @lucene.experimental
+   */
+  public class FacetLabelReader {
+    private final OrdinalsReader.OrdinalsSegmentReader ordinalsSegmentReader;
+    private final IntsRef decodedOrds = new IntsRef();
+    private int currentDocId = -1;
+    private int currentPos = -1;
+
+    // Lazily set when nextFacetLabel(int docId, String facetDimension) is first called
+    private int[] parents;
+
+    /**
+     * Sole constructor.
+     */
+    public FacetLabelReader(OrdinalsReader ordsReader, LeafReaderContext readerContext) throws IOException {
+      ordinalsSegmentReader = ordsReader.getReader(readerContext);
+    }
+
+    /**
+     * Retrieves the next {@link FacetLabel} for the specified {@code docId}, or {@code null} if there are no more.
+     * This method has state: if the provided {@code docId} is the same as the previous invocation, it returns the
+     * next {@link FacetLabel} for that document.  Otherwise, it advances to the new {@code docId} and provides the
+     * first {@link FacetLabel} for that document, or {@code null} if that document has no indexed facets.  Each
+     * new {@code docId} must be in strictly monotonic (increasing) order.
+     *
+     * @param docId input docId provided in monotonic (non-decreasing) order

Review comment:
       done in next revision

##########
File path: lucene/facet/src/java/org/apache/lucene/facet/taxonomy/TaxonomyFacetLabels.java
##########
@@ -0,0 +1,184 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.facet.taxonomy;
+
+import org.apache.lucene.facet.FacetsConfig;
+import org.apache.lucene.index.LeafReaderContext;
+import org.apache.lucene.util.IntsRef;
+
+import java.io.IOException;
+
+import static org.apache.lucene.facet.taxonomy.TaxonomyReader.INVALID_ORDINAL;
+import static org.apache.lucene.facet.taxonomy.TaxonomyReader.ROOT_ORDINAL;
+
+/**
+ * Utility class to easily retrieve previously indexed facet labels, allowing you to skip also adding stored fields for these values,
+ * reducing your index size.
+ *
+ * @lucene.experimental
+ **/
+public class TaxonomyFacetLabels {
+
+  /**
+   * Index field name provided to the constructor
+   */
+  private final String indexFieldName;
+
+  /**
+   * {@code TaxonomyReader} provided to the constructor
+   */
+  private final TaxonomyReader taxoReader;
+
+  /**
+   * {@code FacetsConfig} provided to the constructor
+   */
+  private final FacetsConfig config;
+
+  /**
+   * {@code OrdinalsReader} to decode ordinals previously indexed into the {@code BinaryDocValues} facet field
+   */
+  private final OrdinalsReader ordsReader;
+
+  /**
+   * Sole constructor.  Do not close the provided {@link TaxonomyReader} while still using this instance!
+   */
+  public TaxonomyFacetLabels(TaxonomyReader taxoReader, FacetsConfig config, String indexFieldName) throws IOException {
+    this.taxoReader = taxoReader;
+    this.config = config;
+    this.indexFieldName = indexFieldName;
+    this.ordsReader = new DocValuesOrdinalsReader(indexFieldName);
+  }
+
+  /**
+   * Create and return an instance of {@link FacetLabelReader} to retrieve facet labels for
+   * multiple documents and (optionally) for a specific dimension.  You must create this per-segment,
+   * and then step through all hits, in order, for that segment.
+   *
+   * <p><b>NOTE</b>: This class is not thread-safe, so you must use a new instance of this
+   * class for each thread.</p>
+   *
+   * @param readerContext LeafReaderContext used to access the {@code BinaryDocValues} facet field
+   * @return an instance of {@link FacetLabelReader}
+   * @throws IOException when a low-level IO issue occurs
+   */
+  public FacetLabelReader getFacetLabelReader(LeafReaderContext readerContext) throws IOException {
+    return new FacetLabelReader(ordsReader, readerContext);
+  }
+
+  /**
+   * Utility class to retrieve facet labels for multiple documents.
+   *
+   * @lucene.experimental
+   */
+  public class FacetLabelReader {
+    private final OrdinalsReader.OrdinalsSegmentReader ordinalsSegmentReader;
+    private final IntsRef decodedOrds = new IntsRef();
+    private int currentDocId = -1;
+    private int currentPos = -1;
+
+    // Lazily set when nextFacetLabel(int docId, String facetDimension) is first called
+    private int[] parents;
+
+    /**
+     * Sole constructor.
+     */
+    public FacetLabelReader(OrdinalsReader ordsReader, LeafReaderContext readerContext) throws IOException {
+      ordinalsSegmentReader = ordsReader.getReader(readerContext);
+    }
+
+    /**
+     * Retrieves the next {@link FacetLabel} for the specified {@code docId}, or {@code null} if there are no more.
+     * This method has state: if the provided {@code docId} is the same as the previous invocation, it returns the
+     * next {@link FacetLabel} for that document.  Otherwise, it advances to the new {@code docId} and provides the
+     * first {@link FacetLabel} for that document, or {@code null} if that document has no indexed facets.  Each
+     * new {@code docId} must be in strictly monotonic (increasing) order.
+     *
+     * @param docId input docId provided in monotonic (non-decreasing) order
+     * @return the first or next {@link FacetLabel}, or {@code null} if there are no more
+     * @throws IOException when a low-level IO issue occurs
+     */
+    public FacetLabel nextFacetLabel(int docId) throws IOException {
+      if (currentDocId != docId) {
+        assert docId > currentDocId: "docs out of order!  currentDocId=" + currentDocId + " docId=" + docId;
+        ordinalsSegmentReader.get(docId, decodedOrds);
+        currentDocId = docId;
+        currentPos = decodedOrds.offset;
+      }
+
+      int endPos = decodedOrds.offset + decodedOrds.length;
+      assert currentPos <= endPos;
+
+      if (currentPos == endPos) {
+        // no more FacetLabels
+        return null;
+      }
+
+      int ord = decodedOrds.ints[currentPos++];
+      return taxoReader.getPath(ord);
+    }
+
+    private boolean isDescendant(int ord, int ancestorOrd) {
+      while (ord != INVALID_ORDINAL && ord != ROOT_ORDINAL) {
+        if (parents[ord] == ancestorOrd) {
+          return true;
+        }
+        ord = parents[ord];
+      }
+      return false;
+    }
+
+    /**
+     * Retrieves the next {@link FacetLabel} for the specified {@code docId} under the requested {@code facetDimension},
+     * or {@code null} if there are no more. This method has state: if the provided {@code docId} is the same as the
+     * previous invocation, it returns the next {@link FacetLabel} for that document.  Otherwise, it advances to
+     * the new {@code docId} and provides the first {@link FacetLabel} for that document, or {@code null} if that document
+     * has no indexed facets.  Each new {@code docId} must be in strictly monotonic (increasing) order.
+     *
+     * <p><b>NOTE</b>: this method loads the {@code int[] parents} array from the taxonomy index</p>

Review comment:
       done in next revision

##########
File path: lucene/facet/src/java/org/apache/lucene/facet/taxonomy/TaxonomyFacetLabels.java
##########
@@ -0,0 +1,184 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.facet.taxonomy;
+
+import org.apache.lucene.facet.FacetsConfig;
+import org.apache.lucene.index.LeafReaderContext;
+import org.apache.lucene.util.IntsRef;
+
+import java.io.IOException;
+
+import static org.apache.lucene.facet.taxonomy.TaxonomyReader.INVALID_ORDINAL;
+import static org.apache.lucene.facet.taxonomy.TaxonomyReader.ROOT_ORDINAL;
+
+/**
+ * Utility class to easily retrieve previously indexed facet labels, allowing you to skip also adding stored fields for these values,
+ * reducing your index size.
+ *
+ * @lucene.experimental
+ **/
+public class TaxonomyFacetLabels {
+
+  /**
+   * Index field name provided to the constructor
+   */
+  private final String indexFieldName;
+
+  /**
+   * {@code TaxonomyReader} provided to the constructor
+   */
+  private final TaxonomyReader taxoReader;
+
+  /**
+   * {@code FacetsConfig} provided to the constructor
+   */
+  private final FacetsConfig config;
+
+  /**
+   * {@code OrdinalsReader} to decode ordinals previously indexed into the {@code BinaryDocValues} facet field
+   */
+  private final OrdinalsReader ordsReader;
+
+  /**
+   * Sole constructor.  Do not close the provided {@link TaxonomyReader} while still using this instance!
+   */
+  public TaxonomyFacetLabels(TaxonomyReader taxoReader, FacetsConfig config, String indexFieldName) throws IOException {
+    this.taxoReader = taxoReader;
+    this.config = config;
+    this.indexFieldName = indexFieldName;
+    this.ordsReader = new DocValuesOrdinalsReader(indexFieldName);
+  }
+
+  /**
+   * Create and return an instance of {@link FacetLabelReader} to retrieve facet labels for
+   * multiple documents and (optionally) for a specific dimension.  You must create this per-segment,
+   * and then step through all hits, in order, for that segment.
+   *
+   * <p><b>NOTE</b>: This class is not thread-safe, so you must use a new instance of this
+   * class for each thread.</p>
+   *
+   * @param readerContext LeafReaderContext used to access the {@code BinaryDocValues} facet field
+   * @return an instance of {@link FacetLabelReader}
+   * @throws IOException when a low-level IO issue occurs
+   */
+  public FacetLabelReader getFacetLabelReader(LeafReaderContext readerContext) throws IOException {
+    return new FacetLabelReader(ordsReader, readerContext);
+  }
+
+  /**
+   * Utility class to retrieve facet labels for multiple documents.
+   *
+   * @lucene.experimental
+   */
+  public class FacetLabelReader {
+    private final OrdinalsReader.OrdinalsSegmentReader ordinalsSegmentReader;
+    private final IntsRef decodedOrds = new IntsRef();
+    private int currentDocId = -1;
+    private int currentPos = -1;
+
+    // Lazily set when nextFacetLabel(int docId, String facetDimension) is first called
+    private int[] parents;
+
+    /**
+     * Sole constructor.
+     */
+    public FacetLabelReader(OrdinalsReader ordsReader, LeafReaderContext readerContext) throws IOException {
+      ordinalsSegmentReader = ordsReader.getReader(readerContext);
+    }
+
+    /**
+     * Retrieves the next {@link FacetLabel} for the specified {@code docId}, or {@code null} if there are no more.
+     * This method has state: if the provided {@code docId} is the same as the previous invocation, it returns the
+     * next {@link FacetLabel} for that document.  Otherwise, it advances to the new {@code docId} and provides the
+     * first {@link FacetLabel} for that document, or {@code null} if that document has no indexed facets.  Each
+     * new {@code docId} must be in strictly monotonic (increasing) order.
+     *
+     * @param docId input docId provided in monotonic (non-decreasing) order
+     * @return the first or next {@link FacetLabel}, or {@code null} if there are no more
+     * @throws IOException when a low-level IO issue occurs
+     */
+    public FacetLabel nextFacetLabel(int docId) throws IOException {
+      if (currentDocId != docId) {
+        assert docId > currentDocId: "docs out of order!  currentDocId=" + currentDocId + " docId=" + docId;
+        ordinalsSegmentReader.get(docId, decodedOrds);
+        currentDocId = docId;
+        currentPos = decodedOrds.offset;
+      }
+
+      int endPos = decodedOrds.offset + decodedOrds.length;
+      assert currentPos <= endPos;
+
+      if (currentPos == endPos) {
+        // no more FacetLabels
+        return null;
+      }
+
+      int ord = decodedOrds.ints[currentPos++];
+      return taxoReader.getPath(ord);
+    }
+
+    private boolean isDescendant(int ord, int ancestorOrd) {
+      while (ord != INVALID_ORDINAL && ord != ROOT_ORDINAL) {
+        if (parents[ord] == ancestorOrd) {
+          return true;
+        }
+        ord = parents[ord];
+      }
+      return false;
+    }
+
+    /**
+     * Retrieves the next {@link FacetLabel} for the specified {@code docId} under the requested {@code facetDimension},
+     * or {@code null} if there are no more. This method has state: if the provided {@code docId} is the same as the
+     * previous invocation, it returns the next {@link FacetLabel} for that document.  Otherwise, it advances to
+     * the new {@code docId} and provides the first {@link FacetLabel} for that document, or {@code null} if that document
+     * has no indexed facets.  Each new {@code docId} must be in strictly monotonic (increasing) order.
+     *
+     * <p><b>NOTE</b>: this method loads the {@code int[] parents} array from the taxonomy index</p>
+     *
+     * @param docId input docId provided in non-decreasing order
+     * @return the first or next {@link FacetLabel}, or {@code null} if there are no more
+     * @throws IOException if {@link TaxonomyReader} has problems getting path for an ordinal
+     */
+    public FacetLabel nextFacetLabel(int docId, String facetDimension) throws IOException {
+      final int parentOrd = taxoReader.getOrdinal(new FacetLabel(facetDimension));
+      assert parentOrd != INVALID_ORDINAL : "Category ordinal not found for facet dimension: " + facetDimension;

Review comment:
       done in next revision.

##########
File path: lucene/facet/src/test/org/apache/lucene/facet/taxonomy/TestTaxonomyLabels.java
##########
@@ -0,0 +1,192 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.facet.taxonomy;
+
+import org.apache.lucene.document.Document;
+import org.apache.lucene.facet.FacetField;
+import org.apache.lucene.facet.FacetTestCase;
+import org.apache.lucene.facet.FacetsCollector;
+import org.apache.lucene.facet.FacetsCollector.MatchingDocs;
+import org.apache.lucene.facet.FacetsConfig;
+import org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyReader;
+import org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter;
+import org.apache.lucene.index.IndexWriterConfig;
+import org.apache.lucene.index.RandomIndexWriter;
+import org.apache.lucene.search.DocIdSetIterator;
+import org.apache.lucene.search.IndexSearcher;
+import org.apache.lucene.search.MatchAllDocsQuery;
+import org.apache.lucene.store.Directory;
+import org.apache.lucene.util.IOUtils;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.Set;
+import java.util.stream.Collectors;
+
+public class TestTaxonomyLabels extends FacetTestCase {
+
+  private List<Document> prepareDocuments() {
+    List<Document> docs = new ArrayList<>();
+
+    Document doc = new Document();
+    doc.add(new FacetField("Author", "Bob"));
+    doc.add(new FacetField("Publish Date", "2010", "10", "15"));
+    docs.add(doc);
+
+    doc = new Document();
+    doc.add(new FacetField("Author", "Lisa"));
+    doc.add(new FacetField("Publish Date", "2010", "10", "20"));
+    docs.add(doc);
+
+    doc = new Document();
+    doc.add(new FacetField("Author", "Tom"));
+    doc.add(new FacetField("Publish Date", "2012", "1", "1"));
+    docs.add(doc);
+
+    doc = new Document();
+    doc.add(new FacetField("Author", "Susan"));
+    doc.add(new FacetField("Publish Date", "2012", "1", "7"));
+    docs.add(doc);
+
+    doc = new Document();
+    doc.add(new FacetField("Author", "Frank"));
+    doc.add(new FacetField("Publish Date", "1999", "5", "5"));
+    docs.add(doc);
+
+    return docs;
+  }
+
+  private List<Integer> allDocIds(MatchingDocs m, boolean decreasingDocIds) throws IOException {
+    DocIdSetIterator disi = m.bits.iterator();
+    List<Integer> docIds = new ArrayList<>();
+    while (disi.nextDoc() != DocIdSetIterator.NO_MORE_DOCS) {
+      docIds.add(disi.docID());
+    }
+
+    if (decreasingDocIds == true) {
+      Collections.reverse(docIds);
+    }
+    return docIds;
+  }
+
+  private List<FacetLabel> lookupFacetLabels(TaxonomyFacetLabels taxoLabels,
+                                             List<MatchingDocs> matchingDocs) throws IOException {
+    return lookupFacetLabels(taxoLabels, matchingDocs, null, false);
+  }
+
+  private List<FacetLabel> lookupFacetLabels(TaxonomyFacetLabels taxoLabels,
+                                             List<MatchingDocs> matchingDocs,
+                                             String dimension) throws IOException {
+    return lookupFacetLabels(taxoLabels, matchingDocs, dimension, false);
+  }
+
+  private List<FacetLabel> lookupFacetLabels(TaxonomyFacetLabels taxoLabels, List<MatchingDocs> matchingDocs, String dimension,
+                                             boolean decreasingDocIds) throws IOException {
+    List<FacetLabel> facetLabels = new ArrayList<>();
+
+    for (MatchingDocs m : matchingDocs) {
+      TaxonomyFacetLabels.FacetLabelReader facetLabelReader = taxoLabels.getFacetLabelReader(m.context);
+      List<Integer> docIds = allDocIds(m, decreasingDocIds);
+      FacetLabel facetLabel;
+      for (Integer docId : docIds) {
+        while (true) {
+          if (dimension != null) {
+            facetLabel = facetLabelReader.nextFacetLabel(docId, dimension);
+          } else {
+            facetLabel = facetLabelReader.nextFacetLabel(docId);
+          }
+
+          if (facetLabel == null) {
+            break;
+          }
+          facetLabels.add(facetLabel);
+        }
+      }
+    }
+
+    return facetLabels;
+  }
+
+
+  public void testBasic() throws Exception {
+    Directory dir = newDirectory();
+    Directory taxoDir = newDirectory();
+
+    // Writes facet ords to a separate directory from the main index:
+    DirectoryTaxonomyWriter taxoWriter = new DirectoryTaxonomyWriter(taxoDir, IndexWriterConfig.OpenMode.CREATE);
+    RandomIndexWriter writer = new RandomIndexWriter(random(), dir);
+
+    FacetsConfig config = new FacetsConfig();
+    config.setHierarchical("Publish Date", true);
+
+    for (Document doc : prepareDocuments()) {
+      writer.addDocument(config.build(taxoWriter, doc));
+    }
+
+    // NRT open
+    IndexSearcher searcher = newSearcher(writer.getReader());
+    // NRT open
+    TaxonomyReader taxoReader = new DirectoryTaxonomyReader(taxoWriter);
+
+    FacetsCollector fc = new FacetsCollector();
+    searcher.search(new MatchAllDocsQuery(), fc);
+
+    TaxonomyFacetLabels taxoLabels = new TaxonomyFacetLabels(taxoReader, config, FacetsConfig.DEFAULT_INDEX_FIELD_NAME);
+
+    // Check labels for all dimensions
+    List<FacetLabel> facetLabels = lookupFacetLabels(taxoLabels, fc.getMatchingDocs());
+    assertTrue(facetLabels.size() == 10);
+    assertTrue(facetLabels.stream()
+        .filter(l -> "Author".equals(l.components[0]))
+        .map(l -> l.components[1]).collect(Collectors.toSet())
+        .equals(Set.of("Bob", "Lisa", "Susan", "Frank", "Tom")));
+
+    assertTrue(facetLabels.stream()
+        .filter(l -> "Publish Date".equals(l.components[0]))
+        .map(l -> String.join("/", l.components[1], l.components[2], l.components[3]))
+        .collect(Collectors.toSet())
+        .equals(Set.of("2010/10/15", "2010/10/20", "2012/1/1", "2012/1/7", "1999/5/5")));
+
+    // Check labels for specific dimension
+    facetLabels = lookupFacetLabels(taxoLabels, fc.getMatchingDocs(), "Publish Date");
+    assertTrue(facetLabels.size() == 5);
+    assertTrue(facetLabels.stream()
+        .filter(l -> "Publish Date".equals(l.components[0]))
+        .map(l -> String.join("/", l.components[1], l.components[2], l.components[3]))
+        .collect(Collectors.toSet())
+        .equals(Set.of("2010/10/15", "2010/10/20", "2012/1/1", "2012/1/7", "1999/5/5")));
+
+    try {
+      facetLabels = lookupFacetLabels(taxoLabels, fc.getMatchingDocs(), null, true);
+      fail("Assertion error was not thrown for using docIds supplied in decreasing order");
+    } catch (AssertionError ae) {

Review comment:
       Nice catch, I changed the implementation to throw `IllegalArgumentExceeption` after ugrade to `if` :-)

##########
File path: lucene/facet/src/test/org/apache/lucene/facet/taxonomy/TestTaxonomyLabels.java
##########
@@ -0,0 +1,192 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.facet.taxonomy;
+
+import org.apache.lucene.document.Document;
+import org.apache.lucene.facet.FacetField;
+import org.apache.lucene.facet.FacetTestCase;
+import org.apache.lucene.facet.FacetsCollector;
+import org.apache.lucene.facet.FacetsCollector.MatchingDocs;
+import org.apache.lucene.facet.FacetsConfig;
+import org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyReader;
+import org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter;
+import org.apache.lucene.index.IndexWriterConfig;
+import org.apache.lucene.index.RandomIndexWriter;
+import org.apache.lucene.search.DocIdSetIterator;
+import org.apache.lucene.search.IndexSearcher;
+import org.apache.lucene.search.MatchAllDocsQuery;
+import org.apache.lucene.store.Directory;
+import org.apache.lucene.util.IOUtils;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.Set;
+import java.util.stream.Collectors;
+
+public class TestTaxonomyLabels extends FacetTestCase {
+
+  private List<Document> prepareDocuments() {
+    List<Document> docs = new ArrayList<>();
+
+    Document doc = new Document();
+    doc.add(new FacetField("Author", "Bob"));
+    doc.add(new FacetField("Publish Date", "2010", "10", "15"));
+    docs.add(doc);
+
+    doc = new Document();
+    doc.add(new FacetField("Author", "Lisa"));
+    doc.add(new FacetField("Publish Date", "2010", "10", "20"));
+    docs.add(doc);
+
+    doc = new Document();
+    doc.add(new FacetField("Author", "Tom"));
+    doc.add(new FacetField("Publish Date", "2012", "1", "1"));
+    docs.add(doc);
+
+    doc = new Document();
+    doc.add(new FacetField("Author", "Susan"));
+    doc.add(new FacetField("Publish Date", "2012", "1", "7"));
+    docs.add(doc);
+
+    doc = new Document();
+    doc.add(new FacetField("Author", "Frank"));
+    doc.add(new FacetField("Publish Date", "1999", "5", "5"));
+    docs.add(doc);
+
+    return docs;
+  }
+
+  private List<Integer> allDocIds(MatchingDocs m, boolean decreasingDocIds) throws IOException {
+    DocIdSetIterator disi = m.bits.iterator();
+    List<Integer> docIds = new ArrayList<>();
+    while (disi.nextDoc() != DocIdSetIterator.NO_MORE_DOCS) {
+      docIds.add(disi.docID());
+    }
+
+    if (decreasingDocIds == true) {
+      Collections.reverse(docIds);
+    }
+    return docIds;
+  }
+
+  private List<FacetLabel> lookupFacetLabels(TaxonomyFacetLabels taxoLabels,
+                                             List<MatchingDocs> matchingDocs) throws IOException {
+    return lookupFacetLabels(taxoLabels, matchingDocs, null, false);
+  }
+
+  private List<FacetLabel> lookupFacetLabels(TaxonomyFacetLabels taxoLabels,
+                                             List<MatchingDocs> matchingDocs,
+                                             String dimension) throws IOException {
+    return lookupFacetLabels(taxoLabels, matchingDocs, dimension, false);
+  }
+
+  private List<FacetLabel> lookupFacetLabels(TaxonomyFacetLabels taxoLabels, List<MatchingDocs> matchingDocs, String dimension,
+                                             boolean decreasingDocIds) throws IOException {
+    List<FacetLabel> facetLabels = new ArrayList<>();
+
+    for (MatchingDocs m : matchingDocs) {
+      TaxonomyFacetLabels.FacetLabelReader facetLabelReader = taxoLabels.getFacetLabelReader(m.context);
+      List<Integer> docIds = allDocIds(m, decreasingDocIds);
+      FacetLabel facetLabel;
+      for (Integer docId : docIds) {
+        while (true) {
+          if (dimension != null) {
+            facetLabel = facetLabelReader.nextFacetLabel(docId, dimension);
+          } else {
+            facetLabel = facetLabelReader.nextFacetLabel(docId);
+          }
+
+          if (facetLabel == null) {
+            break;
+          }
+          facetLabels.add(facetLabel);
+        }
+      }
+    }
+
+    return facetLabels;
+  }
+
+
+  public void testBasic() throws Exception {
+    Directory dir = newDirectory();
+    Directory taxoDir = newDirectory();
+
+    // Writes facet ords to a separate directory from the main index:
+    DirectoryTaxonomyWriter taxoWriter = new DirectoryTaxonomyWriter(taxoDir, IndexWriterConfig.OpenMode.CREATE);
+    RandomIndexWriter writer = new RandomIndexWriter(random(), dir);
+
+    FacetsConfig config = new FacetsConfig();
+    config.setHierarchical("Publish Date", true);
+
+    for (Document doc : prepareDocuments()) {
+      writer.addDocument(config.build(taxoWriter, doc));
+    }
+
+    // NRT open
+    IndexSearcher searcher = newSearcher(writer.getReader());
+    // NRT open
+    TaxonomyReader taxoReader = new DirectoryTaxonomyReader(taxoWriter);
+
+    FacetsCollector fc = new FacetsCollector();
+    searcher.search(new MatchAllDocsQuery(), fc);
+
+    TaxonomyFacetLabels taxoLabels = new TaxonomyFacetLabels(taxoReader, config, FacetsConfig.DEFAULT_INDEX_FIELD_NAME);
+
+    // Check labels for all dimensions
+    List<FacetLabel> facetLabels = lookupFacetLabels(taxoLabels, fc.getMatchingDocs());
+    assertTrue(facetLabels.size() == 10);
+    assertTrue(facetLabels.stream()
+        .filter(l -> "Author".equals(l.components[0]))
+        .map(l -> l.components[1]).collect(Collectors.toSet())
+        .equals(Set.of("Bob", "Lisa", "Susan", "Frank", "Tom")));
+
+    assertTrue(facetLabels.stream()
+        .filter(l -> "Publish Date".equals(l.components[0]))
+        .map(l -> String.join("/", l.components[1], l.components[2], l.components[3]))
+        .collect(Collectors.toSet())
+        .equals(Set.of("2010/10/15", "2010/10/20", "2012/1/1", "2012/1/7", "1999/5/5")));
+
+    // Check labels for specific dimension
+    facetLabels = lookupFacetLabels(taxoLabels, fc.getMatchingDocs(), "Publish Date");
+    assertTrue(facetLabels.size() == 5);
+    assertTrue(facetLabels.stream()
+        .filter(l -> "Publish Date".equals(l.components[0]))
+        .map(l -> String.join("/", l.components[1], l.components[2], l.components[3]))
+        .collect(Collectors.toSet())
+        .equals(Set.of("2010/10/15", "2010/10/20", "2012/1/1", "2012/1/7", "1999/5/5")));
+
+    try {
+      facetLabels = lookupFacetLabels(taxoLabels, fc.getMatchingDocs(), null, true);
+      fail("Assertion error was not thrown for using docIds supplied in decreasing order");
+    } catch (AssertionError ae) {

Review comment:
       Nice catch, I changed the implementation of `nextFacetLabel(...)` methods to throw `IllegalArgumentException` should docIds be supplied in decreasing order and updated this test to expect an `IllegalArgumentException` when docIds are supplied in decreasing order.

##########
File path: lucene/facet/src/java/org/apache/lucene/facet/taxonomy/TaxonomyFacetLabels.java
##########
@@ -0,0 +1,184 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.facet.taxonomy;
+
+import org.apache.lucene.facet.FacetsConfig;
+import org.apache.lucene.index.LeafReaderContext;
+import org.apache.lucene.util.IntsRef;
+
+import java.io.IOException;
+
+import static org.apache.lucene.facet.taxonomy.TaxonomyReader.INVALID_ORDINAL;
+import static org.apache.lucene.facet.taxonomy.TaxonomyReader.ROOT_ORDINAL;
+
+/**
+ * Utility class to easily retrieve previously indexed facet labels, allowing you to skip also adding stored fields for these values,
+ * reducing your index size.
+ *
+ * @lucene.experimental
+ **/
+public class TaxonomyFacetLabels {
+
+  /**
+   * Index field name provided to the constructor
+   */
+  private final String indexFieldName;
+
+  /**
+   * {@code TaxonomyReader} provided to the constructor
+   */
+  private final TaxonomyReader taxoReader;
+
+  /**
+   * {@code FacetsConfig} provided to the constructor
+   */
+  private final FacetsConfig config;
+
+  /**
+   * {@code OrdinalsReader} to decode ordinals previously indexed into the {@code BinaryDocValues} facet field
+   */
+  private final OrdinalsReader ordsReader;
+
+  /**
+   * Sole constructor.  Do not close the provided {@link TaxonomyReader} while still using this instance!
+   */
+  public TaxonomyFacetLabels(TaxonomyReader taxoReader, FacetsConfig config, String indexFieldName) throws IOException {
+    this.taxoReader = taxoReader;
+    this.config = config;
+    this.indexFieldName = indexFieldName;
+    this.ordsReader = new DocValuesOrdinalsReader(indexFieldName);
+  }
+
+  /**
+   * Create and return an instance of {@link FacetLabelReader} to retrieve facet labels for
+   * multiple documents and (optionally) for a specific dimension.  You must create this per-segment,
+   * and then step through all hits, in order, for that segment.
+   *
+   * <p><b>NOTE</b>: This class is not thread-safe, so you must use a new instance of this
+   * class for each thread.</p>
+   *
+   * @param readerContext LeafReaderContext used to access the {@code BinaryDocValues} facet field
+   * @return an instance of {@link FacetLabelReader}
+   * @throws IOException when a low-level IO issue occurs
+   */
+  public FacetLabelReader getFacetLabelReader(LeafReaderContext readerContext) throws IOException {
+    return new FacetLabelReader(ordsReader, readerContext);
+  }
+
+  /**
+   * Utility class to retrieve facet labels for multiple documents.
+   *
+   * @lucene.experimental
+   */
+  public class FacetLabelReader {
+    private final OrdinalsReader.OrdinalsSegmentReader ordinalsSegmentReader;
+    private final IntsRef decodedOrds = new IntsRef();
+    private int currentDocId = -1;
+    private int currentPos = -1;
+
+    // Lazily set when nextFacetLabel(int docId, String facetDimension) is first called
+    private int[] parents;
+
+    /**
+     * Sole constructor.
+     */
+    public FacetLabelReader(OrdinalsReader ordsReader, LeafReaderContext readerContext) throws IOException {
+      ordinalsSegmentReader = ordsReader.getReader(readerContext);
+    }
+
+    /**
+     * Retrieves the next {@link FacetLabel} for the specified {@code docId}, or {@code null} if there are no more.
+     * This method has state: if the provided {@code docId} is the same as the previous invocation, it returns the
+     * next {@link FacetLabel} for that document.  Otherwise, it advances to the new {@code docId} and provides the
+     * first {@link FacetLabel} for that document, or {@code null} if that document has no indexed facets.  Each
+     * new {@code docId} must be in strictly monotonic (increasing) order.
+     *
+     * @param docId input docId provided in monotonic (non-decreasing) order
+     * @return the first or next {@link FacetLabel}, or {@code null} if there are no more
+     * @throws IOException when a low-level IO issue occurs
+     */
+    public FacetLabel nextFacetLabel(int docId) throws IOException {
+      if (currentDocId != docId) {
+        assert docId > currentDocId: "docs out of order!  currentDocId=" + currentDocId + " docId=" + docId;
+        ordinalsSegmentReader.get(docId, decodedOrds);
+        currentDocId = docId;
+        currentPos = decodedOrds.offset;
+      }
+
+      int endPos = decodedOrds.offset + decodedOrds.length;
+      assert currentPos <= endPos;
+
+      if (currentPos == endPos) {
+        // no more FacetLabels
+        return null;
+      }
+
+      int ord = decodedOrds.ints[currentPos++];
+      return taxoReader.getPath(ord);
+    }
+
+    private boolean isDescendant(int ord, int ancestorOrd) {
+      while (ord != INVALID_ORDINAL && ord != ROOT_ORDINAL) {
+        if (parents[ord] == ancestorOrd) {
+          return true;
+        }
+        ord = parents[ord];
+      }
+      return false;
+    }
+
+    /**
+     * Retrieves the next {@link FacetLabel} for the specified {@code docId} under the requested {@code facetDimension},
+     * or {@code null} if there are no more. This method has state: if the provided {@code docId} is the same as the
+     * previous invocation, it returns the next {@link FacetLabel} for that document.  Otherwise, it advances to
+     * the new {@code docId} and provides the first {@link FacetLabel} for that document, or {@code null} if that document
+     * has no indexed facets.  Each new {@code docId} must be in strictly monotonic (increasing) order.
+     *
+     * <p><b>NOTE</b>: this method loads the {@code int[] parents} array from the taxonomy index</p>
+     *
+     * @param docId input docId provided in non-decreasing order
+     * @return the first or next {@link FacetLabel}, or {@code null} if there are no more
+     * @throws IOException if {@link TaxonomyReader} has problems getting path for an ordinal
+     */
+    public FacetLabel nextFacetLabel(int docId, String facetDimension) throws IOException {
+      final int parentOrd = taxoReader.getOrdinal(new FacetLabel(facetDimension));
+      assert parentOrd != INVALID_ORDINAL : "Category ordinal not found for facet dimension: " + facetDimension;
+
+      if (currentDocId != docId) {
+        assert docId > currentDocId: "docs out of order!  currentDocId=" + currentDocId + " docId=" + docId;
+        ordinalsSegmentReader.get(docId, decodedOrds);
+        currentPos = decodedOrds.offset;
+        currentDocId = docId;
+      }
+
+      if (parents == null) {
+        parents = taxoReader.getParallelTaxonomyArrays().parents();
+      }
+
+      int endPos = decodedOrds.offset + decodedOrds.length;
+      assert currentPos <= endPos;
+
+      for (; currentPos < endPos; ) {
+        int ord = decodedOrds.ints[currentPos++];
+        if (isDescendant(ord, parentOrd) == true) {
+          return taxoReader.getPath(ord);

Review comment:
       By this class you mean `TaxonomyReader` because that is where `BinaryDocValues` will be cached as part of [LUCENE-9476](https://issues.apache.org/jira/browse/LUCENE-9476) improving performance for callers that need to lookup multiple ordinals ?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org