You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2021/11/19 13:36:04 UTC

[GitHub] [lucene] bruno-roustant commented on a change in pull request #443: LUCENE-10062: Switch to numeric doc values for encoding taxonomy ordinals

bruno-roustant commented on a change in pull request #443:
URL: https://github.com/apache/lucene/pull/443#discussion_r753194847



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -409,9 +410,26 @@ private void processFacetFields(
         indexDrillDownTerms(doc, indexFieldName, dimConfig, facetLabel);
       }
 
-      // Facet counts:
-      // DocValues are considered stored fields:
-      doc.add(new BinaryDocValuesField(indexFieldName, dedupAndEncode(ordinals.get())));
+      // Store the taxonomy ordinals associated with each doc. Prefer to use SortedNumericDocValues
+      // but "fall back" to a custom binary format to maintain backwards compatibility with Lucene 8
+      // indexes.
+      if (taxoWriter.useNumericDocValuesForOrdinals()) {
+        // Dedupe and encode the ordinals. It's not important that we sort here
+        // (SortedNumericDocValuesField will handle this internally), but we

Review comment:
       Java dual-pivot quicksort is optimized to detect nearly sorted data, it is very fast in this case. (and I double checked with our SorterBenchmark)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org