You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2021/03/19 05:20:20 UTC

[GitHub] [lucene] zacharymorn opened a new pull request #25: LUCENE-9385: Add option to skip indexing facet drill down terms

zacharymorn opened a new pull request #25:
URL: https://github.com/apache/lucene/pull/25


   # Description / solution
   
   Add option to skip indexing facet drill down terms
   
   # Tests
   * gradlew precommit is failing due to nocommit comment
   * added new passing tests
   
   # Checklist
   
   Please review the following and check all that apply:
   
   - [x] I have reviewed the guidelines for [How to Contribute](https://wiki.apache.org/lucene/HowToContribute) and my code conforms to the standards described there to the best of my ability.
   - [x] I have created a Jira issue and added the issue ID to my pull request title.
   - [x] I have given Lucene maintainers [access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork) to contribute to my PR branch. (optional but recommended)
   - [x] I have developed this patch against the `main` branch.
   - [x] I have run `./gradlew check`.
   - [x] I have added tests for my changes.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] zacharymorn commented on a change in pull request #25: LUCENE-9385: Add option to skip indexing facet drill down terms

Posted by GitBox <gi...@apache.org>.
zacharymorn commented on a change in pull request #25:
URL: https://github.com/apache/lucene/pull/25#discussion_r598402280



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -444,12 +478,6 @@ private void processAssocFacetFields(
         FacetsConfig.DimConfig ft = getDimConfig(field.dim);
 
         // Drill down:
-        int start;

Review comment:
       Sure I've created a jira issue to track this https://issues.apache.org/jira/browse/LUCENE-9856. I see from https://issues.apache.org/jira/browse/LUCENE-9497 `error prone` was already integrated to be used as static analysis tool during compilation, so I guess it may just require some additional configuration. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] gsmiller commented on a change in pull request #25: LUCENE-9385: Add option to skip indexing facet drill down terms

Posted by GitBox <gi...@apache.org>.
gsmiller commented on a change in pull request #25:
URL: https://github.com/apache/lucene/pull/25#discussion_r600516372



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -388,6 +434,42 @@ private void processFacetFields(
     }
   }
 
+  private void indexDrillDownTerms(

Review comment:
       Refactoring this functionality into its own method makes it so much easier to read. Thanks!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] mikemccand commented on a change in pull request #25: LUCENE-9385: Add option to skip indexing facet drill down terms

Posted by GitBox <gi...@apache.org>.
mikemccand commented on a change in pull request #25:
URL: https://github.com/apache/lucene/pull/25#discussion_r603483275



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -168,14 +203,34 @@ public synchronized void setIndexFieldName(String dimName, String indexFieldName
     ft.indexFieldName = indexFieldName;
   }
 
-  /** Specify whether drill down on just the dimension is necessary. */
+  /**
+   * Specify whether drill down on the dimension is necessary.
+   *
+   * @deprecated Use {@link FacetsConfig#setDrillDownTermsIndexing(String, DrillDownTermsIndexing)}
+   *     instead
+   */
+  @Deprecated
   public synchronized void setRequireDimensionDrillDown(String dimName, boolean v) {

Review comment:
       Aha, looks like you already opened the backport PR!  https://github.com/apache/lucene-solr/pull/2471 Thanks @zacharymorn!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] zacharymorn commented on a change in pull request #25: LUCENE-9385: Add option to skip indexing facet drill down terms

Posted by GitBox <gi...@apache.org>.
zacharymorn commented on a change in pull request #25:
URL: https://github.com/apache/lucene/pull/25#discussion_r602651563



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -388,6 +434,42 @@ private void processFacetFields(
     }
   }
 
+  private void indexDrillDownTerms(
+      Document doc, String indexFieldName, DimConfig ft, FacetLabel cp) {
+    if (ft.drillDownTermsIndexing != DrillDownTermsIndexing.NONE) {
+      // index full-path drill down term
+      doc.add(
+          new StringField(indexFieldName, pathToString(cp.components, cp.length), Field.Store.NO));
+
+      switch (ft.drillDownTermsIndexing) {
+        case NONE:

Review comment:
       Thanks Robert! I've opened a issue available here https://issues.apache.org/jira/browse/LUCENE-9883




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] zacharymorn commented on a change in pull request #25: LUCENE-9385: Add option to skip indexing facet drill down terms

Posted by GitBox <gi...@apache.org>.
zacharymorn commented on a change in pull request #25:
URL: https://github.com/apache/lucene/pull/25#discussion_r598035067



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -444,12 +478,6 @@ private void processAssocFacetFields(
         FacetsConfig.DimConfig ft = getDimConfig(field.dim);
 
         // Drill down:
-        int start;

Review comment:
       Yeah I was surprised it didn't get caught by static analysis as well (my IDE did though).  




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] zacharymorn commented on a change in pull request #25: LUCENE-9385: Add option to skip indexing facet drill down terms

Posted by GitBox <gi...@apache.org>.
zacharymorn commented on a change in pull request #25:
URL: https://github.com/apache/lucene/pull/25#discussion_r598398199



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -65,6 +65,38 @@
   // int/float/bytes in a single indexed field:
   private final Map<String, String> assocDimTypes = new ConcurrentHashMap<>();
 
+  /**
+   * Drill down terms indexing option to control whether dimension and sub-path terms should be
+   * indexed.
+   */
+  public enum DrillDownTermsIndexing {
+    /**
+     * Index only full-path drill down terms. No dimension nor sub-path indexing. e.g. for
+     * FacetField("a", "foo/bar/baz"), we would only index value "a foo bar baz".
+     */
+    FULL_PATH_ONLY,
+
+    /**
+     * Index sub-path and full-path down terms. No dimension indexing. e.g. for FacetField("a",
+     * "foo/bar/baz"), we would only index values "a foo", "a foo bar", and "a foo baz".

Review comment:
       Updated. I also noticed I had put in some incorrect examples, so I fixed those as well.

##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -65,6 +65,38 @@
   // int/float/bytes in a single indexed field:
   private final Map<String, String> assocDimTypes = new ConcurrentHashMap<>();
 
+  /**
+   * Drill down terms indexing option to control whether dimension and sub-path terms should be
+   * indexed.
+   */
+  public enum DrillDownTermsIndexing {
+    /**
+     * Index only full-path drill down terms. No dimension nor sub-path indexing. e.g. for
+     * FacetField("a", "foo/bar/baz"), we would only index value "a foo bar baz".
+     */
+    FULL_PATH_ONLY,
+
+    /**
+     * Index sub-path and full-path down terms. No dimension indexing. e.g. for FacetField("a",
+     * "foo/bar/baz"), we would only index values "a foo", "a foo bar", and "a foo baz".
+     */
+    SUB_PATH_AND_FULL_PATH,
+
+    /**
+     * Index dimension and full-path drill down terms. No sub-path indexing. e.g. for
+     * FacetField("a", "foo/bar/baz"), we would only index values "a" and "a foo baz". This is weird
+     * usage and currently not available.
+     */
+    // DIMENSION_AND_FULL_PATH,
+
+    /**
+     * Index dimension, sub-path and full-path drill down terms. e.g. for FacetField("a",
+     * "foo/bar/baz"), we would index all values "a", "a foo", "a foo bar", and "a foo baz". This is
+     * the default / existing behavior.
+     */
+    ALL

Review comment:
       Added the `NONE` option. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on a change in pull request #25: LUCENE-9385: Add option to skip indexing facet drill down terms

Posted by GitBox <gi...@apache.org>.
rmuir commented on a change in pull request #25:
URL: https://github.com/apache/lucene/pull/25#discussion_r601951524



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -388,6 +434,42 @@ private void processFacetFields(
     }
   }
 
+  private void indexDrillDownTerms(
+      Document doc, String indexFieldName, DimConfig ft, FacetLabel cp) {
+    if (ft.drillDownTermsIndexing != DrillDownTermsIndexing.NONE) {
+      // index full-path drill down term
+      doc.add(
+          new StringField(indexFieldName, pathToString(cp.components, cp.length), Field.Store.NO));
+
+      switch (ft.drillDownTermsIndexing) {
+        case NONE:

Review comment:
       OK I looked, I think it would not be difficult. To improve this, we just have to make decisions about how we want switch-on-enums to work in our codebase.
   
   For example, we could enable `missingEnumCaseDespiteDefault` which would mean that every enum value must be explicitly listed in every switch, even if a `default` case exists. As most enums are small, it would not be that annoying and would ensure compile-time safety if new enum values were added later.
   
   This change would only impact switch statements on enums, so it might be worth looking into. We could still, separately, choose to enforce other things about switch statements in general such as requiring `default` case if we choose to do that, and it wouldn't break this check: we could just make it the pattern to throw an AssertionError from there for enums: all the values are known at compile-time, so we enforce no surprises. 
   And requiring `default` is good practice for switches on every other data type because all values are not known.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] gsmiller commented on a change in pull request #25: LUCENE-9385: Add option to skip indexing facet drill down terms

Posted by GitBox <gi...@apache.org>.
gsmiller commented on a change in pull request #25:
URL: https://github.com/apache/lucene/pull/25#discussion_r597966307



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -372,7 +403,10 @@ private void processFacetFields(
 
         // Drill down:

Review comment:
       Hmm, maybe this is just a personal preference, but I'd find it more readable to do:
   
   ```suggestion
           // Drill down:
           if (disableDrillDown == false) {
             int start;
             if (ft.requireDimensionDrillDown) {
               start = 1;
             } else {
               start = 2;
             }
             for (int i = start; i <= cp.length; i++) {
               doc.add(new StringField(indexFieldName, pathToString(cp.components, i), Field.Store.NO));
             }
           }
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on a change in pull request #25: LUCENE-9385: Add option to skip indexing facet drill down terms

Posted by GitBox <gi...@apache.org>.
rmuir commented on a change in pull request #25:
URL: https://github.com/apache/lucene/pull/25#discussion_r597596432



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -65,6 +65,8 @@
   // int/float/bytes in a single indexed field:
   private final Map<String, String> assocDimTypes = new ConcurrentHashMap<>();
 
+  private boolean disableDrillDown = false;

Review comment:
       Can we rework this to be `enableDrillDown = true`? 
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] zacharymorn commented on a change in pull request #25: LUCENE-9385: Add option to skip indexing facet drill down terms

Posted by GitBox <gi...@apache.org>.
zacharymorn commented on a change in pull request #25:
URL: https://github.com/apache/lucene/pull/25#discussion_r600111781



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -408,7 +475,7 @@ private void processSSDVFacetFields(
         doc.add(new StringField(indexFieldName, fullPath, Field.Store.NO));

Review comment:
       Sounds good. I've added support for these options for SSDV as well in the latest commits. Actually during implementation I noticed there are test assertions for drill down for SSDV already, so I guess there's already overlap between the two (do we want to combine the overlap though?).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] zacharymorn commented on a change in pull request #25: LUCENE-9385: Add option to skip indexing facet drill down terms

Posted by GitBox <gi...@apache.org>.
zacharymorn commented on a change in pull request #25:
URL: https://github.com/apache/lucene/pull/25#discussion_r600992558



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -388,6 +434,42 @@ private void processFacetFields(
     }
   }
 
+  private void indexDrillDownTerms(
+      Document doc, String indexFieldName, DimConfig ft, FacetLabel cp) {

Review comment:
       Good catch! For some reasons I've been looking at them the whole time without realizing I should change them. I've updated them in the latest commit and also remove some commented out `System.out.println()`. For the original naming though I'm guess `ft` stands for `fieldType`, but not sure what `cp` stands for.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] zacharymorn commented on a change in pull request #25: LUCENE-9385: Add option to skip indexing facet drill down terms

Posted by GitBox <gi...@apache.org>.
zacharymorn commented on a change in pull request #25:
URL: https://github.com/apache/lucene/pull/25#discussion_r599171328



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -444,12 +511,6 @@ private void processAssocFacetFields(
         FacetsConfig.DimConfig ft = getDimConfig(field.dim);
 
         // Drill down:
-        int start;
-        if (ft.requireDimensionDrillDown) {
-          start = 1;
-        } else {
-          start = 2;
-        }
         for (int i = 1; i <= label.length; i++) {
           doc.add(

Review comment:
       I originally thought it doesn't need to be updated as association facet field doesn't work with hierarchy, and the `int i = 1` in the for loop seems to be done on purpose:
   
   https://github.com/apache/lucene/blob/2678d68be8e1eab75706c3aedf27ca4c77182a22/lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java#L263
   
   But on a second thought / look I think it should use the same configuration now, particularly after `NONE` option is added. I've updated it accordingly. Good catch!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] mikemccand merged pull request #25: LUCENE-9385: Add option to skip indexing facet drill down terms

Posted by GitBox <gi...@apache.org>.
mikemccand merged pull request #25:
URL: https://github.com/apache/lucene/pull/25


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] zacharymorn commented on a change in pull request #25: LUCENE-9385: Add option to skip indexing facet drill down terms

Posted by GitBox <gi...@apache.org>.
zacharymorn commented on a change in pull request #25:
URL: https://github.com/apache/lucene/pull/25#discussion_r600112770



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -168,14 +203,34 @@ public synchronized void setIndexFieldName(String dimName, String indexFieldName
     ft.indexFieldName = indexFieldName;
   }
 
-  /** Specify whether drill down on just the dimension is necessary. */
+  /**
+   * Specify whether drill down on the dimension is necessary.
+   *
+   * @deprecated Use {@link FacetsConfig#setDrillDownTermsIndexing(String, DrillDownTermsIndexing)}
+   *     instead
+   */
+  @Deprecated
   public synchronized void setRequireDimensionDrillDown(String dimName, boolean v) {

Review comment:
       I think we do. The implementation change in `setRequireDimensionDrillDown` should behave like the existing one (keeping or skipping indexing for dim only), existing tests passed (in particular `testRequireDimensionDrillDown`) and there are new tests added to verify the default scenarios (e.g. `testDrillDownTermsDefaultWithHierarchicalSetting`).  Is back-porting something only Lucene committers can do, or I can help out as well?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] zacharymorn commented on a change in pull request #25: LUCENE-9385: Add option to skip indexing facet drill down terms

Posted by GitBox <gi...@apache.org>.
zacharymorn commented on a change in pull request #25:
URL: https://github.com/apache/lucene/pull/25#discussion_r599170681



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -371,14 +426,26 @@ private void processFacetFields(
         }
 
         // Drill down:
-        int start;
-        if (ft.requireDimensionDrillDown) {
-          start = 1;
-        } else {
-          start = 2;
-        }
-        for (int i = start; i <= cp.length; i++) {
-          doc.add(new StringField(indexFieldName, pathToString(cp.components, i), Field.Store.NO));
+        if (ft.drillDownTermsIndexing != DrillDownTermsIndexing.NONE) {
+          StringField fullPathField =

Review comment:
       I introduced `fullPathField` here as I thought it could use some explanation to add clarity, but I'm fine either way so I have inlined it here.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] mikemccand commented on a change in pull request #25: LUCENE-9385: Add option to skip indexing facet drill down terms

Posted by GitBox <gi...@apache.org>.
mikemccand commented on a change in pull request #25:
URL: https://github.com/apache/lucene/pull/25#discussion_r598319815



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -444,12 +478,6 @@ private void processAssocFacetFields(
         FacetsConfig.DimConfig ft = getDimConfig(field.dim);
 
         // Drill down:
-        int start;

Review comment:
       Maybe open a follow-on issue?  Surely `javac` has some command line flags these days to warn about dead code?  If we turned it on would we be horrified by how much dead we have lying about?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] zacharymorn commented on a change in pull request #25: LUCENE-9385: Add option to skip indexing facet drill down terms

Posted by GitBox <gi...@apache.org>.
zacharymorn commented on a change in pull request #25:
URL: https://github.com/apache/lucene/pull/25#discussion_r598055282



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -87,6 +89,8 @@
      * True if drilling down by a whole dimension, to match all documents that had any value for
      * this dimension, is necessary (default is true)
      */
+    // nocommit should this (control for only label / dim name indexing) be deprecated to favor the

Review comment:
       I've pushed a new commit using enum with above definition for now, and moving the flag from `FacetsConfig` to `DimConfig` so that it's still configurable at the field level. Please let me know if any changes need to be made there.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] mikemccand commented on a change in pull request #25: LUCENE-9385: Add option to skip indexing facet drill down terms

Posted by GitBox <gi...@apache.org>.
mikemccand commented on a change in pull request #25:
URL: https://github.com/apache/lucene/pull/25#discussion_r598320268



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -65,6 +65,38 @@
   // int/float/bytes in a single indexed field:
   private final Map<String, String> assocDimTypes = new ConcurrentHashMap<>();
 
+  /**
+   * Drill down terms indexing option to control whether dimension and sub-path terms should be
+   * indexed.
+   */
+  public enum DrillDownTermsIndexing {
+    /**
+     * Index only full-path drill down terms. No dimension nor sub-path indexing. e.g. for
+     * FacetField("a", "foo/bar/baz"), we would only index value "a foo bar baz".

Review comment:
       Instead of `"a foo bar baz"` could we consistently use `/`, i.e. change it to `a/foo/bar/baz`?

##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -65,6 +65,38 @@
   // int/float/bytes in a single indexed field:
   private final Map<String, String> assocDimTypes = new ConcurrentHashMap<>();
 
+  /**
+   * Drill down terms indexing option to control whether dimension and sub-path terms should be
+   * indexed.
+   */
+  public enum DrillDownTermsIndexing {
+    /**
+     * Index only full-path drill down terms. No dimension nor sub-path indexing. e.g. for
+     * FacetField("a", "foo/bar/baz"), we would only index value "a foo bar baz".
+     */
+    FULL_PATH_ONLY,
+
+    /**
+     * Index sub-path and full-path down terms. No dimension indexing. e.g. for FacetField("a",
+     * "foo/bar/baz"), we would only index values "a foo", "a foo bar", and "a foo baz".

Review comment:
       And put `/` in the examples too, e.g. `a/foo`, `a/foo/bar`, ...

##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -372,7 +403,10 @@ private void processFacetFields(
 
         // Drill down:

Review comment:
       I think this approach is good, i.e. "disable drill down" would also mean "disable the full path drill down", i.e. it would mean this `dim` adds zero tokens to the inverted index.

##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -65,6 +65,38 @@
   // int/float/bytes in a single indexed field:
   private final Map<String, String> assocDimTypes = new ConcurrentHashMap<>();
 
+  /**
+   * Drill down terms indexing option to control whether dimension and sub-path terms should be
+   * indexed.
+   */
+  public enum DrillDownTermsIndexing {
+    /**
+     * Index only full-path drill down terms. No dimension nor sub-path indexing. e.g. for
+     * FacetField("a", "foo/bar/baz"), we would only index value "a foo bar baz".
+     */
+    FULL_PATH_ONLY,
+
+    /**
+     * Index sub-path and full-path down terms. No dimension indexing. e.g. for FacetField("a",

Review comment:
       Insert `drill` before `down terms`?

##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -65,6 +65,38 @@
   // int/float/bytes in a single indexed field:
   private final Map<String, String> assocDimTypes = new ConcurrentHashMap<>();
 
+  /**
+   * Drill down terms indexing option to control whether dimension and sub-path terms should be
+   * indexed.
+   */
+  public enum DrillDownTermsIndexing {
+    /**
+     * Index only full-path drill down terms. No dimension nor sub-path indexing. e.g. for
+     * FacetField("a", "foo/bar/baz"), we would only index value "a foo bar baz".
+     */
+    FULL_PATH_ONLY,
+
+    /**
+     * Index sub-path and full-path down terms. No dimension indexing. e.g. for FacetField("a",
+     * "foo/bar/baz"), we would only index values "a foo", "a foo bar", and "a foo baz".
+     */
+    SUB_PATH_AND_FULL_PATH,
+
+    /**
+     * Index dimension and full-path drill down terms. No sub-path indexing. e.g. for
+     * FacetField("a", "foo/bar/baz"), we would only index values "a" and "a foo baz". This is weird
+     * usage and currently not available.
+     */
+    // DIMENSION_AND_FULL_PATH,
+
+    /**
+     * Index dimension, sub-path and full-path drill down terms. e.g. for FacetField("a",
+     * "foo/bar/baz"), we would index all values "a", "a foo", "a foo bar", and "a foo baz". This is
+     * the default / existing behavior.
+     */
+    ALL

Review comment:
       Hmm don't we also need a `NONE`?  I.e. disable all drill down indexing entirely (the original goal of the issue I think)?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] zacharymorn commented on a change in pull request #25: LUCENE-9385: Add option to skip indexing facet drill down terms

Posted by GitBox <gi...@apache.org>.
zacharymorn commented on a change in pull request #25:
URL: https://github.com/apache/lucene/pull/25#discussion_r598035085



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -87,6 +89,8 @@
      * True if drilling down by a whole dimension, to match all documents that had any value for
      * this dimension, is necessary (default is true)
      */
+    // nocommit should this (control for only label / dim name indexing) be deprecated to favor the

Review comment:
       Using enum to replace the 2 booleans is a great idea! I'll change it in a bit. 
   
   However, I have been thinking about one question during when I worked on this yesterday, and think I should ask after seeing your comment above: does full path term count as "drill down term" ?  If it does count, and full path term should be skipped for indexing as well when drill down term indexing is disabled, then wouldn't the full path information provided via `FacetField(dimension, path1, path2, path3...)` is lost in the index? If it does not count, and we should always index full path term, then wouldn't `Index no drill down terms` == `Index only full path drill down`?
   
   For clarity, I've listed the enum options with different drill down terms examples below that I have in mind. Please let me know if this is the correct understanding.  
   
   ```
     /**
      * Drill down terms indexing option to control whether dimension and sub-path terms should be
      * indexed.
      */
     public enum DrillDownTermsIndexing {
       /**
        * Index only full-path drill down terms. No dimension nor sub-path indexing. 
        * e.g. for FaceField("a", "foo/bar/baz"), we would only index value "a foo bar baz"
        */
       FULL_PATH_ONLY,
   
       /**
        * Index sub-path and full-path down terms. No dimension indexing. 
        * e.g. for FaceField("a", "foo/bar/baz"), we would only index values "a foo", "a foo bar", and "a foo baz"
        */
       SUB_PATH_AND_FULL_PATH,
   
       /**
        * Index dimension and full-path drill down terms. No sub-path indexing. 
        * e.g. for FaceField("a", "foo/bar/baz"), we would only index values "a" and "a foo baz" 
        * This is weird usage and currently not available.
        */
       // DIMENSION_AND_FULL_PATH,
   
       /**
        * Index dimension, sub-path and full-path drill down terms. 
        * e.g. for FaceField("a", "foo/bar/baz"), we would index all values "a", "a foo", "a foo bar", and "a foo baz" 
        * This is the default / existing behavior
        */
       ALL
     }
   ```

##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -65,6 +65,8 @@
   // int/float/bytes in a single indexed field:
   private final Map<String, String> assocDimTypes = new ConcurrentHashMap<>();
 
+  private boolean disableDrillDown = false;

Review comment:
       Thanks for the suggestion Robert! I had previously thought about it (prefer positive statement over negative one), but ended up choosing `disableDrillDown` variable / API naming as drill down terms indexing is an existing behavior, so I'm not sure if adding `enableDrillDown` variable / API may add confusion for existing Lucene clients that this indexing needs to be explicitly enabled.  
   
   However, I'm planning to change the API from taking a boolean to taking an enum instead as Mike suggested (`setDrillDownTermsIndexingOption`) , so I guess we should be good there?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] gsmiller commented on a change in pull request #25: LUCENE-9385: Add option to skip indexing facet drill down terms

Posted by GitBox <gi...@apache.org>.
gsmiller commented on a change in pull request #25:
URL: https://github.com/apache/lucene/pull/25#discussion_r600514785



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -388,6 +434,42 @@ private void processFacetFields(
     }
   }
 
+  private void indexDrillDownTerms(
+      Document doc, String indexFieldName, DimConfig ft, FacetLabel cp) {
+    if (ft.drillDownTermsIndexing != DrillDownTermsIndexing.NONE) {
+      // index full-path drill down term
+      doc.add(
+          new StringField(indexFieldName, pathToString(cp.components, cp.length), Field.Store.NO));
+
+      switch (ft.drillDownTermsIndexing) {
+        case NONE:

Review comment:
       minor: Should we consider not checking the NONE case here since it's already checked by the outer conditional? I'm surprised the IDE doesn't bark about this actually.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] gsmiller commented on a change in pull request #25: LUCENE-9385: Add option to skip indexing facet drill down terms

Posted by GitBox <gi...@apache.org>.
gsmiller commented on a change in pull request #25:
URL: https://github.com/apache/lucene/pull/25#discussion_r599195582



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -408,7 +475,7 @@ private void processSSDVFacetFields(
         doc.add(new StringField(indexFieldName, fullPath, Field.Store.NO));

Review comment:
       Yeah, these are interesting because they can't really honor all the options you've introduced in the new enum right? The enum is pretty centered around the taxonomy use-case, but then these other two sneak in.
   
   These other two (SSDV and AssociationFacet) can honor NONE, FULL_PATH_ONLY and DIMENSION_AND_FULL_PATH, but the other two options are weird because there aren't cases (that I'm aware of anyway) for indexing partial paths. Hmm. Well, at a minimum, I think they should honor those first three options. The current behavior is either DIMENSION_AND_FULL_PATH or FULL_PATH_ONLY, and NONE would be trivial to add. Need a little more thought on the other two. Maybe Mike has some additional ideas :)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] gsmiller commented on a change in pull request #25: LUCENE-9385: Add option to skip indexing facet drill down terms

Posted by GitBox <gi...@apache.org>.
gsmiller commented on a change in pull request #25:
URL: https://github.com/apache/lucene/pull/25#discussion_r597964197



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -178,6 +190,25 @@ public synchronized void setRequireDimensionDrillDown(String dimName, boolean v)
     ft.requireDimensionDrillDown = v;
   }
 
+  /**
+   * Specify whether drill down functionality as a whole should be disabled. Default to false for
+   * backward compatibility.

Review comment:
       I wonder if calling out "backward compatibility" here might create confusion. It reads like the "new best thing" is to disable drilling down, which really isn't the case. Seems like just defaulting to enabling drill down is a simple enough explanation (hey, that's what most users probably expect right?).
   
   Also, to echo Robert's comment above, I suggest enableDrillDown instead of focusing on "disabling"




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] gsmiller commented on a change in pull request #25: LUCENE-9385: Add option to skip indexing facet drill down terms

Posted by GitBox <gi...@apache.org>.
gsmiller commented on a change in pull request #25:
URL: https://github.com/apache/lucene/pull/25#discussion_r598757471



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -371,14 +426,26 @@ private void processFacetFields(
         }
 
         // Drill down:
-        int start;
-        if (ft.requireDimensionDrillDown) {
-          start = 1;
-        } else {
-          start = 2;
-        }
-        for (int i = start; i <= cp.length; i++) {
-          doc.add(new StringField(indexFieldName, pathToString(cp.components, i), Field.Store.NO));
+        if (ft.drillDownTermsIndexing != DrillDownTermsIndexing.NONE) {
+          StringField fullPathField =
+              new StringField(
+                  indexFieldName, pathToString(cp.components, cp.length), Field.Store.NO);
+          doc.add(fullPathField);
+
+          if (ft.drillDownTermsIndexing == DrillDownTermsIndexing.DIMENSION_AND_FULL_PATH) {
+            doc.add(
+                new StringField(indexFieldName, pathToString(cp.components, 1), Field.Store.NO));
+          } else if (ft.drillDownTermsIndexing == DrillDownTermsIndexing.ALL_PATHS_NO_DIM) {
+            for (int i = 2; i < cp.length; i++) {
+              doc.add(
+                  new StringField(indexFieldName, pathToString(cp.components, i), Field.Store.NO));
+            }
+          } else if (ft.drillDownTermsIndexing == DrillDownTermsIndexing.ALL) {
+            for (int i = 1; i < cp.length; i++) {
+              doc.add(
+                  new StringField(indexFieldName, pathToString(cp.components, i), Field.Store.NO));
+            }
+          }

Review comment:
       minor: you might consider the following, which might make it more clear to future readers that the only difference in these two options is the start length:
   
   ```suggestion
   } else if (ft.drillDownTermsIndexing == DrillDownTermsIndexing.ALL_PATHS_NO_DIM || ft.drillDownTermsIndexing == DrillDownTermsIndexing.ALL) {
               int start;
               if (ft.drillDownTermsIndexing == DrillDownTermsIndexing.ALL) {
                 start = 1;
               } else {
                 start = 2;
               }
               for (int i = start; i < cp.length; i++) {
                 doc.add(new StringField(indexFieldName, pathToString(cp.components, i), Field.Store.NO));
               }
             }
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on a change in pull request #25: LUCENE-9385: Add option to skip indexing facet drill down terms

Posted by GitBox <gi...@apache.org>.
rmuir commented on a change in pull request #25:
URL: https://github.com/apache/lucene/pull/25#discussion_r601940550



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -388,6 +434,42 @@ private void processFacetFields(
     }
   }
 
+  private void indexDrillDownTerms(
+      Document doc, String indexFieldName, DimConfig ft, FacetLabel cp) {
+    if (ft.drillDownTermsIndexing != DrillDownTermsIndexing.NONE) {
+      // index full-path drill down term
+      doc.add(
+          new StringField(indexFieldName, pathToString(cp.components, cp.length), Field.Store.NO));
+
+      switch (ft.drillDownTermsIndexing) {
+        case NONE:

Review comment:
       @gsmiller @zacharymorn we may also choose to enable some ECJ checks for switch statements. I will do a quick test of this to see how difficult it would be to turn some of these on.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] zacharymorn commented on a change in pull request #25: LUCENE-9385: Add option to skip indexing facet drill down terms

Posted by GitBox <gi...@apache.org>.
zacharymorn commented on a change in pull request #25:
URL: https://github.com/apache/lucene/pull/25#discussion_r599171007



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -371,14 +426,26 @@ private void processFacetFields(
         }
 
         // Drill down:
-        int start;
-        if (ft.requireDimensionDrillDown) {
-          start = 1;
-        } else {
-          start = 2;
-        }
-        for (int i = start; i <= cp.length; i++) {
-          doc.add(new StringField(indexFieldName, pathToString(cp.components, i), Field.Store.NO));
+        if (ft.drillDownTermsIndexing != DrillDownTermsIndexing.NONE) {
+          StringField fullPathField =
+              new StringField(
+                  indexFieldName, pathToString(cp.components, cp.length), Field.Store.NO);
+          doc.add(fullPathField);
+
+          if (ft.drillDownTermsIndexing == DrillDownTermsIndexing.DIMENSION_AND_FULL_PATH) {
+            doc.add(
+                new StringField(indexFieldName, pathToString(cp.components, 1), Field.Store.NO));
+          } else if (ft.drillDownTermsIndexing == DrillDownTermsIndexing.ALL_PATHS_NO_DIM) {
+            for (int i = 2; i < cp.length; i++) {
+              doc.add(
+                  new StringField(indexFieldName, pathToString(cp.components, i), Field.Store.NO));
+            }
+          } else if (ft.drillDownTermsIndexing == DrillDownTermsIndexing.ALL) {
+            for (int i = 1; i < cp.length; i++) {
+              doc.add(
+                  new StringField(indexFieldName, pathToString(cp.components, i), Field.Store.NO));
+            }
+          }

Review comment:
       hmmm tbh I actually preferred the existing one for readability as each option is checked once, whereas the suggested one has additional branching to check for one option twice. I see both styles having their pros & cons... 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] zacharymorn commented on a change in pull request #25: LUCENE-9385: Add option to skip indexing facet drill down terms

Posted by GitBox <gi...@apache.org>.
zacharymorn commented on a change in pull request #25:
URL: https://github.com/apache/lucene/pull/25#discussion_r602012262



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -388,6 +434,42 @@ private void processFacetFields(
     }
   }
 
+  private void indexDrillDownTerms(
+      Document doc, String indexFieldName, DimConfig ft, FacetLabel cp) {
+    if (ft.drillDownTermsIndexing != DrillDownTermsIndexing.NONE) {
+      // index full-path drill down term
+      doc.add(
+          new StringField(indexFieldName, pathToString(cp.components, cp.length), Field.Store.NO));
+
+      switch (ft.drillDownTermsIndexing) {
+        case NONE:

Review comment:
       > OK I looked, I think it would not be difficult. To improve this, we just have to make decisions about how we want switch-on-enums to work in our codebase.
   > 
   > For example, we could enable `missingEnumCaseDespiteDefault` which would mean that every enum value must be explicitly listed in every switch, even if a `default` case exists. As most enums are small, it would not be that annoying and would ensure compile-time safety if new enum values were added later.
   > 
   > This change would only impact switch statements on enums, so it might be worth looking into. We could still, separately, choose to enforce other things about switch statements in general such as requiring `default` case if we choose to do that, and it wouldn't break this check: we could just make it the pattern to throw an AssertionError from there for enums: all the values are known at compile-time, so we enforce no surprises.
   > And requiring `default` is good practice for switches on every other data type because all values are not known.
   
   Thanks Robert for the suggestion on this! I like both enforcement ideas. I gave `missingEnumCaseDespiteDefault` a try and can see a few need to be added already. Do you mind I create a spin-off issue for that and work on it separately? 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] mikemccand commented on a change in pull request #25: LUCENE-9385: Add option to skip indexing facet drill down terms

Posted by GitBox <gi...@apache.org>.
mikemccand commented on a change in pull request #25:
URL: https://github.com/apache/lucene/pull/25#discussion_r602728130



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -388,6 +434,42 @@ private void processFacetFields(
     }
   }
 
+  private void indexDrillDownTerms(
+      Document doc, String indexFieldName, DimConfig ft, FacetLabel cp) {
+    if (ft.drillDownTermsIndexing != DrillDownTermsIndexing.NONE) {
+      // index full-path drill down term
+      doc.add(
+          new StringField(indexFieldName, pathToString(cp.components, cp.length), Field.Store.NO));
+
+      switch (ft.drillDownTermsIndexing) {
+        case NONE:

Review comment:
       Awesome, thanks @zacharymorn and @rmuir!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on a change in pull request #25: LUCENE-9385: Add option to skip indexing facet drill down terms

Posted by GitBox <gi...@apache.org>.
rmuir commented on a change in pull request #25:
URL: https://github.com/apache/lucene/pull/25#discussion_r602160575



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -388,6 +434,42 @@ private void processFacetFields(
     }
   }
 
+  private void indexDrillDownTerms(
+      Document doc, String indexFieldName, DimConfig ft, FacetLabel cp) {
+    if (ft.drillDownTermsIndexing != DrillDownTermsIndexing.NONE) {
+      // index full-path drill down term
+      doc.add(
+          new StringField(indexFieldName, pathToString(cp.components, cp.length), Field.Store.NO));
+
+      switch (ft.drillDownTermsIndexing) {
+        case NONE:

Review comment:
       please, of course it is best to do any change like this with a separate issue/PR!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] zacharymorn commented on a change in pull request #25: LUCENE-9385: Add option to skip indexing facet drill down terms

Posted by GitBox <gi...@apache.org>.
zacharymorn commented on a change in pull request #25:
URL: https://github.com/apache/lucene/pull/25#discussion_r598398449



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -65,6 +65,38 @@
   // int/float/bytes in a single indexed field:
   private final Map<String, String> assocDimTypes = new ConcurrentHashMap<>();
 
+  /**
+   * Drill down terms indexing option to control whether dimension and sub-path terms should be
+   * indexed.
+   */
+  public enum DrillDownTermsIndexing {
+    /**
+     * Index only full-path drill down terms. No dimension nor sub-path indexing. e.g. for
+     * FacetField("a", "foo/bar/baz"), we would only index value "a foo bar baz".
+     */
+    FULL_PATH_ONLY,
+
+    /**
+     * Index sub-path and full-path down terms. No dimension indexing. e.g. for FacetField("a",
+     * "foo/bar/baz"), we would only index values "a foo", "a foo bar", and "a foo baz".
+     */
+    SUB_PATH_AND_FULL_PATH,
+
+    /**
+     * Index dimension and full-path drill down terms. No sub-path indexing. e.g. for
+     * FacetField("a", "foo/bar/baz"), we would only index values "a" and "a foo baz". This is weird
+     * usage and currently not available.
+     */
+    // DIMENSION_AND_FULL_PATH,

Review comment:
       Updated. Yeah I wasn't sure before if full-path is considered as drill down terms as well, but the new `NONE` option makes it clear now.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] gsmiller commented on a change in pull request #25: LUCENE-9385: Add option to skip indexing facet drill down terms

Posted by GitBox <gi...@apache.org>.
gsmiller commented on a change in pull request #25:
URL: https://github.com/apache/lucene/pull/25#discussion_r600513254



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -388,6 +434,42 @@ private void processFacetFields(
     }
   }
 
+  private void indexDrillDownTerms(
+      Document doc, String indexFieldName, DimConfig ft, FacetLabel cp) {

Review comment:
       Does anyone know what "ft" and "cp" actually stand for? Is this a case where the class names changed at some point but the variable names stayed the same? I dunno. Either way, is it time to maybe name these something more descriptive in this method?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] zacharymorn commented on a change in pull request #25: LUCENE-9385: Add option to skip indexing facet drill down terms

Posted by GitBox <gi...@apache.org>.
zacharymorn commented on a change in pull request #25:
URL: https://github.com/apache/lucene/pull/25#discussion_r600111254



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -388,6 +434,26 @@ private void processFacetFields(
     }
   }
 
+  private void indexDrillDownTerms(
+      Document doc, String indexFieldName, DimConfig ft, FacetLabel cp) {
+    if (ft.drillDownTermsIndexing != DrillDownTermsIndexing.NONE) {
+      doc.add(
+          new StringField(indexFieldName, pathToString(cp.components, cp.length), Field.Store.NO));
+
+      if (ft.drillDownTermsIndexing == DrillDownTermsIndexing.DIMENSION_AND_FULL_PATH) {
+        doc.add(new StringField(indexFieldName, pathToString(cp.components, 1), Field.Store.NO));
+      } else if (ft.drillDownTermsIndexing == DrillDownTermsIndexing.ALL_PATHS_NO_DIM) {
+        for (int i = 2; i < cp.length; i++) {
+          doc.add(new StringField(indexFieldName, pathToString(cp.components, i), Field.Store.NO));
+        }
+      } else if (ft.drillDownTermsIndexing == DrillDownTermsIndexing.ALL) {
+        for (int i = 1; i < cp.length; i++) {
+          doc.add(new StringField(indexFieldName, pathToString(cp.components, i), Field.Store.NO));
+        }
+      }

Review comment:
       Good call for the assertion error for the potentially new and unhandled enum! I've updated it to use `switch` statement.

##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -388,6 +434,26 @@ private void processFacetFields(
     }
   }
 
+  private void indexDrillDownTerms(
+      Document doc, String indexFieldName, DimConfig ft, FacetLabel cp) {
+    if (ft.drillDownTermsIndexing != DrillDownTermsIndexing.NONE) {

Review comment:
       Done.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] gsmiller commented on a change in pull request #25: LUCENE-9385: Add option to skip indexing facet drill down terms

Posted by GitBox <gi...@apache.org>.
gsmiller commented on a change in pull request #25:
URL: https://github.com/apache/lucene/pull/25#discussion_r598748091



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -65,6 +65,38 @@
   // int/float/bytes in a single indexed field:
   private final Map<String, String> assocDimTypes = new ConcurrentHashMap<>();
 
+  /**
+   * Drill down terms indexing option to control whether dimension and sub-path terms should be
+   * indexed.
+   */
+  public enum DrillDownTermsIndexing {
+    /**
+     * Index only full-path drill down terms. No dimension nor sub-path indexing. e.g. for
+     * FacetField("a", "foo/bar/baz"), we would only index value "a foo bar baz".
+     */
+    FULL_PATH_ONLY,
+
+    /**
+     * Index sub-path and full-path down terms. No dimension indexing. e.g. for FacetField("a",
+     * "foo/bar/baz"), we would only index values "a foo", "a foo bar", and "a foo baz".
+     */
+    SUB_PATH_AND_FULL_PATH,
+
+    /**
+     * Index dimension and full-path drill down terms. No sub-path indexing. e.g. for
+     * FacetField("a", "foo/bar/baz"), we would only index values "a" and "a foo baz". This is weird
+     * usage and currently not available.
+     */
+    // DIMENSION_AND_FULL_PATH,
+
+    /**
+     * Index dimension, sub-path and full-path drill down terms. e.g. for FacetField("a",
+     * "foo/bar/baz"), we would index all values "a", "a foo", "a foo bar", and "a foo baz". This is
+     * the default / existing behavior.
+     */
+    ALL

Review comment:
       Love the examples in the doc. Thanks!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] gsmiller commented on a change in pull request #25: LUCENE-9385: Add option to skip indexing facet drill down terms

Posted by GitBox <gi...@apache.org>.
gsmiller commented on a change in pull request #25:
URL: https://github.com/apache/lucene/pull/25#discussion_r599106387



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -444,12 +511,6 @@ private void processAssocFacetFields(
         FacetsConfig.DimConfig ft = getDimConfig(field.dim);
 
         // Drill down:
-        int start;
-        if (ft.requireDimensionDrillDown) {
-          start = 1;
-        } else {
-          start = 2;
-        }
         for (int i = 1; i <= label.length; i++) {
           doc.add(

Review comment:
       Does this also need to be modified to honor the config?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] zacharymorn commented on a change in pull request #25: LUCENE-9385: Add option to skip indexing facet drill down terms

Posted by GitBox <gi...@apache.org>.
zacharymorn commented on a change in pull request #25:
URL: https://github.com/apache/lucene/pull/25#discussion_r597413482



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -444,12 +478,6 @@ private void processAssocFacetFields(
         FacetsConfig.DimConfig ft = getDimConfig(field.dim);
 
         // Drill down:
-        int start;

Review comment:
       This `start` variable is not used in code.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] gsmiller commented on a change in pull request #25: LUCENE-9385: Add option to skip indexing facet drill down terms

Posted by GitBox <gi...@apache.org>.
gsmiller commented on a change in pull request #25:
URL: https://github.com/apache/lucene/pull/25#discussion_r599105765



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -408,7 +475,7 @@ private void processSSDVFacetFields(
         doc.add(new StringField(indexFieldName, fullPath, Field.Store.NO));

Review comment:
       Does this also need to get modified?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] zacharymorn commented on a change in pull request #25: LUCENE-9385: Add option to skip indexing facet drill down terms

Posted by GitBox <gi...@apache.org>.
zacharymorn commented on a change in pull request #25:
URL: https://github.com/apache/lucene/pull/25#discussion_r599171189



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -408,7 +475,7 @@ private void processSSDVFacetFields(
         doc.add(new StringField(indexFieldName, fullPath, Field.Store.NO));

Review comment:
       I think this would function like the existing behavior since I made the corresponding changes for setting `requireDimensionDrillDown`:
   
   ```
     /**
      * Specify whether drill down on the dimension is necessary.
      *
      * @deprecated Use {@link FacetsConfig#setDrillDownTermsIndexing(String, DrillDownTermsIndexing)}
      *     instead
      */
     @Deprecated
     public synchronized void setRequireDimensionDrillDown(String dimName, boolean v) {
       DimConfig ft = fieldTypes.get(dimName);
       if (ft == null) {
         ft = new DimConfig();
         fieldTypes.put(dimName, ft);
       }
   
       ft.drillDownTermsIndexing =
           v ? DrillDownTermsIndexing.ALL : DrillDownTermsIndexing.ALL_PATHS_NO_DIM; // drillDownTermsIndexing = true is now drillDownTermsIndexing = DrillDownTermsIndexing.ALL
     }
   ```
   
   but I guess a relevant question would be, do we want to support different indexing options here as well like in `processFacetFields` ? My understanding is this method is not used for taxonomy faceting, so we may not need to add indexing for drill down terms, but I may be wrong. What do you think @mikemccand ?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] zacharymorn commented on a change in pull request #25: LUCENE-9385: Add option to skip indexing facet drill down terms

Posted by GitBox <gi...@apache.org>.
zacharymorn commented on a change in pull request #25:
URL: https://github.com/apache/lucene/pull/25#discussion_r598035273



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -178,6 +190,25 @@ public synchronized void setRequireDimensionDrillDown(String dimName, boolean v)
     ft.requireDimensionDrillDown = v;
   }
 
+  /**
+   * Specify whether drill down functionality as a whole should be disabled. Default to false for
+   * backward compatibility.

Review comment:
       Thank Greg for the comment! I was coding at night after some beer so might have used the words a bit casually :D , but yes my original meaning there was that the default setting should not break existing Lucene clients upgrading from older version. I'll re-word this java doc entirely as I'll change the API to taking enum instead of boolean. 
   
   For `enableDrillDown` vs. `disableDrillDown`, please see my reply to Robert's comment above.

##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -372,7 +403,10 @@ private void processFacetFields(
 
         // Drill down:

Review comment:
       I think this logic may or may not be correct actually, as it depends on whether full path terms is considered as "drill down terms" (see my question to Mike above).
   
   Also, the index `i` in `pathToString(cp.components, i)` above is used as "length". So with the original implementation when `start = cp.length`,  `pathToString(cp.components, start)`  will still be executed and give the full path including every element inside `cp.components`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] zacharymorn commented on pull request #25: LUCENE-9385: Add option to skip indexing facet drill down terms

Posted by GitBox <gi...@apache.org>.
zacharymorn commented on pull request #25:
URL: https://github.com/apache/lucene/pull/25#issuecomment-805432620


   > This is looking close; thanks @zacharymorn!
   > 
   > And thank you for opening the spinoff issue https://issues.apache.org/jira/browse/LUCENE-9856 -- it now has a life all its own and a massive PR fixing many "I see dead code!" problems: [#34 (review)](https://github.com/apache/lucene/pull/34#pullrequestreview-618569001)
   
   No problem and it's my pleasure. Thank you for suggesting opening that issue as well!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] zacharymorn commented on a change in pull request #25: LUCENE-9385: Add option to skip indexing facet drill down terms

Posted by GitBox <gi...@apache.org>.
zacharymorn commented on a change in pull request #25:
URL: https://github.com/apache/lucene/pull/25#discussion_r598397987



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -372,7 +403,10 @@ private void processFacetFields(
 
         // Drill down:

Review comment:
       Updated to use the new `DrillDownTermsIndexing.NONE` option to denote the all drill down disabled condition. Also refactored a bit more there to support the `DrillDownTermsIndexing.DIMENSION_AND_FULL_PATH` option that was previously commented out.

##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -65,6 +65,38 @@
   // int/float/bytes in a single indexed field:
   private final Map<String, String> assocDimTypes = new ConcurrentHashMap<>();
 
+  /**
+   * Drill down terms indexing option to control whether dimension and sub-path terms should be
+   * indexed.
+   */
+  public enum DrillDownTermsIndexing {
+    /**
+     * Index only full-path drill down terms. No dimension nor sub-path indexing. e.g. for
+     * FacetField("a", "foo/bar/baz"), we would only index value "a foo bar baz".

Review comment:
       Updated. I used `a foo bar baz` before as that's the format computed by `pathToString(cp.components, i)` and fed into the value of `StringField`, but yeah given that is actually internal drill down term indexing format, we should probably not expose that in java doc and favor `a/foo/bar/baz` instead.

##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -65,6 +65,38 @@
   // int/float/bytes in a single indexed field:
   private final Map<String, String> assocDimTypes = new ConcurrentHashMap<>();
 
+  /**
+   * Drill down terms indexing option to control whether dimension and sub-path terms should be
+   * indexed.
+   */
+  public enum DrillDownTermsIndexing {
+    /**
+     * Index only full-path drill down terms. No dimension nor sub-path indexing. e.g. for
+     * FacetField("a", "foo/bar/baz"), we would only index value "a foo bar baz".
+     */
+    FULL_PATH_ONLY,
+
+    /**
+     * Index sub-path and full-path down terms. No dimension indexing. e.g. for FacetField("a",

Review comment:
       Ops. Updated. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] mikemccand commented on a change in pull request #25: LUCENE-9385: Add option to skip indexing facet drill down terms

Posted by GitBox <gi...@apache.org>.
mikemccand commented on a change in pull request #25:
URL: https://github.com/apache/lucene/pull/25#discussion_r597761931



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -444,12 +478,6 @@ private void processAssocFacetFields(
         FacetsConfig.DimConfig ft = getDimConfig(field.dim);
 
         // Drill down:
-        int start;

Review comment:
       Good catch!  It's crazy our static tooling can't ... help us remove such code :)

##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -87,6 +89,8 @@
      * True if drilling down by a whole dimension, to match all documents that had any value for
      * this dimension, is necessary (default is true)
      */
+    // nocommit should this (control for only label / dim name indexing) be deprecated to favor the

Review comment:
       I think there three use cases we want to enable, which we are now doing with two booleans:
   
     * Index no drill down terms
     * Index dimension and full path drill down
     * Index only full path drill down
   
   I think the 4th logical option (Index only dimension) is very weird and we don't need to make that easy?
   
   So maybe we could make an `enum` for these three, instead of two booleans?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] gsmiller commented on a change in pull request #25: LUCENE-9385: Add option to skip indexing facet drill down terms

Posted by GitBox <gi...@apache.org>.
gsmiller commented on a change in pull request #25:
URL: https://github.com/apache/lucene/pull/25#discussion_r598753012



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -371,14 +426,26 @@ private void processFacetFields(
         }
 
         // Drill down:
-        int start;
-        if (ft.requireDimensionDrillDown) {
-          start = 1;
-        } else {
-          start = 2;
-        }
-        for (int i = start; i <= cp.length; i++) {
-          doc.add(new StringField(indexFieldName, pathToString(cp.components, i), Field.Store.NO));
+        if (ft.drillDownTermsIndexing != DrillDownTermsIndexing.NONE) {
+          StringField fullPathField =

Review comment:
       minor: You could collapse these two lines and skip declaring the local variable fullPathField to be consistent with your other code.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] zacharymorn commented on pull request #25: LUCENE-9385: Add option to skip indexing facet drill down terms

Posted by GitBox <gi...@apache.org>.
zacharymorn commented on pull request #25:
URL: https://github.com/apache/lucene/pull/25#issuecomment-808759691


   > This looks awesome! Thank you for fixing all the silly `ft` and `cp` names too!
   
   Thanks for the review and approval!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] mikemccand commented on a change in pull request #25: LUCENE-9385: Add option to skip indexing facet drill down terms

Posted by GitBox <gi...@apache.org>.
mikemccand commented on a change in pull request #25:
URL: https://github.com/apache/lucene/pull/25#discussion_r602739378



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -168,14 +203,34 @@ public synchronized void setIndexFieldName(String dimName, String indexFieldName
     ft.indexFieldName = indexFieldName;
   }
 
-  /** Specify whether drill down on just the dimension is necessary. */
+  /**
+   * Specify whether drill down on the dimension is necessary.
+   *
+   * @deprecated Use {@link FacetsConfig#setDrillDownTermsIndexing(String, DrillDownTermsIndexing)}
+   *     instead
+   */
+  @Deprecated
   public synchronized void setRequireDimensionDrillDown(String dimName, boolean v) {

Review comment:
       > Is back-porting something only Lucene committers can do, or I can help out as well?
   
   Oh feel free to help!  That would be very welcome :)  It's a bit tricky right now since Lucene 9.x dev is happening in different repository than 8.x.  Not sure if the normal cherry-pick approach "just works"?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] mikemccand commented on a change in pull request #25: LUCENE-9385: Add option to skip indexing facet drill down terms

Posted by GitBox <gi...@apache.org>.
mikemccand commented on a change in pull request #25:
URL: https://github.com/apache/lucene/pull/25#discussion_r598320523



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -65,6 +65,38 @@
   // int/float/bytes in a single indexed field:
   private final Map<String, String> assocDimTypes = new ConcurrentHashMap<>();
 
+  /**
+   * Drill down terms indexing option to control whether dimension and sub-path terms should be
+   * indexed.
+   */
+  public enum DrillDownTermsIndexing {
+    /**
+     * Index only full-path drill down terms. No dimension nor sub-path indexing. e.g. for
+     * FacetField("a", "foo/bar/baz"), we would only index value "a foo bar baz".
+     */
+    FULL_PATH_ONLY,
+
+    /**
+     * Index sub-path and full-path down terms. No dimension indexing. e.g. for FacetField("a",
+     * "foo/bar/baz"), we would only index values "a foo", "a foo bar", and "a foo baz".
+     */
+    SUB_PATH_AND_FULL_PATH,
+
+    /**
+     * Index dimension and full-path drill down terms. No sub-path indexing. e.g. for
+     * FacetField("a", "foo/bar/baz"), we would only index values "a" and "a foo baz". This is weird
+     * usage and currently not available.
+     */
+    // DIMENSION_AND_FULL_PATH,
+
+    /**
+     * Index dimension, sub-path and full-path drill down terms. e.g. for FacetField("a",
+     * "foo/bar/baz"), we would index all values "a", "a foo", "a foo bar", and "a foo baz". This is
+     * the default / existing behavior.
+     */
+    ALL

Review comment:
       Hmm I think we should also add `NONE`, i.e. don't even index the full-path drill down token?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] zacharymorn commented on a change in pull request #25: LUCENE-9385: Add option to skip indexing facet drill down terms

Posted by GitBox <gi...@apache.org>.
zacharymorn commented on a change in pull request #25:
URL: https://github.com/apache/lucene/pull/25#discussion_r602745425



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -168,14 +203,34 @@ public synchronized void setIndexFieldName(String dimName, String indexFieldName
     ft.indexFieldName = indexFieldName;
   }
 
-  /** Specify whether drill down on just the dimension is necessary. */
+  /**
+   * Specify whether drill down on the dimension is necessary.
+   *
+   * @deprecated Use {@link FacetsConfig#setDrillDownTermsIndexing(String, DrillDownTermsIndexing)}
+   *     instead
+   */
+  @Deprecated
   public synchronized void setRequireDimensionDrillDown(String dimName, boolean v) {

Review comment:
       Sounds good! I assume for 8.x in different repo you are referring to this branch in the old repo right https://github.com/apache/lucene-solr/tree/branch_8x ? And I guess back-porting just means I need to initiate a separate PR with this change, but against that branch / repo, and the PR will go through the same review and approval process to be merged right ?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] zacharymorn commented on a change in pull request #25: LUCENE-9385: Add option to skip indexing facet drill down terms

Posted by GitBox <gi...@apache.org>.
zacharymorn commented on a change in pull request #25:
URL: https://github.com/apache/lucene/pull/25#discussion_r598398375



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -65,6 +65,38 @@
   // int/float/bytes in a single indexed field:
   private final Map<String, String> assocDimTypes = new ConcurrentHashMap<>();
 
+  /**
+   * Drill down terms indexing option to control whether dimension and sub-path terms should be
+   * indexed.
+   */
+  public enum DrillDownTermsIndexing {
+    /**
+     * Index only full-path drill down terms. No dimension nor sub-path indexing. e.g. for
+     * FacetField("a", "foo/bar/baz"), we would only index value "a foo bar baz".
+     */
+    FULL_PATH_ONLY,
+
+    /**
+     * Index sub-path and full-path down terms. No dimension indexing. e.g. for FacetField("a",
+     * "foo/bar/baz"), we would only index values "a foo", "a foo bar", and "a foo baz".
+     */
+    SUB_PATH_AND_FULL_PATH,

Review comment:
       Renamed.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] mikemccand commented on a change in pull request #25: LUCENE-9385: Add option to skip indexing facet drill down terms

Posted by GitBox <gi...@apache.org>.
mikemccand commented on a change in pull request #25:
URL: https://github.com/apache/lucene/pull/25#discussion_r598321546



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -65,6 +65,38 @@
   // int/float/bytes in a single indexed field:
   private final Map<String, String> assocDimTypes = new ConcurrentHashMap<>();
 
+  /**
+   * Drill down terms indexing option to control whether dimension and sub-path terms should be
+   * indexed.
+   */
+  public enum DrillDownTermsIndexing {
+    /**
+     * Index only full-path drill down terms. No dimension nor sub-path indexing. e.g. for
+     * FacetField("a", "foo/bar/baz"), we would only index value "a foo bar baz".
+     */
+    FULL_PATH_ONLY,
+
+    /**
+     * Index sub-path and full-path down terms. No dimension indexing. e.g. for FacetField("a",
+     * "foo/bar/baz"), we would only index values "a foo", "a foo bar", and "a foo baz".
+     */
+    SUB_PATH_AND_FULL_PATH,
+
+    /**
+     * Index dimension and full-path drill down terms. No sub-path indexing. e.g. for
+     * FacetField("a", "foo/bar/baz"), we would only index values "a" and "a foo baz". This is weird
+     * usage and currently not available.
+     */
+    // DIMENSION_AND_FULL_PATH,

Review comment:
       Actually, I am now thinking that this case might be helpful?  (It was the "only dim and NOT full path" case that I thought seems pointless) ... this case could e.g. be used to find all docs that have at least one facet label on the dimension.  So, maybe uncomment this and fixup the doc, e.g. add `/` in, remove the weirdness statement?

##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -65,6 +65,38 @@
   // int/float/bytes in a single indexed field:
   private final Map<String, String> assocDimTypes = new ConcurrentHashMap<>();
 
+  /**
+   * Drill down terms indexing option to control whether dimension and sub-path terms should be
+   * indexed.
+   */
+  public enum DrillDownTermsIndexing {
+    /**
+     * Index only full-path drill down terms. No dimension nor sub-path indexing. e.g. for
+     * FacetField("a", "foo/bar/baz"), we would only index value "a foo bar baz".
+     */
+    FULL_PATH_ONLY,
+
+    /**
+     * Index sub-path and full-path down terms. No dimension indexing. e.g. for FacetField("a",
+     * "foo/bar/baz"), we would only index values "a foo", "a foo bar", and "a foo baz".
+     */
+    SUB_PATH_AND_FULL_PATH,

Review comment:
       Maybe rename to `ALL_PATHS_NO_DIM`?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] zacharymorn commented on a change in pull request #25: LUCENE-9385: Add option to skip indexing facet drill down terms

Posted by GitBox <gi...@apache.org>.
zacharymorn commented on a change in pull request #25:
URL: https://github.com/apache/lucene/pull/25#discussion_r602004669



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -388,6 +434,42 @@ private void processFacetFields(
     }
   }
 
+  private void indexDrillDownTerms(
+      Document doc, String indexFieldName, DimConfig ft, FacetLabel cp) {

Review comment:
       I'm 99% sure that's a correct guess.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] zacharymorn commented on a change in pull request #25: LUCENE-9385: Add option to skip indexing facet drill down terms

Posted by GitBox <gi...@apache.org>.
zacharymorn commented on a change in pull request #25:
URL: https://github.com/apache/lucene/pull/25#discussion_r600993992



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -388,6 +434,42 @@ private void processFacetFields(
     }
   }
 
+  private void indexDrillDownTerms(
+      Document doc, String indexFieldName, DimConfig ft, FacetLabel cp) {
+    if (ft.drillDownTermsIndexing != DrillDownTermsIndexing.NONE) {
+      // index full-path drill down term
+      doc.add(
+          new StringField(indexFieldName, pathToString(cp.components, cp.length), Field.Store.NO));
+
+      switch (ft.drillDownTermsIndexing) {
+        case NONE:

Review comment:
       So not checking `NONE` in this switch statement will actually cause this case (and the `FULL_PATH_ONLY` case) to fall through and reach the default branch, where an assertion error would be raised. I've considered not using `switch` due to these additional checks, but overall I still find them to be more readable than the original if-else-if-else branches.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] gsmiller commented on a change in pull request #25: LUCENE-9385: Add option to skip indexing facet drill down terms

Posted by GitBox <gi...@apache.org>.
gsmiller commented on a change in pull request #25:
URL: https://github.com/apache/lucene/pull/25#discussion_r601500635



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -388,6 +434,42 @@ private void processFacetFields(
     }
   }
 
+  private void indexDrillDownTerms(
+      Document doc, String indexFieldName, DimConfig ft, FacetLabel cp) {

Review comment:
       My money's on "cp == category path" :)

##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -388,6 +434,42 @@ private void processFacetFields(
     }
   }
 
+  private void indexDrillDownTerms(
+      Document doc, String indexFieldName, DimConfig ft, FacetLabel cp) {
+    if (ft.drillDownTermsIndexing != DrillDownTermsIndexing.NONE) {
+      // index full-path drill down term
+      doc.add(
+          new StringField(indexFieldName, pathToString(cp.components, cp.length), Field.Store.NO));
+
+      switch (ft.drillDownTermsIndexing) {
+        case NONE:

Review comment:
       Got it, thanks!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] mikemccand commented on a change in pull request #25: LUCENE-9385: Add option to skip indexing facet drill down terms

Posted by GitBox <gi...@apache.org>.
mikemccand commented on a change in pull request #25:
URL: https://github.com/apache/lucene/pull/25#discussion_r603456522



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -168,14 +203,34 @@ public synchronized void setIndexFieldName(String dimName, String indexFieldName
     ft.indexFieldName = indexFieldName;
   }
 
-  /** Specify whether drill down on just the dimension is necessary. */
+  /**
+   * Specify whether drill down on the dimension is necessary.
+   *
+   * @deprecated Use {@link FacetsConfig#setDrillDownTermsIndexing(String, DrillDownTermsIndexing)}
+   *     instead
+   */
+  @Deprecated
   public synchronized void setRequireDimensionDrillDown(String dimName, boolean v) {

Review comment:
       Yes, that is exactly right @zacharymorn!  So the PR would appear in the old (`lucene-solr`) github repository, not here.  Thank you!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] gsmiller commented on a change in pull request #25: LUCENE-9385: Add option to skip indexing facet drill down terms

Posted by GitBox <gi...@apache.org>.
gsmiller commented on a change in pull request #25:
URL: https://github.com/apache/lucene/pull/25#discussion_r599193102



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -371,14 +426,26 @@ private void processFacetFields(
         }
 
         // Drill down:
-        int start;
-        if (ft.requireDimensionDrillDown) {
-          start = 1;
-        } else {
-          start = 2;
-        }
-        for (int i = start; i <= cp.length; i++) {
-          doc.add(new StringField(indexFieldName, pathToString(cp.components, i), Field.Store.NO));
+        if (ft.drillDownTermsIndexing != DrillDownTermsIndexing.NONE) {
+          StringField fullPathField =
+              new StringField(
+                  indexFieldName, pathToString(cp.components, cp.length), Field.Store.NO);
+          doc.add(fullPathField);
+
+          if (ft.drillDownTermsIndexing == DrillDownTermsIndexing.DIMENSION_AND_FULL_PATH) {
+            doc.add(
+                new StringField(indexFieldName, pathToString(cp.components, 1), Field.Store.NO));
+          } else if (ft.drillDownTermsIndexing == DrillDownTermsIndexing.ALL_PATHS_NO_DIM) {
+            for (int i = 2; i < cp.length; i++) {
+              doc.add(
+                  new StringField(indexFieldName, pathToString(cp.components, i), Field.Store.NO));
+            }
+          } else if (ft.drillDownTermsIndexing == DrillDownTermsIndexing.ALL) {
+            for (int i = 1; i < cp.length; i++) {
+              doc.add(
+                  new StringField(indexFieldName, pathToString(cp.components, i), Field.Store.NO));
+            }
+          }

Review comment:
       Yeah, that's fair. My thinking was that the loop start being the only difference could be overlooked in a future change, but I don't really have strong opinions here. Thanks for considering a different approach :)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] mikemccand commented on a change in pull request #25: LUCENE-9385: Add option to skip indexing facet drill down terms

Posted by GitBox <gi...@apache.org>.
mikemccand commented on a change in pull request #25:
URL: https://github.com/apache/lucene/pull/25#discussion_r599557787



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -388,6 +434,26 @@ private void processFacetFields(
     }
   }
 
+  private void indexDrillDownTerms(
+      Document doc, String indexFieldName, DimConfig ft, FacetLabel cp) {
+    if (ft.drillDownTermsIndexing != DrillDownTermsIndexing.NONE) {
+      doc.add(
+          new StringField(indexFieldName, pathToString(cp.components, cp.length), Field.Store.NO));
+
+      if (ft.drillDownTermsIndexing == DrillDownTermsIndexing.DIMENSION_AND_FULL_PATH) {
+        doc.add(new StringField(indexFieldName, pathToString(cp.components, 1), Field.Store.NO));
+      } else if (ft.drillDownTermsIndexing == DrillDownTermsIndexing.ALL_PATHS_NO_DIM) {
+        for (int i = 2; i < cp.length; i++) {
+          doc.add(new StringField(indexFieldName, pathToString(cp.components, i), Field.Store.NO));
+        }
+      } else if (ft.drillDownTermsIndexing == DrillDownTermsIndexing.ALL) {
+        for (int i = 1; i < cp.length; i++) {
+          doc.add(new StringField(indexFieldName, pathToString(cp.components, i), Field.Store.NO));
+        }
+      }

Review comment:
       Maybe insert `} else { throw AssertionError(...)` to catch future accident of adding a new `enum` value and failing to fix this tricky code?
   
   If we switch to `switch` you could add a `default` case and `throw` there.

##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -408,7 +475,7 @@ private void processSSDVFacetFields(
         doc.add(new StringField(indexFieldName, fullPath, Field.Store.NO));

Review comment:
       Indeed, this is a weird forced-fit: SSDV facets are not hierarchical, or rather, they have only `dim` and `label`.
   
   But I think all of the new `enum` options can be implemented correctly here, as if this were a taxonomy facets field of depth two (`a/b`)?

##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -388,6 +434,26 @@ private void processFacetFields(
     }
   }
 
+  private void indexDrillDownTerms(
+      Document doc, String indexFieldName, DimConfig ft, FacetLabel cp) {
+    if (ft.drillDownTermsIndexing != DrillDownTermsIndexing.NONE) {
+      doc.add(

Review comment:
       Could you add a comment above this e.g. `// index full-path token` or so?

##########
File path: lucene/facet/src/test/org/apache/lucene/facet/TestDrillDownQuery.java
##########
@@ -299,4 +300,233 @@ public void testRequireDimensionDrillDown() throws Exception {
     assertEquals(0, searcher.count(q));
     IOUtils.close(taxoReader, reader, writer, dir, taxoDir);
   }
+
+  public void testSkipDrillDownTermsIndexing() throws Exception {
+    Directory dir = newDirectory();
+    RandomIndexWriter writer =
+        new RandomIndexWriter(
+            random(),
+            dir,
+            newIndexWriterConfig(new MockAnalyzer(random(), MockTokenizer.KEYWORD, false)));
+    Directory taxoDir = newDirectory();
+    TaxonomyWriter taxoWriter = new DirectoryTaxonomyWriter(taxoDir);
+    FacetsConfig config = new FacetsConfig();
+    config.setDrillDownTermsIndexing("a", FacetsConfig.DrillDownTermsIndexing.FULL_PATH_ONLY);
+    config.setDrillDownTermsIndexing("b", FacetsConfig.DrillDownTermsIndexing.FULL_PATH_ONLY);
+
+    Document doc = new Document();
+    doc.add(new FacetField("a", "1"));
+    doc.add(new FacetField("b", "2"));
+    writer.addDocument(config.build(taxoWriter, doc));
+    taxoWriter.close();
+
+    IndexReader reader = writer.getReader();
+    DirectoryTaxonomyReader taxoReader = new DirectoryTaxonomyReader(taxoDir);
+    IndexSearcher searcher = newSearcher(reader);
+
+    DrillDownQuery q = new DrillDownQuery(config);
+    q.add("a");
+    // no hits because we disabled dimension drill down completely
+    assertEquals(0, searcher.count(q));
+
+    q = new DrillDownQuery(config);
+    q.add("a", "1");
+    assertEquals(1, searcher.count(q));
+
+    q = new DrillDownQuery(config);
+    q.add("b");
+    // no hits because we disabled dimension drill down completely
+    assertEquals(0, searcher.count(q));
+
+    q = new DrillDownQuery(config);
+    q.add("b", "2");
+    assertEquals(1, searcher.count(q));
+
+    IOUtils.close(taxoReader, reader, writer, dir, taxoDir);
+  }
+
+  public void testDrillDownTermsDifferentOptions() throws Exception {
+    Directory dir = newDirectory();
+    RandomIndexWriter writer =
+        new RandomIndexWriter(
+            random(),
+            dir,
+            newIndexWriterConfig(new MockAnalyzer(random(), MockTokenizer.KEYWORD, false)));
+    Directory taxoDir = newDirectory();
+    TaxonomyWriter taxoWriter = new DirectoryTaxonomyWriter(taxoDir);
+    FacetsConfig config = new FacetsConfig();
+    config.setHierarchical("a", true);
+    config.setHierarchical("b", true);
+    config.setHierarchical("c", true);
+    config.setHierarchical("d", true);
+    config.setHierarchical("e", true);
+    config.setDrillDownTermsIndexing("a", FacetsConfig.DrillDownTermsIndexing.NONE);
+    config.setDrillDownTermsIndexing("b", FacetsConfig.DrillDownTermsIndexing.FULL_PATH_ONLY);
+    config.setDrillDownTermsIndexing("c", FacetsConfig.DrillDownTermsIndexing.ALL_PATHS_NO_DIM);
+    config.setDrillDownTermsIndexing(
+        "d", FacetsConfig.DrillDownTermsIndexing.DIMENSION_AND_FULL_PATH);
+    config.setDrillDownTermsIndexing("e", FacetsConfig.DrillDownTermsIndexing.ALL);
+    config.setDrillDownTermsIndexing("f", FacetsConfig.DrillDownTermsIndexing.NONE);
+    config.setIndexFieldName("f", "facet-for-f");
+
+    Document doc = new Document();
+    doc.add(new FacetField("a", "a1", "a2", "a3"));
+    doc.add(new FacetField("b", "b1", "b2", "b3"));
+    doc.add(new FacetField("c", "c1", "c2", "c3"));
+    doc.add(new FacetField("d", "d1", "d2", "d3"));
+    doc.add(new FacetField("e", "e1", "e2", "e3"));
+    doc.add(new IntAssociationFacetField(5, "f", "f1"));
+    writer.addDocument(config.build(taxoWriter, doc));
+    taxoWriter.close();
+
+    IndexReader reader = writer.getReader();
+    DirectoryTaxonomyReader taxoReader = new DirectoryTaxonomyReader(taxoDir);
+    IndexSearcher searcher = newSearcher(reader);
+
+    // Verifies for FacetsConfig.DrillDownTermsIndexing.NONE option
+    DrillDownQuery q = new DrillDownQuery(config);
+    q.add("a");
+    assertEquals(0, searcher.count(q));
+
+    q = new DrillDownQuery(config);
+    q.add("a", "a1");
+    assertEquals(0, searcher.count(q));
+
+    q = new DrillDownQuery(config);
+    q.add("a", "a1", "a2");
+    assertEquals(0, searcher.count(q));
+
+    q = new DrillDownQuery(config);
+    q.add("a", "a1", "a2", "a3");
+    assertEquals(0, searcher.count(q));
+
+    // Verifies for FacetsConfig.DrillDownTermsIndexing.FULL_PATH_ONLY option
+    q = new DrillDownQuery(config);
+    q.add("b");
+    assertEquals(0, searcher.count(q));
+
+    q = new DrillDownQuery(config);
+    q.add("b", "b1");
+    assertEquals(0, searcher.count(q));
+
+    q = new DrillDownQuery(config);
+    q.add("b", "b1", "b2");
+    assertEquals(0, searcher.count(q));
+
+    q = new DrillDownQuery(config);
+    q.add("b", "b1", "b2", "b3");
+    assertEquals(1, searcher.count(q));
+
+    // Verifies for FacetsConfig.DrillDownTermsIndexing.ALL_PATHS_NO_DIM option
+    q = new DrillDownQuery(config);
+    q.add("c");
+    assertEquals(0, searcher.count(q));
+
+    q = new DrillDownQuery(config);
+    q.add("c", "c1");
+    assertEquals(1, searcher.count(q));
+
+    q = new DrillDownQuery(config);
+    q.add("c", "c1", "c2");
+    assertEquals(1, searcher.count(q));
+
+    q = new DrillDownQuery(config);
+    q.add("c", "c1", "c2", "c3");
+    assertEquals(1, searcher.count(q));
+
+    // Verifies for FacetsConfig.DrillDownTermsIndexing.DIMENSION_AND_FULL_PATH option
+    q = new DrillDownQuery(config);
+    q.add("d");
+    assertEquals(1, searcher.count(q));
+
+    q = new DrillDownQuery(config);
+    q.add("d", "d1");
+    assertEquals(0, searcher.count(q));
+
+    q = new DrillDownQuery(config);
+    q.add("d", "d1", "d2");
+    assertEquals(0, searcher.count(q));
+
+    q = new DrillDownQuery(config);
+    q.add("d", "d1", "d2", "d3");
+    assertEquals(1, searcher.count(q));
+
+    // Verifies for FacetsConfig.DrillDownTermsIndexing.ALL option
+    q = new DrillDownQuery(config);
+    q.add("e");
+    assertEquals(1, searcher.count(q));
+
+    q = new DrillDownQuery(config);
+    q.add("e", "e1");
+    assertEquals(1, searcher.count(q));
+
+    q = new DrillDownQuery(config);
+    q.add("e", "e1", "e2");
+    assertEquals(1, searcher.count(q));
+
+    q = new DrillDownQuery(config);
+    q.add("e", "e1", "e2", "e3");
+    assertEquals(1, searcher.count(q));
+
+    // Verifies for FacetsConfig.DrillDownTermsIndexing.DIMENSION_AND_FULL_PATH option with
+    // IntAssociationFacetField
+    q = new DrillDownQuery(config);
+    q.add("f");
+    assertEquals(0, searcher.count(q));
+
+    q = new DrillDownQuery(config);
+    q.add("f", "f1");
+    assertEquals(0, searcher.count(q));
+
+    IOUtils.close(taxoReader, reader, writer, dir, taxoDir);
+  }
+
+  public void testDrillDownTermsDefaultWithHierarchicalSetting() throws Exception {

Review comment:
       Thank you @zacharymorn for the added test cases!

##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -168,14 +203,34 @@ public synchronized void setIndexFieldName(String dimName, String indexFieldName
     ft.indexFieldName = indexFieldName;
   }
 
-  /** Specify whether drill down on just the dimension is necessary. */
+  /**
+   * Specify whether drill down on the dimension is necessary.
+   *
+   * @deprecated Use {@link FacetsConfig#setDrillDownTermsIndexing(String, DrillDownTermsIndexing)}
+   *     instead
+   */
+  @Deprecated
   public synchronized void setRequireDimensionDrillDown(String dimName, boolean v) {

Review comment:
       We should have full backwards compatibility with this, right?  In which case we could backport this change to 8.x too!

##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -388,6 +434,26 @@ private void processFacetFields(
     }
   }
 
+  private void indexDrillDownTerms(
+      Document doc, String indexFieldName, DimConfig ft, FacetLabel cp) {
+    if (ft.drillDownTermsIndexing != DrillDownTermsIndexing.NONE) {

Review comment:
       Maybe, keep the outer `if` checking for `not NONE`, but and then add the full-path term here, but then switch to `switch` for the inner three cases?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] zacharymorn commented on a change in pull request #25: LUCENE-9385: Add option to skip indexing facet drill down terms

Posted by GitBox <gi...@apache.org>.
zacharymorn commented on a change in pull request #25:
URL: https://github.com/apache/lucene/pull/25#discussion_r600112047



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -388,6 +434,26 @@ private void processFacetFields(
     }
   }
 
+  private void indexDrillDownTerms(
+      Document doc, String indexFieldName, DimConfig ft, FacetLabel cp) {
+    if (ft.drillDownTermsIndexing != DrillDownTermsIndexing.NONE) {
+      doc.add(

Review comment:
       Added.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org