You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@solr.apache.org by GitBox <gi...@apache.org> on 2022/11/03 22:26:16 UTC

[GitHub] [solr] dsmiley commented on a diff in pull request #1154: SOLR-16496: provide option for Query Elevation Component to bypass filters

dsmiley commented on code in PR #1154:
URL: https://github.com/apache/solr/pull/1154#discussion_r1013431841


##########
solr/core/src/java/org/apache/solr/handler/component/QueryElevationComponent.java:
##########
@@ -553,6 +560,82 @@ private void setQuery(ResponseBuilder rb, Elevation elevation) {
     }
   }
 
+  /**
+   * Updates any filters that have been tagged for exclusion. Filters can be tagged for exclusion
+   * via the syntax fq={!tag=t1}field1:value1&elevate.excludeTags=t1 This method modifies each
+   * "excluded" filter so that it becomes a boolean OR of the original filter with an "include
+   * query" that matches the elevated documents by their IDs. To be clear, the "excluded" filters
+   * are not ignored entirely, but rather broadened so that the elevated documents are allowed
+   * through.
+   */
+  private void setFilters(ResponseBuilder rb, Elevation elevation) {
+    SolrParams params = rb.req.getParams();
+
+    String tagStr = params.get(QueryElevationParams.ELEVATE_EXCLUDE_TAGS);
+    if (StringUtils.isEmpty(tagStr)) {
+      // the parameter that specifies tags for exclusion was not provided or had no value
+      return;
+    }
+
+    List<String> excludeTags = StrUtils.splitSmart(tagStr, ',');
+    excludeTags.removeIf(s -> StringUtils.isBlank(s));
+    if (excludeTags.isEmpty()) {
+      // the parameter that specifies tags for exclusion was provided but the tag names were blank
+      return;
+    }
+
+    List<Query> filters = rb.getFilters();
+    if (filters == null || filters.isEmpty()) {
+      // no filters were provided
+      return;
+    }
+
+    Map<?, ?> tagMap = (Map<?, ?>) rb.req.getContext().get("tags");
+    if (tagMap == null) {
+      // no filters were tagged
+      return;
+    }
+
+    // TODO: this code is copied from FacetProcessor#handleFilterExclusions()
+    // duplication could be avoided by placing this code in a common utility method, perhaps in
+    // QueryUtils
+    IdentityHashMap<Query, Boolean> excludeSet = new IdentityHashMap<>();

Review Comment:
   The Boolean wasn't obvious to me but now I get it.  We can use com.carrotsearch.hppc.ObjectIdentityHashSet instead of IdentityHashMap since we want a set.



##########
solr/core/src/java/org/apache/solr/handler/component/QueryElevationComponent.java:
##########
@@ -553,6 +560,82 @@ private void setQuery(ResponseBuilder rb, Elevation elevation) {
     }
   }
 
+  /**
+   * Updates any filters that have been tagged for exclusion. Filters can be tagged for exclusion
+   * via the syntax fq={!tag=t1}field1:value1&elevate.excludeTags=t1 This method modifies each
+   * "excluded" filter so that it becomes a boolean OR of the original filter with an "include
+   * query" that matches the elevated documents by their IDs. To be clear, the "excluded" filters
+   * are not ignored entirely, but rather broadened so that the elevated documents are allowed
+   * through.
+   */
+  private void setFilters(ResponseBuilder rb, Elevation elevation) {
+    SolrParams params = rb.req.getParams();
+
+    String tagStr = params.get(QueryElevationParams.ELEVATE_EXCLUDE_TAGS);
+    if (StringUtils.isEmpty(tagStr)) {
+      // the parameter that specifies tags for exclusion was not provided or had no value
+      return;
+    }
+
+    List<String> excludeTags = StrUtils.splitSmart(tagStr, ',');
+    excludeTags.removeIf(s -> StringUtils.isBlank(s));
+    if (excludeTags.isEmpty()) {
+      // the parameter that specifies tags for exclusion was provided but the tag names were blank
+      return;
+    }
+
+    List<Query> filters = rb.getFilters();
+    if (filters == null || filters.isEmpty()) {
+      // no filters were provided
+      return;
+    }
+
+    Map<?, ?> tagMap = (Map<?, ?>) rb.req.getContext().get("tags");
+    if (tagMap == null) {
+      // no filters were tagged
+      return;
+    }
+
+    // TODO: this code is copied from FacetProcessor#handleFilterExclusions()
+    // duplication could be avoided by placing this code in a common utility method, perhaps in
+    // QueryUtils
+    IdentityHashMap<Query, Boolean> excludeSet = new IdentityHashMap<>();
+    for (String excludeTag : excludeTags) {
+      Object olst = tagMap.get(excludeTag);
+      // tagMap has entries of List<String,List<QParser>>, but subject to change in the future
+      if (!(olst instanceof Collection)) continue;
+      for (Object o : (Collection<?>) olst) {
+        if (!(o instanceof QParser)) continue;
+        QParser qp = (QParser) o;
+        try {
+          excludeSet.put(qp.getQuery(), Boolean.TRUE);
+        } catch (SyntaxError syntaxError) {
+          // This should not happen since we should only be retrieving a previously parsed query
+          throw new SolrException(SolrException.ErrorCode.BAD_REQUEST, syntaxError);
+        }
+      }
+    }
+
+    List<Query> updatedFilters = new ArrayList<Query>();
+
+    for (Query q : filters) {

Review Comment:
   please rename `q` to `filter` or `filterQuery` :-)



##########
solr/core/src/java/org/apache/solr/handler/component/QueryElevationComponent.java:
##########
@@ -553,6 +560,82 @@ private void setQuery(ResponseBuilder rb, Elevation elevation) {
     }
   }
 
+  /**
+   * Updates any filters that have been tagged for exclusion. Filters can be tagged for exclusion
+   * via the syntax fq={!tag=t1}field1:value1&elevate.excludeTags=t1 This method modifies each
+   * "excluded" filter so that it becomes a boolean OR of the original filter with an "include
+   * query" that matches the elevated documents by their IDs. To be clear, the "excluded" filters
+   * are not ignored entirely, but rather broadened so that the elevated documents are allowed
+   * through.
+   */
+  private void setFilters(ResponseBuilder rb, Elevation elevation) {
+    SolrParams params = rb.req.getParams();
+
+    String tagStr = params.get(QueryElevationParams.ELEVATE_EXCLUDE_TAGS);
+    if (StringUtils.isEmpty(tagStr)) {
+      // the parameter that specifies tags for exclusion was not provided or had no value
+      return;
+    }
+
+    List<String> excludeTags = StrUtils.splitSmart(tagStr, ',');
+    excludeTags.removeIf(s -> StringUtils.isBlank(s));
+    if (excludeTags.isEmpty()) {
+      // the parameter that specifies tags for exclusion was provided but the tag names were blank
+      return;
+    }
+
+    List<Query> filters = rb.getFilters();
+    if (filters == null || filters.isEmpty()) {
+      // no filters were provided
+      return;
+    }
+
+    Map<?, ?> tagMap = (Map<?, ?>) rb.req.getContext().get("tags");
+    if (tagMap == null) {
+      // no filters were tagged
+      return;
+    }
+
+    // TODO: this code is copied from FacetProcessor#handleFilterExclusions()
+    // duplication could be avoided by placing this code in a common utility method, perhaps in
+    // QueryUtils
+    IdentityHashMap<Query, Boolean> excludeSet = new IdentityHashMap<>();
+    for (String excludeTag : excludeTags) {
+      Object olst = tagMap.get(excludeTag);
+      // tagMap has entries of List<String,List<QParser>>, but subject to change in the future
+      if (!(olst instanceof Collection)) continue;
+      for (Object o : (Collection<?>) olst) {
+        if (!(o instanceof QParser)) continue;
+        QParser qp = (QParser) o;
+        try {
+          excludeSet.put(qp.getQuery(), Boolean.TRUE);
+        } catch (SyntaxError syntaxError) {
+          // This should not happen since we should only be retrieving a previously parsed query
+          throw new SolrException(SolrException.ErrorCode.BAD_REQUEST, syntaxError);
+        }
+      }
+    }
+
+    List<Query> updatedFilters = new ArrayList<Query>();
+
+    for (Query q : filters) {
+      if (!excludeSet.containsKey(q) || q instanceof CollapsingQParserPlugin.CollapsingPostFilter) {

Review Comment:
   hard-coding a comparison to a specific Query type is troubling.  Maybe generically the pattern should be a PostFilter?



##########
solr/core/src/java/org/apache/solr/handler/component/QueryElevationComponent.java:
##########
@@ -553,6 +560,82 @@ private void setQuery(ResponseBuilder rb, Elevation elevation) {
     }
   }
 
+  /**
+   * Updates any filters that have been tagged for exclusion. Filters can be tagged for exclusion
+   * via the syntax fq={!tag=t1}field1:value1&elevate.excludeTags=t1 This method modifies each
+   * "excluded" filter so that it becomes a boolean OR of the original filter with an "include
+   * query" that matches the elevated documents by their IDs. To be clear, the "excluded" filters
+   * are not ignored entirely, but rather broadened so that the elevated documents are allowed
+   * through.
+   */
+  private void setFilters(ResponseBuilder rb, Elevation elevation) {
+    SolrParams params = rb.req.getParams();
+
+    String tagStr = params.get(QueryElevationParams.ELEVATE_EXCLUDE_TAGS);
+    if (StringUtils.isEmpty(tagStr)) {
+      // the parameter that specifies tags for exclusion was not provided or had no value
+      return;
+    }
+
+    List<String> excludeTags = StrUtils.splitSmart(tagStr, ',');
+    excludeTags.removeIf(s -> StringUtils.isBlank(s));
+    if (excludeTags.isEmpty()) {
+      // the parameter that specifies tags for exclusion was provided but the tag names were blank
+      return;
+    }
+
+    List<Query> filters = rb.getFilters();
+    if (filters == null || filters.isEmpty()) {
+      // no filters were provided
+      return;
+    }
+
+    Map<?, ?> tagMap = (Map<?, ?>) rb.req.getContext().get("tags");
+    if (tagMap == null) {
+      // no filters were tagged
+      return;
+    }
+
+    // TODO: this code is copied from FacetProcessor#handleFilterExclusions()
+    // duplication could be avoided by placing this code in a common utility method, perhaps in
+    // QueryUtils

Review Comment:
   Definitely should be factored out.  Hard to say where; I don't like QueryUtils as the destination.  How about this -- imagine a TaggedQueryContext class that encompasses most logic associated with populating "tags" and parsing it out?  Some of QParser's logic would go there as well, which is where it's populated today.  The proposed class wouldn't be instantiated; it'd just be utility methods pertaining to this.  One could argue it'd be nice if this class was also the holder of the value in the request context instead of the List<String,List<QParser>> directly.  Maybe?  WDYT?



##########
solr/core/src/java/org/apache/solr/handler/component/QueryElevationComponent.java:
##########
@@ -553,6 +560,82 @@ private void setQuery(ResponseBuilder rb, Elevation elevation) {
     }
   }
 
+  /**
+   * Updates any filters that have been tagged for exclusion. Filters can be tagged for exclusion
+   * via the syntax fq={!tag=t1}field1:value1&elevate.excludeTags=t1 This method modifies each
+   * "excluded" filter so that it becomes a boolean OR of the original filter with an "include
+   * query" that matches the elevated documents by their IDs. To be clear, the "excluded" filters
+   * are not ignored entirely, but rather broadened so that the elevated documents are allowed
+   * through.
+   */
+  private void setFilters(ResponseBuilder rb, Elevation elevation) {
+    SolrParams params = rb.req.getParams();
+
+    String tagStr = params.get(QueryElevationParams.ELEVATE_EXCLUDE_TAGS);
+    if (StringUtils.isEmpty(tagStr)) {
+      // the parameter that specifies tags for exclusion was not provided or had no value
+      return;
+    }
+
+    List<String> excludeTags = StrUtils.splitSmart(tagStr, ',');
+    excludeTags.removeIf(s -> StringUtils.isBlank(s));
+    if (excludeTags.isEmpty()) {
+      // the parameter that specifies tags for exclusion was provided but the tag names were blank
+      return;
+    }
+
+    List<Query> filters = rb.getFilters();
+    if (filters == null || filters.isEmpty()) {
+      // no filters were provided
+      return;
+    }
+
+    Map<?, ?> tagMap = (Map<?, ?>) rb.req.getContext().get("tags");
+    if (tagMap == null) {
+      // no filters were tagged
+      return;
+    }
+
+    // TODO: this code is copied from FacetProcessor#handleFilterExclusions()
+    // duplication could be avoided by placing this code in a common utility method, perhaps in
+    // QueryUtils
+    IdentityHashMap<Query, Boolean> excludeSet = new IdentityHashMap<>();
+    for (String excludeTag : excludeTags) {
+      Object olst = tagMap.get(excludeTag);
+      // tagMap has entries of List<String,List<QParser>>, but subject to change in the future
+      if (!(olst instanceof Collection)) continue;
+      for (Object o : (Collection<?>) olst) {
+        if (!(o instanceof QParser)) continue;
+        QParser qp = (QParser) o;
+        try {
+          excludeSet.put(qp.getQuery(), Boolean.TRUE);
+        } catch (SyntaxError syntaxError) {
+          // This should not happen since we should only be retrieving a previously parsed query
+          throw new SolrException(SolrException.ErrorCode.BAD_REQUEST, syntaxError);
+        }
+      }
+    }
+
+    List<Query> updatedFilters = new ArrayList<Query>();
+
+    for (Query q : filters) {
+      if (!excludeSet.containsKey(q) || q instanceof CollapsingQParserPlugin.CollapsingPostFilter) {
+        updatedFilters.add(q);
+        continue;
+      }
+
+      // we're looking at a filter that has been tagged for exclusion
+      // transform it into a boolean OR of the original filter with the "include query" matching the
+      // elevated docs
+      BooleanQuery.Builder queryBuilder = new BooleanQuery.Builder();
+      queryBuilder.add(q, BooleanClause.Occur.SHOULD);
+      queryBuilder.add(elevation.includeQuery, BooleanClause.Occur.SHOULD);
+      updatedFilters.add(queryBuilder.build());

Review Comment:
   We should probably check to see if the filter query in question also had cache=false local-param (via ExtendedQuery check).  If so, we need to ensure this BooleanQuery is wrapped so that it doesn't cache either.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org