You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by tirthmehta1994 <gi...@git.apache.org> on 2018/10/18 00:54:33 UTC
[GitHub] lucene-solr pull request #477: Block Expensive Queries custom component
GitHub user tirthmehta1994 opened a pull request:
https://github.com/apache/lucene-solr/pull/477
Block Expensive Queries custom component
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/walmartlabs/lucene-solr BlockExpensiveQueries
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/lucene-solr/pull/477.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #477
----
commit 31887130d577d581a75cbf0630c172defb2119f9
Author: tirthmehta1994 <ti...@...>
Date: 2018-10-18T00:42:50Z
Block Expensive Queries custom component
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[GitHub] lucene-solr issue #477: Block Expensive Queries custom component
Posted by tirthmehta1994 <gi...@git.apache.org>.
Github user tirthmehta1994 commented on the issue:
https://github.com/apache/lucene-solr/pull/477
Sure @vthacker:
https://issues.apache.org/jira/browse/SOLR-12902
---
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[GitHub] lucene-solr pull request #477: Block Expensive Queries custom component
Posted by vthacker <gi...@git.apache.org>.
Github user vthacker commented on a diff in the pull request:
https://github.com/apache/lucene-solr/pull/477#discussion_r228296662
--- Diff: solr/core/src/java/org/apache/solr/search/BlockExpensiveQueries.java ---
@@ -0,0 +1,99 @@
+package org.apache.solr.search;
+
+import java.io.IOException;
+
+import org.apache.lucene.analysis.Analyzer;
+import org.apache.lucene.analysis.util.TokenFilterFactory;
+import org.apache.solr.analysis.ReversedWildcardFilterFactory;
+import org.apache.solr.analysis.TokenizerChain;
+import org.apache.solr.common.util.NamedList;
+import org.apache.solr.handler.component.ResponseBuilder;
+import org.apache.solr.handler.component.SearchComponent;
+import org.apache.solr.request.SolrQueryRequest;
+import org.apache.solr.response.SolrQueryResponse;
+import org.apache.solr.search.SortSpec;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * This search component can be plugged into your SearchHandler if you would like to block some well known expensive queries.
+ * The queries that are blocked and failed by component currently are deep pagination queries as they are known to consume lot of memory and CPU
+ * <ul>
+ * <li> queries with a start offset which is greater than the configured maxStartOffset config parameter value
+ * <li> queries with a row param value which is greater than the configured maxRowsFetch config parameter value
+ * </ul>
+ *
+ * In future we would also like to extend this component to prevent
+ * <ul>
+ * <li> facet pivot queries, controlled by a config param
+ * <li> regular facet queries, controlled by a config param
+ * <li> query with wildcard in the prefix if the field does not have ReversedWildCartPattern configured
+ * </ul>
+ *
+ *
+ */
+
+public class BlockExpensiveQueries extends SearchComponent {
+
+ private static final Logger LOG = LoggerFactory.getLogger(BlockExpensiveQueries.class);
+
+ private int maxStartOffset = 10000;
+ private int maxRowsFetch = 1000;
+ private NamedList<?> initParams;
+
+ @Override
+ @SuppressWarnings("unchecked")
+ public void init(NamedList args) {
+ LOG.info("Loading the BlockExpensiveQueries component");
+ super.init(args);
+ this.initParams = args;
+
+ if (args != null) {
+ Object o = args.get("defaults");
+ if (o != null && o instanceof NamedList) {
+ maxStartOffset = (Integer)((NamedList)o).get("maxStartOffset");
+ maxRowsFetch = (Integer)((NamedList)o).get("maxRowsFetch");
+ LOG.info("Using maxStartOffset={}. maxRowsFetch={}", maxStartOffset, maxRowsFetch);
+ }
+ } else {
+ LOG.info("Using default values, maxStartOffset={}. maxRowsFetch={}", maxStartOffset, maxRowsFetch);
+ }
+ }
+
+ @Override
+ public void prepare(ResponseBuilder rb) throws IOException {
+ SolrQueryRequest req = rb.req;
+ SolrQueryResponse rsp = rb.rsp;
+ SortSpec sortSpec = rb.getSortSpec();
+ int offset = sortSpec.getOffset();
+ int count = sortSpec.getCount();
+ LOG.info("Query offset={}, rows={}", offset, count);
+
+ //check if cursorMark is used if we would like to allow deep pagination with cursor mark queries
+ boolean isDistributed = req.getParams().getBool("distrib", true);
+ if (isDistributed) {
+ String cursorMarkMsg = "Queries with high \"start\" or high \"rows\" parameters are a performance problem in Solr. " +
+ "If you really have a use-case for such queries, consider using \"cursors\" for pagination of results. " +
+ "Refer: https://lucene.apache.org/solr/guide/pagination-of-results.html.";
+ if (offset > maxStartOffset) {
+ throw new IOException(String.format("The start=%s value exceeded the max offset allowed value of %s. %s",
--- End diff --
Maybe this should be a SolrException with BAD_REQUEST as the error code?
So something like ...
`throw new SolrException(SolrException.ErrorCode.BAD_REQUEST,"error message"`
---
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[GitHub] lucene-solr issue #477: Block Expensive Queries custom component
Posted by tirthmehta1994 <gi...@git.apache.org>.
Github user tirthmehta1994 commented on the issue:
https://github.com/apache/lucene-solr/pull/477
@vthacker it would be great if you could let me know of some updates here. Thanks.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[GitHub] lucene-solr issue #477: Block Expensive Queries custom component
Posted by vthacker <gi...@git.apache.org>.
Github user vthacker commented on the issue:
https://github.com/apache/lucene-solr/pull/477
Hi Tirth,
It would be great if we could have a test case for this.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[GitHub] lucene-solr issue #477: Block Expensive Queries custom component
Posted by tirthmehta1994 <gi...@git.apache.org>.
Github user tirthmehta1994 commented on the issue:
https://github.com/apache/lucene-solr/pull/477
Hi @vthacker , I have added a test-case explaining the scenario where in the custom component will be useful. @anshumg The test case has been added in the code. Please have a look.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[GitHub] lucene-solr issue #477: Block Expensive Queries custom component
Posted by tirthmehta1994 <gi...@git.apache.org>.
Github user tirthmehta1994 commented on the issue:
https://github.com/apache/lucene-solr/pull/477
Hi @vthacker, I have made some changes please do have a look.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[GitHub] lucene-solr issue #477: Block Expensive Queries custom component
Posted by vthacker <gi...@git.apache.org>.
Github user vthacker commented on the issue:
https://github.com/apache/lucene-solr/pull/477
Hi @tirthmehta1994 I see you've created 3+ PRs with patches on issues.
Would you mind creating a solr Jira for each of them ( http://issues.apache.org/jira/browse/SOLR ), putting a short description and posting the PR link.
We'd be happy to review the patches!
---
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org