You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by tirthmehta1994 <gi...@git.apache.org> on 2018/10/18 00:54:33 UTC

[GitHub] lucene-solr pull request #477: Block Expensive Queries custom component

GitHub user tirthmehta1994 opened a pull request:

    https://github.com/apache/lucene-solr/pull/477

    Block Expensive Queries custom component

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/walmartlabs/lucene-solr BlockExpensiveQueries

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/lucene-solr/pull/477.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #477
    
----
commit 31887130d577d581a75cbf0630c172defb2119f9
Author: tirthmehta1994 <ti...@...>
Date:   2018-10-18T00:42:50Z

    Block Expensive Queries custom component

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[GitHub] lucene-solr issue #477: Block Expensive Queries custom component

Posted by tirthmehta1994 <gi...@git.apache.org>.
Github user tirthmehta1994 commented on the issue:

    https://github.com/apache/lucene-solr/pull/477
  
    Sure @vthacker:
    https://issues.apache.org/jira/browse/SOLR-12902


---

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[GitHub] lucene-solr pull request #477: Block Expensive Queries custom component

Posted by vthacker <gi...@git.apache.org>.
Github user vthacker commented on a diff in the pull request:

    https://github.com/apache/lucene-solr/pull/477#discussion_r228296662
  
    --- Diff: solr/core/src/java/org/apache/solr/search/BlockExpensiveQueries.java ---
    @@ -0,0 +1,99 @@
    +package org.apache.solr.search;
    +
    +import java.io.IOException;
    +
    +import org.apache.lucene.analysis.Analyzer;
    +import org.apache.lucene.analysis.util.TokenFilterFactory;
    +import org.apache.solr.analysis.ReversedWildcardFilterFactory;
    +import org.apache.solr.analysis.TokenizerChain;
    +import org.apache.solr.common.util.NamedList;
    +import org.apache.solr.handler.component.ResponseBuilder;
    +import org.apache.solr.handler.component.SearchComponent;
    +import org.apache.solr.request.SolrQueryRequest;
    +import org.apache.solr.response.SolrQueryResponse;
    +import org.apache.solr.search.SortSpec;
    +import org.slf4j.Logger;
    +import org.slf4j.LoggerFactory;
    +
    +/**
    + * This search component can be plugged into your SearchHandler if you would like to block some well known expensive queries.
    + * The queries that are blocked and failed by component currently are deep pagination queries as they are known to consume lot of memory and CPU
    + * <ul>
    + *  <li> queries with a start offset which is greater than the configured maxStartOffset config parameter value
    + *  <li> queries with a row param value which is greater than the configured maxRowsFetch config parameter value
    + * </ul>
    + *
    + * In future we would also like to extend this component to prevent
    + * <ul>
    + *  <li> facet pivot queries, controlled by a config param
    + *  <li> regular facet queries, controlled by a config param
    + *  <li> query with wildcard in the prefix if the field does not have ReversedWildCartPattern configured
    + * </ul>
    + *
    + *
    + */
    +
    +public class BlockExpensiveQueries extends SearchComponent {
    +
    +    private static final Logger LOG = LoggerFactory.getLogger(BlockExpensiveQueries.class);
    +
    +    private int maxStartOffset = 10000;
    +    private int maxRowsFetch = 1000;
    +    private NamedList<?> initParams;
    +
    +    @Override
    +    @SuppressWarnings("unchecked")
    +    public void init(NamedList args) {
    +        LOG.info("Loading the BlockExpensiveQueries component");
    +        super.init(args);
    +        this.initParams = args;
    +
    +        if (args != null) {
    +            Object o = args.get("defaults");
    +            if (o != null && o instanceof NamedList) {
    +                maxStartOffset = (Integer)((NamedList)o).get("maxStartOffset");
    +                maxRowsFetch = (Integer)((NamedList)o).get("maxRowsFetch");
    +                LOG.info("Using maxStartOffset={}. maxRowsFetch={}", maxStartOffset, maxRowsFetch);
    +            }
    +        } else {
    +            LOG.info("Using default values, maxStartOffset={}. maxRowsFetch={}", maxStartOffset, maxRowsFetch);
    +        }
    +    }
    +
    +    @Override
    +    public void prepare(ResponseBuilder rb) throws IOException {
    +        SolrQueryRequest req = rb.req;
    +        SolrQueryResponse rsp = rb.rsp;
    +        SortSpec sortSpec = rb.getSortSpec();
    +        int offset = sortSpec.getOffset();
    +        int count = sortSpec.getCount();
    +        LOG.info("Query offset={}, rows={}", offset, count);
    +
    +        //check if cursorMark is used if we would like to allow deep pagination with cursor mark queries
    +        boolean isDistributed = req.getParams().getBool("distrib", true);
    +        if (isDistributed) {
    +            String cursorMarkMsg = "Queries with high \"start\" or high \"rows\" parameters are a performance problem in Solr. " +
    +                                   "If you really have a use-case for such queries, consider using \"cursors\" for pagination of results. " +
    +                                   "Refer: https://lucene.apache.org/solr/guide/pagination-of-results.html.";
    +            if (offset > maxStartOffset) {
    +                throw new IOException(String.format("The start=%s value exceeded the max offset allowed value of %s. %s",
    --- End diff --
    
    Maybe this should be a SolrException with BAD_REQUEST as the error code?
    So something like ...
    
    `throw new SolrException(SolrException.ErrorCode.BAD_REQUEST,"error message"`


---

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[GitHub] lucene-solr issue #477: Block Expensive Queries custom component

Posted by tirthmehta1994 <gi...@git.apache.org>.
Github user tirthmehta1994 commented on the issue:

    https://github.com/apache/lucene-solr/pull/477
  
    @vthacker it would be great if you could let me know of some updates here. Thanks.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[GitHub] lucene-solr issue #477: Block Expensive Queries custom component

Posted by vthacker <gi...@git.apache.org>.
Github user vthacker commented on the issue:

    https://github.com/apache/lucene-solr/pull/477
  
    Hi Tirth,
    
    It would be great if we could have a test case for this.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[GitHub] lucene-solr issue #477: Block Expensive Queries custom component

Posted by tirthmehta1994 <gi...@git.apache.org>.
Github user tirthmehta1994 commented on the issue:

    https://github.com/apache/lucene-solr/pull/477
  
    Hi @vthacker , I have added a test-case explaining the scenario where in the custom component will be useful. @anshumg The test case has been added in the code. Please have a look.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[GitHub] lucene-solr issue #477: Block Expensive Queries custom component

Posted by tirthmehta1994 <gi...@git.apache.org>.
Github user tirthmehta1994 commented on the issue:

    https://github.com/apache/lucene-solr/pull/477
  
    Hi @vthacker, I have made some changes please do have a look.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[GitHub] lucene-solr issue #477: Block Expensive Queries custom component

Posted by vthacker <gi...@git.apache.org>.
Github user vthacker commented on the issue:

    https://github.com/apache/lucene-solr/pull/477
  
    Hi @tirthmehta1994  I see you've created 3+ PRs with patches on issues. 
    
    Would you mind creating a solr Jira for each of them ( http://issues.apache.org/jira/browse/SOLR ), putting a short description and posting the PR link.
    
    We'd be happy to review the patches!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org