You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Andres de la Peña (Jira)" <ji...@apache.org> on 2022/02/11 11:27:00 UTC

[jira] [Comment Edited] (CASSANDRA-17187) Guardrail for SELECT IN terms and their cartesian product

    [ https://issues.apache.org/jira/browse/CASSANDRA-17187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17490859#comment-17490859 ] 

Andres de la Peña edited comment on CASSANDRA-17187 at 2/11/22, 11:26 AM:
--------------------------------------------------------------------------

Here is the patch:
||PR||CI||
|[trunk|https://github.com/apache/cassandra/pull/1444]|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/1296/workflows/f532c245-0374-43ab-b991-c58fab0c135d] [j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/1296/workflows/57c48e3e-5fc0-4df1-82ae-c50592b79427]|

The guardrail is relatively simple but the patch is a bit noisy because we need to pass the {{ClientState}} to {{PartitionKeySingleRestrictionSet}} and \{{ClusteringColumnRestrictions{}, and this affects a bunch of classes in the way.

There is also a minor refactor ofthe signatures of the utility methods {{GuardrailTester#assertWarns}} and {{GuardrailTester#aasertFails}} that (trivially) touches a bunch of tests. The reason for this refactor is that we need to consider that a query can trigger multiple guardrails and thus produce multiple warnings.


was (Author: adelapena):
Here is the patch:
||PR||CI||
|[trunk|https://github.com/apache/cassandra/pull/1444]|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/1296/workflows/f532c245-0374-43ab-b991-c58fab0c135d] [j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/1296/workflows/57c48e3e-5fc0-4df1-82ae-c50592b79427]|

The patch is relatively simple but the patch is a bit noisy because we need to pass the {{ClientState}} to {{PartitionKeySingleRestrictionSet}} and \{{ClusteringColumnRestrictions{}, and this affects a bunch of classes in the way.

There is also a minor refactor ofthe signatures of the utility methods {{GuardrailTester#assertWarns}} and {{GuardrailTester#aasertFails}} that (trivially) touches a bunch of tests. The reason for this refactor is that we need to consider that a query can trigger multiple guardrails and thus produce multiple warnings.

> Guardrail for SELECT IN terms and their cartesian product
> ---------------------------------------------------------
>
>                 Key: CASSANDRA-17187
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-17187
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Feature/Guardrails
>            Reporter: Andres de la Peña
>            Assignee: Andres de la Peña
>            Priority: Normal
>              Labels: lhf
>             Fix For: 4.x
>
>
> Add a guardrail to limit the number restrictions generated by the cartesian product of the {{IN}} restrictions of a {{SELECT}} query, for example:
> {code}
> # Guardrail to warn or abort when IN query creates a cartesian product with a 
> # size exceeding threshold, eg. "a in (1,2,...10) and b in (1,2...10)" results in 
> # cartesian product of 100.
> # The two thresholds default to -1 to disable. 
> in_select_cartesian_product:
>     warn_threshold: -1
>     abort_threshold: -1
> {code}
> As an example of why this guardrails is proposed, these queries bring a C* instance to its knees even before the query starts executing: 
> {code}
> @Test
> public void testPartitionKeyTerms() throws Throwable
> {
>     createTable("CREATE TABLE %s (pk1 int, pk2 int, pk3 int, pk4 int, pk5 int, pk6 int, pk7 int, pk8 int, pk9 int, " +
>                "PRIMARY KEY((pk1, pk2, pk3, pk4, pk5, pk6, pk7, pk8, pk9)))");
>     execute("SELECT * FROM %s WHERE pk1 in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) " +
>             "AND pk2 in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) " +
>             "AND pk3 in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) " +
>             "AND pk4 in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) " +
>             "AND pk5 in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) " +
>             "AND pk6 in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) " +
>             "AND pk7 in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) " +
>             "AND pk8 in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) " +
>             "AND pk9 in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10);");
> }
> @Test
> public void testClusteringKeyTerms() throws Throwable
> {
>     createTable("CREATE TABLE %s (pk int ,ck1 int, ck2 int, ck3 int, ck4 int, ck5 int, ck6 int, ck7 int, ck8 int, ck9 int, " +
>             "PRIMARY KEY(pk, ck1, ck2, ck3, ck4, ck5, ck6, ck7, ck8, ck9))");
>     execute("SELECT * FROM %s WHERE pk = 1 " +
>             "AND ck1 in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) " +
>             "AND ck2 in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) " +
>             "AND ck3 in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) " +
>             "AND ck4 in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) " +
>             "AND ck5 in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) " +
>             "AND ck6 in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) " +
>             "AND ck7 in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) " +
>             "AND ck8 in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) " +
>             "AND ck9 in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10);");
> }
> {code}
> +Additional information for newcomers:+
> # Add the configuration for the new guardrail on cartesian product in the guardrails section of cassandra.yaml.
> # Add a {{getInCartesianProduct}} method in {{GuardrailsConfig}} returning a {{Threshold.Config}} object
> # Implement that method in {{GuardrailsOptions}}, which is the default yaml-based implementation of {{GuardrailsConfig}}
> # Add a Threshold guardrail named {{inCartesianProduct}} in Guardrails, using the previously created config
> # Define JMX-friendly getters and setters for the previously created config in {{GuardrailsMBean}}
> # Implement the JMX-friendly getters and setters in Guardrails
> # Now that we have the guardrail ready, it’s time to use it. We should search for a place to invoke the Guardrails#inCartesianProduct guard method. The {{MultiCBuilder}} look like good candidates for this.
> # Finally, add some tests for the new guardrail. Given that the new guardrail is a Threshold, our new test should probably extend {{ThresholdTester}}.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org