You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "dimtiris kanoute (Jira)" <ji...@apache.org> on 2022/02/24 17:38:00 UTC

[jira] [Created] (SPARK-38319) Implement Strict Mode to prevent QUERY the entire table

dimtiris kanoute created SPARK-38319:
----------------------------------------

             Summary: Implement Strict Mode to prevent QUERY the entire table  
                 Key: SPARK-38319
                 URL: https://issues.apache.org/jira/browse/SPARK-38319
             Project: Spark
          Issue Type: New Feature
          Components: Spark Core
    Affects Versions: 3.2.1
            Reporter: dimtiris kanoute


We are using Spark Thrift Server as a service to run Spark SQL queries along with Hive metastore as the metadata service.

We would like to restrict users from querying the entire table and force them to use {{WHERE  }}clause in the query based on partition column{{ (i.e. SELECT * FROM TABLE WHERE partition_column=<column_value>) }}*and*  {{LIMIT }}the output of the query when {{ORDER}} {{BY}} is used.

This behaviour is similar to what hive exposes as configuration

{{??hive.strict.checks.no.partition.filter??}}

{{??hive.strict.checks.orderby.no.limit??}}

and is described here: [https://github.com/apache/hive/blob/master/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java#L1812|http://example.com/]

and

[https://github.com/apache/hive/blob/master/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java#L1816|http://example.com/]

 

This is a pretty common usecase / feature that we meet in other tools as well,  like in BigQuery for example: [https://cloud.google.com/bigquery/docs/querying-partitioned-tables#require_a_partition_filter_in_queries]  .

It would be nice to have this feature implemented in Spark when hive support is enabled in a spark session. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org