You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (JIRA)" <ji...@apache.org> on 2019/05/21 04:18:16 UTC
[jira] [Resolved] (SPARK-20248) Spark SQL add limit parameter to
enhance the reliability.
[ https://issues.apache.org/jira/browse/SPARK-20248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hyukjin Kwon resolved SPARK-20248.
----------------------------------
Resolution: Incomplete
> Spark SQL add limit parameter to enhance the reliability.
> ---------------------------------------------------------
>
> Key: SPARK-20248
> URL: https://issues.apache.org/jira/browse/SPARK-20248
> Project: Spark
> Issue Type: Improvement
> Components: SQL
> Affects Versions: 2.1.0
> Environment: 2.1.0
> Reporter: shaolinliu
> Priority: Minor
> Labels: bulk-closed
>
> When we using thrift server, it is difficult to constrain the user's sql statement;
> When the user query a large table without limit, this will lead to thrift server process memory occupancy lead to service instability;
> In general, the user is not used correctly, because if you really need to return the whole table:
> 1, if you use this data to compute , you can complete the computation in the cluster and then return
> 2, if you want obtain the data, you can store it in hdfs
> For the above scene, it is recommended to add a "spark.sql.thriftserver.retainedResults" parameter,
> 1, when it is 0, we don not restrict user's operation
> 2, when it is greater than 0, if user query with limit, we use user's limit;if not we use this to limit query's result
> Priority user's limit is because, if the user consider the limit, in general, the user is aware of the exact meaning of this query
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org