You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by dubey_a <Ab...@xoriant.com> on 2015/03/02 11:01:32 UTC
Performance tuning in Spark SQL.
What are the ways to tune query performance in Spark SQL?
--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Performance-tuning-in-Spark-SQL-tp21871.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org
Re: Performance tuning in Spark SQL.
Posted by prosp4300 <pr...@163.com>.
Please see below link for the ways available
https://spark.apache.org/docs/1.3.1/sql-programming-guide.html#performance-tuning
For example, reduce spark.sql.shuffle.partitions from 200 to 10 could
improve the performance significantly
--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Performance-tuning-in-Spark-SQL-tp21871p23576.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org
RE: Performance tuning in Spark SQL.
Posted by Abhishek Dubey <Ab...@Xoriant.Com>.
Hi,
Thank you for your reply. It surely going to help.
Regards,
Abhishek Dubey
From: Cheng, Hao [mailto:hao.cheng@intel.com]
Sent: Monday, March 02, 2015 6:52 PM
To: Abhishek Dubey; user@spark.apache.org
Subject: RE: Performance tuning in Spark SQL.
This is actually a quite open question, from my understanding, there're probably ways to tune like:
* SQL Configurations like:
Configuration Key
Default Value
spark.sql.autoBroadcastJoinThreshold
10 * 1024 * 1024
spark.sql.defaultSizeInBytes
10 * 1024 * 1024 + 1
spark.sql.planner.externalSort
false
spark.sql.shuffle.partitions
200
spark.sql.codegen
false
* Spark Cluster / Application Configuration (Memory, GC etc. Spark Core Number etc.)
* Try using the Cached tables / Parquet Files as the storage.
* "EXPLAIN [EXTENDED] query" is your best friend to tuning your SQL itself.
* ...
And, a real use case scenario probably be more helpful in answering your question.
-----Original Message-----
From: dubey_a [mailto:Abhishek.Dubey@xoriant.com]
Sent: Monday, March 2, 2015 6:02 PM
To: user@spark.apache.org<ma...@spark.apache.org>
Subject: Performance tuning in Spark SQL.
What are the ways to tune query performance in Spark SQL?
--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Performance-tuning-in-Spark-SQL-tp21871.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org<ma...@spark.apache.org> For additional commands, e-mail: user-help@spark.apache.org<ma...@spark.apache.org>
RE: Performance tuning in Spark SQL.
Posted by "Cheng, Hao" <ha...@intel.com>.
This is actually a quite open question, from my understanding, there're probably ways to tune like:
* SQL Configurations like:
Configuration Key
Default Value
spark.sql.autoBroadcastJoinThreshold
10 * 1024 * 1024
spark.sql.defaultSizeInBytes
10 * 1024 * 1024 + 1
spark.sql.planner.externalSort
false
spark.sql.shuffle.partitions
200
spark.sql.codegen
false
* Spark Cluster / Application Configuration (Memory, GC etc. Spark Core Number etc.)
* Try using the Cached tables / Parquet Files as the storage.
* "EXPLAIN [EXTENDED] query" is your best friend to tuning your SQL itself.
* ...
And, a real use case scenario probably be more helpful in answering your question.
-----Original Message-----
From: dubey_a [mailto:Abhishek.Dubey@xoriant.com]
Sent: Monday, March 2, 2015 6:02 PM
To: user@spark.apache.org
Subject: Performance tuning in Spark SQL.
What are the ways to tune query performance in Spark SQL?
--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Performance-tuning-in-Spark-SQL-tp21871.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org<ma...@spark.apache.org> For additional commands, e-mail: user-help@spark.apache.org<ma...@spark.apache.org>
Re: Performance tuning in Spark SQL.
Posted by Stephen Boesch <ja...@gmail.com>.
You have sent four questions that are very general in nature. They might be
better answered if you googled for those topics: there is a wealth of
materials available.
2015-03-02 2:01 GMT-08:00 dubey_a <Ab...@xoriant.com>:
> What are the ways to tune query performance in Spark SQL?
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Performance-tuning-in-Spark-SQL-tp21871.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>