You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by masaki rikitoku <ri...@gmail.com> on 2015/02/26 10:31:44 UTC
number of partitions for hive schemaRDD
Hi all
now, I'm trying the SparkSQL with hivecontext.
when I execute the hql like the following.
---
val ctx = new org.apache.spark.sql.hive.HiveContext(sc)
import ctx._
val queries = ctx.hql("select keyword from queries where dt =
'2015-02-01' limit 10000000")
---
It seem that the number of the partitions ot the queries is set by 1.
Is this the specifications for schemaRDD, SparkSQL, HiveContext ?
Are there any means to set the number of partitions arbitrary value
except for explicit repartition
Masaki Rikitoku
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org
Re: number of partitions for hive schemaRDD
Posted by Cheng Lian <li...@gmail.com>.
Hi Masaki,
I guess what you saw is the partition number of the last stage, which
must be 1 to perform the global phase of LIMIT. To tune partition number
of normal shuffles like joins, you may resort to
spark.sql.shuffle.partitions.
Cheng
On 2/26/15 5:31 PM, masaki rikitoku wrote:
> Hi all
>
> now, I'm trying the SparkSQL with hivecontext.
>
> when I execute the hql like the following.
>
> ---
>
> val ctx = new org.apache.spark.sql.hive.HiveContext(sc)
> import ctx._
>
> val queries = ctx.hql("select keyword from queries where dt =
> '2015-02-01' limit 10000000")
>
> ---
>
> It seem that the number of the partitions ot the queries is set by 1.
>
> Is this the specifications for schemaRDD, SparkSQL, HiveContext ?
>
> Are there any means to set the number of partitions arbitrary value
> except for explicit repartition
>
>
> Masaki Rikitoku
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> For additional commands, e-mail: dev-help@spark.apache.org
>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org