You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Guillermo Ortiz <ko...@gmail.com> on 2016/02/25 11:42:25 UTC

Number partitions after a join

When you do a join in Spark, how many partitions are as result? is it a
default number if you don't specify the number of partitions?

Re: Number partitions after a join

Posted by Guillermo Ortiz <ko...@gmail.com>.
Good to know, thanks everybody!

2016-02-25 15:29 GMT+01:00 JOAQUIN GUANTER GONZALBEZ <
joaquin.guantergonzalbez@telefonica.com>:

> Actually that only applies to Spark SQL. I believe that in plain RDD, the
> resulting join will have as many partitions as the RDD with the most
> partition.
>
>
>
> Cheers,
>
> Ximo
>
>
>
> *De:* Guillermo Ortiz [mailto:konstt2000@gmail.com]
> *Enviado el:* jueves, 25 de febrero de 2016 15:19
> *Para:* Takeshi Yamamuro <li...@gmail.com>
> *CC:* user <us...@spark.apache.org>
> *Asunto:* Re: Number partitions after a join
>
>
>
> thank you, I didn't see that option.
>
>
>
> 2016-02-25 14:51 GMT+01:00 Takeshi Yamamuro <li...@gmail.com>:
>
> Hi,
>
>
>
> The number depends on `spark.sql.shuffle.partitions`.
>
> See:
> http://spark.apache.org/docs/latest/sql-programming-guide.html#other-configuration-options
>
>
>
> On Thu, Feb 25, 2016 at 7:42 PM, Guillermo Ortiz <ko...@gmail.com>
> wrote:
>
> When you do a join in Spark, how many partitions are as result? is it a
> default number if you don't specify the number of partitions?
>
>
>
>
>
> --
>
> ---
> Takeshi Yamamuro
>
>
>
> ------------------------------
>
> Este mensaje y sus adjuntos se dirigen exclusivamente a su destinatario,
> puede contener información privilegiada o confidencial y es para uso
> exclusivo de la persona o entidad de destino. Si no es usted. el
> destinatario indicado, queda notificado de que la lectura, utilización,
> divulgación y/o copia sin autorización puede estar prohibida en virtud de
> la legislación vigente. Si ha recibido este mensaje por error, le rogamos
> que nos lo comunique inmediatamente por esta misma vía y proceda a su
> destrucción.
>
> The information contained in this transmission is privileged and
> confidential information intended only for the use of the individual or
> entity named above. If the reader of this message is not the intended
> recipient, you are hereby notified that any dissemination, distribution or
> copying of this communication is strictly prohibited. If you have received
> this transmission in error, do not read it. Please immediately reply to the
> sender that you have received this communication in error and then delete
> it.
>
> Esta mensagem e seus anexos se dirigem exclusivamente ao seu destinatário,
> pode conter informação privilegiada ou confidencial e é para uso exclusivo
> da pessoa ou entidade de destino. Se não é vossa senhoria o destinatário
> indicado, fica notificado de que a leitura, utilização, divulgação e/ou
> cópia sem autorização pode estar proibida em virtude da legislação vigente.
> Se recebeu esta mensagem por erro, rogamos-lhe que nos o comunique
> imediatamente por esta mesma via e proceda a sua destruição
>

RE: Number partitions after a join

Posted by JOAQUIN GUANTER GONZALBEZ <jo...@telefonica.com>.
Actually that only applies to Spark SQL. I believe that in plain RDD, the resulting join will have as many partitions as the RDD with the most partition.

Cheers,
Ximo

De: Guillermo Ortiz [mailto:konstt2000@gmail.com]
Enviado el: jueves, 25 de febrero de 2016 15:19
Para: Takeshi Yamamuro <li...@gmail.com>
CC: user <us...@spark.apache.org>
Asunto: Re: Number partitions after a join

thank you, I didn't see that option.

2016-02-25 14:51 GMT+01:00 Takeshi Yamamuro <li...@gmail.com>>:
Hi,

The number depends on `spark.sql.shuffle.partitions`.
See: http://spark.apache.org/docs/latest/sql-programming-guide.html#other-configuration-options

On Thu, Feb 25, 2016 at 7:42 PM, Guillermo Ortiz <ko...@gmail.com>> wrote:
When you do a join in Spark, how many partitions are as result? is it a default number if you don't specify the number of partitions?



--
---
Takeshi Yamamuro


________________________________

Este mensaje y sus adjuntos se dirigen exclusivamente a su destinatario, puede contener información privilegiada o confidencial y es para uso exclusivo de la persona o entidad de destino. Si no es usted. el destinatario indicado, queda notificado de que la lectura, utilización, divulgación y/o copia sin autorización puede estar prohibida en virtud de la legislación vigente. Si ha recibido este mensaje por error, le rogamos que nos lo comunique inmediatamente por esta misma vía y proceda a su destrucción.

The information contained in this transmission is privileged and confidential information intended only for the use of the individual or entity named above. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this transmission in error, do not read it. Please immediately reply to the sender that you have received this communication in error and then delete it.

Esta mensagem e seus anexos se dirigem exclusivamente ao seu destinatário, pode conter informação privilegiada ou confidencial e é para uso exclusivo da pessoa ou entidade de destino. Se não é vossa senhoria o destinatário indicado, fica notificado de que a leitura, utilização, divulgação e/ou cópia sem autorização pode estar proibida em virtude da legislação vigente. Se recebeu esta mensagem por erro, rogamos-lhe que nos o comunique imediatamente por esta mesma via e proceda a sua destruição

Re: Number partitions after a join

Posted by Guillermo Ortiz <ko...@gmail.com>.
thank you, I didn't see that option.

2016-02-25 14:51 GMT+01:00 Takeshi Yamamuro <li...@gmail.com>:

> Hi,
>
> The number depends on `spark.sql.shuffle.partitions`.
> See:
> http://spark.apache.org/docs/latest/sql-programming-guide.html#other-configuration-options
>
> On Thu, Feb 25, 2016 at 7:42 PM, Guillermo Ortiz <ko...@gmail.com>
> wrote:
>
>> When you do a join in Spark, how many partitions are as result? is it a
>> default number if you don't specify the number of partitions?
>>
>
>
>
> --
> ---
> Takeshi Yamamuro
>

Re: Number partitions after a join

Posted by Takeshi Yamamuro <li...@gmail.com>.
Hi,

The number depends on `spark.sql.shuffle.partitions`.
See:
http://spark.apache.org/docs/latest/sql-programming-guide.html#other-configuration-options

On Thu, Feb 25, 2016 at 7:42 PM, Guillermo Ortiz <ko...@gmail.com>
wrote:

> When you do a join in Spark, how many partitions are as result? is it a
> default number if you don't specify the number of partitions?
>



-- 
---
Takeshi Yamamuro