You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Nilesh Chakraborty <ni...@nileshc.com> on 2016/06/03 16:09:42 UTC

Custom positioning/partitioning Dataframes

Hi,

I have a domain-specific schema (RDF data with vertical partitioning, ie.
one table per property) and I want to instruct SparkSQL to keep semantically
closer property tables closer together, that is, group dataframes together
into different nodes (or at least encourage it somehow) so that tables that
are most frequently joined together are located locally together.

Any thoughts on how I can do this with Spark? Any internal hack ideas are
welcome too. :)

Cheers,
Nilesh



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Custom-positioning-partitioning-Dataframes-tp27084.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Re: Custom positioning/partitioning Dataframes

Posted by Takeshi Yamamuro <li...@gmail.com>.
Hi,

I'm afraid spark has no explicit api to set custom partitioners in df for
now.

// maropu

On Sat, Jun 4, 2016 at 1:09 AM, Nilesh Chakraborty <ni...@nileshc.com>
wrote:

> Hi,
>
> I have a domain-specific schema (RDF data with vertical partitioning, ie.
> one table per property) and I want to instruct SparkSQL to keep
> semantically
> closer property tables closer together, that is, group dataframes together
> into different nodes (or at least encourage it somehow) so that tables that
> are most frequently joined together are located locally together.
>
> Any thoughts on how I can do this with Spark? Any internal hack ideas are
> welcome too. :)
>
> Cheers,
> Nilesh
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Custom-positioning-partitioning-Dataframes-tp27084.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>


-- 
---
Takeshi Yamamuro