You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by Jianshi Huang <ji...@gmail.com> on 2015/03/27 07:22:54 UTC

Add partition support in saveAsParquet

Hi,

Anyone has similar request?

https://issues.apache.org/jira/browse/SPARK-6561

When we save a DataFrame into Parquet files, we also want to have it
partitioned.

The proposed API looks like this:

def saveAsParquet(path: String, partitionColumns: Seq[String])



-- 
Jianshi Huang

LinkedIn: jianshi
Twitter: @jshuang
Github & Blog: http://huangjs.github.com/

Re: Add partition support in saveAsParquet

Posted by Michael Armbrust <mi...@databricks.com>.

This is something we are hoping to support in Spark 1.4.  We'll post more
information to JIRA when there is a design.

On Thu, Mar 26, 2015 at 11:22 PM, Jianshi Huang <ji...@gmail.com>
wrote:

> Hi,
>
> Anyone has similar request?
>
> https://issues.apache.org/jira/browse/SPARK-6561
>
> When we save a DataFrame into Parquet files, we also want to have it
> partitioned.
>
> The proposed API looks like this:
>
> def saveAsParquet(path: String, partitionColumns: Seq[String])
>
>
>
> --
> Jianshi Huang
>
> LinkedIn: jianshi
> Twitter: @jshuang
> Github & Blog: http://huangjs.github.com/
>