You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by Raffael Marty <ra...@pixlcloud.com> on 2014/07/07 07:00:06 UTC

SparkSQL - Partitioned Parquet

Does SparkSQL support partitioned parquet tables? How do I save to a partitioned parquet file from within Python?

 table.saveAsParquetFile("table.parquet”)

This call doesn’t seem to support a partition argument. Or does my schemaRDD have to be setup a specific way?

Re: SparkSQL - Partitioned Parquet

Posted by Michael Armbrust <mi...@databricks.com>.

The only partitioning that is currently supported is through Hive
partitioned tables.  Supporting this for parquet as well is on our radar,
but probably won't happen for 1.1.

On Sun, Jul 6, 2014 at 10:00 PM, Raffael Marty <ra...@pixlcloud.com> wrote:

> Does SparkSQL support partitioned parquet tables? How do I save to a
> partitioned parquet file from within Python?
>
>  table.saveAsParquetFile("table.parquet”)
>
> This call doesn’t seem to support a partition argument. Or does my
> schemaRDD have to be setup a specific way?
>