You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by ctang <ct...@gmail.com> on 2017/07/19 22:13:16 UTC

How to insert a dataframe as a static partition to a partitioned table

I wonder if there are any easy ways (or APIs) to insert a dataframe (or
DataSet), which does not contain the partition columns, as a static
partition to the table. For example,
The DataSet with columns (col1, col2) will be inserted into a table (col1,
col2) partitioned by column partcol as a static partition with partspec
(partcol =1).

Thanks



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-insert-a-dataframe-as-a-static-partition-to-a-partitioned-table-tp28882.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Re: How to insert a dataframe as a static partition to a partitioned table

Posted by Chaoyu Tang <ct...@gmail.com>.
Thanks Vadim. But I am looking for an API either in DataSet, DataFrame, or
DataFrameWriter etc. The way you suggested can be done via a query like
spark.sql(""" ALTER TABLE `table` ADD PARTITION (partcol=1) LOCATION
'/path/to/your/dataset' """), and before that I write it to a specified
location first.



On Thu, Jul 20, 2017 at 10:56 AM, Vadim Semenov <vadim.semenov@datadoghq.com
> wrote:

> This should work:
> ```
> ALTER TABLE `table` ADD PARTITION (partcol=1) LOCATION
> '/path/to/your/dataset'
> ```
>
> On Wed, Jul 19, 2017 at 6:13 PM, ctang <ct...@gmail.com> wrote:
>
>> I wonder if there are any easy ways (or APIs) to insert a dataframe (or
>> DataSet), which does not contain the partition columns, as a static
>> partition to the table. For example,
>> The DataSet with columns (col1, col2) will be inserted into a table (col1,
>> col2) partitioned by column partcol as a static partition with partspec
>> (partcol =1).
>>
>> Thanks
>>
>>
>>
>> --
>> View this message in context: http://apache-spark-user-list.
>> 1001560.n3.nabble.com/How-to-insert-a-dataframe-as-a-static-
>> partition-to-a-partitioned-table-tp28882.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>>
>>
>

Re: How to insert a dataframe as a static partition to a partitioned table

Posted by Vadim Semenov <va...@datadoghq.com>.
This should work:
```
ALTER TABLE `table` ADD PARTITION (partcol=1) LOCATION
'/path/to/your/dataset'
```

On Wed, Jul 19, 2017 at 6:13 PM, ctang <ct...@gmail.com> wrote:

> I wonder if there are any easy ways (or APIs) to insert a dataframe (or
> DataSet), which does not contain the partition columns, as a static
> partition to the table. For example,
> The DataSet with columns (col1, col2) will be inserted into a table (col1,
> col2) partitioned by column partcol as a static partition with partspec
> (partcol =1).
>
> Thanks
>
>
>
> --
> View this message in context: http://apache-spark-user-list.
> 1001560.n3.nabble.com/How-to-insert-a-dataframe-as-a-
> static-partition-to-a-partitioned-table-tp28882.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>
>

Re: How to insert a dataframe as a static partition to a partitioned table

Posted by Ryan <ry...@gmail.com>.
Not sure about the writer api, but you could always register a temp table
for that dataframe and execute insert hql.

On Thu, Jul 20, 2017 at 6:13 AM, ctang <ct...@gmail.com> wrote:

> I wonder if there are any easy ways (or APIs) to insert a dataframe (or
> DataSet), which does not contain the partition columns, as a static
> partition to the table. For example,
> The DataSet with columns (col1, col2) will be inserted into a table (col1,
> col2) partitioned by column partcol as a static partition with partspec
> (partcol =1).
>
> Thanks
>
>
>
> --
> View this message in context: http://apache-spark-user-list.
> 1001560.n3.nabble.com/How-to-insert-a-dataframe-as-a-
> static-partition-to-a-partitioned-table-tp28882.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>
>