You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by aakash aakash <em...@gmail.com> on 2019/12/17 18:42:15 UTC

how to get partition column info in Data Source V2 writer

Hi Spark dev folks,

First of all kudos on this new Data Source v2, API looks simple and it
makes easy to develop a new data source and use it.

With my current work, I am trying to implement a new data source V2 writer
with Spark 2.3 and I was wondering how I will get the info about partition
by columns. I see that it has been passed to Data Source V1 from
DataFrameWriter but not for V2.


Thanks,
Aakash

Re: how to get partition column info in Data Source V2 writer

Posted by aakash aakash <em...@gmail.com>.
Thanks Wenchen!

On Wed, Dec 18, 2019 at 7:25 PM Wenchen Fan <cl...@gmail.com> wrote:

> Hi Aakash,
>
> You can try the latest DS v2 with the 3.0 preview, and the API is in a
> quite stable shape now. With the latest API, a Writer is created from a
> Table, and the Table has the partitioning information.
>
> Thanks,
> Wenchen
>
> On Wed, Dec 18, 2019 at 3:22 AM aakash aakash <em...@gmail.com>
> wrote:
>
>> Thanks Andrew!
>>
>> It seems there is a drastic change in 3.0, going through it.
>>
>> -Aakash
>>
>> On Tue, Dec 17, 2019 at 11:01 AM Andrew Melo <an...@gmail.com>
>> wrote:
>>
>>> Hi Aakash
>>>
>>> On Tue, Dec 17, 2019 at 12:42 PM aakash aakash <em...@gmail.com>
>>> wrote:
>>>
>>>> Hi Spark dev folks,
>>>>
>>>> First of all kudos on this new Data Source v2, API looks simple and it
>>>> makes easy to develop a new data source and use it.
>>>>
>>>> With my current work, I am trying to implement a new data source V2
>>>> writer with Spark 2.3 and I was wondering how I will get the info about
>>>> partition by columns. I see that it has been passed to Data Source V1 from
>>>> DataFrameWriter but not for V2.
>>>>
>>>
>>> Not directly related to your Q, but just so you're aware, the DSv2 API
>>> evolved from 2.3->2.4 and then again for 2.4->3.0.
>>>
>>> Cheers
>>> Andrew
>>>
>>>
>>>>
>>>>
>>>> Thanks,
>>>> Aakash
>>>>
>>>

Re: how to get partition column info in Data Source V2 writer

Posted by Wenchen Fan <cl...@gmail.com>.
Hi Aakash,

You can try the latest DS v2 with the 3.0 preview, and the API is in a
quite stable shape now. With the latest API, a Writer is created from a
Table, and the Table has the partitioning information.

Thanks,
Wenchen

On Wed, Dec 18, 2019 at 3:22 AM aakash aakash <em...@gmail.com>
wrote:

> Thanks Andrew!
>
> It seems there is a drastic change in 3.0, going through it.
>
> -Aakash
>
> On Tue, Dec 17, 2019 at 11:01 AM Andrew Melo <an...@gmail.com>
> wrote:
>
>> Hi Aakash
>>
>> On Tue, Dec 17, 2019 at 12:42 PM aakash aakash <em...@gmail.com>
>> wrote:
>>
>>> Hi Spark dev folks,
>>>
>>> First of all kudos on this new Data Source v2, API looks simple and it
>>> makes easy to develop a new data source and use it.
>>>
>>> With my current work, I am trying to implement a new data source V2
>>> writer with Spark 2.3 and I was wondering how I will get the info about
>>> partition by columns. I see that it has been passed to Data Source V1 from
>>> DataFrameWriter but not for V2.
>>>
>>
>> Not directly related to your Q, but just so you're aware, the DSv2 API
>> evolved from 2.3->2.4 and then again for 2.4->3.0.
>>
>> Cheers
>> Andrew
>>
>>
>>>
>>>
>>> Thanks,
>>> Aakash
>>>
>>

Re: how to get partition column info in Data Source V2 writer

Posted by aakash aakash <em...@gmail.com>.
Thanks Andrew!

It seems there is a drastic change in 3.0, going through it.

-Aakash

On Tue, Dec 17, 2019 at 11:01 AM Andrew Melo <an...@gmail.com> wrote:

> Hi Aakash
>
> On Tue, Dec 17, 2019 at 12:42 PM aakash aakash <em...@gmail.com>
> wrote:
>
>> Hi Spark dev folks,
>>
>> First of all kudos on this new Data Source v2, API looks simple and it
>> makes easy to develop a new data source and use it.
>>
>> With my current work, I am trying to implement a new data source V2
>> writer with Spark 2.3 and I was wondering how I will get the info about
>> partition by columns. I see that it has been passed to Data Source V1 from
>> DataFrameWriter but not for V2.
>>
>
> Not directly related to your Q, but just so you're aware, the DSv2 API
> evolved from 2.3->2.4 and then again for 2.4->3.0.
>
> Cheers
> Andrew
>
>
>>
>>
>> Thanks,
>> Aakash
>>
>

Re: how to get partition column info in Data Source V2 writer

Posted by Andrew Melo <an...@gmail.com>.
Hi Aakash

On Tue, Dec 17, 2019 at 12:42 PM aakash aakash <em...@gmail.com>
wrote:

> Hi Spark dev folks,
>
> First of all kudos on this new Data Source v2, API looks simple and it
> makes easy to develop a new data source and use it.
>
> With my current work, I am trying to implement a new data source V2 writer
> with Spark 2.3 and I was wondering how I will get the info about partition
> by columns. I see that it has been passed to Data Source V1 from
> DataFrameWriter but not for V2.
>

Not directly related to your Q, but just so you're aware, the DSv2 API
evolved from 2.3->2.4 and then again for 2.4->3.0.

Cheers
Andrew


>
>
> Thanks,
> Aakash
>