You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Hyukjin Kwon <gu...@gmail.com> on 2022/05/17 03:26:42 UTC

Introducing "Pandas API on Spark" component in JIRA, and use "PS" PR title component

Hi all,

What about we introduce a component in JIRA "Pandas API on Spark", and use
"PS"  (pandas-on-Spark) in PR titles? We already use "ps" in many places
when we: import pyspark.pandas as ps.
This is similar to "Structured Streaming" in JIRA, and "SS" in PR title.

I think it'd be easier to track the changes here with that. Currently it's
a bit difficult to identify it from pure PySpark changes.

Re: Introducing "Pandas API on Spark" component in JIRA, and use "PS" PR title component

Posted by Hyukjin Kwon <gu...@gmail.com>.
Thanks Ruifeng.

I added "Pandas API on Spark" component JIRA (and archived "jenkins"
component since we don't have the legacy Jenkins anymore).
Let me know if you guys have other opinions.

On Tue, 17 May 2022 at 12:59, Ruifeng Zheng <ru...@foxmail.com> wrote:

> +1, I think it is a good idea
>
>
> ------------------ 原始邮件 ------------------
> *发件人:* "Hyukjin Kwon" <gu...@gmail.com>;
> *发送时间:* 2022年5月17日(星期二) 中午11:26
> *收件人:* "dev"<de...@spark.apache.org>;
> *抄送:* "Yikun Jiang"<yi...@gmail.com>;"Xinrong Meng"<
> xinrong.meng@databricks.com>;"Xiao Li"<xi...@databricks.com>;"Takuya
> Ueshin"<ue...@databricks.com>;"Haejoon Lee"<ha...@databricks.com>;"Ruifeng
> Zheng"<ru...@foxmail.com>;
> *主题:* Introducing "Pandas API on Spark" component in JIRA, and use "PS"
> PR title component
>
> Hi all,
>
> What about we introduce a component in JIRA "Pandas API on Spark", and use
> "PS"  (pandas-on-Spark) in PR titles? We already use "ps" in many places
> when we: import pyspark.pandas as ps.
> This is similar to "Structured Streaming" in JIRA, and "SS" in PR title.
>
> I think it'd be easier to track the changes here with that. Currently it's
> a bit difficult to identify it from pure PySpark changes.
>
>

回复:Introducing "Pandas API on Spark" component in JIRA, and use "PS" PR title component

Posted by Ruifeng Zheng <ru...@foxmail.com>.
+1, I think it is a good idea




------------------&nbsp;原始邮件&nbsp;------------------
发件人:                                                                                                                        "Hyukjin Kwon"                                                                                    <gurwls223@gmail.com&gt;;
发送时间:&nbsp;2022年5月17日(星期二) 中午11:26
收件人:&nbsp;"dev"<dev@spark.apache.org&gt;;
抄送:&nbsp;"Yikun Jiang"<yikunkero@gmail.com&gt;;"Xinrong Meng"<xinrong.meng@databricks.com&gt;;"Xiao Li"<xiao.li@databricks.com&gt;;"Takuya Ueshin"<ueshin@databricks.com&gt;;"Haejoon Lee"<haejoon.lee@databricks.com&gt;;"Ruifeng Zheng"<ruifengz@foxmail.com&gt;;
主题:&nbsp;Introducing "Pandas API on Spark" component in JIRA, and use "PS" PR title component



Hi all,


What about we introduce a component&nbsp;in JIRA "Pandas API on Spark", and use "PS"&nbsp; (pandas-on-Spark) in PR titles? We already use "ps" in many places when we: import pyspark.pandas&nbsp;as ps.
This is similar to "Structured Streaming" in JIRA, and "SS" in PR title.

I think it'd be easier to track the changes here with that. Currently it's a bit difficult to identify it from pure PySpark changes.

Re: Introducing "Pandas API on Spark" component in JIRA, and use "PS" PR title component

Posted by "L. C. Hsieh" <vi...@gmail.com>.
+1. Thanks Hyukjin.

On Thu, May 19, 2022 at 10:14 AM Bryan Cutler <cu...@gmail.com> wrote:
>
> +1, sounds good
>
> On Wed, May 18, 2022 at 9:16 PM Dongjoon Hyun <do...@gmail.com> wrote:
>>
>> +1
>>
>> Thank you for the suggestion, Hyukjin.
>>
>> Dongjoon.
>>
>> On Wed, May 18, 2022 at 11:08 AM Bjørn Jørgensen <bj...@gmail.com> wrote:
>>>
>>> +1
>>> But can will have PR Title and PR label the same,  PS
>>>
>>> ons. 18. mai 2022 kl. 18:57 skrev Xinrong Meng <xi...@databricks.com.invalid>:
>>>>
>>>> Great!
>>>>
>>>> It saves us from always specifying "Pandas API on Spark" in PR titles.
>>>>
>>>> Thanks!
>>>>
>>>>
>>>> Xinrong Meng
>>>>
>>>> Software Engineer
>>>>
>>>> Databricks
>>>>
>>>>
>>>>
>>>> On Tue, May 17, 2022 at 1:08 AM Maciej <ms...@gmail.com> wrote:
>>>>>
>>>>> Sounds good!
>>>>>
>>>>> +1
>>>>>
>>>>> On 5/17/22 06:08, Yikun Jiang wrote:
>>>>> > It's a pretty good idea, +1.
>>>>> >
>>>>> > To be clear in Github:
>>>>> >
>>>>> > - For each PR Title: [SPARK-XXX][PYTHON][PS] The Pandas on spark pr title
>>>>> > (*still keep [PYTHON]* and [PS] new added)
>>>>> >
>>>>> > - For PR label: new added: `PANDAS API ON Spark`, still keep: `PYTHON`,
>>>>> > `CORE`
>>>>> > (*still keep `PYTHON`, `CORE`* and `PANDAS API ON SPARK` new added)
>>>>> > https://github.com/apache/spark/pull/36574
>>>>> > <https://github.com/apache/spark/pull/36574>
>>>>> >
>>>>> > Right?
>>>>> >
>>>>> > Regards,
>>>>> > Yikun
>>>>> >
>>>>> >
>>>>> > On Tue, May 17, 2022 at 11:26 AM Hyukjin Kwon <gurwls223@gmail.com
>>>>> > <ma...@gmail.com>> wrote:
>>>>> >
>>>>> >     Hi all,
>>>>> >
>>>>> >     What about we introduce a component in JIRA "Pandas API on Spark",
>>>>> >     and use "PS"  (pandas-on-Spark) in PR titles? We already use "ps" in
>>>>> >     many places when we: import pyspark.pandas as ps.
>>>>> >     This is similar to "Structured Streaming" in JIRA, and "SS" in PR title.
>>>>> >
>>>>> >     I think it'd be easier to track the changes here with that.
>>>>> >     Currently it's a bit difficult to identify it from pure PySpark changes.
>>>>> >
>>>>>
>>>>>
>>>>> --
>>>>> Best regards,
>>>>> Maciej Szymkiewicz
>>>>>
>>>>> Web: https://zero323.net
>>>>> PGP: A30CEF0C31A501EC
>>>
>>>
>>>
>>> --
>>> Bjørn Jørgensen
>>> Vestre Aspehaug 4, 6010 Ålesund
>>> Norge
>>>
>>> +47 480 94 297

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org


Re: Introducing "Pandas API on Spark" component in JIRA, and use "PS" PR title component

Posted by Bryan Cutler <cu...@gmail.com>.
+1, sounds good

On Wed, May 18, 2022 at 9:16 PM Dongjoon Hyun <do...@gmail.com>
wrote:

> +1
>
> Thank you for the suggestion, Hyukjin.
>
> Dongjoon.
>
> On Wed, May 18, 2022 at 11:08 AM Bjørn Jørgensen <bj...@gmail.com>
> wrote:
>
>> +1
>> But can will have PR Title and PR label the same,  PS
>>
>> ons. 18. mai 2022 kl. 18:57 skrev Xinrong Meng
>> <xi...@databricks.com.invalid>:
>>
>>> Great!
>>>
>>> It saves us from always specifying "Pandas API on Spark" in PR titles.
>>>
>>> Thanks!
>>>
>>>
>>> Xinrong Meng
>>>
>>> Software Engineer
>>>
>>> Databricks
>>>
>>>
>>> On Tue, May 17, 2022 at 1:08 AM Maciej <ms...@gmail.com> wrote:
>>>
>>>> Sounds good!
>>>>
>>>> +1
>>>>
>>>> On 5/17/22 06:08, Yikun Jiang wrote:
>>>> > It's a pretty good idea, +1.
>>>> >
>>>> > To be clear in Github:
>>>> >
>>>> > - For each PR Title: [SPARK-XXX][PYTHON][PS] The Pandas on spark pr
>>>> title
>>>> > (*still keep [PYTHON]* and [PS] new added)
>>>> >
>>>> > - For PR label: new added: `PANDAS API ON Spark`, still keep:
>>>> `PYTHON`,
>>>> > `CORE`
>>>> > (*still keep `PYTHON`, `CORE`* and `PANDAS API ON SPARK` new added)
>>>> > https://github.com/apache/spark/pull/36574
>>>> > <https://github.com/apache/spark/pull/36574>
>>>> >
>>>> > Right?
>>>> >
>>>> > Regards,
>>>> > Yikun
>>>> >
>>>> >
>>>> > On Tue, May 17, 2022 at 11:26 AM Hyukjin Kwon <gurwls223@gmail.com
>>>> > <ma...@gmail.com>> wrote:
>>>> >
>>>> >     Hi all,
>>>> >
>>>> >     What about we introduce a component in JIRA "Pandas API on Spark",
>>>> >     and use "PS"  (pandas-on-Spark) in PR titles? We already use "ps"
>>>> in
>>>> >     many places when we: import pyspark.pandas as ps.
>>>> >     This is similar to "Structured Streaming" in JIRA, and "SS" in PR
>>>> title.
>>>> >
>>>> >     I think it'd be easier to track the changes here with that.
>>>> >     Currently it's a bit difficult to identify it from pure PySpark
>>>> changes.
>>>> >
>>>>
>>>>
>>>> --
>>>> Best regards,
>>>> Maciej Szymkiewicz
>>>>
>>>> Web: https://zero323.net
>>>> PGP: A30CEF0C31A501EC
>>>>
>>>
>>
>> --
>> Bjørn Jørgensen
>> Vestre Aspehaug 4, 6010 Ålesund
>> Norge
>>
>> +47 480 94 297
>>
>

Re: Introducing "Pandas API on Spark" component in JIRA, and use "PS" PR title component

Posted by Dongjoon Hyun <do...@gmail.com>.
+1

Thank you for the suggestion, Hyukjin.

Dongjoon.

On Wed, May 18, 2022 at 11:08 AM Bjørn Jørgensen <bj...@gmail.com>
wrote:

> +1
> But can will have PR Title and PR label the same,  PS
>
> ons. 18. mai 2022 kl. 18:57 skrev Xinrong Meng
> <xi...@databricks.com.invalid>:
>
>> Great!
>>
>> It saves us from always specifying "Pandas API on Spark" in PR titles.
>>
>> Thanks!
>>
>>
>> Xinrong Meng
>>
>> Software Engineer
>>
>> Databricks
>>
>>
>> On Tue, May 17, 2022 at 1:08 AM Maciej <ms...@gmail.com> wrote:
>>
>>> Sounds good!
>>>
>>> +1
>>>
>>> On 5/17/22 06:08, Yikun Jiang wrote:
>>> > It's a pretty good idea, +1.
>>> >
>>> > To be clear in Github:
>>> >
>>> > - For each PR Title: [SPARK-XXX][PYTHON][PS] The Pandas on spark pr
>>> title
>>> > (*still keep [PYTHON]* and [PS] new added)
>>> >
>>> > - For PR label: new added: `PANDAS API ON Spark`, still keep: `PYTHON`,
>>> > `CORE`
>>> > (*still keep `PYTHON`, `CORE`* and `PANDAS API ON SPARK` new added)
>>> > https://github.com/apache/spark/pull/36574
>>> > <https://github.com/apache/spark/pull/36574>
>>> >
>>> > Right?
>>> >
>>> > Regards,
>>> > Yikun
>>> >
>>> >
>>> > On Tue, May 17, 2022 at 11:26 AM Hyukjin Kwon <gurwls223@gmail.com
>>> > <ma...@gmail.com>> wrote:
>>> >
>>> >     Hi all,
>>> >
>>> >     What about we introduce a component in JIRA "Pandas API on Spark",
>>> >     and use "PS"  (pandas-on-Spark) in PR titles? We already use "ps"
>>> in
>>> >     many places when we: import pyspark.pandas as ps.
>>> >     This is similar to "Structured Streaming" in JIRA, and "SS" in PR
>>> title.
>>> >
>>> >     I think it'd be easier to track the changes here with that.
>>> >     Currently it's a bit difficult to identify it from pure PySpark
>>> changes.
>>> >
>>>
>>>
>>> --
>>> Best regards,
>>> Maciej Szymkiewicz
>>>
>>> Web: https://zero323.net
>>> PGP: A30CEF0C31A501EC
>>>
>>
>
> --
> Bjørn Jørgensen
> Vestre Aspehaug 4, 6010 Ålesund
> Norge
>
> +47 480 94 297
>

Re: Introducing "Pandas API on Spark" component in JIRA, and use "PS" PR title component

Posted by Bjørn Jørgensen <bj...@gmail.com>.
+1
But can will have PR Title and PR label the same,  PS

ons. 18. mai 2022 kl. 18:57 skrev Xinrong Meng
<xi...@databricks.com.invalid>:

> Great!
>
> It saves us from always specifying "Pandas API on Spark" in PR titles.
>
> Thanks!
>
>
> Xinrong Meng
>
> Software Engineer
>
> Databricks
>
>
> On Tue, May 17, 2022 at 1:08 AM Maciej <ms...@gmail.com> wrote:
>
>> Sounds good!
>>
>> +1
>>
>> On 5/17/22 06:08, Yikun Jiang wrote:
>> > It's a pretty good idea, +1.
>> >
>> > To be clear in Github:
>> >
>> > - For each PR Title: [SPARK-XXX][PYTHON][PS] The Pandas on spark pr
>> title
>> > (*still keep [PYTHON]* and [PS] new added)
>> >
>> > - For PR label: new added: `PANDAS API ON Spark`, still keep: `PYTHON`,
>> > `CORE`
>> > (*still keep `PYTHON`, `CORE`* and `PANDAS API ON SPARK` new added)
>> > https://github.com/apache/spark/pull/36574
>> > <https://github.com/apache/spark/pull/36574>
>> >
>> > Right?
>> >
>> > Regards,
>> > Yikun
>> >
>> >
>> > On Tue, May 17, 2022 at 11:26 AM Hyukjin Kwon <gurwls223@gmail.com
>> > <ma...@gmail.com>> wrote:
>> >
>> >     Hi all,
>> >
>> >     What about we introduce a component in JIRA "Pandas API on Spark",
>> >     and use "PS"  (pandas-on-Spark) in PR titles? We already use "ps" in
>> >     many places when we: import pyspark.pandas as ps.
>> >     This is similar to "Structured Streaming" in JIRA, and "SS" in PR
>> title.
>> >
>> >     I think it'd be easier to track the changes here with that.
>> >     Currently it's a bit difficult to identify it from pure PySpark
>> changes.
>> >
>>
>>
>> --
>> Best regards,
>> Maciej Szymkiewicz
>>
>> Web: https://zero323.net
>> PGP: A30CEF0C31A501EC
>>
>

-- 
Bjørn Jørgensen
Vestre Aspehaug 4, 6010 Ålesund
Norge

+47 480 94 297

Re: Introducing "Pandas API on Spark" component in JIRA, and use "PS" PR title component

Posted by Xinrong Meng <xi...@databricks.com.INVALID>.
Great!

It saves us from always specifying "Pandas API on Spark" in PR titles.

Thanks!


Xinrong Meng

Software Engineer

Databricks


On Tue, May 17, 2022 at 1:08 AM Maciej <ms...@gmail.com> wrote:

> Sounds good!
>
> +1
>
> On 5/17/22 06:08, Yikun Jiang wrote:
> > It's a pretty good idea, +1.
> >
> > To be clear in Github:
> >
> > - For each PR Title: [SPARK-XXX][PYTHON][PS] The Pandas on spark pr title
> > (*still keep [PYTHON]* and [PS] new added)
> >
> > - For PR label: new added: `PANDAS API ON Spark`, still keep: `PYTHON`,
> > `CORE`
> > (*still keep `PYTHON`, `CORE`* and `PANDAS API ON SPARK` new added)
> > https://github.com/apache/spark/pull/36574
> > <https://github.com/apache/spark/pull/36574>
> >
> > Right?
> >
> > Regards,
> > Yikun
> >
> >
> > On Tue, May 17, 2022 at 11:26 AM Hyukjin Kwon <gurwls223@gmail.com
> > <ma...@gmail.com>> wrote:
> >
> >     Hi all,
> >
> >     What about we introduce a component in JIRA "Pandas API on Spark",
> >     and use "PS"  (pandas-on-Spark) in PR titles? We already use "ps" in
> >     many places when we: import pyspark.pandas as ps.
> >     This is similar to "Structured Streaming" in JIRA, and "SS" in PR
> title.
> >
> >     I think it'd be easier to track the changes here with that.
> >     Currently it's a bit difficult to identify it from pure PySpark
> changes.
> >
>
>
> --
> Best regards,
> Maciej Szymkiewicz
>
> Web: https://zero323.net
> PGP: A30CEF0C31A501EC
>

Re: Introducing "Pandas API on Spark" component in JIRA, and use "PS" PR title component

Posted by Maciej <ms...@gmail.com>.
Sounds good!

+1

On 5/17/22 06:08, Yikun Jiang wrote:
> It's a pretty good idea, +1.
> 
> To be clear in Github:
> 
> - For each PR Title: [SPARK-XXX][PYTHON][PS] The Pandas on spark pr title
> (*still keep [PYTHON]* and [PS] new added)
> 
> - For PR label: new added: `PANDAS API ON Spark`, still keep: `PYTHON`,
> `CORE`
> (*still keep `PYTHON`, `CORE`* and `PANDAS API ON SPARK` new added)
> https://github.com/apache/spark/pull/36574
> <https://github.com/apache/spark/pull/36574>
> 
> Right?
> 
> Regards,
> Yikun
> 
> 
> On Tue, May 17, 2022 at 11:26 AM Hyukjin Kwon <gurwls223@gmail.com
> <ma...@gmail.com>> wrote:
> 
>     Hi all,
> 
>     What about we introduce a component in JIRA "Pandas API on Spark",
>     and use "PS"  (pandas-on-Spark) in PR titles? We already use "ps" in
>     many places when we: import pyspark.pandas as ps.
>     This is similar to "Structured Streaming" in JIRA, and "SS" in PR title.
> 
>     I think it'd be easier to track the changes here with that.
>     Currently it's a bit difficult to identify it from pure PySpark changes.
> 


-- 
Best regards,
Maciej Szymkiewicz

Web: https://zero323.net
PGP: A30CEF0C31A501EC

Re: Introducing "Pandas API on Spark" component in JIRA, and use "PS" PR title component

Posted by Yikun Jiang <yi...@gmail.com>.
It's a pretty good idea, +1.

To be clear in Github:

- For each PR Title: [SPARK-XXX][PYTHON][PS] The Pandas on spark pr title
(*still keep [PYTHON]* and [PS] new added)

- For PR label: new added: `PANDAS API ON Spark`, still keep: `PYTHON`,
`CORE`
(*still keep `PYTHON`, `CORE`* and `PANDAS API ON SPARK` new added)
https://github.com/apache/spark/pull/36574

Right?

Regards,
Yikun


On Tue, May 17, 2022 at 11:26 AM Hyukjin Kwon <gu...@gmail.com> wrote:

> Hi all,
>
> What about we introduce a component in JIRA "Pandas API on Spark", and use
> "PS"  (pandas-on-Spark) in PR titles? We already use "ps" in many places
> when we: import pyspark.pandas as ps.
> This is similar to "Structured Streaming" in JIRA, and "SS" in PR title.
>
> I think it'd be easier to track the changes here with that. Currently it's
> a bit difficult to identify it from pure PySpark changes.
>
>