You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Hyukjin Kwon <gu...@gmail.com> on 2022/05/17 03:26:42 UTC
Introducing "Pandas API on Spark" component in JIRA, and use "PS" PR title component
Hi all,
What about we introduce a component in JIRA "Pandas API on Spark", and use
"PS" (pandas-on-Spark) in PR titles? We already use "ps" in many places
when we: import pyspark.pandas as ps.
This is similar to "Structured Streaming" in JIRA, and "SS" in PR title.
I think it'd be easier to track the changes here with that. Currently it's
a bit difficult to identify it from pure PySpark changes.
Re: Introducing "Pandas API on Spark" component in JIRA, and use "PS" PR title component
Posted by Hyukjin Kwon <gu...@gmail.com>.
Thanks Ruifeng.
I added "Pandas API on Spark" component JIRA (and archived "jenkins"
component since we don't have the legacy Jenkins anymore).
Let me know if you guys have other opinions.
On Tue, 17 May 2022 at 12:59, Ruifeng Zheng <ru...@foxmail.com> wrote:
> +1, I think it is a good idea
>
>
> ------------------ 原始邮件 ------------------
> *发件人:* "Hyukjin Kwon" <gu...@gmail.com>;
> *发送时间:* 2022年5月17日(星期二) 中午11:26
> *收件人:* "dev"<de...@spark.apache.org>;
> *抄送:* "Yikun Jiang"<yi...@gmail.com>;"Xinrong Meng"<
> xinrong.meng@databricks.com>;"Xiao Li"<xi...@databricks.com>;"Takuya
> Ueshin"<ue...@databricks.com>;"Haejoon Lee"<ha...@databricks.com>;"Ruifeng
> Zheng"<ru...@foxmail.com>;
> *主题:* Introducing "Pandas API on Spark" component in JIRA, and use "PS"
> PR title component
>
> Hi all,
>
> What about we introduce a component in JIRA "Pandas API on Spark", and use
> "PS" (pandas-on-Spark) in PR titles? We already use "ps" in many places
> when we: import pyspark.pandas as ps.
> This is similar to "Structured Streaming" in JIRA, and "SS" in PR title.
>
> I think it'd be easier to track the changes here with that. Currently it's
> a bit difficult to identify it from pure PySpark changes.
>
>
回复:Introducing "Pandas API on Spark" component in JIRA, and use "PS" PR title component
Posted by Ruifeng Zheng <ru...@foxmail.com>.
+1, I think it is a good idea
------------------ 原始邮件 ------------------
发件人: "Hyukjin Kwon" <gurwls223@gmail.com>;
发送时间: 2022年5月17日(星期二) 中午11:26
收件人: "dev"<dev@spark.apache.org>;
抄送: "Yikun Jiang"<yikunkero@gmail.com>;"Xinrong Meng"<xinrong.meng@databricks.com>;"Xiao Li"<xiao.li@databricks.com>;"Takuya Ueshin"<ueshin@databricks.com>;"Haejoon Lee"<haejoon.lee@databricks.com>;"Ruifeng Zheng"<ruifengz@foxmail.com>;
主题: Introducing "Pandas API on Spark" component in JIRA, and use "PS" PR title component
Hi all,
What about we introduce a component in JIRA "Pandas API on Spark", and use "PS" (pandas-on-Spark) in PR titles? We already use "ps" in many places when we: import pyspark.pandas as ps.
This is similar to "Structured Streaming" in JIRA, and "SS" in PR title.
I think it'd be easier to track the changes here with that. Currently it's a bit difficult to identify it from pure PySpark changes.
Re: Introducing "Pandas API on Spark" component in JIRA, and use "PS" PR title component
Posted by "L. C. Hsieh" <vi...@gmail.com>.
+1. Thanks Hyukjin.
On Thu, May 19, 2022 at 10:14 AM Bryan Cutler <cu...@gmail.com> wrote:
>
> +1, sounds good
>
> On Wed, May 18, 2022 at 9:16 PM Dongjoon Hyun <do...@gmail.com> wrote:
>>
>> +1
>>
>> Thank you for the suggestion, Hyukjin.
>>
>> Dongjoon.
>>
>> On Wed, May 18, 2022 at 11:08 AM Bjørn Jørgensen <bj...@gmail.com> wrote:
>>>
>>> +1
>>> But can will have PR Title and PR label the same, PS
>>>
>>> ons. 18. mai 2022 kl. 18:57 skrev Xinrong Meng <xi...@databricks.com.invalid>:
>>>>
>>>> Great!
>>>>
>>>> It saves us from always specifying "Pandas API on Spark" in PR titles.
>>>>
>>>> Thanks!
>>>>
>>>>
>>>> Xinrong Meng
>>>>
>>>> Software Engineer
>>>>
>>>> Databricks
>>>>
>>>>
>>>>
>>>> On Tue, May 17, 2022 at 1:08 AM Maciej <ms...@gmail.com> wrote:
>>>>>
>>>>> Sounds good!
>>>>>
>>>>> +1
>>>>>
>>>>> On 5/17/22 06:08, Yikun Jiang wrote:
>>>>> > It's a pretty good idea, +1.
>>>>> >
>>>>> > To be clear in Github:
>>>>> >
>>>>> > - For each PR Title: [SPARK-XXX][PYTHON][PS] The Pandas on spark pr title
>>>>> > (*still keep [PYTHON]* and [PS] new added)
>>>>> >
>>>>> > - For PR label: new added: `PANDAS API ON Spark`, still keep: `PYTHON`,
>>>>> > `CORE`
>>>>> > (*still keep `PYTHON`, `CORE`* and `PANDAS API ON SPARK` new added)
>>>>> > https://github.com/apache/spark/pull/36574
>>>>> > <https://github.com/apache/spark/pull/36574>
>>>>> >
>>>>> > Right?
>>>>> >
>>>>> > Regards,
>>>>> > Yikun
>>>>> >
>>>>> >
>>>>> > On Tue, May 17, 2022 at 11:26 AM Hyukjin Kwon <gurwls223@gmail.com
>>>>> > <ma...@gmail.com>> wrote:
>>>>> >
>>>>> > Hi all,
>>>>> >
>>>>> > What about we introduce a component in JIRA "Pandas API on Spark",
>>>>> > and use "PS" (pandas-on-Spark) in PR titles? We already use "ps" in
>>>>> > many places when we: import pyspark.pandas as ps.
>>>>> > This is similar to "Structured Streaming" in JIRA, and "SS" in PR title.
>>>>> >
>>>>> > I think it'd be easier to track the changes here with that.
>>>>> > Currently it's a bit difficult to identify it from pure PySpark changes.
>>>>> >
>>>>>
>>>>>
>>>>> --
>>>>> Best regards,
>>>>> Maciej Szymkiewicz
>>>>>
>>>>> Web: https://zero323.net
>>>>> PGP: A30CEF0C31A501EC
>>>
>>>
>>>
>>> --
>>> Bjørn Jørgensen
>>> Vestre Aspehaug 4, 6010 Ålesund
>>> Norge
>>>
>>> +47 480 94 297
---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
Re: Introducing "Pandas API on Spark" component in JIRA, and use "PS" PR title component
Posted by Bryan Cutler <cu...@gmail.com>.
+1, sounds good
On Wed, May 18, 2022 at 9:16 PM Dongjoon Hyun <do...@gmail.com>
wrote:
> +1
>
> Thank you for the suggestion, Hyukjin.
>
> Dongjoon.
>
> On Wed, May 18, 2022 at 11:08 AM Bjørn Jørgensen <bj...@gmail.com>
> wrote:
>
>> +1
>> But can will have PR Title and PR label the same, PS
>>
>> ons. 18. mai 2022 kl. 18:57 skrev Xinrong Meng
>> <xi...@databricks.com.invalid>:
>>
>>> Great!
>>>
>>> It saves us from always specifying "Pandas API on Spark" in PR titles.
>>>
>>> Thanks!
>>>
>>>
>>> Xinrong Meng
>>>
>>> Software Engineer
>>>
>>> Databricks
>>>
>>>
>>> On Tue, May 17, 2022 at 1:08 AM Maciej <ms...@gmail.com> wrote:
>>>
>>>> Sounds good!
>>>>
>>>> +1
>>>>
>>>> On 5/17/22 06:08, Yikun Jiang wrote:
>>>> > It's a pretty good idea, +1.
>>>> >
>>>> > To be clear in Github:
>>>> >
>>>> > - For each PR Title: [SPARK-XXX][PYTHON][PS] The Pandas on spark pr
>>>> title
>>>> > (*still keep [PYTHON]* and [PS] new added)
>>>> >
>>>> > - For PR label: new added: `PANDAS API ON Spark`, still keep:
>>>> `PYTHON`,
>>>> > `CORE`
>>>> > (*still keep `PYTHON`, `CORE`* and `PANDAS API ON SPARK` new added)
>>>> > https://github.com/apache/spark/pull/36574
>>>> > <https://github.com/apache/spark/pull/36574>
>>>> >
>>>> > Right?
>>>> >
>>>> > Regards,
>>>> > Yikun
>>>> >
>>>> >
>>>> > On Tue, May 17, 2022 at 11:26 AM Hyukjin Kwon <gurwls223@gmail.com
>>>> > <ma...@gmail.com>> wrote:
>>>> >
>>>> > Hi all,
>>>> >
>>>> > What about we introduce a component in JIRA "Pandas API on Spark",
>>>> > and use "PS" (pandas-on-Spark) in PR titles? We already use "ps"
>>>> in
>>>> > many places when we: import pyspark.pandas as ps.
>>>> > This is similar to "Structured Streaming" in JIRA, and "SS" in PR
>>>> title.
>>>> >
>>>> > I think it'd be easier to track the changes here with that.
>>>> > Currently it's a bit difficult to identify it from pure PySpark
>>>> changes.
>>>> >
>>>>
>>>>
>>>> --
>>>> Best regards,
>>>> Maciej Szymkiewicz
>>>>
>>>> Web: https://zero323.net
>>>> PGP: A30CEF0C31A501EC
>>>>
>>>
>>
>> --
>> Bjørn Jørgensen
>> Vestre Aspehaug 4, 6010 Ålesund
>> Norge
>>
>> +47 480 94 297
>>
>
Re: Introducing "Pandas API on Spark" component in JIRA, and use "PS" PR title component
Posted by Dongjoon Hyun <do...@gmail.com>.
+1
Thank you for the suggestion, Hyukjin.
Dongjoon.
On Wed, May 18, 2022 at 11:08 AM Bjørn Jørgensen <bj...@gmail.com>
wrote:
> +1
> But can will have PR Title and PR label the same, PS
>
> ons. 18. mai 2022 kl. 18:57 skrev Xinrong Meng
> <xi...@databricks.com.invalid>:
>
>> Great!
>>
>> It saves us from always specifying "Pandas API on Spark" in PR titles.
>>
>> Thanks!
>>
>>
>> Xinrong Meng
>>
>> Software Engineer
>>
>> Databricks
>>
>>
>> On Tue, May 17, 2022 at 1:08 AM Maciej <ms...@gmail.com> wrote:
>>
>>> Sounds good!
>>>
>>> +1
>>>
>>> On 5/17/22 06:08, Yikun Jiang wrote:
>>> > It's a pretty good idea, +1.
>>> >
>>> > To be clear in Github:
>>> >
>>> > - For each PR Title: [SPARK-XXX][PYTHON][PS] The Pandas on spark pr
>>> title
>>> > (*still keep [PYTHON]* and [PS] new added)
>>> >
>>> > - For PR label: new added: `PANDAS API ON Spark`, still keep: `PYTHON`,
>>> > `CORE`
>>> > (*still keep `PYTHON`, `CORE`* and `PANDAS API ON SPARK` new added)
>>> > https://github.com/apache/spark/pull/36574
>>> > <https://github.com/apache/spark/pull/36574>
>>> >
>>> > Right?
>>> >
>>> > Regards,
>>> > Yikun
>>> >
>>> >
>>> > On Tue, May 17, 2022 at 11:26 AM Hyukjin Kwon <gurwls223@gmail.com
>>> > <ma...@gmail.com>> wrote:
>>> >
>>> > Hi all,
>>> >
>>> > What about we introduce a component in JIRA "Pandas API on Spark",
>>> > and use "PS" (pandas-on-Spark) in PR titles? We already use "ps"
>>> in
>>> > many places when we: import pyspark.pandas as ps.
>>> > This is similar to "Structured Streaming" in JIRA, and "SS" in PR
>>> title.
>>> >
>>> > I think it'd be easier to track the changes here with that.
>>> > Currently it's a bit difficult to identify it from pure PySpark
>>> changes.
>>> >
>>>
>>>
>>> --
>>> Best regards,
>>> Maciej Szymkiewicz
>>>
>>> Web: https://zero323.net
>>> PGP: A30CEF0C31A501EC
>>>
>>
>
> --
> Bjørn Jørgensen
> Vestre Aspehaug 4, 6010 Ålesund
> Norge
>
> +47 480 94 297
>
Re: Introducing "Pandas API on Spark" component in JIRA, and use "PS" PR title component
Posted by Bjørn Jørgensen <bj...@gmail.com>.
+1
But can will have PR Title and PR label the same, PS
ons. 18. mai 2022 kl. 18:57 skrev Xinrong Meng
<xi...@databricks.com.invalid>:
> Great!
>
> It saves us from always specifying "Pandas API on Spark" in PR titles.
>
> Thanks!
>
>
> Xinrong Meng
>
> Software Engineer
>
> Databricks
>
>
> On Tue, May 17, 2022 at 1:08 AM Maciej <ms...@gmail.com> wrote:
>
>> Sounds good!
>>
>> +1
>>
>> On 5/17/22 06:08, Yikun Jiang wrote:
>> > It's a pretty good idea, +1.
>> >
>> > To be clear in Github:
>> >
>> > - For each PR Title: [SPARK-XXX][PYTHON][PS] The Pandas on spark pr
>> title
>> > (*still keep [PYTHON]* and [PS] new added)
>> >
>> > - For PR label: new added: `PANDAS API ON Spark`, still keep: `PYTHON`,
>> > `CORE`
>> > (*still keep `PYTHON`, `CORE`* and `PANDAS API ON SPARK` new added)
>> > https://github.com/apache/spark/pull/36574
>> > <https://github.com/apache/spark/pull/36574>
>> >
>> > Right?
>> >
>> > Regards,
>> > Yikun
>> >
>> >
>> > On Tue, May 17, 2022 at 11:26 AM Hyukjin Kwon <gurwls223@gmail.com
>> > <ma...@gmail.com>> wrote:
>> >
>> > Hi all,
>> >
>> > What about we introduce a component in JIRA "Pandas API on Spark",
>> > and use "PS" (pandas-on-Spark) in PR titles? We already use "ps" in
>> > many places when we: import pyspark.pandas as ps.
>> > This is similar to "Structured Streaming" in JIRA, and "SS" in PR
>> title.
>> >
>> > I think it'd be easier to track the changes here with that.
>> > Currently it's a bit difficult to identify it from pure PySpark
>> changes.
>> >
>>
>>
>> --
>> Best regards,
>> Maciej Szymkiewicz
>>
>> Web: https://zero323.net
>> PGP: A30CEF0C31A501EC
>>
>
--
Bjørn Jørgensen
Vestre Aspehaug 4, 6010 Ålesund
Norge
+47 480 94 297
Re: Introducing "Pandas API on Spark" component in JIRA, and use "PS" PR title component
Posted by Xinrong Meng <xi...@databricks.com.INVALID>.
Great!
It saves us from always specifying "Pandas API on Spark" in PR titles.
Thanks!
Xinrong Meng
Software Engineer
Databricks
On Tue, May 17, 2022 at 1:08 AM Maciej <ms...@gmail.com> wrote:
> Sounds good!
>
> +1
>
> On 5/17/22 06:08, Yikun Jiang wrote:
> > It's a pretty good idea, +1.
> >
> > To be clear in Github:
> >
> > - For each PR Title: [SPARK-XXX][PYTHON][PS] The Pandas on spark pr title
> > (*still keep [PYTHON]* and [PS] new added)
> >
> > - For PR label: new added: `PANDAS API ON Spark`, still keep: `PYTHON`,
> > `CORE`
> > (*still keep `PYTHON`, `CORE`* and `PANDAS API ON SPARK` new added)
> > https://github.com/apache/spark/pull/36574
> > <https://github.com/apache/spark/pull/36574>
> >
> > Right?
> >
> > Regards,
> > Yikun
> >
> >
> > On Tue, May 17, 2022 at 11:26 AM Hyukjin Kwon <gurwls223@gmail.com
> > <ma...@gmail.com>> wrote:
> >
> > Hi all,
> >
> > What about we introduce a component in JIRA "Pandas API on Spark",
> > and use "PS" (pandas-on-Spark) in PR titles? We already use "ps" in
> > many places when we: import pyspark.pandas as ps.
> > This is similar to "Structured Streaming" in JIRA, and "SS" in PR
> title.
> >
> > I think it'd be easier to track the changes here with that.
> > Currently it's a bit difficult to identify it from pure PySpark
> changes.
> >
>
>
> --
> Best regards,
> Maciej Szymkiewicz
>
> Web: https://zero323.net
> PGP: A30CEF0C31A501EC
>
Re: Introducing "Pandas API on Spark" component in JIRA, and use "PS" PR title component
Posted by Maciej <ms...@gmail.com>.
Sounds good!
+1
On 5/17/22 06:08, Yikun Jiang wrote:
> It's a pretty good idea, +1.
>
> To be clear in Github:
>
> - For each PR Title: [SPARK-XXX][PYTHON][PS] The Pandas on spark pr title
> (*still keep [PYTHON]* and [PS] new added)
>
> - For PR label: new added: `PANDAS API ON Spark`, still keep: `PYTHON`,
> `CORE`
> (*still keep `PYTHON`, `CORE`* and `PANDAS API ON SPARK` new added)
> https://github.com/apache/spark/pull/36574
> <https://github.com/apache/spark/pull/36574>
>
> Right?
>
> Regards,
> Yikun
>
>
> On Tue, May 17, 2022 at 11:26 AM Hyukjin Kwon <gurwls223@gmail.com
> <ma...@gmail.com>> wrote:
>
> Hi all,
>
> What about we introduce a component in JIRA "Pandas API on Spark",
> and use "PS" (pandas-on-Spark) in PR titles? We already use "ps" in
> many places when we: import pyspark.pandas as ps.
> This is similar to "Structured Streaming" in JIRA, and "SS" in PR title.
>
> I think it'd be easier to track the changes here with that.
> Currently it's a bit difficult to identify it from pure PySpark changes.
>
--
Best regards,
Maciej Szymkiewicz
Web: https://zero323.net
PGP: A30CEF0C31A501EC
Re: Introducing "Pandas API on Spark" component in JIRA, and use "PS" PR title component
Posted by Yikun Jiang <yi...@gmail.com>.
It's a pretty good idea, +1.
To be clear in Github:
- For each PR Title: [SPARK-XXX][PYTHON][PS] The Pandas on spark pr title
(*still keep [PYTHON]* and [PS] new added)
- For PR label: new added: `PANDAS API ON Spark`, still keep: `PYTHON`,
`CORE`
(*still keep `PYTHON`, `CORE`* and `PANDAS API ON SPARK` new added)
https://github.com/apache/spark/pull/36574
Right?
Regards,
Yikun
On Tue, May 17, 2022 at 11:26 AM Hyukjin Kwon <gu...@gmail.com> wrote:
> Hi all,
>
> What about we introduce a component in JIRA "Pandas API on Spark", and use
> "PS" (pandas-on-Spark) in PR titles? We already use "ps" in many places
> when we: import pyspark.pandas as ps.
> This is similar to "Structured Streaming" in JIRA, and "SS" in PR title.
>
> I think it'd be easier to track the changes here with that. Currently it's
> a bit difficult to identify it from pure PySpark changes.
>
>