You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by 郑瑞峰 <ru...@foxmail.com> on 2020/06/24 23:28:33 UTC

回复: [DISCUSS] Apache Spark 3.0.1 Release

I volunteer to be a release manager of 3.0.1, if nobody is working on this.




------------------&nbsp;原始邮件&nbsp;------------------
发件人:&nbsp;"Gengliang Wang"<gengliang.wang@databricks.com&gt;;
发送时间:&nbsp;2020年6月24日(星期三) 下午4:15
收件人:&nbsp;"Hyukjin Kwon"<gurwls223@gmail.com&gt;;
抄送:&nbsp;"Dongjoon Hyun"<dongjoon.hyun@gmail.com&gt;;"Jungtaek Lim"<kabhwan.opensource@gmail.com&gt;;"Jules Damji"<dmatrix@comcast.net&gt;;"Holden Karau"<holden@pigscanfly.ca&gt;;"Reynold Xin"<rxin@databricks.com&gt;;"Shivaram Venkataraman"<shivaram@eecs.berkeley.edu&gt;;"Yuanjian Li"<xyliyuanjian@gmail.com&gt;;"Spark dev list"<dev@spark.apache.org&gt;;"Takeshi Yamamuro"<linguin.m.s@gmail.com&gt;;
主题:&nbsp;Re: [DISCUSS] Apache Spark 3.0.1 Release



+1, the issues mentioned are really serious.&nbsp;


On Tue, Jun 23, 2020 at 7:56 PM Hyukjin Kwon <gurwls223@gmail.com&gt; wrote:

+1.

Just as a note,
-&nbsp;SPARK-31918 is fixed now, and there's no blocker. - When we build SparkR, we should use the latest R version at least 4.0.0+.  

2020년 6월 24일 (수) 오전 11:20, Dongjoon Hyun <dongjoon.hyun@gmail.com&gt;님이 작성:

+1


Bests,
Dongjoon.


On Tue, Jun 23, 2020 at 1:19 PM Jungtaek Lim <kabhwan.opensource@gmail.com&gt; wrote:

+1 on a 3.0.1 soon.


Probably it would be nice if some Scala experts can take a look at&nbsp;https://issues.apache.org/jira/browse/SPARK-32051&nbsp;and include&nbsp;the fix into 3.0.1 if possible.
Looks like APIs designed to work with Scala 2.11 &amp; Java bring ambiguity&nbsp;in Scala 2.12 &amp; Java.&nbsp;


On Wed, Jun 24, 2020 at 4:52 AM Jules Damji <dmatrix@comcast.net&gt; wrote:

+1 (non-binding)
Sent from my iPhonePardon the dumb thumb typos :)


On Jun 23, 2020, at 11:36 AM, Holden Karau <holden@pigscanfly.ca&gt; wrote:


+1 on a patch release soon


On Tue, Jun 23, 2020 at 10:47 AM Reynold Xin <rxin@databricks.com&gt; wrote:


+1 on doing a new patch release soon. I saw some of these issues when preparing the 3.0 release, and some of them are very serious.





On Tue, Jun 23, 2020 at 8:06 AM, Shivaram Venkataraman <shivaram@eecs.berkeley.edu&gt; wrote:

+1 Thanks Yuanjian -- I think it'll be great to have a 3.0.1 release soon. 

 Shivaram 

 On Tue, Jun 23, 2020 at 3:43 AM Takeshi Yamamuro <linguin.m.s@gmail.com&gt; wrote: 

 Thanks for the heads-up, Yuanjian! 

 I also noticed branch-3.0 already has 39 commits after Spark 3.0.0. 

 wow, the updates are so quick. Anyway, +1 for the release. 

 Bests, 
 Takeshi 

 On Tue, Jun 23, 2020 at 4:59 PM Yuanjian Li <xyliyuanjian@gmail.com&gt; wrote: 

 Hi dev-list, 

 I’m writing this to raise the discussion about Spark 3.0.1 feasibility since 4 blocker issues were found after Spark 3.0.0: 

 [SPARK-31990] The state store compatibility broken will cause a correctness issue when Streaming query with `dropDuplicate` uses the checkpoint written by the old Spark version. 

 [SPARK-32038] The regression bug in handling NaN values in COUNT(DISTINCT) 

 [SPARK-31918][WIP] CRAN requires to make it working with the latest R 4.0. It makes the 3.0 release unavailable on CRAN, and only supports R [3.5, 4.0) 

 [SPARK-31967] Downgrade vis.js to fix Jobs UI loading time regression 

 I also noticed branch-3.0 already has 39 commits after Spark 3.0.0. I think it would be great if we have Spark 3.0.1 to deliver the critical fixes. 

 Any comments are appreciated. 

 Best, 

 Yuanjian 

 -- 
 --- 
 Takeshi Yamamuro 

 --------------------------------------------------------------------- To unsubscribe e-mail: dev-unsubscribe@spark.apache.org












-- 
Twitter:&nbsp;https://twitter.com/holdenkarau

Books (Learning Spark, High Performance Spark, etc.):&nbsp;https://amzn.to/2MaRAG9&nbsp;
YouTube Live Streams:&nbsp;https://www.youtube.com/user/holdenkarau

Re: [DISCUSS] Apache Spark 3.0.1 Release

Posted by Mridul Muralidharan <mr...@gmail.com>.
I agree, that would be a new feature; and unless compelling reason (like
security concerns) would not qualify.

Regards,
Mridul

On Wed, Jul 15, 2020 at 11:46 AM Wenchen Fan <cl...@gmail.com> wrote:

> Supporting Python 3.8.0 sounds like a new feature, and doesn't qualify a
> backport. But I'm open to other opinions.
>
> On Wed, Jul 15, 2020 at 11:24 PM Ismaël Mejía <ie...@gmail.com> wrote:
>
>> Any chance that SPARK-29536 PySpark does not work with Python 3.8.0
>> can be backported to 2.4.7 ?
>> This was not done for Spark 2.4.6 because it was too late on the vote
>> process but it makes perfect sense to have this in 2.4.7.
>>
>> On Wed, Jul 15, 2020 at 9:07 AM Wenchen Fan <cl...@gmail.com> wrote:
>> >
>> > Yea I think 2.4.7 is good to go. Let's start!
>> >
>> > On Wed, Jul 15, 2020 at 1:50 PM Prashant Sharma <sc...@gmail.com>
>> wrote:
>> >>
>> >> Hi Folks,
>> >>
>> >> So, I am back, and searched the JIRAS with target version as "2.4.7"
>> and Resolved, found only 2 jiras. So, are we good to go, with just a couple
>> of jiras fixed ? Shall I proceed with making a RC?
>> >>
>> >> Thanks,
>> >> Prashant
>> >>
>> >> On Thu, Jul 2, 2020 at 5:23 PM Prashant Sharma <sc...@gmail.com>
>> wrote:
>> >>>
>> >>> Thank you, Holden.
>> >>>
>> >>> Folks, My health has gone down a bit. So, I will start working on
>> this in a few days. If this needs to be published sooner, then maybe
>> someone else has to help out.
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>> On Thu, Jul 2, 2020 at 10:11 AM Holden Karau <ho...@pigscanfly.ca>
>> wrote:
>> >>>>
>> >>>> I’m happy to have Prashant do 2.4.7 :)
>> >>>>
>> >>>> On Wed, Jul 1, 2020 at 9:40 PM Xiao Li <li...@databricks.com>
>> wrote:
>> >>>>>
>> >>>>> +1 on releasing both 3.0.1 and 2.4.7
>> >>>>>
>> >>>>> Great! Three committers volunteer to be a release manager. Ruifeng,
>> Prashant and Holden. Holden just helped release Spark 2.4.6. This time,
>> maybe, Ruifeng and Prashant can be the release manager of 3.0.1 and 2.4.7
>> respectively.
>> >>>>>
>> >>>>> Xiao
>> >>>>>
>> >>>>> On Wed, Jul 1, 2020 at 2:24 PM Jungtaek Lim <
>> kabhwan.opensource@gmail.com> wrote:
>> >>>>>>
>> >>>>>> https://issues.apache.org/jira/browse/SPARK-32148 was reported
>> yesterday, and if the report is valid it looks to be a blocker. I'll try to
>> take a look sooner.
>> >>>>>>
>> >>>>>> On Thu, Jul 2, 2020 at 12:48 AM Shivaram Venkataraman <
>> shivaram@eecs.berkeley.edu> wrote:
>> >>>>>>>
>> >>>>>>> Thanks Holden -- it would be great to also get 2.4.7 started
>> >>>>>>>
>> >>>>>>> Thanks
>> >>>>>>> Shivaram
>> >>>>>>>
>> >>>>>>> On Tue, Jun 30, 2020 at 10:31 PM Holden Karau <
>> holden@pigscanfly.ca> wrote:
>> >>>>>>> >
>> >>>>>>> > I can take care of 2.4.7 unless someone else wants to do it.
>> >>>>>>> >
>> >>>>>>> > On Tue, Jun 30, 2020 at 8:29 PM Jason Moore <
>> Jason.Moore@quantium.com.au> wrote:
>> >>>>>>> >>
>> >>>>>>> >> Hi all,
>> >>>>>>> >>
>> >>>>>>> >>
>> >>>>>>> >>
>> >>>>>>> >> Could I get some input on the severity of this one that I
>> found yesterday?  If that’s a correctness issue, should it block this
>> patch?  Let me know under the ticket if there’s more info that I can
>> provide to help.
>> >>>>>>> >>
>> >>>>>>> >>
>> >>>>>>> >>
>> >>>>>>> >> https://issues.apache.org/jira/browse/SPARK-32136
>> >>>>>>> >>
>> >>>>>>> >>
>> >>>>>>> >>
>> >>>>>>> >> Thanks,
>> >>>>>>> >>
>> >>>>>>> >> Jason.
>> >>>>>>> >>
>> >>>>>>> >>
>> >>>>>>> >>
>> >>>>>>> >> From: Jungtaek Lim <ka...@gmail.com>
>> >>>>>>> >> Date: Wednesday, 1 July 2020 at 10:20 am
>> >>>>>>> >> To: Shivaram Venkataraman <sh...@eecs.berkeley.edu>
>> >>>>>>> >> Cc: Prashant Sharma <sc...@gmail.com>, 郑瑞峰 <
>> ruifengz@foxmail.com>, Gengliang Wang <ge...@databricks.com>,
>> gurwls223 <gu...@gmail.com>, Dongjoon Hyun <do...@gmail.com>,
>> Jules Damji <dm...@comcast.net>, Holden Karau <ho...@pigscanfly.ca>,
>> Reynold Xin <rx...@databricks.com>, Yuanjian Li <xy...@gmail.com>,
>> "dev@spark.apache.org" <de...@spark.apache.org>, Takeshi Yamamuro <
>> linguin.m.s@gmail.com>
>> >>>>>>> >> Subject: Re: [DISCUSS] Apache Spark 3.0.1 Release
>> >>>>>>> >>
>> >>>>>>> >>
>> >>>>>>> >>
>> >>>>>>> >> SPARK-32130 [1] looks to be a performance regression
>> introduced in Spark 3.0.0, which is ideal to look into before releasing
>> another bugfix version.
>> >>>>>>> >>
>> >>>>>>> >>
>> >>>>>>> >>
>> >>>>>>> >> 1. https://issues.apache.org/jira/browse/SPARK-32130
>> >>>>>>> >>
>> >>>>>>> >>
>> >>>>>>> >>
>> >>>>>>> >> On Wed, Jul 1, 2020 at 7:05 AM Shivaram Venkataraman <
>> shivaram@eecs.berkeley.edu> wrote:
>> >>>>>>> >>
>> >>>>>>> >> Hi all
>> >>>>>>> >>
>> >>>>>>> >>
>> >>>>>>> >>
>> >>>>>>> >> I just wanted to ping this thread to see if all the
>> outstanding blockers for 3.0.1 have been fixed. If so, it would be great if
>> we can get the release going. The CRAN team sent us a note that the version
>> SparkR available on CRAN for the current R version (4.0.2) is broken and
>> hence we need to update the package soon --  it will be great to do it with
>> 3.0.1.
>> >>>>>>> >>
>> >>>>>>> >>
>> >>>>>>> >>
>> >>>>>>> >> Thanks
>> >>>>>>> >>
>> >>>>>>> >> Shivaram
>> >>>>>>> >>
>> >>>>>>> >>
>> >>>>>>> >>
>> >>>>>>> >> On Wed, Jun 24, 2020 at 8:31 PM Prashant Sharma <
>> scrapcodes@gmail.com> wrote:
>> >>>>>>> >>
>> >>>>>>> >> +1 for 3.0.1 release.
>> >>>>>>> >>
>> >>>>>>> >> I too can help out as release manager.
>> >>>>>>> >>
>> >>>>>>> >>
>> >>>>>>> >>
>> >>>>>>> >> On Thu, Jun 25, 2020 at 4:58 AM 郑瑞峰 <ru...@foxmail.com>
>> wrote:
>> >>>>>>> >>
>> >>>>>>> >> I volunteer to be a release manager of 3.0.1, if nobody is
>> working on this.
>> >>>>>>> >>
>> >>>>>>> >>
>> >>>>>>> >>
>> >>>>>>> >>
>> >>>>>>> >>
>> >>>>>>> >> ------------------ 原始邮件 ------------------
>> >>>>>>> >>
>> >>>>>>> >> 发件人: "Gengliang Wang"<ge...@databricks.com>;
>> >>>>>>> >>
>> >>>>>>> >> 发送时间: 2020年6月24日(星期三) 下午4:15
>> >>>>>>> >>
>> >>>>>>> >> 收件人: "Hyukjin Kwon"<gu...@gmail.com>;
>> >>>>>>> >>
>> >>>>>>> >> 抄送: "Dongjoon Hyun"<do...@gmail.com>;"Jungtaek Lim"<
>> kabhwan.opensource@gmail.com>;"Jules Damji"<dm...@comcast.net>;"Holden
>> Karau"<ho...@pigscanfly.ca>;"Reynold Xin"<rx...@databricks.com>;"Shivaram
>> Venkataraman"<sh...@eecs.berkeley.edu>;"Yuanjian Li"<
>> xyliyuanjian@gmail.com>;"Spark dev list"<de...@spark.apache.org>;"Takeshi
>> Yamamuro"<li...@gmail.com>;
>> >>>>>>> >>
>> >>>>>>> >> 主题: Re: [DISCUSS] Apache Spark 3.0.1 Release
>> >>>>>>> >>
>> >>>>>>> >>
>> >>>>>>> >>
>> >>>>>>> >> +1, the issues mentioned are really serious.
>> >>>>>>> >>
>> >>>>>>> >>
>> >>>>>>> >>
>> >>>>>>> >> On Tue, Jun 23, 2020 at 7:56 PM Hyukjin Kwon <
>> gurwls223@gmail.com> wrote:
>> >>>>>>> >>
>> >>>>>>> >> +1.
>> >>>>>>> >>
>> >>>>>>> >> Just as a note,
>> >>>>>>> >> - SPARK-31918 is fixed now, and there's no blocker. - When we
>> build SparkR, we should use the latest R version at least 4.0.0+.
>> >>>>>>> >>
>> >>>>>>> >>
>> >>>>>>> >>
>> >>>>>>> >> 2020년 6월 24일 (수) 오전 11:20, Dongjoon Hyun <
>> dongjoon.hyun@gmail.com>님이 작성:
>> >>>>>>> >>
>> >>>>>>> >> +1
>> >>>>>>> >>
>> >>>>>>> >>
>> >>>>>>> >>
>> >>>>>>> >> Bests,
>> >>>>>>> >>
>> >>>>>>> >> Dongjoon.
>> >>>>>>> >>
>> >>>>>>> >>
>> >>>>>>> >>
>> >>>>>>> >> On Tue, Jun 23, 2020 at 1:19 PM Jungtaek Lim <
>> kabhwan.opensource@gmail.com> wrote:
>> >>>>>>> >>
>> >>>>>>> >> +1 on a 3.0.1 soon.
>> >>>>>>> >>
>> >>>>>>> >>
>> >>>>>>> >>
>> >>>>>>> >> Probably it would be nice if some Scala experts can take a
>> look at https://issues.apache.org/jira/browse/SPARK-32051 and include
>> the fix into 3.0.1 if possible.
>> >>>>>>> >>
>> >>>>>>> >> Looks like APIs designed to work with Scala 2.11 & Java bring
>> ambiguity in Scala 2.12 & Java.
>> >>>>>>> >>
>> >>>>>>> >>
>> >>>>>>> >>
>> >>>>>>> >> On Wed, Jun 24, 2020 at 4:52 AM Jules Damji <
>> dmatrix@comcast.net> wrote:
>> >>>>>>> >>
>> >>>>>>> >> +1 (non-binding)
>> >>>>>>> >>
>> >>>>>>> >>
>> >>>>>>> >>
>> >>>>>>> >> Sent from my iPhone
>> >>>>>>> >>
>> >>>>>>> >> Pardon the dumb thumb typos :)
>> >>>>>>> >>
>> >>>>>>> >>
>> >>>>>>> >>
>> >>>>>>> >> On Jun 23, 2020, at 11:36 AM, Holden Karau <
>> holden@pigscanfly.ca> wrote:
>> >>>>>>> >>
>> >>>>>>> >> +1 on a patch release soon
>> >>>>>>> >>
>> >>>>>>> >>
>> >>>>>>> >>
>> >>>>>>> >> On Tue, Jun 23, 2020 at 10:47 AM Reynold Xin <
>> rxin@databricks.com> wrote:
>> >>>>>>> >>
>> >>>>>>> >> Error! Filename not specified.
>> >>>>>>> >>
>> >>>>>>> >> +1 on doing a new patch release soon. I saw some of these
>> issues when preparing the 3.0 release, and some of them are very serious.
>> >>>>>>> >>
>> >>>>>>> >>
>> >>>>>>> >>
>> >>>>>>> >>
>> >>>>>>> >>
>> >>>>>>> >> On Tue, Jun 23, 2020 at 8:06 AM, Shivaram Venkataraman <
>> shivaram@eecs.berkeley.edu> wrote:
>> >>>>>>> >>
>> >>>>>>> >> +1 Thanks Yuanjian -- I think it'll be great to have a 3.0.1
>> release soon.
>> >>>>>>> >>
>> >>>>>>> >> Shivaram
>> >>>>>>> >>
>> >>>>>>> >> On Tue, Jun 23, 2020 at 3:43 AM Takeshi Yamamuro <
>> linguin.m.s@gmail.com> wrote:
>> >>>>>>> >>
>> >>>>>>> >> Thanks for the heads-up, Yuanjian!
>> >>>>>>> >>
>> >>>>>>> >> I also noticed branch-3.0 already has 39 commits after Spark
>> 3.0.0.
>> >>>>>>> >>
>> >>>>>>> >> wow, the updates are so quick. Anyway, +1 for the release.
>> >>>>>>> >>
>> >>>>>>> >> Bests,
>> >>>>>>> >> Takeshi
>> >>>>>>> >>
>> >>>>>>> >> On Tue, Jun 23, 2020 at 4:59 PM Yuanjian Li <
>> xyliyuanjian@gmail.com> wrote:
>> >>>>>>> >>
>> >>>>>>> >> Hi dev-list,
>> >>>>>>> >>
>> >>>>>>> >> I’m writing this to raise the discussion about Spark 3.0.1
>> feasibility since 4 blocker issues were found after Spark 3.0.0:
>> >>>>>>> >>
>> >>>>>>> >> [SPARK-31990] The state store compatibility broken will cause
>> a correctness issue when Streaming query with `dropDuplicate` uses the
>> checkpoint written by the old Spark version.
>> >>>>>>> >>
>> >>>>>>> >> [SPARK-32038] The regression bug in handling NaN values in
>> COUNT(DISTINCT)
>> >>>>>>> >>
>> >>>>>>> >> [SPARK-31918][WIP] CRAN requires to make it working with the
>> latest R 4.0. It makes the 3.0 release unavailable on CRAN, and only
>> supports R [3.5, 4.0)
>> >>>>>>> >>
>> >>>>>>> >> [SPARK-31967] Downgrade vis.js to fix Jobs UI loading time
>> regression
>> >>>>>>> >>
>> >>>>>>> >> I also noticed branch-3.0 already has 39 commits after Spark
>> 3.0.0. I think it would be great if we have Spark 3.0.1 to deliver the
>> critical fixes.
>> >>>>>>> >>
>> >>>>>>> >> Any comments are appreciated.
>> >>>>>>> >>
>> >>>>>>> >> Best,
>> >>>>>>> >>
>> >>>>>>> >> Yuanjian
>> >>>>>>> >>
>> >>>>>>> >> --
>> >>>>>>> >> ---
>> >>>>>>> >> Takeshi Yamamuro
>> >>>>>>> >>
>> >>>>>>> >>
>> --------------------------------------------------------------------- To
>> unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>> >>>>>>> >>
>> >>>>>>> >>
>> >>>>>>> >>
>> >>>>>>> >>
>> >>>>>>> >>
>> >>>>>>> >>
>> >>>>>>> >> --
>> >>>>>>> >>
>> >>>>>>> >> Twitter: https://twitter.com/holdenkarau
>> >>>>>>> >>
>> >>>>>>> >> Books (Learning Spark, High Performance Spark, etc.):
>> https://amzn.to/2MaRAG9
>> >>>>>>> >>
>> >>>>>>> >> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>> >>>>>>> >
>> >>>>>>> > --
>> >>>>>>> > Twitter: https://twitter.com/holdenkarau
>> >>>>>>> > Books (Learning Spark, High Performance Spark, etc.):
>> https://amzn.to/2MaRAG9
>> >>>>>>> > YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> --
>> >>>>
>> >>>> --
>> >>>> Twitter: https://twitter.com/holdenkarau
>> >>>> Books (Learning Spark, High Performance Spark, etc.):
>> https://amzn.to/2MaRAG9
>> >>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>
>

Re: [DISCUSS] Apache Spark 3.0.1 Release

Posted by Wenchen Fan <cl...@gmail.com>.
Supporting Python 3.8.0 sounds like a new feature, and doesn't qualify a
backport. But I'm open to other opinions.

On Wed, Jul 15, 2020 at 11:24 PM Ismaël Mejía <ie...@gmail.com> wrote:

> Any chance that SPARK-29536 PySpark does not work with Python 3.8.0
> can be backported to 2.4.7 ?
> This was not done for Spark 2.4.6 because it was too late on the vote
> process but it makes perfect sense to have this in 2.4.7.
>
> On Wed, Jul 15, 2020 at 9:07 AM Wenchen Fan <cl...@gmail.com> wrote:
> >
> > Yea I think 2.4.7 is good to go. Let's start!
> >
> > On Wed, Jul 15, 2020 at 1:50 PM Prashant Sharma <sc...@gmail.com>
> wrote:
> >>
> >> Hi Folks,
> >>
> >> So, I am back, and searched the JIRAS with target version as "2.4.7"
> and Resolved, found only 2 jiras. So, are we good to go, with just a couple
> of jiras fixed ? Shall I proceed with making a RC?
> >>
> >> Thanks,
> >> Prashant
> >>
> >> On Thu, Jul 2, 2020 at 5:23 PM Prashant Sharma <sc...@gmail.com>
> wrote:
> >>>
> >>> Thank you, Holden.
> >>>
> >>> Folks, My health has gone down a bit. So, I will start working on this
> in a few days. If this needs to be published sooner, then maybe someone
> else has to help out.
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> On Thu, Jul 2, 2020 at 10:11 AM Holden Karau <ho...@pigscanfly.ca>
> wrote:
> >>>>
> >>>> I’m happy to have Prashant do 2.4.7 :)
> >>>>
> >>>> On Wed, Jul 1, 2020 at 9:40 PM Xiao Li <li...@databricks.com> wrote:
> >>>>>
> >>>>> +1 on releasing both 3.0.1 and 2.4.7
> >>>>>
> >>>>> Great! Three committers volunteer to be a release manager. Ruifeng,
> Prashant and Holden. Holden just helped release Spark 2.4.6. This time,
> maybe, Ruifeng and Prashant can be the release manager of 3.0.1 and 2.4.7
> respectively.
> >>>>>
> >>>>> Xiao
> >>>>>
> >>>>> On Wed, Jul 1, 2020 at 2:24 PM Jungtaek Lim <
> kabhwan.opensource@gmail.com> wrote:
> >>>>>>
> >>>>>> https://issues.apache.org/jira/browse/SPARK-32148 was reported
> yesterday, and if the report is valid it looks to be a blocker. I'll try to
> take a look sooner.
> >>>>>>
> >>>>>> On Thu, Jul 2, 2020 at 12:48 AM Shivaram Venkataraman <
> shivaram@eecs.berkeley.edu> wrote:
> >>>>>>>
> >>>>>>> Thanks Holden -- it would be great to also get 2.4.7 started
> >>>>>>>
> >>>>>>> Thanks
> >>>>>>> Shivaram
> >>>>>>>
> >>>>>>> On Tue, Jun 30, 2020 at 10:31 PM Holden Karau <
> holden@pigscanfly.ca> wrote:
> >>>>>>> >
> >>>>>>> > I can take care of 2.4.7 unless someone else wants to do it.
> >>>>>>> >
> >>>>>>> > On Tue, Jun 30, 2020 at 8:29 PM Jason Moore <
> Jason.Moore@quantium.com.au> wrote:
> >>>>>>> >>
> >>>>>>> >> Hi all,
> >>>>>>> >>
> >>>>>>> >>
> >>>>>>> >>
> >>>>>>> >> Could I get some input on the severity of this one that I found
> yesterday?  If that’s a correctness issue, should it block this patch?  Let
> me know under the ticket if there’s more info that I can provide to help.
> >>>>>>> >>
> >>>>>>> >>
> >>>>>>> >>
> >>>>>>> >> https://issues.apache.org/jira/browse/SPARK-32136
> >>>>>>> >>
> >>>>>>> >>
> >>>>>>> >>
> >>>>>>> >> Thanks,
> >>>>>>> >>
> >>>>>>> >> Jason.
> >>>>>>> >>
> >>>>>>> >>
> >>>>>>> >>
> >>>>>>> >> From: Jungtaek Lim <ka...@gmail.com>
> >>>>>>> >> Date: Wednesday, 1 July 2020 at 10:20 am
> >>>>>>> >> To: Shivaram Venkataraman <sh...@eecs.berkeley.edu>
> >>>>>>> >> Cc: Prashant Sharma <sc...@gmail.com>, 郑瑞峰 <
> ruifengz@foxmail.com>, Gengliang Wang <ge...@databricks.com>,
> gurwls223 <gu...@gmail.com>, Dongjoon Hyun <do...@gmail.com>,
> Jules Damji <dm...@comcast.net>, Holden Karau <ho...@pigscanfly.ca>,
> Reynold Xin <rx...@databricks.com>, Yuanjian Li <xy...@gmail.com>, "
> dev@spark.apache.org" <de...@spark.apache.org>, Takeshi Yamamuro <
> linguin.m.s@gmail.com>
> >>>>>>> >> Subject: Re: [DISCUSS] Apache Spark 3.0.1 Release
> >>>>>>> >>
> >>>>>>> >>
> >>>>>>> >>
> >>>>>>> >> SPARK-32130 [1] looks to be a performance regression introduced
> in Spark 3.0.0, which is ideal to look into before releasing another bugfix
> version.
> >>>>>>> >>
> >>>>>>> >>
> >>>>>>> >>
> >>>>>>> >> 1. https://issues.apache.org/jira/browse/SPARK-32130
> >>>>>>> >>
> >>>>>>> >>
> >>>>>>> >>
> >>>>>>> >> On Wed, Jul 1, 2020 at 7:05 AM Shivaram Venkataraman <
> shivaram@eecs.berkeley.edu> wrote:
> >>>>>>> >>
> >>>>>>> >> Hi all
> >>>>>>> >>
> >>>>>>> >>
> >>>>>>> >>
> >>>>>>> >> I just wanted to ping this thread to see if all the outstanding
> blockers for 3.0.1 have been fixed. If so, it would be great if we can get
> the release going. The CRAN team sent us a note that the version SparkR
> available on CRAN for the current R version (4.0.2) is broken and hence we
> need to update the package soon --  it will be great to do it with 3.0.1.
> >>>>>>> >>
> >>>>>>> >>
> >>>>>>> >>
> >>>>>>> >> Thanks
> >>>>>>> >>
> >>>>>>> >> Shivaram
> >>>>>>> >>
> >>>>>>> >>
> >>>>>>> >>
> >>>>>>> >> On Wed, Jun 24, 2020 at 8:31 PM Prashant Sharma <
> scrapcodes@gmail.com> wrote:
> >>>>>>> >>
> >>>>>>> >> +1 for 3.0.1 release.
> >>>>>>> >>
> >>>>>>> >> I too can help out as release manager.
> >>>>>>> >>
> >>>>>>> >>
> >>>>>>> >>
> >>>>>>> >> On Thu, Jun 25, 2020 at 4:58 AM 郑瑞峰 <ru...@foxmail.com>
> wrote:
> >>>>>>> >>
> >>>>>>> >> I volunteer to be a release manager of 3.0.1, if nobody is
> working on this.
> >>>>>>> >>
> >>>>>>> >>
> >>>>>>> >>
> >>>>>>> >>
> >>>>>>> >>
> >>>>>>> >> ------------------ 原始邮件 ------------------
> >>>>>>> >>
> >>>>>>> >> 发件人: "Gengliang Wang"<ge...@databricks.com>;
> >>>>>>> >>
> >>>>>>> >> 发送时间: 2020年6月24日(星期三) 下午4:15
> >>>>>>> >>
> >>>>>>> >> 收件人: "Hyukjin Kwon"<gu...@gmail.com>;
> >>>>>>> >>
> >>>>>>> >> 抄送: "Dongjoon Hyun"<do...@gmail.com>;"Jungtaek Lim"<
> kabhwan.opensource@gmail.com>;"Jules Damji"<dm...@comcast.net>;"Holden
> Karau"<ho...@pigscanfly.ca>;"Reynold Xin"<rx...@databricks.com>;"Shivaram
> Venkataraman"<sh...@eecs.berkeley.edu>;"Yuanjian Li"<
> xyliyuanjian@gmail.com>;"Spark dev list"<de...@spark.apache.org>;"Takeshi
> Yamamuro"<li...@gmail.com>;
> >>>>>>> >>
> >>>>>>> >> 主题: Re: [DISCUSS] Apache Spark 3.0.1 Release
> >>>>>>> >>
> >>>>>>> >>
> >>>>>>> >>
> >>>>>>> >> +1, the issues mentioned are really serious.
> >>>>>>> >>
> >>>>>>> >>
> >>>>>>> >>
> >>>>>>> >> On Tue, Jun 23, 2020 at 7:56 PM Hyukjin Kwon <
> gurwls223@gmail.com> wrote:
> >>>>>>> >>
> >>>>>>> >> +1.
> >>>>>>> >>
> >>>>>>> >> Just as a note,
> >>>>>>> >> - SPARK-31918 is fixed now, and there's no blocker. - When we
> build SparkR, we should use the latest R version at least 4.0.0+.
> >>>>>>> >>
> >>>>>>> >>
> >>>>>>> >>
> >>>>>>> >> 2020년 6월 24일 (수) 오전 11:20, Dongjoon Hyun <
> dongjoon.hyun@gmail.com>님이 작성:
> >>>>>>> >>
> >>>>>>> >> +1
> >>>>>>> >>
> >>>>>>> >>
> >>>>>>> >>
> >>>>>>> >> Bests,
> >>>>>>> >>
> >>>>>>> >> Dongjoon.
> >>>>>>> >>
> >>>>>>> >>
> >>>>>>> >>
> >>>>>>> >> On Tue, Jun 23, 2020 at 1:19 PM Jungtaek Lim <
> kabhwan.opensource@gmail.com> wrote:
> >>>>>>> >>
> >>>>>>> >> +1 on a 3.0.1 soon.
> >>>>>>> >>
> >>>>>>> >>
> >>>>>>> >>
> >>>>>>> >> Probably it would be nice if some Scala experts can take a look
> at https://issues.apache.org/jira/browse/SPARK-32051 and include the fix
> into 3.0.1 if possible.
> >>>>>>> >>
> >>>>>>> >> Looks like APIs designed to work with Scala 2.11 & Java bring
> ambiguity in Scala 2.12 & Java.
> >>>>>>> >>
> >>>>>>> >>
> >>>>>>> >>
> >>>>>>> >> On Wed, Jun 24, 2020 at 4:52 AM Jules Damji <
> dmatrix@comcast.net> wrote:
> >>>>>>> >>
> >>>>>>> >> +1 (non-binding)
> >>>>>>> >>
> >>>>>>> >>
> >>>>>>> >>
> >>>>>>> >> Sent from my iPhone
> >>>>>>> >>
> >>>>>>> >> Pardon the dumb thumb typos :)
> >>>>>>> >>
> >>>>>>> >>
> >>>>>>> >>
> >>>>>>> >> On Jun 23, 2020, at 11:36 AM, Holden Karau <
> holden@pigscanfly.ca> wrote:
> >>>>>>> >>
> >>>>>>> >> +1 on a patch release soon
> >>>>>>> >>
> >>>>>>> >>
> >>>>>>> >>
> >>>>>>> >> On Tue, Jun 23, 2020 at 10:47 AM Reynold Xin <
> rxin@databricks.com> wrote:
> >>>>>>> >>
> >>>>>>> >> Error! Filename not specified.
> >>>>>>> >>
> >>>>>>> >> +1 on doing a new patch release soon. I saw some of these
> issues when preparing the 3.0 release, and some of them are very serious.
> >>>>>>> >>
> >>>>>>> >>
> >>>>>>> >>
> >>>>>>> >>
> >>>>>>> >>
> >>>>>>> >> On Tue, Jun 23, 2020 at 8:06 AM, Shivaram Venkataraman <
> shivaram@eecs.berkeley.edu> wrote:
> >>>>>>> >>
> >>>>>>> >> +1 Thanks Yuanjian -- I think it'll be great to have a 3.0.1
> release soon.
> >>>>>>> >>
> >>>>>>> >> Shivaram
> >>>>>>> >>
> >>>>>>> >> On Tue, Jun 23, 2020 at 3:43 AM Takeshi Yamamuro <
> linguin.m.s@gmail.com> wrote:
> >>>>>>> >>
> >>>>>>> >> Thanks for the heads-up, Yuanjian!
> >>>>>>> >>
> >>>>>>> >> I also noticed branch-3.0 already has 39 commits after Spark
> 3.0.0.
> >>>>>>> >>
> >>>>>>> >> wow, the updates are so quick. Anyway, +1 for the release.
> >>>>>>> >>
> >>>>>>> >> Bests,
> >>>>>>> >> Takeshi
> >>>>>>> >>
> >>>>>>> >> On Tue, Jun 23, 2020 at 4:59 PM Yuanjian Li <
> xyliyuanjian@gmail.com> wrote:
> >>>>>>> >>
> >>>>>>> >> Hi dev-list,
> >>>>>>> >>
> >>>>>>> >> I’m writing this to raise the discussion about Spark 3.0.1
> feasibility since 4 blocker issues were found after Spark 3.0.0:
> >>>>>>> >>
> >>>>>>> >> [SPARK-31990] The state store compatibility broken will cause a
> correctness issue when Streaming query with `dropDuplicate` uses the
> checkpoint written by the old Spark version.
> >>>>>>> >>
> >>>>>>> >> [SPARK-32038] The regression bug in handling NaN values in
> COUNT(DISTINCT)
> >>>>>>> >>
> >>>>>>> >> [SPARK-31918][WIP] CRAN requires to make it working with the
> latest R 4.0. It makes the 3.0 release unavailable on CRAN, and only
> supports R [3.5, 4.0)
> >>>>>>> >>
> >>>>>>> >> [SPARK-31967] Downgrade vis.js to fix Jobs UI loading time
> regression
> >>>>>>> >>
> >>>>>>> >> I also noticed branch-3.0 already has 39 commits after Spark
> 3.0.0. I think it would be great if we have Spark 3.0.1 to deliver the
> critical fixes.
> >>>>>>> >>
> >>>>>>> >> Any comments are appreciated.
> >>>>>>> >>
> >>>>>>> >> Best,
> >>>>>>> >>
> >>>>>>> >> Yuanjian
> >>>>>>> >>
> >>>>>>> >> --
> >>>>>>> >> ---
> >>>>>>> >> Takeshi Yamamuro
> >>>>>>> >>
> >>>>>>> >>
> --------------------------------------------------------------------- To
> unsubscribe e-mail: dev-unsubscribe@spark.apache.org
> >>>>>>> >>
> >>>>>>> >>
> >>>>>>> >>
> >>>>>>> >>
> >>>>>>> >>
> >>>>>>> >>
> >>>>>>> >> --
> >>>>>>> >>
> >>>>>>> >> Twitter: https://twitter.com/holdenkarau
> >>>>>>> >>
> >>>>>>> >> Books (Learning Spark, High Performance Spark, etc.):
> https://amzn.to/2MaRAG9
> >>>>>>> >>
> >>>>>>> >> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
> >>>>>>> >
> >>>>>>> > --
> >>>>>>> > Twitter: https://twitter.com/holdenkarau
> >>>>>>> > Books (Learning Spark, High Performance Spark, etc.):
> https://amzn.to/2MaRAG9
> >>>>>>> > YouTube Live Streams: https://www.youtube.com/user/holdenkarau
> >>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>
> >>>> --
> >>>> Twitter: https://twitter.com/holdenkarau
> >>>> Books (Learning Spark, High Performance Spark, etc.):
> https://amzn.to/2MaRAG9
> >>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>

Re: [DISCUSS] Apache Spark 3.0.1 Release

Posted by Ismaël Mejía <ie...@gmail.com>.
Any chance that SPARK-29536 PySpark does not work with Python 3.8.0
can be backported to 2.4.7 ?
This was not done for Spark 2.4.6 because it was too late on the vote
process but it makes perfect sense to have this in 2.4.7.

On Wed, Jul 15, 2020 at 9:07 AM Wenchen Fan <cl...@gmail.com> wrote:
>
> Yea I think 2.4.7 is good to go. Let's start!
>
> On Wed, Jul 15, 2020 at 1:50 PM Prashant Sharma <sc...@gmail.com> wrote:
>>
>> Hi Folks,
>>
>> So, I am back, and searched the JIRAS with target version as "2.4.7" and Resolved, found only 2 jiras. So, are we good to go, with just a couple of jiras fixed ? Shall I proceed with making a RC?
>>
>> Thanks,
>> Prashant
>>
>> On Thu, Jul 2, 2020 at 5:23 PM Prashant Sharma <sc...@gmail.com> wrote:
>>>
>>> Thank you, Holden.
>>>
>>> Folks, My health has gone down a bit. So, I will start working on this in a few days. If this needs to be published sooner, then maybe someone else has to help out.
>>>
>>>
>>>
>>>
>>>
>>> On Thu, Jul 2, 2020 at 10:11 AM Holden Karau <ho...@pigscanfly.ca> wrote:
>>>>
>>>> I’m happy to have Prashant do 2.4.7 :)
>>>>
>>>> On Wed, Jul 1, 2020 at 9:40 PM Xiao Li <li...@databricks.com> wrote:
>>>>>
>>>>> +1 on releasing both 3.0.1 and 2.4.7
>>>>>
>>>>> Great! Three committers volunteer to be a release manager. Ruifeng, Prashant and Holden. Holden just helped release Spark 2.4.6. This time, maybe, Ruifeng and Prashant can be the release manager of 3.0.1 and 2.4.7 respectively.
>>>>>
>>>>> Xiao
>>>>>
>>>>> On Wed, Jul 1, 2020 at 2:24 PM Jungtaek Lim <ka...@gmail.com> wrote:
>>>>>>
>>>>>> https://issues.apache.org/jira/browse/SPARK-32148 was reported yesterday, and if the report is valid it looks to be a blocker. I'll try to take a look sooner.
>>>>>>
>>>>>> On Thu, Jul 2, 2020 at 12:48 AM Shivaram Venkataraman <sh...@eecs.berkeley.edu> wrote:
>>>>>>>
>>>>>>> Thanks Holden -- it would be great to also get 2.4.7 started
>>>>>>>
>>>>>>> Thanks
>>>>>>> Shivaram
>>>>>>>
>>>>>>> On Tue, Jun 30, 2020 at 10:31 PM Holden Karau <ho...@pigscanfly.ca> wrote:
>>>>>>> >
>>>>>>> > I can take care of 2.4.7 unless someone else wants to do it.
>>>>>>> >
>>>>>>> > On Tue, Jun 30, 2020 at 8:29 PM Jason Moore <Ja...@quantium.com.au> wrote:
>>>>>>> >>
>>>>>>> >> Hi all,
>>>>>>> >>
>>>>>>> >>
>>>>>>> >>
>>>>>>> >> Could I get some input on the severity of this one that I found yesterday?  If that’s a correctness issue, should it block this patch?  Let me know under the ticket if there’s more info that I can provide to help.
>>>>>>> >>
>>>>>>> >>
>>>>>>> >>
>>>>>>> >> https://issues.apache.org/jira/browse/SPARK-32136
>>>>>>> >>
>>>>>>> >>
>>>>>>> >>
>>>>>>> >> Thanks,
>>>>>>> >>
>>>>>>> >> Jason.
>>>>>>> >>
>>>>>>> >>
>>>>>>> >>
>>>>>>> >> From: Jungtaek Lim <ka...@gmail.com>
>>>>>>> >> Date: Wednesday, 1 July 2020 at 10:20 am
>>>>>>> >> To: Shivaram Venkataraman <sh...@eecs.berkeley.edu>
>>>>>>> >> Cc: Prashant Sharma <sc...@gmail.com>, 郑瑞峰 <ru...@foxmail.com>, Gengliang Wang <ge...@databricks.com>, gurwls223 <gu...@gmail.com>, Dongjoon Hyun <do...@gmail.com>, Jules Damji <dm...@comcast.net>, Holden Karau <ho...@pigscanfly.ca>, Reynold Xin <rx...@databricks.com>, Yuanjian Li <xy...@gmail.com>, "dev@spark.apache.org" <de...@spark.apache.org>, Takeshi Yamamuro <li...@gmail.com>
>>>>>>> >> Subject: Re: [DISCUSS] Apache Spark 3.0.1 Release
>>>>>>> >>
>>>>>>> >>
>>>>>>> >>
>>>>>>> >> SPARK-32130 [1] looks to be a performance regression introduced in Spark 3.0.0, which is ideal to look into before releasing another bugfix version.
>>>>>>> >>
>>>>>>> >>
>>>>>>> >>
>>>>>>> >> 1. https://issues.apache.org/jira/browse/SPARK-32130
>>>>>>> >>
>>>>>>> >>
>>>>>>> >>
>>>>>>> >> On Wed, Jul 1, 2020 at 7:05 AM Shivaram Venkataraman <sh...@eecs.berkeley.edu> wrote:
>>>>>>> >>
>>>>>>> >> Hi all
>>>>>>> >>
>>>>>>> >>
>>>>>>> >>
>>>>>>> >> I just wanted to ping this thread to see if all the outstanding blockers for 3.0.1 have been fixed. If so, it would be great if we can get the release going. The CRAN team sent us a note that the version SparkR available on CRAN for the current R version (4.0.2) is broken and hence we need to update the package soon --  it will be great to do it with 3.0.1.
>>>>>>> >>
>>>>>>> >>
>>>>>>> >>
>>>>>>> >> Thanks
>>>>>>> >>
>>>>>>> >> Shivaram
>>>>>>> >>
>>>>>>> >>
>>>>>>> >>
>>>>>>> >> On Wed, Jun 24, 2020 at 8:31 PM Prashant Sharma <sc...@gmail.com> wrote:
>>>>>>> >>
>>>>>>> >> +1 for 3.0.1 release.
>>>>>>> >>
>>>>>>> >> I too can help out as release manager.
>>>>>>> >>
>>>>>>> >>
>>>>>>> >>
>>>>>>> >> On Thu, Jun 25, 2020 at 4:58 AM 郑瑞峰 <ru...@foxmail.com> wrote:
>>>>>>> >>
>>>>>>> >> I volunteer to be a release manager of 3.0.1, if nobody is working on this.
>>>>>>> >>
>>>>>>> >>
>>>>>>> >>
>>>>>>> >>
>>>>>>> >>
>>>>>>> >> ------------------ 原始邮件 ------------------
>>>>>>> >>
>>>>>>> >> 发件人: "Gengliang Wang"<ge...@databricks.com>;
>>>>>>> >>
>>>>>>> >> 发送时间: 2020年6月24日(星期三) 下午4:15
>>>>>>> >>
>>>>>>> >> 收件人: "Hyukjin Kwon"<gu...@gmail.com>;
>>>>>>> >>
>>>>>>> >> 抄送: "Dongjoon Hyun"<do...@gmail.com>;"Jungtaek Lim"<ka...@gmail.com>;"Jules Damji"<dm...@comcast.net>;"Holden Karau"<ho...@pigscanfly.ca>;"Reynold Xin"<rx...@databricks.com>;"Shivaram Venkataraman"<sh...@eecs.berkeley.edu>;"Yuanjian Li"<xy...@gmail.com>;"Spark dev list"<de...@spark.apache.org>;"Takeshi Yamamuro"<li...@gmail.com>;
>>>>>>> >>
>>>>>>> >> 主题: Re: [DISCUSS] Apache Spark 3.0.1 Release
>>>>>>> >>
>>>>>>> >>
>>>>>>> >>
>>>>>>> >> +1, the issues mentioned are really serious.
>>>>>>> >>
>>>>>>> >>
>>>>>>> >>
>>>>>>> >> On Tue, Jun 23, 2020 at 7:56 PM Hyukjin Kwon <gu...@gmail.com> wrote:
>>>>>>> >>
>>>>>>> >> +1.
>>>>>>> >>
>>>>>>> >> Just as a note,
>>>>>>> >> - SPARK-31918 is fixed now, and there's no blocker. - When we build SparkR, we should use the latest R version at least 4.0.0+.
>>>>>>> >>
>>>>>>> >>
>>>>>>> >>
>>>>>>> >> 2020년 6월 24일 (수) 오전 11:20, Dongjoon Hyun <do...@gmail.com>님이 작성:
>>>>>>> >>
>>>>>>> >> +1
>>>>>>> >>
>>>>>>> >>
>>>>>>> >>
>>>>>>> >> Bests,
>>>>>>> >>
>>>>>>> >> Dongjoon.
>>>>>>> >>
>>>>>>> >>
>>>>>>> >>
>>>>>>> >> On Tue, Jun 23, 2020 at 1:19 PM Jungtaek Lim <ka...@gmail.com> wrote:
>>>>>>> >>
>>>>>>> >> +1 on a 3.0.1 soon.
>>>>>>> >>
>>>>>>> >>
>>>>>>> >>
>>>>>>> >> Probably it would be nice if some Scala experts can take a look at https://issues.apache.org/jira/browse/SPARK-32051 and include the fix into 3.0.1 if possible.
>>>>>>> >>
>>>>>>> >> Looks like APIs designed to work with Scala 2.11 & Java bring ambiguity in Scala 2.12 & Java.
>>>>>>> >>
>>>>>>> >>
>>>>>>> >>
>>>>>>> >> On Wed, Jun 24, 2020 at 4:52 AM Jules Damji <dm...@comcast.net> wrote:
>>>>>>> >>
>>>>>>> >> +1 (non-binding)
>>>>>>> >>
>>>>>>> >>
>>>>>>> >>
>>>>>>> >> Sent from my iPhone
>>>>>>> >>
>>>>>>> >> Pardon the dumb thumb typos :)
>>>>>>> >>
>>>>>>> >>
>>>>>>> >>
>>>>>>> >> On Jun 23, 2020, at 11:36 AM, Holden Karau <ho...@pigscanfly.ca> wrote:
>>>>>>> >>
>>>>>>> >> +1 on a patch release soon
>>>>>>> >>
>>>>>>> >>
>>>>>>> >>
>>>>>>> >> On Tue, Jun 23, 2020 at 10:47 AM Reynold Xin <rx...@databricks.com> wrote:
>>>>>>> >>
>>>>>>> >> Error! Filename not specified.
>>>>>>> >>
>>>>>>> >> +1 on doing a new patch release soon. I saw some of these issues when preparing the 3.0 release, and some of them are very serious.
>>>>>>> >>
>>>>>>> >>
>>>>>>> >>
>>>>>>> >>
>>>>>>> >>
>>>>>>> >> On Tue, Jun 23, 2020 at 8:06 AM, Shivaram Venkataraman <sh...@eecs.berkeley.edu> wrote:
>>>>>>> >>
>>>>>>> >> +1 Thanks Yuanjian -- I think it'll be great to have a 3.0.1 release soon.
>>>>>>> >>
>>>>>>> >> Shivaram
>>>>>>> >>
>>>>>>> >> On Tue, Jun 23, 2020 at 3:43 AM Takeshi Yamamuro <li...@gmail.com> wrote:
>>>>>>> >>
>>>>>>> >> Thanks for the heads-up, Yuanjian!
>>>>>>> >>
>>>>>>> >> I also noticed branch-3.0 already has 39 commits after Spark 3.0.0.
>>>>>>> >>
>>>>>>> >> wow, the updates are so quick. Anyway, +1 for the release.
>>>>>>> >>
>>>>>>> >> Bests,
>>>>>>> >> Takeshi
>>>>>>> >>
>>>>>>> >> On Tue, Jun 23, 2020 at 4:59 PM Yuanjian Li <xy...@gmail.com> wrote:
>>>>>>> >>
>>>>>>> >> Hi dev-list,
>>>>>>> >>
>>>>>>> >> I’m writing this to raise the discussion about Spark 3.0.1 feasibility since 4 blocker issues were found after Spark 3.0.0:
>>>>>>> >>
>>>>>>> >> [SPARK-31990] The state store compatibility broken will cause a correctness issue when Streaming query with `dropDuplicate` uses the checkpoint written by the old Spark version.
>>>>>>> >>
>>>>>>> >> [SPARK-32038] The regression bug in handling NaN values in COUNT(DISTINCT)
>>>>>>> >>
>>>>>>> >> [SPARK-31918][WIP] CRAN requires to make it working with the latest R 4.0. It makes the 3.0 release unavailable on CRAN, and only supports R [3.5, 4.0)
>>>>>>> >>
>>>>>>> >> [SPARK-31967] Downgrade vis.js to fix Jobs UI loading time regression
>>>>>>> >>
>>>>>>> >> I also noticed branch-3.0 already has 39 commits after Spark 3.0.0. I think it would be great if we have Spark 3.0.1 to deliver the critical fixes.
>>>>>>> >>
>>>>>>> >> Any comments are appreciated.
>>>>>>> >>
>>>>>>> >> Best,
>>>>>>> >>
>>>>>>> >> Yuanjian
>>>>>>> >>
>>>>>>> >> --
>>>>>>> >> ---
>>>>>>> >> Takeshi Yamamuro
>>>>>>> >>
>>>>>>> >> --------------------------------------------------------------------- To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>>>>>>> >>
>>>>>>> >>
>>>>>>> >>
>>>>>>> >>
>>>>>>> >>
>>>>>>> >>
>>>>>>> >> --
>>>>>>> >>
>>>>>>> >> Twitter: https://twitter.com/holdenkarau
>>>>>>> >>
>>>>>>> >> Books (Learning Spark, High Performance Spark, etc.): https://amzn.to/2MaRAG9
>>>>>>> >>
>>>>>>> >> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>>>>>> >
>>>>>>> > --
>>>>>>> > Twitter: https://twitter.com/holdenkarau
>>>>>>> > Books (Learning Spark, High Performance Spark, etc.): https://amzn.to/2MaRAG9
>>>>>>> > YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>
>>>> --
>>>> Twitter: https://twitter.com/holdenkarau
>>>> Books (Learning Spark, High Performance Spark, etc.): https://amzn.to/2MaRAG9
>>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org


Re: [DISCUSS] Apache Spark 3.0.1 Release

Posted by Wenchen Fan <cl...@gmail.com>.
Yea I think 2.4.7 is good to go. Let's start!

On Wed, Jul 15, 2020 at 1:50 PM Prashant Sharma <sc...@gmail.com>
wrote:

> Hi Folks,
>
> So, I am back, and searched the JIRAS with target version as "2.4.7" and
> Resolved, found only 2 jiras. So, are we good to go, with just a couple of
> jiras fixed ? Shall I proceed with making a RC?
>
> Thanks,
> Prashant
>
> On Thu, Jul 2, 2020 at 5:23 PM Prashant Sharma <sc...@gmail.com>
> wrote:
>
>> Thank you, Holden.
>>
>> Folks, My health has gone down a bit. So, I will start working on this in
>> a few days. If this needs to be published sooner, then maybe someone else
>> has to help out.
>>
>>
>>
>>
>>
>> On Thu, Jul 2, 2020 at 10:11 AM Holden Karau <ho...@pigscanfly.ca>
>> wrote:
>>
>>> I’m happy to have Prashant do 2.4.7 :)
>>>
>>> On Wed, Jul 1, 2020 at 9:40 PM Xiao Li <li...@databricks.com> wrote:
>>>
>>>> +1 on releasing both 3.0.1 and 2.4.7
>>>>
>>>> Great! Three committers volunteer to be a release manager. Ruifeng,
>>>> Prashant and Holden. Holden just helped release Spark 2.4.6. This time,
>>>> maybe, Ruifeng and Prashant can be the release manager of 3.0.1 and 2.4.7
>>>> respectively.
>>>>
>>>> Xiao
>>>>
>>>> On Wed, Jul 1, 2020 at 2:24 PM Jungtaek Lim <
>>>> kabhwan.opensource@gmail.com> wrote:
>>>>
>>>>> https://issues.apache.org/jira/browse/SPARK-32148 was reported
>>>>> yesterday, and if the report is valid it looks to be a blocker. I'll try to
>>>>> take a look sooner.
>>>>>
>>>>> On Thu, Jul 2, 2020 at 12:48 AM Shivaram Venkataraman <
>>>>> shivaram@eecs.berkeley.edu> wrote:
>>>>>
>>>>>> Thanks Holden -- it would be great to also get 2.4.7 started
>>>>>>
>>>>>> Thanks
>>>>>> Shivaram
>>>>>>
>>>>>> On Tue, Jun 30, 2020 at 10:31 PM Holden Karau <ho...@pigscanfly.ca>
>>>>>> wrote:
>>>>>> >
>>>>>> > I can take care of 2.4.7 unless someone else wants to do it.
>>>>>> >
>>>>>> > On Tue, Jun 30, 2020 at 8:29 PM Jason Moore <
>>>>>> Jason.Moore@quantium.com.au> wrote:
>>>>>> >>
>>>>>> >> Hi all,
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >> Could I get some input on the severity of this one that I found
>>>>>> yesterday?  If that’s a correctness issue, should it block this patch?  Let
>>>>>> me know under the ticket if there’s more info that I can provide to help.
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >> https://issues.apache.org/jira/browse/SPARK-32136
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >> Thanks,
>>>>>> >>
>>>>>> >> Jason.
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >> From: Jungtaek Lim <ka...@gmail.com>
>>>>>> >> Date: Wednesday, 1 July 2020 at 10:20 am
>>>>>> >> To: Shivaram Venkataraman <sh...@eecs.berkeley.edu>
>>>>>> >> Cc: Prashant Sharma <sc...@gmail.com>, 郑瑞峰 <
>>>>>> ruifengz@foxmail.com>, Gengliang Wang <ge...@databricks.com>,
>>>>>> gurwls223 <gu...@gmail.com>, Dongjoon Hyun <
>>>>>> dongjoon.hyun@gmail.com>, Jules Damji <dm...@comcast.net>, Holden
>>>>>> Karau <ho...@pigscanfly.ca>, Reynold Xin <rx...@databricks.com>,
>>>>>> Yuanjian Li <xy...@gmail.com>, "dev@spark.apache.org" <
>>>>>> dev@spark.apache.org>, Takeshi Yamamuro <li...@gmail.com>
>>>>>> >> Subject: Re: [DISCUSS] Apache Spark 3.0.1 Release
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >> SPARK-32130 [1] looks to be a performance regression introduced in
>>>>>> Spark 3.0.0, which is ideal to look into before releasing another bugfix
>>>>>> version.
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >> 1. https://issues.apache.org/jira/browse/SPARK-32130
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >> On Wed, Jul 1, 2020 at 7:05 AM Shivaram Venkataraman <
>>>>>> shivaram@eecs.berkeley.edu> wrote:
>>>>>> >>
>>>>>> >> Hi all
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >> I just wanted to ping this thread to see if all the outstanding
>>>>>> blockers for 3.0.1 have been fixed. If so, it would be great if we can get
>>>>>> the release going. The CRAN team sent us a note that the version SparkR
>>>>>> available on CRAN for the current R version (4.0.2) is broken and hence we
>>>>>> need to update the package soon --  it will be great to do it with 3.0.1.
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >> Thanks
>>>>>> >>
>>>>>> >> Shivaram
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >> On Wed, Jun 24, 2020 at 8:31 PM Prashant Sharma <
>>>>>> scrapcodes@gmail.com> wrote:
>>>>>> >>
>>>>>> >> +1 for 3.0.1 release.
>>>>>> >>
>>>>>> >> I too can help out as release manager.
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >> On Thu, Jun 25, 2020 at 4:58 AM 郑瑞峰 <ru...@foxmail.com> wrote:
>>>>>> >>
>>>>>> >> I volunteer to be a release manager of 3.0.1, if nobody is working
>>>>>> on this.
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >> ------------------ 原始邮件 ------------------
>>>>>> >>
>>>>>> >> 发件人: "Gengliang Wang"<ge...@databricks.com>;
>>>>>> >>
>>>>>> >> 发送时间: 2020年6月24日(星期三) 下午4:15
>>>>>> >>
>>>>>> >> 收件人: "Hyukjin Kwon"<gu...@gmail.com>;
>>>>>> >>
>>>>>> >> 抄送: "Dongjoon Hyun"<do...@gmail.com>;"Jungtaek Lim"<
>>>>>> kabhwan.opensource@gmail.com>;"Jules Damji"<dm...@comcast.net>;"Holden
>>>>>> Karau"<ho...@pigscanfly.ca>;"Reynold Xin"<rx...@databricks.com>;"Shivaram
>>>>>> Venkataraman"<sh...@eecs.berkeley.edu>;"Yuanjian Li"<
>>>>>> xyliyuanjian@gmail.com>;"Spark dev list"<de...@spark.apache.org>;"Takeshi
>>>>>> Yamamuro"<li...@gmail.com>;
>>>>>> >>
>>>>>> >> 主题: Re: [DISCUSS] Apache Spark 3.0.1 Release
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >> +1, the issues mentioned are really serious.
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >> On Tue, Jun 23, 2020 at 7:56 PM Hyukjin Kwon <gu...@gmail.com>
>>>>>> wrote:
>>>>>> >>
>>>>>> >> +1.
>>>>>> >>
>>>>>> >> Just as a note,
>>>>>> >> - SPARK-31918 is fixed now, and there's no blocker. - When we
>>>>>> build SparkR, we should use the latest R version at least 4.0.0+.
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >> 2020년 6월 24일 (수) 오전 11:20, Dongjoon Hyun <do...@gmail.com>님이
>>>>>> 작성:
>>>>>> >>
>>>>>> >> +1
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >> Bests,
>>>>>> >>
>>>>>> >> Dongjoon.
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >> On Tue, Jun 23, 2020 at 1:19 PM Jungtaek Lim <
>>>>>> kabhwan.opensource@gmail.com> wrote:
>>>>>> >>
>>>>>> >> +1 on a 3.0.1 soon.
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >> Probably it would be nice if some Scala experts can take a look at
>>>>>> https://issues.apache.org/jira/browse/SPARK-32051 and include the
>>>>>> fix into 3.0.1 if possible.
>>>>>> >>
>>>>>> >> Looks like APIs designed to work with Scala 2.11 & Java bring
>>>>>> ambiguity in Scala 2.12 & Java.
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >> On Wed, Jun 24, 2020 at 4:52 AM Jules Damji <dm...@comcast.net>
>>>>>> wrote:
>>>>>> >>
>>>>>> >> +1 (non-binding)
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >> Sent from my iPhone
>>>>>> >>
>>>>>> >> Pardon the dumb thumb typos :)
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >> On Jun 23, 2020, at 11:36 AM, Holden Karau <ho...@pigscanfly.ca>
>>>>>> wrote:
>>>>>> >>
>>>>>> >> +1 on a patch release soon
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >> On Tue, Jun 23, 2020 at 10:47 AM Reynold Xin <rx...@databricks.com>
>>>>>> wrote:
>>>>>> >>
>>>>>> >> Error! Filename not specified.
>>>>>> >>
>>>>>> >> +1 on doing a new patch release soon. I saw some of these issues
>>>>>> when preparing the 3.0 release, and some of them are very serious.
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >> On Tue, Jun 23, 2020 at 8:06 AM, Shivaram Venkataraman <
>>>>>> shivaram@eecs.berkeley.edu> wrote:
>>>>>> >>
>>>>>> >> +1 Thanks Yuanjian -- I think it'll be great to have a 3.0.1
>>>>>> release soon.
>>>>>> >>
>>>>>> >> Shivaram
>>>>>> >>
>>>>>> >> On Tue, Jun 23, 2020 at 3:43 AM Takeshi Yamamuro <
>>>>>> linguin.m.s@gmail.com> wrote:
>>>>>> >>
>>>>>> >> Thanks for the heads-up, Yuanjian!
>>>>>> >>
>>>>>> >> I also noticed branch-3.0 already has 39 commits after Spark 3.0.0.
>>>>>> >>
>>>>>> >> wow, the updates are so quick. Anyway, +1 for the release.
>>>>>> >>
>>>>>> >> Bests,
>>>>>> >> Takeshi
>>>>>> >>
>>>>>> >> On Tue, Jun 23, 2020 at 4:59 PM Yuanjian Li <
>>>>>> xyliyuanjian@gmail.com> wrote:
>>>>>> >>
>>>>>> >> Hi dev-list,
>>>>>> >>
>>>>>> >> I’m writing this to raise the discussion about Spark 3.0.1
>>>>>> feasibility since 4 blocker issues were found after Spark 3.0.0:
>>>>>> >>
>>>>>> >> [SPARK-31990] The state store compatibility broken will cause a
>>>>>> correctness issue when Streaming query with `dropDuplicate` uses the
>>>>>> checkpoint written by the old Spark version.
>>>>>> >>
>>>>>> >> [SPARK-32038] The regression bug in handling NaN values in
>>>>>> COUNT(DISTINCT)
>>>>>> >>
>>>>>> >> [SPARK-31918][WIP] CRAN requires to make it working with the
>>>>>> latest R 4.0. It makes the 3.0 release unavailable on CRAN, and only
>>>>>> supports R [3.5, 4.0)
>>>>>> >>
>>>>>> >> [SPARK-31967] Downgrade vis.js to fix Jobs UI loading time
>>>>>> regression
>>>>>> >>
>>>>>> >> I also noticed branch-3.0 already has 39 commits after Spark
>>>>>> 3.0.0. I think it would be great if we have Spark 3.0.1 to deliver the
>>>>>> critical fixes.
>>>>>> >>
>>>>>> >> Any comments are appreciated.
>>>>>> >>
>>>>>> >> Best,
>>>>>> >>
>>>>>> >> Yuanjian
>>>>>> >>
>>>>>> >> --
>>>>>> >> ---
>>>>>> >> Takeshi Yamamuro
>>>>>> >>
>>>>>> >>
>>>>>> --------------------------------------------------------------------- To
>>>>>> unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >> --
>>>>>> >>
>>>>>> >> Twitter: https://twitter.com/holdenkarau
>>>>>> >>
>>>>>> >> Books (Learning Spark, High Performance Spark, etc.):
>>>>>> https://amzn.to/2MaRAG9
>>>>>> >>
>>>>>> >> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>>>>> >
>>>>>> > --
>>>>>> > Twitter: https://twitter.com/holdenkarau
>>>>>> > Books (Learning Spark, High Performance Spark, etc.):
>>>>>> https://amzn.to/2MaRAG9
>>>>>> > YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>>>>>
>>>>>
>>>>
>>>> --
>>>> <https://databricks.com/sparkaisummit/north-america>
>>>>
>>> --
>>> Twitter: https://twitter.com/holdenkarau
>>> Books (Learning Spark, High Performance Spark, etc.):
>>> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>>
>>

Re: [DISCUSS] Apache Spark 3.0.1 Release

Posted by Prashant Sharma <sc...@gmail.com>.
Hi Folks,

So, I am back, and searched the JIRAS with target version as "2.4.7" and
Resolved, found only 2 jiras. So, are we good to go, with just a couple of
jiras fixed ? Shall I proceed with making a RC?

Thanks,
Prashant

On Thu, Jul 2, 2020 at 5:23 PM Prashant Sharma <sc...@gmail.com> wrote:

> Thank you, Holden.
>
> Folks, My health has gone down a bit. So, I will start working on this in
> a few days. If this needs to be published sooner, then maybe someone else
> has to help out.
>
>
>
>
>
> On Thu, Jul 2, 2020 at 10:11 AM Holden Karau <ho...@pigscanfly.ca> wrote:
>
>> I’m happy to have Prashant do 2.4.7 :)
>>
>> On Wed, Jul 1, 2020 at 9:40 PM Xiao Li <li...@databricks.com> wrote:
>>
>>> +1 on releasing both 3.0.1 and 2.4.7
>>>
>>> Great! Three committers volunteer to be a release manager. Ruifeng,
>>> Prashant and Holden. Holden just helped release Spark 2.4.6. This time,
>>> maybe, Ruifeng and Prashant can be the release manager of 3.0.1 and 2.4.7
>>> respectively.
>>>
>>> Xiao
>>>
>>> On Wed, Jul 1, 2020 at 2:24 PM Jungtaek Lim <
>>> kabhwan.opensource@gmail.com> wrote:
>>>
>>>> https://issues.apache.org/jira/browse/SPARK-32148 was reported
>>>> yesterday, and if the report is valid it looks to be a blocker. I'll try to
>>>> take a look sooner.
>>>>
>>>> On Thu, Jul 2, 2020 at 12:48 AM Shivaram Venkataraman <
>>>> shivaram@eecs.berkeley.edu> wrote:
>>>>
>>>>> Thanks Holden -- it would be great to also get 2.4.7 started
>>>>>
>>>>> Thanks
>>>>> Shivaram
>>>>>
>>>>> On Tue, Jun 30, 2020 at 10:31 PM Holden Karau <ho...@pigscanfly.ca>
>>>>> wrote:
>>>>> >
>>>>> > I can take care of 2.4.7 unless someone else wants to do it.
>>>>> >
>>>>> > On Tue, Jun 30, 2020 at 8:29 PM Jason Moore <
>>>>> Jason.Moore@quantium.com.au> wrote:
>>>>> >>
>>>>> >> Hi all,
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >> Could I get some input on the severity of this one that I found
>>>>> yesterday?  If that’s a correctness issue, should it block this patch?  Let
>>>>> me know under the ticket if there’s more info that I can provide to help.
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >> https://issues.apache.org/jira/browse/SPARK-32136
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >> Thanks,
>>>>> >>
>>>>> >> Jason.
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >> From: Jungtaek Lim <ka...@gmail.com>
>>>>> >> Date: Wednesday, 1 July 2020 at 10:20 am
>>>>> >> To: Shivaram Venkataraman <sh...@eecs.berkeley.edu>
>>>>> >> Cc: Prashant Sharma <sc...@gmail.com>, 郑瑞峰 <
>>>>> ruifengz@foxmail.com>, Gengliang Wang <ge...@databricks.com>,
>>>>> gurwls223 <gu...@gmail.com>, Dongjoon Hyun <
>>>>> dongjoon.hyun@gmail.com>, Jules Damji <dm...@comcast.net>, Holden
>>>>> Karau <ho...@pigscanfly.ca>, Reynold Xin <rx...@databricks.com>,
>>>>> Yuanjian Li <xy...@gmail.com>, "dev@spark.apache.org" <
>>>>> dev@spark.apache.org>, Takeshi Yamamuro <li...@gmail.com>
>>>>> >> Subject: Re: [DISCUSS] Apache Spark 3.0.1 Release
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >> SPARK-32130 [1] looks to be a performance regression introduced in
>>>>> Spark 3.0.0, which is ideal to look into before releasing another bugfix
>>>>> version.
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >> 1. https://issues.apache.org/jira/browse/SPARK-32130
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >> On Wed, Jul 1, 2020 at 7:05 AM Shivaram Venkataraman <
>>>>> shivaram@eecs.berkeley.edu> wrote:
>>>>> >>
>>>>> >> Hi all
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >> I just wanted to ping this thread to see if all the outstanding
>>>>> blockers for 3.0.1 have been fixed. If so, it would be great if we can get
>>>>> the release going. The CRAN team sent us a note that the version SparkR
>>>>> available on CRAN for the current R version (4.0.2) is broken and hence we
>>>>> need to update the package soon --  it will be great to do it with 3.0.1.
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >> Thanks
>>>>> >>
>>>>> >> Shivaram
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >> On Wed, Jun 24, 2020 at 8:31 PM Prashant Sharma <
>>>>> scrapcodes@gmail.com> wrote:
>>>>> >>
>>>>> >> +1 for 3.0.1 release.
>>>>> >>
>>>>> >> I too can help out as release manager.
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >> On Thu, Jun 25, 2020 at 4:58 AM 郑瑞峰 <ru...@foxmail.com> wrote:
>>>>> >>
>>>>> >> I volunteer to be a release manager of 3.0.1, if nobody is working
>>>>> on this.
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >> ------------------ 原始邮件 ------------------
>>>>> >>
>>>>> >> 发件人: "Gengliang Wang"<ge...@databricks.com>;
>>>>> >>
>>>>> >> 发送时间: 2020年6月24日(星期三) 下午4:15
>>>>> >>
>>>>> >> 收件人: "Hyukjin Kwon"<gu...@gmail.com>;
>>>>> >>
>>>>> >> 抄送: "Dongjoon Hyun"<do...@gmail.com>;"Jungtaek Lim"<
>>>>> kabhwan.opensource@gmail.com>;"Jules Damji"<dm...@comcast.net>;"Holden
>>>>> Karau"<ho...@pigscanfly.ca>;"Reynold Xin"<rx...@databricks.com>;"Shivaram
>>>>> Venkataraman"<sh...@eecs.berkeley.edu>;"Yuanjian Li"<
>>>>> xyliyuanjian@gmail.com>;"Spark dev list"<de...@spark.apache.org>;"Takeshi
>>>>> Yamamuro"<li...@gmail.com>;
>>>>> >>
>>>>> >> 主题: Re: [DISCUSS] Apache Spark 3.0.1 Release
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >> +1, the issues mentioned are really serious.
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >> On Tue, Jun 23, 2020 at 7:56 PM Hyukjin Kwon <gu...@gmail.com>
>>>>> wrote:
>>>>> >>
>>>>> >> +1.
>>>>> >>
>>>>> >> Just as a note,
>>>>> >> - SPARK-31918 is fixed now, and there's no blocker. - When we build
>>>>> SparkR, we should use the latest R version at least 4.0.0+.
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >> 2020년 6월 24일 (수) 오전 11:20, Dongjoon Hyun <do...@gmail.com>님이
>>>>> 작성:
>>>>> >>
>>>>> >> +1
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >> Bests,
>>>>> >>
>>>>> >> Dongjoon.
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >> On Tue, Jun 23, 2020 at 1:19 PM Jungtaek Lim <
>>>>> kabhwan.opensource@gmail.com> wrote:
>>>>> >>
>>>>> >> +1 on a 3.0.1 soon.
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >> Probably it would be nice if some Scala experts can take a look at
>>>>> https://issues.apache.org/jira/browse/SPARK-32051 and include the fix
>>>>> into 3.0.1 if possible.
>>>>> >>
>>>>> >> Looks like APIs designed to work with Scala 2.11 & Java bring
>>>>> ambiguity in Scala 2.12 & Java.
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >> On Wed, Jun 24, 2020 at 4:52 AM Jules Damji <dm...@comcast.net>
>>>>> wrote:
>>>>> >>
>>>>> >> +1 (non-binding)
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >> Sent from my iPhone
>>>>> >>
>>>>> >> Pardon the dumb thumb typos :)
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >> On Jun 23, 2020, at 11:36 AM, Holden Karau <ho...@pigscanfly.ca>
>>>>> wrote:
>>>>> >>
>>>>> >> +1 on a patch release soon
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >> On Tue, Jun 23, 2020 at 10:47 AM Reynold Xin <rx...@databricks.com>
>>>>> wrote:
>>>>> >>
>>>>> >> Error! Filename not specified.
>>>>> >>
>>>>> >> +1 on doing a new patch release soon. I saw some of these issues
>>>>> when preparing the 3.0 release, and some of them are very serious.
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >> On Tue, Jun 23, 2020 at 8:06 AM, Shivaram Venkataraman <
>>>>> shivaram@eecs.berkeley.edu> wrote:
>>>>> >>
>>>>> >> +1 Thanks Yuanjian -- I think it'll be great to have a 3.0.1
>>>>> release soon.
>>>>> >>
>>>>> >> Shivaram
>>>>> >>
>>>>> >> On Tue, Jun 23, 2020 at 3:43 AM Takeshi Yamamuro <
>>>>> linguin.m.s@gmail.com> wrote:
>>>>> >>
>>>>> >> Thanks for the heads-up, Yuanjian!
>>>>> >>
>>>>> >> I also noticed branch-3.0 already has 39 commits after Spark 3.0.0.
>>>>> >>
>>>>> >> wow, the updates are so quick. Anyway, +1 for the release.
>>>>> >>
>>>>> >> Bests,
>>>>> >> Takeshi
>>>>> >>
>>>>> >> On Tue, Jun 23, 2020 at 4:59 PM Yuanjian Li <xy...@gmail.com>
>>>>> wrote:
>>>>> >>
>>>>> >> Hi dev-list,
>>>>> >>
>>>>> >> I’m writing this to raise the discussion about Spark 3.0.1
>>>>> feasibility since 4 blocker issues were found after Spark 3.0.0:
>>>>> >>
>>>>> >> [SPARK-31990] The state store compatibility broken will cause a
>>>>> correctness issue when Streaming query with `dropDuplicate` uses the
>>>>> checkpoint written by the old Spark version.
>>>>> >>
>>>>> >> [SPARK-32038] The regression bug in handling NaN values in
>>>>> COUNT(DISTINCT)
>>>>> >>
>>>>> >> [SPARK-31918][WIP] CRAN requires to make it working with the latest
>>>>> R 4.0. It makes the 3.0 release unavailable on CRAN, and only supports R
>>>>> [3.5, 4.0)
>>>>> >>
>>>>> >> [SPARK-31967] Downgrade vis.js to fix Jobs UI loading time
>>>>> regression
>>>>> >>
>>>>> >> I also noticed branch-3.0 already has 39 commits after Spark 3.0.0.
>>>>> I think it would be great if we have Spark 3.0.1 to deliver the critical
>>>>> fixes.
>>>>> >>
>>>>> >> Any comments are appreciated.
>>>>> >>
>>>>> >> Best,
>>>>> >>
>>>>> >> Yuanjian
>>>>> >>
>>>>> >> --
>>>>> >> ---
>>>>> >> Takeshi Yamamuro
>>>>> >>
>>>>> >>
>>>>> --------------------------------------------------------------------- To
>>>>> unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >> --
>>>>> >>
>>>>> >> Twitter: https://twitter.com/holdenkarau
>>>>> >>
>>>>> >> Books (Learning Spark, High Performance Spark, etc.):
>>>>> https://amzn.to/2MaRAG9
>>>>> >>
>>>>> >> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>>>> >
>>>>> > --
>>>>> > Twitter: https://twitter.com/holdenkarau
>>>>> > Books (Learning Spark, High Performance Spark, etc.):
>>>>> https://amzn.to/2MaRAG9
>>>>> > YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>>>>
>>>>
>>>
>>> --
>>> <https://databricks.com/sparkaisummit/north-america>
>>>
>> --
>> Twitter: https://twitter.com/holdenkarau
>> Books (Learning Spark, High Performance Spark, etc.):
>> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>
>

Re: [DISCUSS] Apache Spark 3.0.1 Release

Posted by Prashant Sharma <sc...@gmail.com>.
Thank you, Holden.

Folks, My health has gone down a bit. So, I will start working on this in a
few days. If this needs to be published sooner, then maybe someone else has
to help out.





On Thu, Jul 2, 2020 at 10:11 AM Holden Karau <ho...@pigscanfly.ca> wrote:

> I’m happy to have Prashant do 2.4.7 :)
>
> On Wed, Jul 1, 2020 at 9:40 PM Xiao Li <li...@databricks.com> wrote:
>
>> +1 on releasing both 3.0.1 and 2.4.7
>>
>> Great! Three committers volunteer to be a release manager. Ruifeng,
>> Prashant and Holden. Holden just helped release Spark 2.4.6. This time,
>> maybe, Ruifeng and Prashant can be the release manager of 3.0.1 and 2.4.7
>> respectively.
>>
>> Xiao
>>
>> On Wed, Jul 1, 2020 at 2:24 PM Jungtaek Lim <ka...@gmail.com>
>> wrote:
>>
>>> https://issues.apache.org/jira/browse/SPARK-32148 was reported
>>> yesterday, and if the report is valid it looks to be a blocker. I'll try to
>>> take a look sooner.
>>>
>>> On Thu, Jul 2, 2020 at 12:48 AM Shivaram Venkataraman <
>>> shivaram@eecs.berkeley.edu> wrote:
>>>
>>>> Thanks Holden -- it would be great to also get 2.4.7 started
>>>>
>>>> Thanks
>>>> Shivaram
>>>>
>>>> On Tue, Jun 30, 2020 at 10:31 PM Holden Karau <ho...@pigscanfly.ca>
>>>> wrote:
>>>> >
>>>> > I can take care of 2.4.7 unless someone else wants to do it.
>>>> >
>>>> > On Tue, Jun 30, 2020 at 8:29 PM Jason Moore <
>>>> Jason.Moore@quantium.com.au> wrote:
>>>> >>
>>>> >> Hi all,
>>>> >>
>>>> >>
>>>> >>
>>>> >> Could I get some input on the severity of this one that I found
>>>> yesterday?  If that’s a correctness issue, should it block this patch?  Let
>>>> me know under the ticket if there’s more info that I can provide to help.
>>>> >>
>>>> >>
>>>> >>
>>>> >> https://issues.apache.org/jira/browse/SPARK-32136
>>>> >>
>>>> >>
>>>> >>
>>>> >> Thanks,
>>>> >>
>>>> >> Jason.
>>>> >>
>>>> >>
>>>> >>
>>>> >> From: Jungtaek Lim <ka...@gmail.com>
>>>> >> Date: Wednesday, 1 July 2020 at 10:20 am
>>>> >> To: Shivaram Venkataraman <sh...@eecs.berkeley.edu>
>>>> >> Cc: Prashant Sharma <sc...@gmail.com>, 郑瑞峰 <
>>>> ruifengz@foxmail.com>, Gengliang Wang <ge...@databricks.com>,
>>>> gurwls223 <gu...@gmail.com>, Dongjoon Hyun <do...@gmail.com>,
>>>> Jules Damji <dm...@comcast.net>, Holden Karau <ho...@pigscanfly.ca>,
>>>> Reynold Xin <rx...@databricks.com>, Yuanjian Li <xy...@gmail.com>,
>>>> "dev@spark.apache.org" <de...@spark.apache.org>, Takeshi Yamamuro <
>>>> linguin.m.s@gmail.com>
>>>> >> Subject: Re: [DISCUSS] Apache Spark 3.0.1 Release
>>>> >>
>>>> >>
>>>> >>
>>>> >> SPARK-32130 [1] looks to be a performance regression introduced in
>>>> Spark 3.0.0, which is ideal to look into before releasing another bugfix
>>>> version.
>>>> >>
>>>> >>
>>>> >>
>>>> >> 1. https://issues.apache.org/jira/browse/SPARK-32130
>>>> >>
>>>> >>
>>>> >>
>>>> >> On Wed, Jul 1, 2020 at 7:05 AM Shivaram Venkataraman <
>>>> shivaram@eecs.berkeley.edu> wrote:
>>>> >>
>>>> >> Hi all
>>>> >>
>>>> >>
>>>> >>
>>>> >> I just wanted to ping this thread to see if all the outstanding
>>>> blockers for 3.0.1 have been fixed. If so, it would be great if we can get
>>>> the release going. The CRAN team sent us a note that the version SparkR
>>>> available on CRAN for the current R version (4.0.2) is broken and hence we
>>>> need to update the package soon --  it will be great to do it with 3.0.1.
>>>> >>
>>>> >>
>>>> >>
>>>> >> Thanks
>>>> >>
>>>> >> Shivaram
>>>> >>
>>>> >>
>>>> >>
>>>> >> On Wed, Jun 24, 2020 at 8:31 PM Prashant Sharma <
>>>> scrapcodes@gmail.com> wrote:
>>>> >>
>>>> >> +1 for 3.0.1 release.
>>>> >>
>>>> >> I too can help out as release manager.
>>>> >>
>>>> >>
>>>> >>
>>>> >> On Thu, Jun 25, 2020 at 4:58 AM 郑瑞峰 <ru...@foxmail.com> wrote:
>>>> >>
>>>> >> I volunteer to be a release manager of 3.0.1, if nobody is working
>>>> on this.
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >> ------------------ 原始邮件 ------------------
>>>> >>
>>>> >> 发件人: "Gengliang Wang"<ge...@databricks.com>;
>>>> >>
>>>> >> 发送时间: 2020年6月24日(星期三) 下午4:15
>>>> >>
>>>> >> 收件人: "Hyukjin Kwon"<gu...@gmail.com>;
>>>> >>
>>>> >> 抄送: "Dongjoon Hyun"<do...@gmail.com>;"Jungtaek Lim"<
>>>> kabhwan.opensource@gmail.com>;"Jules Damji"<dm...@comcast.net>;"Holden
>>>> Karau"<ho...@pigscanfly.ca>;"Reynold Xin"<rx...@databricks.com>;"Shivaram
>>>> Venkataraman"<sh...@eecs.berkeley.edu>;"Yuanjian Li"<
>>>> xyliyuanjian@gmail.com>;"Spark dev list"<de...@spark.apache.org>;"Takeshi
>>>> Yamamuro"<li...@gmail.com>;
>>>> >>
>>>> >> 主题: Re: [DISCUSS] Apache Spark 3.0.1 Release
>>>> >>
>>>> >>
>>>> >>
>>>> >> +1, the issues mentioned are really serious.
>>>> >>
>>>> >>
>>>> >>
>>>> >> On Tue, Jun 23, 2020 at 7:56 PM Hyukjin Kwon <gu...@gmail.com>
>>>> wrote:
>>>> >>
>>>> >> +1.
>>>> >>
>>>> >> Just as a note,
>>>> >> - SPARK-31918 is fixed now, and there's no blocker. - When we build
>>>> SparkR, we should use the latest R version at least 4.0.0+.
>>>> >>
>>>> >>
>>>> >>
>>>> >> 2020년 6월 24일 (수) 오전 11:20, Dongjoon Hyun <do...@gmail.com>님이
>>>> 작성:
>>>> >>
>>>> >> +1
>>>> >>
>>>> >>
>>>> >>
>>>> >> Bests,
>>>> >>
>>>> >> Dongjoon.
>>>> >>
>>>> >>
>>>> >>
>>>> >> On Tue, Jun 23, 2020 at 1:19 PM Jungtaek Lim <
>>>> kabhwan.opensource@gmail.com> wrote:
>>>> >>
>>>> >> +1 on a 3.0.1 soon.
>>>> >>
>>>> >>
>>>> >>
>>>> >> Probably it would be nice if some Scala experts can take a look at
>>>> https://issues.apache.org/jira/browse/SPARK-32051 and include the fix
>>>> into 3.0.1 if possible.
>>>> >>
>>>> >> Looks like APIs designed to work with Scala 2.11 & Java bring
>>>> ambiguity in Scala 2.12 & Java.
>>>> >>
>>>> >>
>>>> >>
>>>> >> On Wed, Jun 24, 2020 at 4:52 AM Jules Damji <dm...@comcast.net>
>>>> wrote:
>>>> >>
>>>> >> +1 (non-binding)
>>>> >>
>>>> >>
>>>> >>
>>>> >> Sent from my iPhone
>>>> >>
>>>> >> Pardon the dumb thumb typos :)
>>>> >>
>>>> >>
>>>> >>
>>>> >> On Jun 23, 2020, at 11:36 AM, Holden Karau <ho...@pigscanfly.ca>
>>>> wrote:
>>>> >>
>>>> >> +1 on a patch release soon
>>>> >>
>>>> >>
>>>> >>
>>>> >> On Tue, Jun 23, 2020 at 10:47 AM Reynold Xin <rx...@databricks.com>
>>>> wrote:
>>>> >>
>>>> >> Error! Filename not specified.
>>>> >>
>>>> >> +1 on doing a new patch release soon. I saw some of these issues
>>>> when preparing the 3.0 release, and some of them are very serious.
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >> On Tue, Jun 23, 2020 at 8:06 AM, Shivaram Venkataraman <
>>>> shivaram@eecs.berkeley.edu> wrote:
>>>> >>
>>>> >> +1 Thanks Yuanjian -- I think it'll be great to have a 3.0.1 release
>>>> soon.
>>>> >>
>>>> >> Shivaram
>>>> >>
>>>> >> On Tue, Jun 23, 2020 at 3:43 AM Takeshi Yamamuro <
>>>> linguin.m.s@gmail.com> wrote:
>>>> >>
>>>> >> Thanks for the heads-up, Yuanjian!
>>>> >>
>>>> >> I also noticed branch-3.0 already has 39 commits after Spark 3.0.0.
>>>> >>
>>>> >> wow, the updates are so quick. Anyway, +1 for the release.
>>>> >>
>>>> >> Bests,
>>>> >> Takeshi
>>>> >>
>>>> >> On Tue, Jun 23, 2020 at 4:59 PM Yuanjian Li <xy...@gmail.com>
>>>> wrote:
>>>> >>
>>>> >> Hi dev-list,
>>>> >>
>>>> >> I’m writing this to raise the discussion about Spark 3.0.1
>>>> feasibility since 4 blocker issues were found after Spark 3.0.0:
>>>> >>
>>>> >> [SPARK-31990] The state store compatibility broken will cause a
>>>> correctness issue when Streaming query with `dropDuplicate` uses the
>>>> checkpoint written by the old Spark version.
>>>> >>
>>>> >> [SPARK-32038] The regression bug in handling NaN values in
>>>> COUNT(DISTINCT)
>>>> >>
>>>> >> [SPARK-31918][WIP] CRAN requires to make it working with the latest
>>>> R 4.0. It makes the 3.0 release unavailable on CRAN, and only supports R
>>>> [3.5, 4.0)
>>>> >>
>>>> >> [SPARK-31967] Downgrade vis.js to fix Jobs UI loading time regression
>>>> >>
>>>> >> I also noticed branch-3.0 already has 39 commits after Spark 3.0.0.
>>>> I think it would be great if we have Spark 3.0.1 to deliver the critical
>>>> fixes.
>>>> >>
>>>> >> Any comments are appreciated.
>>>> >>
>>>> >> Best,
>>>> >>
>>>> >> Yuanjian
>>>> >>
>>>> >> --
>>>> >> ---
>>>> >> Takeshi Yamamuro
>>>> >>
>>>> >>
>>>> --------------------------------------------------------------------- To
>>>> unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >> --
>>>> >>
>>>> >> Twitter: https://twitter.com/holdenkarau
>>>> >>
>>>> >> Books (Learning Spark, High Performance Spark, etc.):
>>>> https://amzn.to/2MaRAG9
>>>> >>
>>>> >> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>>> >
>>>> > --
>>>> > Twitter: https://twitter.com/holdenkarau
>>>> > Books (Learning Spark, High Performance Spark, etc.):
>>>> https://amzn.to/2MaRAG9
>>>> > YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>>>
>>>
>>
>> --
>> <https://databricks.com/sparkaisummit/north-america>
>>
> --
> Twitter: https://twitter.com/holdenkarau
> Books (Learning Spark, High Performance Spark, etc.):
> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>

Re: [DISCUSS] Apache Spark 3.0.1 Release

Posted by Holden Karau <ho...@pigscanfly.ca>.
I’m happy to have Prashant do 2.4.7 :)

On Wed, Jul 1, 2020 at 9:40 PM Xiao Li <li...@databricks.com> wrote:

> +1 on releasing both 3.0.1 and 2.4.7
>
> Great! Three committers volunteer to be a release manager. Ruifeng,
> Prashant and Holden. Holden just helped release Spark 2.4.6. This time,
> maybe, Ruifeng and Prashant can be the release manager of 3.0.1 and 2.4.7
> respectively.
>
> Xiao
>
> On Wed, Jul 1, 2020 at 2:24 PM Jungtaek Lim <ka...@gmail.com>
> wrote:
>
>> https://issues.apache.org/jira/browse/SPARK-32148 was reported
>> yesterday, and if the report is valid it looks to be a blocker. I'll try to
>> take a look sooner.
>>
>> On Thu, Jul 2, 2020 at 12:48 AM Shivaram Venkataraman <
>> shivaram@eecs.berkeley.edu> wrote:
>>
>>> Thanks Holden -- it would be great to also get 2.4.7 started
>>>
>>> Thanks
>>> Shivaram
>>>
>>> On Tue, Jun 30, 2020 at 10:31 PM Holden Karau <ho...@pigscanfly.ca>
>>> wrote:
>>> >
>>> > I can take care of 2.4.7 unless someone else wants to do it.
>>> >
>>> > On Tue, Jun 30, 2020 at 8:29 PM Jason Moore <
>>> Jason.Moore@quantium.com.au> wrote:
>>> >>
>>> >> Hi all,
>>> >>
>>> >>
>>> >>
>>> >> Could I get some input on the severity of this one that I found
>>> yesterday?  If that’s a correctness issue, should it block this patch?  Let
>>> me know under the ticket if there’s more info that I can provide to help.
>>> >>
>>> >>
>>> >>
>>> >> https://issues.apache.org/jira/browse/SPARK-32136
>>> >>
>>> >>
>>> >>
>>> >> Thanks,
>>> >>
>>> >> Jason.
>>> >>
>>> >>
>>> >>
>>> >> From: Jungtaek Lim <ka...@gmail.com>
>>> >> Date: Wednesday, 1 July 2020 at 10:20 am
>>> >> To: Shivaram Venkataraman <sh...@eecs.berkeley.edu>
>>> >> Cc: Prashant Sharma <sc...@gmail.com>, 郑瑞峰 <ru...@foxmail.com>,
>>> Gengliang Wang <ge...@databricks.com>, gurwls223 <
>>> gurwls223@gmail.com>, Dongjoon Hyun <do...@gmail.com>, Jules
>>> Damji <dm...@comcast.net>, Holden Karau <ho...@pigscanfly.ca>,
>>> Reynold Xin <rx...@databricks.com>, Yuanjian Li <xy...@gmail.com>,
>>> "dev@spark.apache.org" <de...@spark.apache.org>, Takeshi Yamamuro <
>>> linguin.m.s@gmail.com>
>>> >> Subject: Re: [DISCUSS] Apache Spark 3.0.1 Release
>>> >>
>>> >>
>>> >>
>>> >> SPARK-32130 [1] looks to be a performance regression introduced in
>>> Spark 3.0.0, which is ideal to look into before releasing another bugfix
>>> version.
>>> >>
>>> >>
>>> >>
>>> >> 1. https://issues.apache.org/jira/browse/SPARK-32130
>>> >>
>>> >>
>>> >>
>>> >> On Wed, Jul 1, 2020 at 7:05 AM Shivaram Venkataraman <
>>> shivaram@eecs.berkeley.edu> wrote:
>>> >>
>>> >> Hi all
>>> >>
>>> >>
>>> >>
>>> >> I just wanted to ping this thread to see if all the outstanding
>>> blockers for 3.0.1 have been fixed. If so, it would be great if we can get
>>> the release going. The CRAN team sent us a note that the version SparkR
>>> available on CRAN for the current R version (4.0.2) is broken and hence we
>>> need to update the package soon --  it will be great to do it with 3.0.1.
>>> >>
>>> >>
>>> >>
>>> >> Thanks
>>> >>
>>> >> Shivaram
>>> >>
>>> >>
>>> >>
>>> >> On Wed, Jun 24, 2020 at 8:31 PM Prashant Sharma <sc...@gmail.com>
>>> wrote:
>>> >>
>>> >> +1 for 3.0.1 release.
>>> >>
>>> >> I too can help out as release manager.
>>> >>
>>> >>
>>> >>
>>> >> On Thu, Jun 25, 2020 at 4:58 AM 郑瑞峰 <ru...@foxmail.com> wrote:
>>> >>
>>> >> I volunteer to be a release manager of 3.0.1, if nobody is working on
>>> this.
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >> ------------------ 原始邮件 ------------------
>>> >>
>>> >> 发件人: "Gengliang Wang"<ge...@databricks.com>;
>>> >>
>>> >> 发送时间: 2020年6月24日(星期三) 下午4:15
>>> >>
>>> >> 收件人: "Hyukjin Kwon"<gu...@gmail.com>;
>>> >>
>>> >> 抄送: "Dongjoon Hyun"<do...@gmail.com>;"Jungtaek Lim"<
>>> kabhwan.opensource@gmail.com>;"Jules Damji"<dm...@comcast.net>;"Holden
>>> Karau"<ho...@pigscanfly.ca>;"Reynold Xin"<rx...@databricks.com>;"Shivaram
>>> Venkataraman"<sh...@eecs.berkeley.edu>;"Yuanjian Li"<
>>> xyliyuanjian@gmail.com>;"Spark dev list"<de...@spark.apache.org>;"Takeshi
>>> Yamamuro"<li...@gmail.com>;
>>> >>
>>> >> 主题: Re: [DISCUSS] Apache Spark 3.0.1 Release
>>> >>
>>> >>
>>> >>
>>> >> +1, the issues mentioned are really serious.
>>> >>
>>> >>
>>> >>
>>> >> On Tue, Jun 23, 2020 at 7:56 PM Hyukjin Kwon <gu...@gmail.com>
>>> wrote:
>>> >>
>>> >> +1.
>>> >>
>>> >> Just as a note,
>>> >> - SPARK-31918 is fixed now, and there's no blocker. - When we build
>>> SparkR, we should use the latest R version at least 4.0.0+.
>>> >>
>>> >>
>>> >>
>>> >> 2020년 6월 24일 (수) 오전 11:20, Dongjoon Hyun <do...@gmail.com>님이
>>> 작성:
>>> >>
>>> >> +1
>>> >>
>>> >>
>>> >>
>>> >> Bests,
>>> >>
>>> >> Dongjoon.
>>> >>
>>> >>
>>> >>
>>> >> On Tue, Jun 23, 2020 at 1:19 PM Jungtaek Lim <
>>> kabhwan.opensource@gmail.com> wrote:
>>> >>
>>> >> +1 on a 3.0.1 soon.
>>> >>
>>> >>
>>> >>
>>> >> Probably it would be nice if some Scala experts can take a look at
>>> https://issues.apache.org/jira/browse/SPARK-32051 and include the fix
>>> into 3.0.1 if possible.
>>> >>
>>> >> Looks like APIs designed to work with Scala 2.11 & Java bring
>>> ambiguity in Scala 2.12 & Java.
>>> >>
>>> >>
>>> >>
>>> >> On Wed, Jun 24, 2020 at 4:52 AM Jules Damji <dm...@comcast.net>
>>> wrote:
>>> >>
>>> >> +1 (non-binding)
>>> >>
>>> >>
>>> >>
>>> >> Sent from my iPhone
>>> >>
>>> >> Pardon the dumb thumb typos :)
>>> >>
>>> >>
>>> >>
>>> >> On Jun 23, 2020, at 11:36 AM, Holden Karau <ho...@pigscanfly.ca>
>>> wrote:
>>> >>
>>> >> +1 on a patch release soon
>>> >>
>>> >>
>>> >>
>>> >> On Tue, Jun 23, 2020 at 10:47 AM Reynold Xin <rx...@databricks.com>
>>> wrote:
>>> >>
>>> >> Error! Filename not specified.
>>> >>
>>> >> +1 on doing a new patch release soon. I saw some of these issues when
>>> preparing the 3.0 release, and some of them are very serious.
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >> On Tue, Jun 23, 2020 at 8:06 AM, Shivaram Venkataraman <
>>> shivaram@eecs.berkeley.edu> wrote:
>>> >>
>>> >> +1 Thanks Yuanjian -- I think it'll be great to have a 3.0.1 release
>>> soon.
>>> >>
>>> >> Shivaram
>>> >>
>>> >> On Tue, Jun 23, 2020 at 3:43 AM Takeshi Yamamuro <
>>> linguin.m.s@gmail.com> wrote:
>>> >>
>>> >> Thanks for the heads-up, Yuanjian!
>>> >>
>>> >> I also noticed branch-3.0 already has 39 commits after Spark 3.0.0.
>>> >>
>>> >> wow, the updates are so quick. Anyway, +1 for the release.
>>> >>
>>> >> Bests,
>>> >> Takeshi
>>> >>
>>> >> On Tue, Jun 23, 2020 at 4:59 PM Yuanjian Li <xy...@gmail.com>
>>> wrote:
>>> >>
>>> >> Hi dev-list,
>>> >>
>>> >> I’m writing this to raise the discussion about Spark 3.0.1
>>> feasibility since 4 blocker issues were found after Spark 3.0.0:
>>> >>
>>> >> [SPARK-31990] The state store compatibility broken will cause a
>>> correctness issue when Streaming query with `dropDuplicate` uses the
>>> checkpoint written by the old Spark version.
>>> >>
>>> >> [SPARK-32038] The regression bug in handling NaN values in
>>> COUNT(DISTINCT)
>>> >>
>>> >> [SPARK-31918][WIP] CRAN requires to make it working with the latest R
>>> 4.0. It makes the 3.0 release unavailable on CRAN, and only supports R
>>> [3.5, 4.0)
>>> >>
>>> >> [SPARK-31967] Downgrade vis.js to fix Jobs UI loading time regression
>>> >>
>>> >> I also noticed branch-3.0 already has 39 commits after Spark 3.0.0. I
>>> think it would be great if we have Spark 3.0.1 to deliver the critical
>>> fixes.
>>> >>
>>> >> Any comments are appreciated.
>>> >>
>>> >> Best,
>>> >>
>>> >> Yuanjian
>>> >>
>>> >> --
>>> >> ---
>>> >> Takeshi Yamamuro
>>> >>
>>> >> ---------------------------------------------------------------------
>>> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >>
>>> >> Twitter: https://twitter.com/holdenkarau
>>> >>
>>> >> Books (Learning Spark, High Performance Spark, etc.):
>>> https://amzn.to/2MaRAG9
>>> >>
>>> >> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>> >
>>> > --
>>> > Twitter: https://twitter.com/holdenkarau
>>> > Books (Learning Spark, High Performance Spark, etc.):
>>> https://amzn.to/2MaRAG9
>>> > YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>>
>>
>
> --
> <https://databricks.com/sparkaisummit/north-america>
>
-- 
Twitter: https://twitter.com/holdenkarau
Books (Learning Spark, High Performance Spark, etc.):
https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
YouTube Live Streams: https://www.youtube.com/user/holdenkarau

Re: [DISCUSS] Apache Spark 3.0.1 Release

Posted by Xiao Li <li...@databricks.com>.
+1 on releasing both 3.0.1 and 2.4.7

Great! Three committers volunteer to be a release manager. Ruifeng,
Prashant and Holden. Holden just helped release Spark 2.4.6. This time,
maybe, Ruifeng and Prashant can be the release manager of 3.0.1 and 2.4.7
respectively.

Xiao

On Wed, Jul 1, 2020 at 2:24 PM Jungtaek Lim <ka...@gmail.com>
wrote:

> https://issues.apache.org/jira/browse/SPARK-32148 was reported yesterday,
> and if the report is valid it looks to be a blocker. I'll try to take a
> look sooner.
>
> On Thu, Jul 2, 2020 at 12:48 AM Shivaram Venkataraman <
> shivaram@eecs.berkeley.edu> wrote:
>
>> Thanks Holden -- it would be great to also get 2.4.7 started
>>
>> Thanks
>> Shivaram
>>
>> On Tue, Jun 30, 2020 at 10:31 PM Holden Karau <ho...@pigscanfly.ca>
>> wrote:
>> >
>> > I can take care of 2.4.7 unless someone else wants to do it.
>> >
>> > On Tue, Jun 30, 2020 at 8:29 PM Jason Moore <
>> Jason.Moore@quantium.com.au> wrote:
>> >>
>> >> Hi all,
>> >>
>> >>
>> >>
>> >> Could I get some input on the severity of this one that I found
>> yesterday?  If that’s a correctness issue, should it block this patch?  Let
>> me know under the ticket if there’s more info that I can provide to help.
>> >>
>> >>
>> >>
>> >> https://issues.apache.org/jira/browse/SPARK-32136
>> >>
>> >>
>> >>
>> >> Thanks,
>> >>
>> >> Jason.
>> >>
>> >>
>> >>
>> >> From: Jungtaek Lim <ka...@gmail.com>
>> >> Date: Wednesday, 1 July 2020 at 10:20 am
>> >> To: Shivaram Venkataraman <sh...@eecs.berkeley.edu>
>> >> Cc: Prashant Sharma <sc...@gmail.com>, 郑瑞峰 <ru...@foxmail.com>,
>> Gengliang Wang <ge...@databricks.com>, gurwls223 <
>> gurwls223@gmail.com>, Dongjoon Hyun <do...@gmail.com>, Jules
>> Damji <dm...@comcast.net>, Holden Karau <ho...@pigscanfly.ca>,
>> Reynold Xin <rx...@databricks.com>, Yuanjian Li <xy...@gmail.com>,
>> "dev@spark.apache.org" <de...@spark.apache.org>, Takeshi Yamamuro <
>> linguin.m.s@gmail.com>
>> >> Subject: Re: [DISCUSS] Apache Spark 3.0.1 Release
>> >>
>> >>
>> >>
>> >> SPARK-32130 [1] looks to be a performance regression introduced in
>> Spark 3.0.0, which is ideal to look into before releasing another bugfix
>> version.
>> >>
>> >>
>> >>
>> >> 1. https://issues.apache.org/jira/browse/SPARK-32130
>> >>
>> >>
>> >>
>> >> On Wed, Jul 1, 2020 at 7:05 AM Shivaram Venkataraman <
>> shivaram@eecs.berkeley.edu> wrote:
>> >>
>> >> Hi all
>> >>
>> >>
>> >>
>> >> I just wanted to ping this thread to see if all the outstanding
>> blockers for 3.0.1 have been fixed. If so, it would be great if we can get
>> the release going. The CRAN team sent us a note that the version SparkR
>> available on CRAN for the current R version (4.0.2) is broken and hence we
>> need to update the package soon --  it will be great to do it with 3.0.1.
>> >>
>> >>
>> >>
>> >> Thanks
>> >>
>> >> Shivaram
>> >>
>> >>
>> >>
>> >> On Wed, Jun 24, 2020 at 8:31 PM Prashant Sharma <sc...@gmail.com>
>> wrote:
>> >>
>> >> +1 for 3.0.1 release.
>> >>
>> >> I too can help out as release manager.
>> >>
>> >>
>> >>
>> >> On Thu, Jun 25, 2020 at 4:58 AM 郑瑞峰 <ru...@foxmail.com> wrote:
>> >>
>> >> I volunteer to be a release manager of 3.0.1, if nobody is working on
>> this.
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> ------------------ 原始邮件 ------------------
>> >>
>> >> 发件人: "Gengliang Wang"<ge...@databricks.com>;
>> >>
>> >> 发送时间: 2020年6月24日(星期三) 下午4:15
>> >>
>> >> 收件人: "Hyukjin Kwon"<gu...@gmail.com>;
>> >>
>> >> 抄送: "Dongjoon Hyun"<do...@gmail.com>;"Jungtaek Lim"<
>> kabhwan.opensource@gmail.com>;"Jules Damji"<dm...@comcast.net>;"Holden
>> Karau"<ho...@pigscanfly.ca>;"Reynold Xin"<rx...@databricks.com>;"Shivaram
>> Venkataraman"<sh...@eecs.berkeley.edu>;"Yuanjian Li"<
>> xyliyuanjian@gmail.com>;"Spark dev list"<de...@spark.apache.org>;"Takeshi
>> Yamamuro"<li...@gmail.com>;
>> >>
>> >> 主题: Re: [DISCUSS] Apache Spark 3.0.1 Release
>> >>
>> >>
>> >>
>> >> +1, the issues mentioned are really serious.
>> >>
>> >>
>> >>
>> >> On Tue, Jun 23, 2020 at 7:56 PM Hyukjin Kwon <gu...@gmail.com>
>> wrote:
>> >>
>> >> +1.
>> >>
>> >> Just as a note,
>> >> - SPARK-31918 is fixed now, and there's no blocker. - When we build
>> SparkR, we should use the latest R version at least 4.0.0+.
>> >>
>> >>
>> >>
>> >> 2020년 6월 24일 (수) 오전 11:20, Dongjoon Hyun <do...@gmail.com>님이
>> 작성:
>> >>
>> >> +1
>> >>
>> >>
>> >>
>> >> Bests,
>> >>
>> >> Dongjoon.
>> >>
>> >>
>> >>
>> >> On Tue, Jun 23, 2020 at 1:19 PM Jungtaek Lim <
>> kabhwan.opensource@gmail.com> wrote:
>> >>
>> >> +1 on a 3.0.1 soon.
>> >>
>> >>
>> >>
>> >> Probably it would be nice if some Scala experts can take a look at
>> https://issues.apache.org/jira/browse/SPARK-32051 and include the fix
>> into 3.0.1 if possible.
>> >>
>> >> Looks like APIs designed to work with Scala 2.11 & Java bring
>> ambiguity in Scala 2.12 & Java.
>> >>
>> >>
>> >>
>> >> On Wed, Jun 24, 2020 at 4:52 AM Jules Damji <dm...@comcast.net>
>> wrote:
>> >>
>> >> +1 (non-binding)
>> >>
>> >>
>> >>
>> >> Sent from my iPhone
>> >>
>> >> Pardon the dumb thumb typos :)
>> >>
>> >>
>> >>
>> >> On Jun 23, 2020, at 11:36 AM, Holden Karau <ho...@pigscanfly.ca>
>> wrote:
>> >>
>> >> +1 on a patch release soon
>> >>
>> >>
>> >>
>> >> On Tue, Jun 23, 2020 at 10:47 AM Reynold Xin <rx...@databricks.com>
>> wrote:
>> >>
>> >> Error! Filename not specified.
>> >>
>> >> +1 on doing a new patch release soon. I saw some of these issues when
>> preparing the 3.0 release, and some of them are very serious.
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> On Tue, Jun 23, 2020 at 8:06 AM, Shivaram Venkataraman <
>> shivaram@eecs.berkeley.edu> wrote:
>> >>
>> >> +1 Thanks Yuanjian -- I think it'll be great to have a 3.0.1 release
>> soon.
>> >>
>> >> Shivaram
>> >>
>> >> On Tue, Jun 23, 2020 at 3:43 AM Takeshi Yamamuro <
>> linguin.m.s@gmail.com> wrote:
>> >>
>> >> Thanks for the heads-up, Yuanjian!
>> >>
>> >> I also noticed branch-3.0 already has 39 commits after Spark 3.0.0.
>> >>
>> >> wow, the updates are so quick. Anyway, +1 for the release.
>> >>
>> >> Bests,
>> >> Takeshi
>> >>
>> >> On Tue, Jun 23, 2020 at 4:59 PM Yuanjian Li <xy...@gmail.com>
>> wrote:
>> >>
>> >> Hi dev-list,
>> >>
>> >> I’m writing this to raise the discussion about Spark 3.0.1 feasibility
>> since 4 blocker issues were found after Spark 3.0.0:
>> >>
>> >> [SPARK-31990] The state store compatibility broken will cause a
>> correctness issue when Streaming query with `dropDuplicate` uses the
>> checkpoint written by the old Spark version.
>> >>
>> >> [SPARK-32038] The regression bug in handling NaN values in
>> COUNT(DISTINCT)
>> >>
>> >> [SPARK-31918][WIP] CRAN requires to make it working with the latest R
>> 4.0. It makes the 3.0 release unavailable on CRAN, and only supports R
>> [3.5, 4.0)
>> >>
>> >> [SPARK-31967] Downgrade vis.js to fix Jobs UI loading time regression
>> >>
>> >> I also noticed branch-3.0 already has 39 commits after Spark 3.0.0. I
>> think it would be great if we have Spark 3.0.1 to deliver the critical
>> fixes.
>> >>
>> >> Any comments are appreciated.
>> >>
>> >> Best,
>> >>
>> >> Yuanjian
>> >>
>> >> --
>> >> ---
>> >> Takeshi Yamamuro
>> >>
>> >> ---------------------------------------------------------------------
>> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> --
>> >>
>> >> Twitter: https://twitter.com/holdenkarau
>> >>
>> >> Books (Learning Spark, High Performance Spark, etc.):
>> https://amzn.to/2MaRAG9
>> >>
>> >> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>> >
>> > --
>> > Twitter: https://twitter.com/holdenkarau
>> > Books (Learning Spark, High Performance Spark, etc.):
>> https://amzn.to/2MaRAG9
>> > YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>
>

-- 
<https://databricks.com/sparkaisummit/north-america>

Re: [DISCUSS] Apache Spark 3.0.1 Release

Posted by Jungtaek Lim <ka...@gmail.com>.
https://issues.apache.org/jira/browse/SPARK-32148 was reported yesterday,
and if the report is valid it looks to be a blocker. I'll try to take a
look sooner.

On Thu, Jul 2, 2020 at 12:48 AM Shivaram Venkataraman <
shivaram@eecs.berkeley.edu> wrote:

> Thanks Holden -- it would be great to also get 2.4.7 started
>
> Thanks
> Shivaram
>
> On Tue, Jun 30, 2020 at 10:31 PM Holden Karau <ho...@pigscanfly.ca>
> wrote:
> >
> > I can take care of 2.4.7 unless someone else wants to do it.
> >
> > On Tue, Jun 30, 2020 at 8:29 PM Jason Moore <Ja...@quantium.com.au>
> wrote:
> >>
> >> Hi all,
> >>
> >>
> >>
> >> Could I get some input on the severity of this one that I found
> yesterday?  If that’s a correctness issue, should it block this patch?  Let
> me know under the ticket if there’s more info that I can provide to help.
> >>
> >>
> >>
> >> https://issues.apache.org/jira/browse/SPARK-32136
> >>
> >>
> >>
> >> Thanks,
> >>
> >> Jason.
> >>
> >>
> >>
> >> From: Jungtaek Lim <ka...@gmail.com>
> >> Date: Wednesday, 1 July 2020 at 10:20 am
> >> To: Shivaram Venkataraman <sh...@eecs.berkeley.edu>
> >> Cc: Prashant Sharma <sc...@gmail.com>, 郑瑞峰 <ru...@foxmail.com>,
> Gengliang Wang <ge...@databricks.com>, gurwls223 <
> gurwls223@gmail.com>, Dongjoon Hyun <do...@gmail.com>, Jules
> Damji <dm...@comcast.net>, Holden Karau <ho...@pigscanfly.ca>, Reynold
> Xin <rx...@databricks.com>, Yuanjian Li <xy...@gmail.com>, "
> dev@spark.apache.org" <de...@spark.apache.org>, Takeshi Yamamuro <
> linguin.m.s@gmail.com>
> >> Subject: Re: [DISCUSS] Apache Spark 3.0.1 Release
> >>
> >>
> >>
> >> SPARK-32130 [1] looks to be a performance regression introduced in
> Spark 3.0.0, which is ideal to look into before releasing another bugfix
> version.
> >>
> >>
> >>
> >> 1. https://issues.apache.org/jira/browse/SPARK-32130
> >>
> >>
> >>
> >> On Wed, Jul 1, 2020 at 7:05 AM Shivaram Venkataraman <
> shivaram@eecs.berkeley.edu> wrote:
> >>
> >> Hi all
> >>
> >>
> >>
> >> I just wanted to ping this thread to see if all the outstanding
> blockers for 3.0.1 have been fixed. If so, it would be great if we can get
> the release going. The CRAN team sent us a note that the version SparkR
> available on CRAN for the current R version (4.0.2) is broken and hence we
> need to update the package soon --  it will be great to do it with 3.0.1.
> >>
> >>
> >>
> >> Thanks
> >>
> >> Shivaram
> >>
> >>
> >>
> >> On Wed, Jun 24, 2020 at 8:31 PM Prashant Sharma <sc...@gmail.com>
> wrote:
> >>
> >> +1 for 3.0.1 release.
> >>
> >> I too can help out as release manager.
> >>
> >>
> >>
> >> On Thu, Jun 25, 2020 at 4:58 AM 郑瑞峰 <ru...@foxmail.com> wrote:
> >>
> >> I volunteer to be a release manager of 3.0.1, if nobody is working on
> this.
> >>
> >>
> >>
> >>
> >>
> >> ------------------ 原始邮件 ------------------
> >>
> >> 发件人: "Gengliang Wang"<ge...@databricks.com>;
> >>
> >> 发送时间: 2020年6月24日(星期三) 下午4:15
> >>
> >> 收件人: "Hyukjin Kwon"<gu...@gmail.com>;
> >>
> >> 抄送: "Dongjoon Hyun"<do...@gmail.com>;"Jungtaek Lim"<
> kabhwan.opensource@gmail.com>;"Jules Damji"<dm...@comcast.net>;"Holden
> Karau"<ho...@pigscanfly.ca>;"Reynold Xin"<rx...@databricks.com>;"Shivaram
> Venkataraman"<sh...@eecs.berkeley.edu>;"Yuanjian Li"<
> xyliyuanjian@gmail.com>;"Spark dev list"<de...@spark.apache.org>;"Takeshi
> Yamamuro"<li...@gmail.com>;
> >>
> >> 主题: Re: [DISCUSS] Apache Spark 3.0.1 Release
> >>
> >>
> >>
> >> +1, the issues mentioned are really serious.
> >>
> >>
> >>
> >> On Tue, Jun 23, 2020 at 7:56 PM Hyukjin Kwon <gu...@gmail.com>
> wrote:
> >>
> >> +1.
> >>
> >> Just as a note,
> >> - SPARK-31918 is fixed now, and there's no blocker. - When we build
> SparkR, we should use the latest R version at least 4.0.0+.
> >>
> >>
> >>
> >> 2020년 6월 24일 (수) 오전 11:20, Dongjoon Hyun <do...@gmail.com>님이
> 작성:
> >>
> >> +1
> >>
> >>
> >>
> >> Bests,
> >>
> >> Dongjoon.
> >>
> >>
> >>
> >> On Tue, Jun 23, 2020 at 1:19 PM Jungtaek Lim <
> kabhwan.opensource@gmail.com> wrote:
> >>
> >> +1 on a 3.0.1 soon.
> >>
> >>
> >>
> >> Probably it would be nice if some Scala experts can take a look at
> https://issues.apache.org/jira/browse/SPARK-32051 and include the fix
> into 3.0.1 if possible.
> >>
> >> Looks like APIs designed to work with Scala 2.11 & Java bring ambiguity
> in Scala 2.12 & Java.
> >>
> >>
> >>
> >> On Wed, Jun 24, 2020 at 4:52 AM Jules Damji <dm...@comcast.net>
> wrote:
> >>
> >> +1 (non-binding)
> >>
> >>
> >>
> >> Sent from my iPhone
> >>
> >> Pardon the dumb thumb typos :)
> >>
> >>
> >>
> >> On Jun 23, 2020, at 11:36 AM, Holden Karau <ho...@pigscanfly.ca>
> wrote:
> >>
> >> +1 on a patch release soon
> >>
> >>
> >>
> >> On Tue, Jun 23, 2020 at 10:47 AM Reynold Xin <rx...@databricks.com>
> wrote:
> >>
> >> Error! Filename not specified.
> >>
> >> +1 on doing a new patch release soon. I saw some of these issues when
> preparing the 3.0 release, and some of them are very serious.
> >>
> >>
> >>
> >>
> >>
> >> On Tue, Jun 23, 2020 at 8:06 AM, Shivaram Venkataraman <
> shivaram@eecs.berkeley.edu> wrote:
> >>
> >> +1 Thanks Yuanjian -- I think it'll be great to have a 3.0.1 release
> soon.
> >>
> >> Shivaram
> >>
> >> On Tue, Jun 23, 2020 at 3:43 AM Takeshi Yamamuro <li...@gmail.com>
> wrote:
> >>
> >> Thanks for the heads-up, Yuanjian!
> >>
> >> I also noticed branch-3.0 already has 39 commits after Spark 3.0.0.
> >>
> >> wow, the updates are so quick. Anyway, +1 for the release.
> >>
> >> Bests,
> >> Takeshi
> >>
> >> On Tue, Jun 23, 2020 at 4:59 PM Yuanjian Li <xy...@gmail.com>
> wrote:
> >>
> >> Hi dev-list,
> >>
> >> I’m writing this to raise the discussion about Spark 3.0.1 feasibility
> since 4 blocker issues were found after Spark 3.0.0:
> >>
> >> [SPARK-31990] The state store compatibility broken will cause a
> correctness issue when Streaming query with `dropDuplicate` uses the
> checkpoint written by the old Spark version.
> >>
> >> [SPARK-32038] The regression bug in handling NaN values in
> COUNT(DISTINCT)
> >>
> >> [SPARK-31918][WIP] CRAN requires to make it working with the latest R
> 4.0. It makes the 3.0 release unavailable on CRAN, and only supports R
> [3.5, 4.0)
> >>
> >> [SPARK-31967] Downgrade vis.js to fix Jobs UI loading time regression
> >>
> >> I also noticed branch-3.0 already has 39 commits after Spark 3.0.0. I
> think it would be great if we have Spark 3.0.1 to deliver the critical
> fixes.
> >>
> >> Any comments are appreciated.
> >>
> >> Best,
> >>
> >> Yuanjian
> >>
> >> --
> >> ---
> >> Takeshi Yamamuro
> >>
> >> ---------------------------------------------------------------------
> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
> >>
> >>
> >>
> >>
> >>
> >>
> >> --
> >>
> >> Twitter: https://twitter.com/holdenkarau
> >>
> >> Books (Learning Spark, High Performance Spark, etc.):
> https://amzn.to/2MaRAG9
> >>
> >> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
> >
> > --
> > Twitter: https://twitter.com/holdenkarau
> > Books (Learning Spark, High Performance Spark, etc.):
> https://amzn.to/2MaRAG9
> > YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>

Re: [DISCUSS] Apache Spark 3.0.1 Release

Posted by Shivaram Venkataraman <sh...@eecs.berkeley.edu>.
Thanks Holden -- it would be great to also get 2.4.7 started

Thanks
Shivaram

On Tue, Jun 30, 2020 at 10:31 PM Holden Karau <ho...@pigscanfly.ca> wrote:
>
> I can take care of 2.4.7 unless someone else wants to do it.
>
> On Tue, Jun 30, 2020 at 8:29 PM Jason Moore <Ja...@quantium.com.au> wrote:
>>
>> Hi all,
>>
>>
>>
>> Could I get some input on the severity of this one that I found yesterday?  If that’s a correctness issue, should it block this patch?  Let me know under the ticket if there’s more info that I can provide to help.
>>
>>
>>
>> https://issues.apache.org/jira/browse/SPARK-32136
>>
>>
>>
>> Thanks,
>>
>> Jason.
>>
>>
>>
>> From: Jungtaek Lim <ka...@gmail.com>
>> Date: Wednesday, 1 July 2020 at 10:20 am
>> To: Shivaram Venkataraman <sh...@eecs.berkeley.edu>
>> Cc: Prashant Sharma <sc...@gmail.com>, 郑瑞峰 <ru...@foxmail.com>, Gengliang Wang <ge...@databricks.com>, gurwls223 <gu...@gmail.com>, Dongjoon Hyun <do...@gmail.com>, Jules Damji <dm...@comcast.net>, Holden Karau <ho...@pigscanfly.ca>, Reynold Xin <rx...@databricks.com>, Yuanjian Li <xy...@gmail.com>, "dev@spark.apache.org" <de...@spark.apache.org>, Takeshi Yamamuro <li...@gmail.com>
>> Subject: Re: [DISCUSS] Apache Spark 3.0.1 Release
>>
>>
>>
>> SPARK-32130 [1] looks to be a performance regression introduced in Spark 3.0.0, which is ideal to look into before releasing another bugfix version.
>>
>>
>>
>> 1. https://issues.apache.org/jira/browse/SPARK-32130
>>
>>
>>
>> On Wed, Jul 1, 2020 at 7:05 AM Shivaram Venkataraman <sh...@eecs.berkeley.edu> wrote:
>>
>> Hi all
>>
>>
>>
>> I just wanted to ping this thread to see if all the outstanding blockers for 3.0.1 have been fixed. If so, it would be great if we can get the release going. The CRAN team sent us a note that the version SparkR available on CRAN for the current R version (4.0.2) is broken and hence we need to update the package soon --  it will be great to do it with 3.0.1.
>>
>>
>>
>> Thanks
>>
>> Shivaram
>>
>>
>>
>> On Wed, Jun 24, 2020 at 8:31 PM Prashant Sharma <sc...@gmail.com> wrote:
>>
>> +1 for 3.0.1 release.
>>
>> I too can help out as release manager.
>>
>>
>>
>> On Thu, Jun 25, 2020 at 4:58 AM 郑瑞峰 <ru...@foxmail.com> wrote:
>>
>> I volunteer to be a release manager of 3.0.1, if nobody is working on this.
>>
>>
>>
>>
>>
>> ------------------ 原始邮件 ------------------
>>
>> 发件人: "Gengliang Wang"<ge...@databricks.com>;
>>
>> 发送时间: 2020年6月24日(星期三) 下午4:15
>>
>> 收件人: "Hyukjin Kwon"<gu...@gmail.com>;
>>
>> 抄送: "Dongjoon Hyun"<do...@gmail.com>;"Jungtaek Lim"<ka...@gmail.com>;"Jules Damji"<dm...@comcast.net>;"Holden Karau"<ho...@pigscanfly.ca>;"Reynold Xin"<rx...@databricks.com>;"Shivaram Venkataraman"<sh...@eecs.berkeley.edu>;"Yuanjian Li"<xy...@gmail.com>;"Spark dev list"<de...@spark.apache.org>;"Takeshi Yamamuro"<li...@gmail.com>;
>>
>> 主题: Re: [DISCUSS] Apache Spark 3.0.1 Release
>>
>>
>>
>> +1, the issues mentioned are really serious.
>>
>>
>>
>> On Tue, Jun 23, 2020 at 7:56 PM Hyukjin Kwon <gu...@gmail.com> wrote:
>>
>> +1.
>>
>> Just as a note,
>> - SPARK-31918 is fixed now, and there's no blocker. - When we build SparkR, we should use the latest R version at least 4.0.0+.
>>
>>
>>
>> 2020년 6월 24일 (수) 오전 11:20, Dongjoon Hyun <do...@gmail.com>님이 작성:
>>
>> +1
>>
>>
>>
>> Bests,
>>
>> Dongjoon.
>>
>>
>>
>> On Tue, Jun 23, 2020 at 1:19 PM Jungtaek Lim <ka...@gmail.com> wrote:
>>
>> +1 on a 3.0.1 soon.
>>
>>
>>
>> Probably it would be nice if some Scala experts can take a look at https://issues.apache.org/jira/browse/SPARK-32051 and include the fix into 3.0.1 if possible.
>>
>> Looks like APIs designed to work with Scala 2.11 & Java bring ambiguity in Scala 2.12 & Java.
>>
>>
>>
>> On Wed, Jun 24, 2020 at 4:52 AM Jules Damji <dm...@comcast.net> wrote:
>>
>> +1 (non-binding)
>>
>>
>>
>> Sent from my iPhone
>>
>> Pardon the dumb thumb typos :)
>>
>>
>>
>> On Jun 23, 2020, at 11:36 AM, Holden Karau <ho...@pigscanfly.ca> wrote:
>>
>> +1 on a patch release soon
>>
>>
>>
>> On Tue, Jun 23, 2020 at 10:47 AM Reynold Xin <rx...@databricks.com> wrote:
>>
>> Error! Filename not specified.
>>
>> +1 on doing a new patch release soon. I saw some of these issues when preparing the 3.0 release, and some of them are very serious.
>>
>>
>>
>>
>>
>> On Tue, Jun 23, 2020 at 8:06 AM, Shivaram Venkataraman <sh...@eecs.berkeley.edu> wrote:
>>
>> +1 Thanks Yuanjian -- I think it'll be great to have a 3.0.1 release soon.
>>
>> Shivaram
>>
>> On Tue, Jun 23, 2020 at 3:43 AM Takeshi Yamamuro <li...@gmail.com> wrote:
>>
>> Thanks for the heads-up, Yuanjian!
>>
>> I also noticed branch-3.0 already has 39 commits after Spark 3.0.0.
>>
>> wow, the updates are so quick. Anyway, +1 for the release.
>>
>> Bests,
>> Takeshi
>>
>> On Tue, Jun 23, 2020 at 4:59 PM Yuanjian Li <xy...@gmail.com> wrote:
>>
>> Hi dev-list,
>>
>> I’m writing this to raise the discussion about Spark 3.0.1 feasibility since 4 blocker issues were found after Spark 3.0.0:
>>
>> [SPARK-31990] The state store compatibility broken will cause a correctness issue when Streaming query with `dropDuplicate` uses the checkpoint written by the old Spark version.
>>
>> [SPARK-32038] The regression bug in handling NaN values in COUNT(DISTINCT)
>>
>> [SPARK-31918][WIP] CRAN requires to make it working with the latest R 4.0. It makes the 3.0 release unavailable on CRAN, and only supports R [3.5, 4.0)
>>
>> [SPARK-31967] Downgrade vis.js to fix Jobs UI loading time regression
>>
>> I also noticed branch-3.0 already has 39 commits after Spark 3.0.0. I think it would be great if we have Spark 3.0.1 to deliver the critical fixes.
>>
>> Any comments are appreciated.
>>
>> Best,
>>
>> Yuanjian
>>
>> --
>> ---
>> Takeshi Yamamuro
>>
>> --------------------------------------------------------------------- To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>>
>>
>>
>>
>>
>>
>> --
>>
>> Twitter: https://twitter.com/holdenkarau
>>
>> Books (Learning Spark, High Performance Spark, etc.): https://amzn.to/2MaRAG9
>>
>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>
> --
> Twitter: https://twitter.com/holdenkarau
> Books (Learning Spark, High Performance Spark, etc.): https://amzn.to/2MaRAG9
> YouTube Live Streams: https://www.youtube.com/user/holdenkarau

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org


Re: [DISCUSS] Apache Spark 3.0.1 Release

Posted by Wenchen Fan <cl...@gmail.com>.
Hi Jason,

Thanks for reporting! https://issues.apache.org/jira/browse/SPARK-32136 looks
like a breaking change and we should investigate.

On Wed, Jul 1, 2020 at 11:31 AM Holden Karau <ho...@pigscanfly.ca> wrote:

> I can take care of 2.4.7 unless someone else wants to do it.
>
> On Tue, Jun 30, 2020 at 8:29 PM Jason Moore <Ja...@quantium.com.au>
> wrote:
>
>> Hi all,
>>
>>
>>
>> Could I get some input on the severity of this one that I found
>> yesterday?  If that’s a correctness issue, should it block this patch?  Let
>> me know under the ticket if there’s more info that I can provide to help.
>>
>>
>>
>> https://issues.apache.org/jira/browse/SPARK-32136
>>
>>
>>
>> Thanks,
>>
>> Jason.
>>
>>
>>
>> *From: *Jungtaek Lim <ka...@gmail.com>
>> *Date: *Wednesday, 1 July 2020 at 10:20 am
>> *To: *Shivaram Venkataraman <sh...@eecs.berkeley.edu>
>> *Cc: *Prashant Sharma <sc...@gmail.com>, 郑瑞峰 <ru...@foxmail.com>,
>> Gengliang Wang <ge...@databricks.com>, gurwls223 <
>> gurwls223@gmail.com>, Dongjoon Hyun <do...@gmail.com>, Jules
>> Damji <dm...@comcast.net>, Holden Karau <ho...@pigscanfly.ca>,
>> Reynold Xin <rx...@databricks.com>, Yuanjian Li <xy...@gmail.com>,
>> "dev@spark.apache.org" <de...@spark.apache.org>, Takeshi Yamamuro <
>> linguin.m.s@gmail.com>
>> *Subject: *Re: [DISCUSS] Apache Spark 3.0.1 Release
>>
>>
>>
>> SPARK-32130 [1] looks to be a performance regression introduced in Spark
>> 3.0.0, which is ideal to look into before releasing another bugfix version.
>>
>>
>>
>> 1. https://issues.apache.org/jira/browse/SPARK-32130
>>
>>
>>
>> On Wed, Jul 1, 2020 at 7:05 AM Shivaram Venkataraman <
>> shivaram@eecs.berkeley.edu> wrote:
>>
>> Hi all
>>
>>
>>
>> I just wanted to ping this thread to see if all the outstanding blockers
>> for 3.0.1 have been fixed. If so, it would be great if we can get the
>> release going. The CRAN team sent us a note that the version SparkR
>> available on CRAN for the current R version (4.0.2) is broken and hence we
>> need to update the package soon --  it will be great to do it with 3.0.1.
>>
>>
>>
>> Thanks
>>
>> Shivaram
>>
>>
>>
>> On Wed, Jun 24, 2020 at 8:31 PM Prashant Sharma <sc...@gmail.com>
>> wrote:
>>
>> +1 for 3.0.1 release.
>>
>> I too can help out as release manager.
>>
>>
>>
>> On Thu, Jun 25, 2020 at 4:58 AM 郑瑞峰 <ru...@foxmail.com> wrote:
>>
>> I volunteer to be a release manager of 3.0.1, if nobody is working on
>> this.
>>
>>
>>
>>
>>
>> ------------------ 原始邮件 ------------------
>>
>> *发**件人**:* "Gengliang Wang"<ge...@databricks.com>;
>>
>> *发**送**时间**:* 2020年6月24日(星期三) 下午4:15
>>
>> *收件人**:* "Hyukjin Kwon"<gu...@gmail.com>;
>>
>> *抄送**:* "Dongjoon Hyun"<do...@gmail.com>;"Jungtaek Lim"<
>> kabhwan.opensource@gmail.com>;"Jules Damji"<dm...@comcast.net>;"Holden
>> Karau"<ho...@pigscanfly.ca>;"Reynold Xin"<rx...@databricks.com>;"Shivaram
>> Venkataraman"<sh...@eecs.berkeley.edu>;"Yuanjian Li"<
>> xyliyuanjian@gmail.com>;"Spark dev list"<de...@spark.apache.org>;"Takeshi
>> Yamamuro"<li...@gmail.com>;
>>
>> *主**题**:* Re: [DISCUSS] Apache Spark 3.0.1 Release
>>
>>
>>
>> +1, the issues mentioned are really serious.
>>
>>
>>
>> On Tue, Jun 23, 2020 at 7:56 PM Hyukjin Kwon <gu...@gmail.com> wrote:
>>
>> +1.
>>
>> Just as a note,
>> - SPARK-31918 <https://issues.apache.org/jira/browse/SPARK-31918> is
>> fixed now, and there's no blocker. - When we build SparkR, we should use
>> the latest R version at least 4.0.0+.
>>
>>
>>
>> 2020년 6월 24일 (수) 오전 11:20, Dongjoon Hyun <do...@gmail.com>님이 작성:
>>
>> +1
>>
>>
>>
>> Bests,
>>
>> Dongjoon.
>>
>>
>>
>> On Tue, Jun 23, 2020 at 1:19 PM Jungtaek Lim <
>> kabhwan.opensource@gmail.com> wrote:
>>
>> +1 on a 3.0.1 soon.
>>
>>
>>
>> Probably it would be nice if some Scala experts can take a look at
>> https://issues.apache.org/jira/browse/SPARK-32051 and include the fix
>> into 3.0.1 if possible.
>>
>> Looks like APIs designed to work with Scala 2.11 & Java bring
>> ambiguity in Scala 2.12 & Java.
>>
>>
>>
>> On Wed, Jun 24, 2020 at 4:52 AM Jules Damji <dm...@comcast.net> wrote:
>>
>> +1 (non-binding)
>>
>>
>>
>> Sent from my iPhone
>>
>> Pardon the dumb thumb typos :)
>>
>>
>>
>> On Jun 23, 2020, at 11:36 AM, Holden Karau <ho...@pigscanfly.ca> wrote:
>>
>> +1 on a patch release soon
>>
>>
>>
>> On Tue, Jun 23, 2020 at 10:47 AM Reynold Xin <rx...@databricks.com> wrote:
>>
>> *Error! Filename not specified.*
>>
>> +1 on doing a new patch release soon. I saw some of these issues when
>> preparing the 3.0 release, and some of them are very serious.
>>
>>
>>
>>
>>
>> On Tue, Jun 23, 2020 at 8:06 AM, Shivaram Venkataraman <
>> shivaram@eecs.berkeley.edu> wrote:
>>
>> +1 Thanks Yuanjian -- I think it'll be great to have a 3.0.1 release
>> soon.
>>
>> Shivaram
>>
>> On Tue, Jun 23, 2020 at 3:43 AM Takeshi Yamamuro <li...@gmail.com>
>> wrote:
>>
>> Thanks for the heads-up, Yuanjian!
>>
>> I also noticed branch-3.0 already has 39 commits after Spark 3.0.0.
>>
>> wow, the updates are so quick. Anyway, +1 for the release.
>>
>> Bests,
>> Takeshi
>>
>> On Tue, Jun 23, 2020 at 4:59 PM Yuanjian Li <xy...@gmail.com>
>> wrote:
>>
>> Hi dev-list,
>>
>> I’m writing this to raise the discussion about Spark 3.0.1 feasibility
>> since 4 blocker issues were found after Spark 3.0.0:
>>
>> [SPARK-31990] The state store compatibility broken will cause a
>> correctness issue when Streaming query with `dropDuplicate` uses the
>> checkpoint written by the old Spark version.
>>
>> [SPARK-32038] The regression bug in handling NaN values in
>> COUNT(DISTINCT)
>>
>> [SPARK-31918][WIP] CRAN requires to make it working with the latest R
>> 4.0. It makes the 3.0 release unavailable on CRAN, and only supports R
>> [3.5, 4.0)
>>
>> [SPARK-31967] Downgrade vis.js to fix Jobs UI loading time regression
>>
>> I also noticed branch-3.0 already has 39 commits after Spark 3.0.0. I
>> think it would be great if we have Spark 3.0.1 to deliver the critical
>> fixes.
>>
>> Any comments are appreciated.
>>
>> Best,
>>
>> Yuanjian
>>
>> --
>> ---
>> Takeshi Yamamuro
>>
>> --------------------------------------------------------------------- To
>> unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>>
>>
>>
>>
>>
>>
>> --
>>
>> Twitter: https://twitter.com/holdenkarau
>>
>> Books (Learning Spark, High Performance Spark, etc.):
>> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
>>
>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>
>> --
> Twitter: https://twitter.com/holdenkarau
> Books (Learning Spark, High Performance Spark, etc.):
> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>

Re: [DISCUSS] Apache Spark 3.0.1 Release

Posted by Holden Karau <ho...@pigscanfly.ca>.
I can take care of 2.4.7 unless someone else wants to do it.

On Tue, Jun 30, 2020 at 8:29 PM Jason Moore <Ja...@quantium.com.au>
wrote:

> Hi all,
>
>
>
> Could I get some input on the severity of this one that I found
> yesterday?  If that’s a correctness issue, should it block this patch?  Let
> me know under the ticket if there’s more info that I can provide to help.
>
>
>
> https://issues.apache.org/jira/browse/SPARK-32136
>
>
>
> Thanks,
>
> Jason.
>
>
>
> *From: *Jungtaek Lim <ka...@gmail.com>
> *Date: *Wednesday, 1 July 2020 at 10:20 am
> *To: *Shivaram Venkataraman <sh...@eecs.berkeley.edu>
> *Cc: *Prashant Sharma <sc...@gmail.com>, 郑瑞峰 <ru...@foxmail.com>,
> Gengliang Wang <ge...@databricks.com>, gurwls223 <
> gurwls223@gmail.com>, Dongjoon Hyun <do...@gmail.com>, Jules
> Damji <dm...@comcast.net>, Holden Karau <ho...@pigscanfly.ca>, Reynold
> Xin <rx...@databricks.com>, Yuanjian Li <xy...@gmail.com>, "
> dev@spark.apache.org" <de...@spark.apache.org>, Takeshi Yamamuro <
> linguin.m.s@gmail.com>
> *Subject: *Re: [DISCUSS] Apache Spark 3.0.1 Release
>
>
>
> SPARK-32130 [1] looks to be a performance regression introduced in Spark
> 3.0.0, which is ideal to look into before releasing another bugfix version.
>
>
>
> 1. https://issues.apache.org/jira/browse/SPARK-32130
>
>
>
> On Wed, Jul 1, 2020 at 7:05 AM Shivaram Venkataraman <
> shivaram@eecs.berkeley.edu> wrote:
>
> Hi all
>
>
>
> I just wanted to ping this thread to see if all the outstanding blockers
> for 3.0.1 have been fixed. If so, it would be great if we can get the
> release going. The CRAN team sent us a note that the version SparkR
> available on CRAN for the current R version (4.0.2) is broken and hence we
> need to update the package soon --  it will be great to do it with 3.0.1.
>
>
>
> Thanks
>
> Shivaram
>
>
>
> On Wed, Jun 24, 2020 at 8:31 PM Prashant Sharma <sc...@gmail.com>
> wrote:
>
> +1 for 3.0.1 release.
>
> I too can help out as release manager.
>
>
>
> On Thu, Jun 25, 2020 at 4:58 AM 郑瑞峰 <ru...@foxmail.com> wrote:
>
> I volunteer to be a release manager of 3.0.1, if nobody is working on this.
>
>
>
>
>
> ------------------ 原始邮件 ------------------
>
> *发**件人**:* "Gengliang Wang"<ge...@databricks.com>;
>
> *发**送**时间**:* 2020年6月24日(星期三) 下午4:15
>
> *收件人**:* "Hyukjin Kwon"<gu...@gmail.com>;
>
> *抄送**:* "Dongjoon Hyun"<do...@gmail.com>;"Jungtaek Lim"<
> kabhwan.opensource@gmail.com>;"Jules Damji"<dm...@comcast.net>;"Holden
> Karau"<ho...@pigscanfly.ca>;"Reynold Xin"<rx...@databricks.com>;"Shivaram
> Venkataraman"<sh...@eecs.berkeley.edu>;"Yuanjian Li"<
> xyliyuanjian@gmail.com>;"Spark dev list"<de...@spark.apache.org>;"Takeshi
> Yamamuro"<li...@gmail.com>;
>
> *主**题**:* Re: [DISCUSS] Apache Spark 3.0.1 Release
>
>
>
> +1, the issues mentioned are really serious.
>
>
>
> On Tue, Jun 23, 2020 at 7:56 PM Hyukjin Kwon <gu...@gmail.com> wrote:
>
> +1.
>
> Just as a note,
> - SPARK-31918 <https://issues.apache.org/jira/browse/SPARK-31918> is
> fixed now, and there's no blocker. - When we build SparkR, we should use
> the latest R version at least 4.0.0+.
>
>
>
> 2020년 6월 24일 (수) 오전 11:20, Dongjoon Hyun <do...@gmail.com>님이 작성:
>
> +1
>
>
>
> Bests,
>
> Dongjoon.
>
>
>
> On Tue, Jun 23, 2020 at 1:19 PM Jungtaek Lim <ka...@gmail.com>
> wrote:
>
> +1 on a 3.0.1 soon.
>
>
>
> Probably it would be nice if some Scala experts can take a look at
> https://issues.apache.org/jira/browse/SPARK-32051 and include the fix
> into 3.0.1 if possible.
>
> Looks like APIs designed to work with Scala 2.11 & Java bring ambiguity in
> Scala 2.12 & Java.
>
>
>
> On Wed, Jun 24, 2020 at 4:52 AM Jules Damji <dm...@comcast.net> wrote:
>
> +1 (non-binding)
>
>
>
> Sent from my iPhone
>
> Pardon the dumb thumb typos :)
>
>
>
> On Jun 23, 2020, at 11:36 AM, Holden Karau <ho...@pigscanfly.ca> wrote:
>
> +1 on a patch release soon
>
>
>
> On Tue, Jun 23, 2020 at 10:47 AM Reynold Xin <rx...@databricks.com> wrote:
>
> *Error! Filename not specified.*
>
> +1 on doing a new patch release soon. I saw some of these issues when
> preparing the 3.0 release, and some of them are very serious.
>
>
>
>
>
> On Tue, Jun 23, 2020 at 8:06 AM, Shivaram Venkataraman <
> shivaram@eecs.berkeley.edu> wrote:
>
> +1 Thanks Yuanjian -- I think it'll be great to have a 3.0.1 release soon.
>
> Shivaram
>
> On Tue, Jun 23, 2020 at 3:43 AM Takeshi Yamamuro <li...@gmail.com>
> wrote:
>
> Thanks for the heads-up, Yuanjian!
>
> I also noticed branch-3.0 already has 39 commits after Spark 3.0.0.
>
> wow, the updates are so quick. Anyway, +1 for the release.
>
> Bests,
> Takeshi
>
> On Tue, Jun 23, 2020 at 4:59 PM Yuanjian Li <xy...@gmail.com>
> wrote:
>
> Hi dev-list,
>
> I’m writing this to raise the discussion about Spark 3.0.1 feasibility
> since 4 blocker issues were found after Spark 3.0.0:
>
> [SPARK-31990] The state store compatibility broken will cause a
> correctness issue when Streaming query with `dropDuplicate` uses the
> checkpoint written by the old Spark version.
>
> [SPARK-32038] The regression bug in handling NaN values in COUNT(DISTINCT)
>
> [SPARK-31918][WIP] CRAN requires to make it working with the latest R 4.0.
> It makes the 3.0 release unavailable on CRAN, and only supports R [3.5,
> 4.0)
>
> [SPARK-31967] Downgrade vis.js to fix Jobs UI loading time regression
>
> I also noticed branch-3.0 already has 39 commits after Spark 3.0.0. I
> think it would be great if we have Spark 3.0.1 to deliver the critical
> fixes.
>
> Any comments are appreciated.
>
> Best,
>
> Yuanjian
>
> --
> ---
> Takeshi Yamamuro
>
> --------------------------------------------------------------------- To
> unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>
>
>
>
>
>
> --
>
> Twitter: https://twitter.com/holdenkarau
>
> Books (Learning Spark, High Performance Spark, etc.):
> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
>
> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>
> --
Twitter: https://twitter.com/holdenkarau
Books (Learning Spark, High Performance Spark, etc.):
https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
YouTube Live Streams: https://www.youtube.com/user/holdenkarau

Re: [DISCUSS] Apache Spark 3.0.1 Release

Posted by Jason Moore <Ja...@quantium.com.au.INVALID>.
Hi all,

Could I get some input on the severity of this one that I found yesterday?  If that’s a correctness issue, should it block this patch?  Let me know under the ticket if there’s more info that I can provide to help.

https://issues.apache.org/jira/browse/SPARK-32136

Thanks,
Jason.

From: Jungtaek Lim <ka...@gmail.com>
Date: Wednesday, 1 July 2020 at 10:20 am
To: Shivaram Venkataraman <sh...@eecs.berkeley.edu>
Cc: Prashant Sharma <sc...@gmail.com>, 郑瑞峰 <ru...@foxmail.com>, Gengliang Wang <ge...@databricks.com>, gurwls223 <gu...@gmail.com>, Dongjoon Hyun <do...@gmail.com>, Jules Damji <dm...@comcast.net>, Holden Karau <ho...@pigscanfly.ca>, Reynold Xin <rx...@databricks.com>, Yuanjian Li <xy...@gmail.com>, "dev@spark.apache.org" <de...@spark.apache.org>, Takeshi Yamamuro <li...@gmail.com>
Subject: Re: [DISCUSS] Apache Spark 3.0.1 Release

SPARK-32130 [1] looks to be a performance regression introduced in Spark 3.0.0, which is ideal to look into before releasing another bugfix version.

1. https://issues.apache.org/jira/browse/SPARK-32130

On Wed, Jul 1, 2020 at 7:05 AM Shivaram Venkataraman <sh...@eecs.berkeley.edu>> wrote:
Hi all

I just wanted to ping this thread to see if all the outstanding blockers for 3.0.1 have been fixed. If so, it would be great if we can get the release going. The CRAN team sent us a note that the version SparkR available on CRAN for the current R version (4.0.2) is broken and hence we need to update the package soon --  it will be great to do it with 3.0.1.

Thanks
Shivaram

On Wed, Jun 24, 2020 at 8:31 PM Prashant Sharma <sc...@gmail.com>> wrote:
+1 for 3.0.1 release.
I too can help out as release manager.

On Thu, Jun 25, 2020 at 4:58 AM 郑瑞峰 <ru...@foxmail.com>> wrote:
I volunteer to be a release manager of 3.0.1, if nobody is working on this.


------------------ 原始邮件 ------------------
发件人: "Gengliang Wang"<ge...@databricks.com>>;
发送时间: 2020年6月24日(星期三) 下午4:15
收件人: "Hyukjin Kwon"<gu...@gmail.com>>;
抄送: "Dongjoon Hyun"<do...@gmail.com>>;"Jungtaek Lim"<ka...@gmail.com>>;"Jules Damji"<dm...@comcast.net>>;"Holden Karau"<ho...@pigscanfly.ca>>;"Reynold Xin"<rx...@databricks.com>>;"Shivaram Venkataraman"<sh...@eecs.berkeley.edu>>;"Yuanjian Li"<xy...@gmail.com>>;"Spark dev list"<de...@spark.apache.org>>;"Takeshi Yamamuro"<li...@gmail.com>>;
主题: Re: [DISCUSS] Apache Spark 3.0.1 Release

+1, the issues mentioned are really serious.

On Tue, Jun 23, 2020 at 7:56 PM Hyukjin Kwon <gu...@gmail.com>> wrote:
+1.

Just as a note,
- SPARK-31918<https://issues.apache.org/jira/browse/SPARK-31918> is fixed now, and there's no blocker. - When we build SparkR, we should use the latest R version at least 4.0.0+.

2020년 6월 24일 (수) 오전 11:20, Dongjoon Hyun <do...@gmail.com>>님이 작성:
+1

Bests,
Dongjoon.

On Tue, Jun 23, 2020 at 1:19 PM Jungtaek Lim <ka...@gmail.com>> wrote:
+1 on a 3.0.1 soon.

Probably it would be nice if some Scala experts can take a look at https://issues.apache.org/jira/browse/SPARK-32051 and include the fix into 3.0.1 if possible.
Looks like APIs designed to work with Scala 2.11 & Java bring ambiguity in Scala 2.12 & Java.

On Wed, Jun 24, 2020 at 4:52 AM Jules Damji <dm...@comcast.net>> wrote:
+1 (non-binding)

Sent from my iPhone
Pardon the dumb thumb typos :)


On Jun 23, 2020, at 11:36 AM, Holden Karau <ho...@pigscanfly.ca>> wrote:
+1 on a patch release soon

On Tue, Jun 23, 2020 at 10:47 AM Reynold Xin <rx...@databricks.com>> wrote:
Error! Filename not specified.
+1 on doing a new patch release soon. I saw some of these issues when preparing the 3.0 release, and some of them are very serious.


On Tue, Jun 23, 2020 at 8:06 AM, Shivaram Venkataraman <sh...@eecs.berkeley.edu>> wrote:

+1 Thanks Yuanjian -- I think it'll be great to have a 3.0.1 release soon.

Shivaram

On Tue, Jun 23, 2020 at 3:43 AM Takeshi Yamamuro <li...@gmail.com>> wrote:

Thanks for the heads-up, Yuanjian!

I also noticed branch-3.0 already has 39 commits after Spark 3.0.0.

wow, the updates are so quick. Anyway, +1 for the release.

Bests,
Takeshi

On Tue, Jun 23, 2020 at 4:59 PM Yuanjian Li <xy...@gmail.com>> wrote:

Hi dev-list,

I’m writing this to raise the discussion about Spark 3.0.1 feasibility since 4 blocker issues were found after Spark 3.0.0:

[SPARK-31990] The state store compatibility broken will cause a correctness issue when Streaming query with `dropDuplicate` uses the checkpoint written by the old Spark version.

[SPARK-32038] The regression bug in handling NaN values in COUNT(DISTINCT)

[SPARK-31918][WIP] CRAN requires to make it working with the latest R 4.0. It makes the 3.0 release unavailable on CRAN, and only supports R [3.5, 4.0)

[SPARK-31967] Downgrade vis.js to fix Jobs UI loading time regression

I also noticed branch-3.0 already has 39 commits after Spark 3.0.0. I think it would be great if we have Spark 3.0.1 to deliver the critical fixes.

Any comments are appreciated.

Best,

Yuanjian

--
---
Takeshi Yamamuro

--------------------------------------------------------------------- To unsubscribe e-mail: dev-unsubscribe@spark.apache.org<ma...@spark.apache.org>



--
Twitter: https://twitter.com/holdenkarau
Books (Learning Spark, High Performance Spark, etc.): https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9>
YouTube Live Streams: https://www.youtube.com/user/holdenkarau

Re: [DISCUSS] Apache Spark 3.0.1 Release

Posted by Jungtaek Lim <ka...@gmail.com>.
SPARK-32130 [1] looks to be a performance regression introduced in Spark
3.0.0, which is ideal to look into before releasing another bugfix version.

1. https://issues.apache.org/jira/browse/SPARK-32130

On Wed, Jul 1, 2020 at 7:05 AM Shivaram Venkataraman <
shivaram@eecs.berkeley.edu> wrote:

> Hi all
>
> I just wanted to ping this thread to see if all the outstanding blockers
> for 3.0.1 have been fixed. If so, it would be great if we can get the
> release going. The CRAN team sent us a note that the version SparkR
> available on CRAN for the current R version (4.0.2) is broken and hence we
> need to update the package soon --  it will be great to do it with 3.0.1.
>
> Thanks
> Shivaram
>
> On Wed, Jun 24, 2020 at 8:31 PM Prashant Sharma <sc...@gmail.com>
> wrote:
>
>> +1 for 3.0.1 release.
>> I too can help out as release manager.
>>
>> On Thu, Jun 25, 2020 at 4:58 AM 郑瑞峰 <ru...@foxmail.com> wrote:
>>
>>> I volunteer to be a release manager of 3.0.1, if nobody is working on
>>> this.
>>>
>>>
>>> ------------------ 原始邮件 ------------------
>>> *发件人:* "Gengliang Wang"<ge...@databricks.com>;
>>> *发送时间:* 2020年6月24日(星期三) 下午4:15
>>> *收件人:* "Hyukjin Kwon"<gu...@gmail.com>;
>>> *抄送:* "Dongjoon Hyun"<do...@gmail.com>;"Jungtaek Lim"<
>>> kabhwan.opensource@gmail.com>;"Jules Damji"<dm...@comcast.net>;"Holden
>>> Karau"<ho...@pigscanfly.ca>;"Reynold Xin"<rx...@databricks.com>;"Shivaram
>>> Venkataraman"<sh...@eecs.berkeley.edu>;"Yuanjian Li"<
>>> xyliyuanjian@gmail.com>;"Spark dev list"<de...@spark.apache.org>;"Takeshi
>>> Yamamuro"<li...@gmail.com>;
>>> *主题:* Re: [DISCUSS] Apache Spark 3.0.1 Release
>>>
>>> +1, the issues mentioned are really serious.
>>>
>>> On Tue, Jun 23, 2020 at 7:56 PM Hyukjin Kwon <gu...@gmail.com>
>>> wrote:
>>>
>>>> +1.
>>>>
>>>> Just as a note,
>>>> - SPARK-31918 <https://issues.apache.org/jira/browse/SPARK-31918> is
>>>> fixed now, and there's no blocker. - When we build SparkR, we should use
>>>> the latest R version at least 4.0.0+.
>>>>
>>>> 2020년 6월 24일 (수) 오전 11:20, Dongjoon Hyun <do...@gmail.com>님이
>>>> 작성:
>>>>
>>>>> +1
>>>>>
>>>>> Bests,
>>>>> Dongjoon.
>>>>>
>>>>> On Tue, Jun 23, 2020 at 1:19 PM Jungtaek Lim <
>>>>> kabhwan.opensource@gmail.com> wrote:
>>>>>
>>>>>> +1 on a 3.0.1 soon.
>>>>>>
>>>>>> Probably it would be nice if some Scala experts can take a look at
>>>>>> https://issues.apache.org/jira/browse/SPARK-32051 and include the
>>>>>> fix into 3.0.1 if possible.
>>>>>> Looks like APIs designed to work with Scala 2.11 & Java bring
>>>>>> ambiguity in Scala 2.12 & Java.
>>>>>>
>>>>>> On Wed, Jun 24, 2020 at 4:52 AM Jules Damji <dm...@comcast.net>
>>>>>> wrote:
>>>>>>
>>>>>>> +1 (non-binding)
>>>>>>>
>>>>>>> Sent from my iPhone
>>>>>>> Pardon the dumb thumb typos :)
>>>>>>>
>>>>>>> On Jun 23, 2020, at 11:36 AM, Holden Karau <ho...@pigscanfly.ca>
>>>>>>> wrote:
>>>>>>>
>>>>>>> 
>>>>>>> +1 on a patch release soon
>>>>>>>
>>>>>>> On Tue, Jun 23, 2020 at 10:47 AM Reynold Xin <rx...@databricks.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> +1 on doing a new patch release soon. I saw some of these issues
>>>>>>>> when preparing the 3.0 release, and some of them are very serious.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, Jun 23, 2020 at 8:06 AM, Shivaram Venkataraman <
>>>>>>>> shivaram@eecs.berkeley.edu> wrote:
>>>>>>>>
>>>>>>>>> +1 Thanks Yuanjian -- I think it'll be great to have a 3.0.1
>>>>>>>>> release soon.
>>>>>>>>>
>>>>>>>>> Shivaram
>>>>>>>>>
>>>>>>>>> On Tue, Jun 23, 2020 at 3:43 AM Takeshi Yamamuro <
>>>>>>>>> linguin.m.s@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>> Thanks for the heads-up, Yuanjian!
>>>>>>>>>
>>>>>>>>> I also noticed branch-3.0 already has 39 commits after Spark
>>>>>>>>> 3.0.0.
>>>>>>>>>
>>>>>>>>> wow, the updates are so quick. Anyway, +1 for the release.
>>>>>>>>>
>>>>>>>>> Bests,
>>>>>>>>> Takeshi
>>>>>>>>>
>>>>>>>>> On Tue, Jun 23, 2020 at 4:59 PM Yuanjian Li <
>>>>>>>>> xyliyuanjian@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>> Hi dev-list,
>>>>>>>>>
>>>>>>>>> I’m writing this to raise the discussion about Spark 3.0.1
>>>>>>>>> feasibility since 4 blocker issues were found after Spark 3.0.0:
>>>>>>>>>
>>>>>>>>> [SPARK-31990] The state store compatibility broken will cause a
>>>>>>>>> correctness issue when Streaming query with `dropDuplicate` uses the
>>>>>>>>> checkpoint written by the old Spark version.
>>>>>>>>>
>>>>>>>>> [SPARK-32038] The regression bug in handling NaN values in
>>>>>>>>> COUNT(DISTINCT)
>>>>>>>>>
>>>>>>>>> [SPARK-31918][WIP] CRAN requires to make it working with the
>>>>>>>>> latest R 4.0. It makes the 3.0 release unavailable on CRAN, and only
>>>>>>>>> supports R [3.5, 4.0)
>>>>>>>>>
>>>>>>>>> [SPARK-31967] Downgrade vis.js to fix Jobs UI loading time
>>>>>>>>> regression
>>>>>>>>>
>>>>>>>>> I also noticed branch-3.0 already has 39 commits after Spark
>>>>>>>>> 3.0.0. I think it would be great if we have Spark 3.0.1 to deliver the
>>>>>>>>> critical fixes.
>>>>>>>>>
>>>>>>>>> Any comments are appreciated.
>>>>>>>>>
>>>>>>>>> Best,
>>>>>>>>>
>>>>>>>>> Yuanjian
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> ---
>>>>>>>>> Takeshi Yamamuro
>>>>>>>>>
>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Twitter: https://twitter.com/holdenkarau
>>>>>>> Books (Learning Spark, High Performance Spark, etc.):
>>>>>>> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
>>>>>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>>>>>>
>>>>>>>

Re: [DISCUSS] Apache Spark 3.0.1 Release

Posted by Shivaram Venkataraman <sh...@eecs.berkeley.edu>.
Hi all

I just wanted to ping this thread to see if all the outstanding blockers
for 3.0.1 have been fixed. If so, it would be great if we can get the
release going. The CRAN team sent us a note that the version SparkR
available on CRAN for the current R version (4.0.2) is broken and hence we
need to update the package soon --  it will be great to do it with 3.0.1.

Thanks
Shivaram

On Wed, Jun 24, 2020 at 8:31 PM Prashant Sharma <sc...@gmail.com>
wrote:

> +1 for 3.0.1 release.
> I too can help out as release manager.
>
> On Thu, Jun 25, 2020 at 4:58 AM 郑瑞峰 <ru...@foxmail.com> wrote:
>
>> I volunteer to be a release manager of 3.0.1, if nobody is working on
>> this.
>>
>>
>> ------------------ 原始邮件 ------------------
>> *发件人:* "Gengliang Wang"<ge...@databricks.com>;
>> *发送时间:* 2020年6月24日(星期三) 下午4:15
>> *收件人:* "Hyukjin Kwon"<gu...@gmail.com>;
>> *抄送:* "Dongjoon Hyun"<do...@gmail.com>;"Jungtaek Lim"<
>> kabhwan.opensource@gmail.com>;"Jules Damji"<dm...@comcast.net>;"Holden
>> Karau"<ho...@pigscanfly.ca>;"Reynold Xin"<rx...@databricks.com>;"Shivaram
>> Venkataraman"<sh...@eecs.berkeley.edu>;"Yuanjian Li"<
>> xyliyuanjian@gmail.com>;"Spark dev list"<de...@spark.apache.org>;"Takeshi
>> Yamamuro"<li...@gmail.com>;
>> *主题:* Re: [DISCUSS] Apache Spark 3.0.1 Release
>>
>> +1, the issues mentioned are really serious.
>>
>> On Tue, Jun 23, 2020 at 7:56 PM Hyukjin Kwon <gu...@gmail.com> wrote:
>>
>>> +1.
>>>
>>> Just as a note,
>>> - SPARK-31918 <https://issues.apache.org/jira/browse/SPARK-31918> is
>>> fixed now, and there's no blocker. - When we build SparkR, we should use
>>> the latest R version at least 4.0.0+.
>>>
>>> 2020년 6월 24일 (수) 오전 11:20, Dongjoon Hyun <do...@gmail.com>님이 작성:
>>>
>>>> +1
>>>>
>>>> Bests,
>>>> Dongjoon.
>>>>
>>>> On Tue, Jun 23, 2020 at 1:19 PM Jungtaek Lim <
>>>> kabhwan.opensource@gmail.com> wrote:
>>>>
>>>>> +1 on a 3.0.1 soon.
>>>>>
>>>>> Probably it would be nice if some Scala experts can take a look at
>>>>> https://issues.apache.org/jira/browse/SPARK-32051 and include the fix
>>>>> into 3.0.1 if possible.
>>>>> Looks like APIs designed to work with Scala 2.11 & Java bring
>>>>> ambiguity in Scala 2.12 & Java.
>>>>>
>>>>> On Wed, Jun 24, 2020 at 4:52 AM Jules Damji <dm...@comcast.net>
>>>>> wrote:
>>>>>
>>>>>> +1 (non-binding)
>>>>>>
>>>>>> Sent from my iPhone
>>>>>> Pardon the dumb thumb typos :)
>>>>>>
>>>>>> On Jun 23, 2020, at 11:36 AM, Holden Karau <ho...@pigscanfly.ca>
>>>>>> wrote:
>>>>>>
>>>>>> 
>>>>>> +1 on a patch release soon
>>>>>>
>>>>>> On Tue, Jun 23, 2020 at 10:47 AM Reynold Xin <rx...@databricks.com>
>>>>>> wrote:
>>>>>>
>>>>>>> +1 on doing a new patch release soon. I saw some of these issues
>>>>>>> when preparing the 3.0 release, and some of them are very serious.
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Jun 23, 2020 at 8:06 AM, Shivaram Venkataraman <
>>>>>>> shivaram@eecs.berkeley.edu> wrote:
>>>>>>>
>>>>>>>> +1 Thanks Yuanjian -- I think it'll be great to have a 3.0.1
>>>>>>>> release soon.
>>>>>>>>
>>>>>>>> Shivaram
>>>>>>>>
>>>>>>>> On Tue, Jun 23, 2020 at 3:43 AM Takeshi Yamamuro <
>>>>>>>> linguin.m.s@gmail.com> wrote:
>>>>>>>>
>>>>>>>> Thanks for the heads-up, Yuanjian!
>>>>>>>>
>>>>>>>> I also noticed branch-3.0 already has 39 commits after Spark 3.0.0.
>>>>>>>>
>>>>>>>> wow, the updates are so quick. Anyway, +1 for the release.
>>>>>>>>
>>>>>>>> Bests,
>>>>>>>> Takeshi
>>>>>>>>
>>>>>>>> On Tue, Jun 23, 2020 at 4:59 PM Yuanjian Li <xy...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Hi dev-list,
>>>>>>>>
>>>>>>>> I’m writing this to raise the discussion about Spark 3.0.1
>>>>>>>> feasibility since 4 blocker issues were found after Spark 3.0.0:
>>>>>>>>
>>>>>>>> [SPARK-31990] The state store compatibility broken will cause a
>>>>>>>> correctness issue when Streaming query with `dropDuplicate` uses the
>>>>>>>> checkpoint written by the old Spark version.
>>>>>>>>
>>>>>>>> [SPARK-32038] The regression bug in handling NaN values in
>>>>>>>> COUNT(DISTINCT)
>>>>>>>>
>>>>>>>> [SPARK-31918][WIP] CRAN requires to make it working with the latest
>>>>>>>> R 4.0. It makes the 3.0 release unavailable on CRAN, and only supports R
>>>>>>>> [3.5, 4.0)
>>>>>>>>
>>>>>>>> [SPARK-31967] Downgrade vis.js to fix Jobs UI loading time
>>>>>>>> regression
>>>>>>>>
>>>>>>>> I also noticed branch-3.0 already has 39 commits after Spark 3.0.0.
>>>>>>>> I think it would be great if we have Spark 3.0.1 to deliver the critical
>>>>>>>> fixes.
>>>>>>>>
>>>>>>>> Any comments are appreciated.
>>>>>>>>
>>>>>>>> Best,
>>>>>>>>
>>>>>>>> Yuanjian
>>>>>>>>
>>>>>>>> --
>>>>>>>> ---
>>>>>>>> Takeshi Yamamuro
>>>>>>>>
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> --
>>>>>> Twitter: https://twitter.com/holdenkarau
>>>>>> Books (Learning Spark, High Performance Spark, etc.):
>>>>>> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
>>>>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>>>>>
>>>>>>

Re: [DISCUSS] Apache Spark 3.0.1 Release

Posted by Prashant Sharma <sc...@gmail.com>.
+1 for 3.0.1 release.
I too can help out as release manager.

On Thu, Jun 25, 2020 at 4:58 AM 郑瑞峰 <ru...@foxmail.com> wrote:

> I volunteer to be a release manager of 3.0.1, if nobody is working on this.
>
>
> ------------------ 原始邮件 ------------------
> *发件人:* "Gengliang Wang"<ge...@databricks.com>;
> *发送时间:* 2020年6月24日(星期三) 下午4:15
> *收件人:* "Hyukjin Kwon"<gu...@gmail.com>;
> *抄送:* "Dongjoon Hyun"<do...@gmail.com>;"Jungtaek Lim"<
> kabhwan.opensource@gmail.com>;"Jules Damji"<dm...@comcast.net>;"Holden
> Karau"<ho...@pigscanfly.ca>;"Reynold Xin"<rx...@databricks.com>;"Shivaram
> Venkataraman"<sh...@eecs.berkeley.edu>;"Yuanjian Li"<
> xyliyuanjian@gmail.com>;"Spark dev list"<de...@spark.apache.org>;"Takeshi
> Yamamuro"<li...@gmail.com>;
> *主题:* Re: [DISCUSS] Apache Spark 3.0.1 Release
>
> +1, the issues mentioned are really serious.
>
> On Tue, Jun 23, 2020 at 7:56 PM Hyukjin Kwon <gu...@gmail.com> wrote:
>
>> +1.
>>
>> Just as a note,
>> - SPARK-31918 <https://issues.apache.org/jira/browse/SPARK-31918> is
>> fixed now, and there's no blocker. - When we build SparkR, we should use
>> the latest R version at least 4.0.0+.
>>
>> 2020년 6월 24일 (수) 오전 11:20, Dongjoon Hyun <do...@gmail.com>님이 작성:
>>
>>> +1
>>>
>>> Bests,
>>> Dongjoon.
>>>
>>> On Tue, Jun 23, 2020 at 1:19 PM Jungtaek Lim <
>>> kabhwan.opensource@gmail.com> wrote:
>>>
>>>> +1 on a 3.0.1 soon.
>>>>
>>>> Probably it would be nice if some Scala experts can take a look at
>>>> https://issues.apache.org/jira/browse/SPARK-32051 and include the fix
>>>> into 3.0.1 if possible.
>>>> Looks like APIs designed to work with Scala 2.11 & Java bring
>>>> ambiguity in Scala 2.12 & Java.
>>>>
>>>> On Wed, Jun 24, 2020 at 4:52 AM Jules Damji <dm...@comcast.net>
>>>> wrote:
>>>>
>>>>> +1 (non-binding)
>>>>>
>>>>> Sent from my iPhone
>>>>> Pardon the dumb thumb typos :)
>>>>>
>>>>> On Jun 23, 2020, at 11:36 AM, Holden Karau <ho...@pigscanfly.ca>
>>>>> wrote:
>>>>>
>>>>> 
>>>>> +1 on a patch release soon
>>>>>
>>>>> On Tue, Jun 23, 2020 at 10:47 AM Reynold Xin <rx...@databricks.com>
>>>>> wrote:
>>>>>
>>>>>> +1 on doing a new patch release soon. I saw some of these issues when
>>>>>> preparing the 3.0 release, and some of them are very serious.
>>>>>>
>>>>>>
>>>>>> On Tue, Jun 23, 2020 at 8:06 AM, Shivaram Venkataraman <
>>>>>> shivaram@eecs.berkeley.edu> wrote:
>>>>>>
>>>>>>> +1 Thanks Yuanjian -- I think it'll be great to have a 3.0.1 release
>>>>>>> soon.
>>>>>>>
>>>>>>> Shivaram
>>>>>>>
>>>>>>> On Tue, Jun 23, 2020 at 3:43 AM Takeshi Yamamuro <
>>>>>>> linguin.m.s@gmail.com> wrote:
>>>>>>>
>>>>>>> Thanks for the heads-up, Yuanjian!
>>>>>>>
>>>>>>> I also noticed branch-3.0 already has 39 commits after Spark 3.0.0.
>>>>>>>
>>>>>>> wow, the updates are so quick. Anyway, +1 for the release.
>>>>>>>
>>>>>>> Bests,
>>>>>>> Takeshi
>>>>>>>
>>>>>>> On Tue, Jun 23, 2020 at 4:59 PM Yuanjian Li <xy...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>> Hi dev-list,
>>>>>>>
>>>>>>> I’m writing this to raise the discussion about Spark 3.0.1
>>>>>>> feasibility since 4 blocker issues were found after Spark 3.0.0:
>>>>>>>
>>>>>>> [SPARK-31990] The state store compatibility broken will cause a
>>>>>>> correctness issue when Streaming query with `dropDuplicate` uses the
>>>>>>> checkpoint written by the old Spark version.
>>>>>>>
>>>>>>> [SPARK-32038] The regression bug in handling NaN values in
>>>>>>> COUNT(DISTINCT)
>>>>>>>
>>>>>>> [SPARK-31918][WIP] CRAN requires to make it working with the latest
>>>>>>> R 4.0. It makes the 3.0 release unavailable on CRAN, and only supports R
>>>>>>> [3.5, 4.0)
>>>>>>>
>>>>>>> [SPARK-31967] Downgrade vis.js to fix Jobs UI loading time
>>>>>>> regression
>>>>>>>
>>>>>>> I also noticed branch-3.0 already has 39 commits after Spark 3.0.0.
>>>>>>> I think it would be great if we have Spark 3.0.1 to deliver the critical
>>>>>>> fixes.
>>>>>>>
>>>>>>> Any comments are appreciated.
>>>>>>>
>>>>>>> Best,
>>>>>>>
>>>>>>> Yuanjian
>>>>>>>
>>>>>>> --
>>>>>>> ---
>>>>>>> Takeshi Yamamuro
>>>>>>>
>>>>>>> ---------------------------------------------------------------------
>>>>>>> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> Twitter: https://twitter.com/holdenkarau
>>>>> Books (Learning Spark, High Performance Spark, etc.):
>>>>> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
>>>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>>>>
>>>>>