You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Hyukjin Kwon <gu...@gmail.com> on 2022/06/20 01:04:51 UTC

[SPARK-39515] Improve scheduled jobs in GitHub Actions

Hi all,

I am trying to rework GitHub Actions CI at
https://issues.apache.org/jira/browse/SPARK-39515. Any help would be very
appreciated.

Re: [SPARK-39515] Improve scheduled jobs in GitHub Actions

Posted by Yikun Jiang <yi...@gmail.com>.
With the help from the community, the cache based job switch has been
completed!

* About the ghcr images:

You might notice that two images are generated in apache ghcr:

- Image cache: spark/apache-spark-github-action-image-cache
<https://github.com/orgs/apache/packages/container/package/spark%2Fapache-spark-github-action-image-cache>:
This is the cache based on branches' dev/infra/Dockerfile.

- CI image: apache-spark-ci-image
<https://github.com/orgs/apache/packages/container/package/apache-spark-ci-image>:
This is for scheduled jobs. It builds an image just-in-time from the cache,
and then uses it to run the CI jobs.

- Distributed (User) CI image: such as yikun/apache-spark-ci-image
<https://github.com/Yikun/spark/pkgs/container/apache-spark-ci-image>: This
is for PR triggered jobs. Again built just-in-time from the cache and used
to execute the CI job(s) in the user's Github Action space.

* About the job:

For Lint/PySpark/SparkR jobs, "Base image build" will do a just-in-time
build and generate a ci-image for each PR, and jobs use the image as the
job container image.

* About how to change the infra deps:

Currently, the CI image is just like a static image unless you change the
Dockerfile.

- If you want to change the version of a dependency of Lint/PySpark/SparkR
jobs, you could change the dev/infra/Dockerfile just like
https://github.com/apache/spark/pull/37175.

- If you want to trigger a full refresh you could just change the
FULL_REFRESH_DATE
in the Dockerfile
<https://github.com/apache/spark/blob/35d00df9bba7238ad4f409999617fae4d04ddbfd/dev/infra/Dockerfile#L21>
.

FYI, I also do a updated the doc on
https://docs.google.com/document/d/1_uiId-U1DODYyYZejAZeyz2OAjxcnA-xfwjynDF6vd0
to
help you understand.


Through this work, I can really feel the efforts of previous maintenance! A
simple version bump of a dependency may lead to a lot of investigation!
Thanks to HyukjinKwon, Dongjoon and the whole community for keeping the
infra deps always latest!

Feel free to ping me if you have any other concerns or ideas!

Regards,
Yikun


On Mon, Jun 27, 2022 at 12:05 AM Yikun Jiang <yi...@gmail.com> wrote:

> > There’s one last task to simply caching the Docker image (
> https://issues.apache.org/jira/browse/SPARK-39522).
> I will have to be less active for this week and next week because of the
> Spark Summit. Would appreciate if somebody
> finds some time to take a stab.
>
> I did some investigations on spark container jobs (pyspark/sparkr/lint)
> using cache, and draft a doc to help you guys understand #36980
> <https://github.com/apache/spark/pull/36980>:
>
> https://docs.google.com/document/d/1_uiId-U1DODYyYZejAZeyz2OAjxcnA-xfwjynDF6vd0
>
>
> > About a quick hallway meetup, I will be there after Holden’s talk at
> least to say hello to her :-).
>
> Something topic I was interesting about and related to build CI:
> - K8S integrations <https://github.com/apache/spark/pull/35830> test on
> GA:
> - To help various OS <https://github.com/apache/spark/pull/35142> and
> multi architecture/hardware (x86/arm64, gpu) integration support, what we
> can do to help improving.
> Please feel free to ping me if necessary. It's a little bit pity I
> couldn't have the opportunity to be there, I hope you guys have a fabulous
> meet on summit!
>
> Regards,
> Yikun
>
>
> On Fri, Jun 24, 2022 at 11:15 AM Dongjoon Hyun <do...@gmail.com>
> wrote:
>
>> Yep, I'll be there too. Thank you for the adjustment. See you soon. :)
>>
>> Dongjoon.
>>
>> On Thu, Jun 23, 2022 at 4:59 PM Hyukjin Kwon <gu...@gmail.com> wrote:
>>
>>> Alright, I'll be there after Holden's talk Thursday
>>> https://databricks.com/dataaisummit/session/tools-assisted-apache-spark-version-migrations-21-32
>>> w/ Dongjoon (since he manages OSS Jenkins too).
>>> Let's have a quickie chat :-).
>>>
>>> On Thu, 23 Jun 2022 at 06:16, Hyukjin Kwon <gu...@gmail.com> wrote:
>>>
>>>> Oops, I was confused about the time and distance in the US. I won't
>>>> make it too.
>>>> Let me find another time slot that works for more ppl.
>>>>
>>>> On Thu, 23 Jun 2022 at 00:19, Dongjoon Hyun <do...@gmail.com>
>>>> wrote:
>>>>
>>>>> Thank you, Hyukjin! :)
>>>>>
>>>>> BTW, unfortunately, it seems that I cannot join that quick meeting.
>>>>> I have another schedule at South Bay around 7PM and need to leave San
>>>>> Francisco at least 5PM.
>>>>>
>>>>> Dongjoon.
>>>>>
>>>>>
>>>>> On Wed, Jun 22, 2022 at 3:39 AM Hyukjin Kwon <gu...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> (cc @Yikun Jiang <yi...@gmail.com> @Gengliang Wang
>>>>>> <ge...@databricks.com> @Maxim Gekk
>>>>>> <ma...@databricks.com> @Yang,Jie(INF) <ya...@baidu.com> FYI)
>>>>>>
>>>>>> On Wed, 22 Jun 2022 at 19:34, Hyukjin Kwon <gu...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Couple of updates:
>>>>>>>
>>>>>>>    -
>>>>>>>
>>>>>>>    All builds passed now with all combinations we defined in the
>>>>>>>    GitHub Actions (e.g., branch-3.2, branch-3.3, JDK 11,
>>>>>>>    JDK 17 and Scala 2.13), see
>>>>>>>    https://github.com/apache/spark/actions cc @Tom Graves
>>>>>>>    <tg...@yahoo.com> @Dongjoon Hyun <do...@gmail.com>
>>>>>>>     FYI
>>>>>>>    -
>>>>>>>
>>>>>>>    except one test that is being failed due to OOM. That’s being
>>>>>>>    fixed at https://github.com/apache/spark/pull/36954, see
>>>>>>>    also
>>>>>>>    https://github.com/apache/spark/pull/36787#discussion_r901190636
>>>>>>>    -
>>>>>>>
>>>>>>>    I am now adding PySpark, SparkR jobs to the scheduled builds at
>>>>>>>    https://github.com/apache/spark/pull/36940
>>>>>>>    and see if they pass. We might need a couple of more fixes there.
>>>>>>>    -
>>>>>>>
>>>>>>>    There’s one last task to simply caching the Docker image (
>>>>>>>    https://issues.apache.org/jira/browse/SPARK-39522).
>>>>>>>    I will have to be less active for this week and next week
>>>>>>>    because of the Spark Summit. Would appreciate if somebody
>>>>>>>    finds some time to take a stab.
>>>>>>>
>>>>>>> About a quick hallway meetup, I will be there after Holden’s talk at
>>>>>>> least to say hello to her :-).
>>>>>>> Let’s have a quick chat about our CI. We still have some general
>>>>>>> problems to cope with like the lack of resources in
>>>>>>> GitHub Actions.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Tue, 21 Jun 2022 at 11:49, Hyukjin Kwon <gu...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Just chatted offline - both I and Holden have multiple sessions :-).
>>>>>>>> Probably let's meet up for a quick chat after your talk
>>>>>>>> https://databricks.com/dataaisummit/session/what-do-when-your-job-goes-oom-night-flowcharts
>>>>>>>> ?
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, 20 Jun 2022 at 22:23, Holden Karau <ho...@pigscanfly.ca>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> How about a hallway meet up at Data AI summit to talk about build
>>>>>>>>> CI if folks are
>>>>>>>>> Interested?
>>>>>>>>>
>>>>>>>>> On Sun, Jun 19, 2022 at 7:50 PM Hyukjin Kwon <gu...@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Increased the priority to a blocker - I don't think we can
>>>>>>>>>> release with these build failures and poor CI
>>>>>>>>>>
>>>>>>>>>> On Mon, 20 Jun 2022 at 10:39, Hyukjin Kwon <gu...@gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> There are too many test failures here. I pinged in some PRs I
>>>>>>>>>>> could identify from a cursory look but would be great for you guys to take
>>>>>>>>>>> a look if you guys haven't tested your change against other
>>>>>>>>>>> environments like JDK 11, Scala 2.13.
>>>>>>>>>>>
>>>>>>>>>>> On Mon, 20 Jun 2022 at 10:04, Hyukjin Kwon <gu...@gmail.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>
>>>>>>>>>>>> I am trying to rework GitHub Actions CI at
>>>>>>>>>>>> https://issues.apache.org/jira/browse/SPARK-39515. Any help
>>>>>>>>>>>> would be very appreciated.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>> Twitter: https://twitter.com/holdenkarau
>>>>>>>>> Books (Learning Spark, High Performance Spark, etc.):
>>>>>>>>> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
>>>>>>>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>>>>>>>>
>>>>>>>>

Re: [SPARK-39515] Improve scheduled jobs in GitHub Actions

Posted by Yikun Jiang <yi...@gmail.com>.
> There’s one last task to simply caching the Docker image (
https://issues.apache.org/jira/browse/SPARK-39522).
I will have to be less active for this week and next week because of the
Spark Summit. Would appreciate if somebody
finds some time to take a stab.

I did some investigations on spark container jobs (pyspark/sparkr/lint)
using cache, and draft a doc to help you guys understand #36980
<https://github.com/apache/spark/pull/36980>:
https://docs.google.com/document/d/1_uiId-U1DODYyYZejAZeyz2OAjxcnA-xfwjynDF6vd0


> About a quick hallway meetup, I will be there after Holden’s talk at
least to say hello to her :-).

Something topic I was interesting about and related to build CI:
- K8S integrations <https://github.com/apache/spark/pull/35830> test on GA:
- To help various OS <https://github.com/apache/spark/pull/35142> and multi
architecture/hardware (x86/arm64, gpu) integration support, what we can do
to help improving.
Please feel free to ping me if necessary. It's a little bit pity I couldn't
have the opportunity to be there, I hope you guys have a fabulous meet on
summit!

Regards,
Yikun


On Fri, Jun 24, 2022 at 11:15 AM Dongjoon Hyun <do...@gmail.com>
wrote:

> Yep, I'll be there too. Thank you for the adjustment. See you soon. :)
>
> Dongjoon.
>
> On Thu, Jun 23, 2022 at 4:59 PM Hyukjin Kwon <gu...@gmail.com> wrote:
>
>> Alright, I'll be there after Holden's talk Thursday
>> https://databricks.com/dataaisummit/session/tools-assisted-apache-spark-version-migrations-21-32
>> w/ Dongjoon (since he manages OSS Jenkins too).
>> Let's have a quickie chat :-).
>>
>> On Thu, 23 Jun 2022 at 06:16, Hyukjin Kwon <gu...@gmail.com> wrote:
>>
>>> Oops, I was confused about the time and distance in the US. I won't make
>>> it too.
>>> Let me find another time slot that works for more ppl.
>>>
>>> On Thu, 23 Jun 2022 at 00:19, Dongjoon Hyun <do...@gmail.com>
>>> wrote:
>>>
>>>> Thank you, Hyukjin! :)
>>>>
>>>> BTW, unfortunately, it seems that I cannot join that quick meeting.
>>>> I have another schedule at South Bay around 7PM and need to leave San
>>>> Francisco at least 5PM.
>>>>
>>>> Dongjoon.
>>>>
>>>>
>>>> On Wed, Jun 22, 2022 at 3:39 AM Hyukjin Kwon <gu...@gmail.com>
>>>> wrote:
>>>>
>>>>> (cc @Yikun Jiang <yi...@gmail.com> @Gengliang Wang
>>>>> <ge...@databricks.com> @Maxim Gekk
>>>>> <ma...@databricks.com> @Yang,Jie(INF) <ya...@baidu.com> FYI)
>>>>>
>>>>> On Wed, 22 Jun 2022 at 19:34, Hyukjin Kwon <gu...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Couple of updates:
>>>>>>
>>>>>>    -
>>>>>>
>>>>>>    All builds passed now with all combinations we defined in the
>>>>>>    GitHub Actions (e.g., branch-3.2, branch-3.3, JDK 11,
>>>>>>    JDK 17 and Scala 2.13), see
>>>>>>    https://github.com/apache/spark/actions cc @Tom Graves
>>>>>>    <tg...@yahoo.com> @Dongjoon Hyun <do...@gmail.com>
>>>>>>     FYI
>>>>>>    -
>>>>>>
>>>>>>    except one test that is being failed due to OOM. That’s being
>>>>>>    fixed at https://github.com/apache/spark/pull/36954, see
>>>>>>    also
>>>>>>    https://github.com/apache/spark/pull/36787#discussion_r901190636
>>>>>>    -
>>>>>>
>>>>>>    I am now adding PySpark, SparkR jobs to the scheduled builds at
>>>>>>    https://github.com/apache/spark/pull/36940
>>>>>>    and see if they pass. We might need a couple of more fixes there.
>>>>>>    -
>>>>>>
>>>>>>    There’s one last task to simply caching the Docker image (
>>>>>>    https://issues.apache.org/jira/browse/SPARK-39522).
>>>>>>    I will have to be less active for this week and next week because
>>>>>>    of the Spark Summit. Would appreciate if somebody
>>>>>>    finds some time to take a stab.
>>>>>>
>>>>>> About a quick hallway meetup, I will be there after Holden’s talk at
>>>>>> least to say hello to her :-).
>>>>>> Let’s have a quick chat about our CI. We still have some general
>>>>>> problems to cope with like the lack of resources in
>>>>>> GitHub Actions.
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Tue, 21 Jun 2022 at 11:49, Hyukjin Kwon <gu...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Just chatted offline - both I and Holden have multiple sessions :-).
>>>>>>> Probably let's meet up for a quick chat after your talk
>>>>>>> https://databricks.com/dataaisummit/session/what-do-when-your-job-goes-oom-night-flowcharts
>>>>>>> ?
>>>>>>>
>>>>>>>
>>>>>>> On Mon, 20 Jun 2022 at 22:23, Holden Karau <ho...@pigscanfly.ca>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> How about a hallway meet up at Data AI summit to talk about build
>>>>>>>> CI if folks are
>>>>>>>> Interested?
>>>>>>>>
>>>>>>>> On Sun, Jun 19, 2022 at 7:50 PM Hyukjin Kwon <gu...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Increased the priority to a blocker - I don't think we can release
>>>>>>>>> with these build failures and poor CI
>>>>>>>>>
>>>>>>>>> On Mon, 20 Jun 2022 at 10:39, Hyukjin Kwon <gu...@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> There are too many test failures here. I pinged in some PRs I
>>>>>>>>>> could identify from a cursory look but would be great for you guys to take
>>>>>>>>>> a look if you guys haven't tested your change against other
>>>>>>>>>> environments like JDK 11, Scala 2.13.
>>>>>>>>>>
>>>>>>>>>> On Mon, 20 Jun 2022 at 10:04, Hyukjin Kwon <gu...@gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi all,
>>>>>>>>>>>
>>>>>>>>>>> I am trying to rework GitHub Actions CI at
>>>>>>>>>>> https://issues.apache.org/jira/browse/SPARK-39515. Any help
>>>>>>>>>>> would be very appreciated.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>> Twitter: https://twitter.com/holdenkarau
>>>>>>>> Books (Learning Spark, High Performance Spark, etc.):
>>>>>>>> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
>>>>>>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>>>>>>>
>>>>>>>

Re: [SPARK-39515] Improve scheduled jobs in GitHub Actions

Posted by Dongjoon Hyun <do...@gmail.com>.
Yep, I'll be there too. Thank you for the adjustment. See you soon. :)

Dongjoon.

On Thu, Jun 23, 2022 at 4:59 PM Hyukjin Kwon <gu...@gmail.com> wrote:

> Alright, I'll be there after Holden's talk Thursday
> https://databricks.com/dataaisummit/session/tools-assisted-apache-spark-version-migrations-21-32
> w/ Dongjoon (since he manages OSS Jenkins too).
> Let's have a quickie chat :-).
>
> On Thu, 23 Jun 2022 at 06:16, Hyukjin Kwon <gu...@gmail.com> wrote:
>
>> Oops, I was confused about the time and distance in the US. I won't make
>> it too.
>> Let me find another time slot that works for more ppl.
>>
>> On Thu, 23 Jun 2022 at 00:19, Dongjoon Hyun <do...@gmail.com>
>> wrote:
>>
>>> Thank you, Hyukjin! :)
>>>
>>> BTW, unfortunately, it seems that I cannot join that quick meeting.
>>> I have another schedule at South Bay around 7PM and need to leave San
>>> Francisco at least 5PM.
>>>
>>> Dongjoon.
>>>
>>>
>>> On Wed, Jun 22, 2022 at 3:39 AM Hyukjin Kwon <gu...@gmail.com>
>>> wrote:
>>>
>>>> (cc @Yikun Jiang <yi...@gmail.com> @Gengliang Wang
>>>> <ge...@databricks.com> @Maxim Gekk <ma...@databricks.com>
>>>>  @Yang,Jie(INF) <ya...@baidu.com> FYI)
>>>>
>>>> On Wed, 22 Jun 2022 at 19:34, Hyukjin Kwon <gu...@gmail.com> wrote:
>>>>
>>>>> Couple of updates:
>>>>>
>>>>>    -
>>>>>
>>>>>    All builds passed now with all combinations we defined in the
>>>>>    GitHub Actions (e.g., branch-3.2, branch-3.3, JDK 11,
>>>>>    JDK 17 and Scala 2.13), see https://github.com/apache/spark/actions
>>>>>    cc @Tom Graves <tg...@yahoo.com> @Dongjoon Hyun
>>>>>    <do...@gmail.com> FYI
>>>>>    -
>>>>>
>>>>>    except one test that is being failed due to OOM. That’s being
>>>>>    fixed at https://github.com/apache/spark/pull/36954, see
>>>>>    also
>>>>>    https://github.com/apache/spark/pull/36787#discussion_r901190636
>>>>>    -
>>>>>
>>>>>    I am now adding PySpark, SparkR jobs to the scheduled builds at
>>>>>    https://github.com/apache/spark/pull/36940
>>>>>    and see if they pass. We might need a couple of more fixes there.
>>>>>    -
>>>>>
>>>>>    There’s one last task to simply caching the Docker image (
>>>>>    https://issues.apache.org/jira/browse/SPARK-39522).
>>>>>    I will have to be less active for this week and next week because
>>>>>    of the Spark Summit. Would appreciate if somebody
>>>>>    finds some time to take a stab.
>>>>>
>>>>> About a quick hallway meetup, I will be there after Holden’s talk at
>>>>> least to say hello to her :-).
>>>>> Let’s have a quick chat about our CI. We still have some general
>>>>> problems to cope with like the lack of resources in
>>>>> GitHub Actions.
>>>>>
>>>>>
>>>>>
>>>>> On Tue, 21 Jun 2022 at 11:49, Hyukjin Kwon <gu...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Just chatted offline - both I and Holden have multiple sessions :-).
>>>>>> Probably let's meet up for a quick chat after your talk
>>>>>> https://databricks.com/dataaisummit/session/what-do-when-your-job-goes-oom-night-flowcharts
>>>>>> ?
>>>>>>
>>>>>>
>>>>>> On Mon, 20 Jun 2022 at 22:23, Holden Karau <ho...@pigscanfly.ca>
>>>>>> wrote:
>>>>>>
>>>>>>> How about a hallway meet up at Data AI summit to talk about build CI
>>>>>>> if folks are
>>>>>>> Interested?
>>>>>>>
>>>>>>> On Sun, Jun 19, 2022 at 7:50 PM Hyukjin Kwon <gu...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Increased the priority to a blocker - I don't think we can release
>>>>>>>> with these build failures and poor CI
>>>>>>>>
>>>>>>>> On Mon, 20 Jun 2022 at 10:39, Hyukjin Kwon <gu...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> There are too many test failures here. I pinged in some PRs I
>>>>>>>>> could identify from a cursory look but would be great for you guys to take
>>>>>>>>> a look if you guys haven't tested your change against other
>>>>>>>>> environments like JDK 11, Scala 2.13.
>>>>>>>>>
>>>>>>>>> On Mon, 20 Jun 2022 at 10:04, Hyukjin Kwon <gu...@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi all,
>>>>>>>>>>
>>>>>>>>>> I am trying to rework GitHub Actions CI at
>>>>>>>>>> https://issues.apache.org/jira/browse/SPARK-39515. Any help
>>>>>>>>>> would be very appreciated.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>> Twitter: https://twitter.com/holdenkarau
>>>>>>> Books (Learning Spark, High Performance Spark, etc.):
>>>>>>> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
>>>>>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>>>>>>
>>>>>>

Re: [SPARK-39515] Improve scheduled jobs in GitHub Actions

Posted by Hyukjin Kwon <gu...@gmail.com>.
Alright, I'll be there after Holden's talk Thursday
https://databricks.com/dataaisummit/session/tools-assisted-apache-spark-version-migrations-21-32
w/ Dongjoon (since he manages OSS Jenkins too).
Let's have a quickie chat :-).

On Thu, 23 Jun 2022 at 06:16, Hyukjin Kwon <gu...@gmail.com> wrote:

> Oops, I was confused about the time and distance in the US. I won't make
> it too.
> Let me find another time slot that works for more ppl.
>
> On Thu, 23 Jun 2022 at 00:19, Dongjoon Hyun <do...@gmail.com>
> wrote:
>
>> Thank you, Hyukjin! :)
>>
>> BTW, unfortunately, it seems that I cannot join that quick meeting.
>> I have another schedule at South Bay around 7PM and need to leave San
>> Francisco at least 5PM.
>>
>> Dongjoon.
>>
>>
>> On Wed, Jun 22, 2022 at 3:39 AM Hyukjin Kwon <gu...@gmail.com> wrote:
>>
>>> (cc @Yikun Jiang <yi...@gmail.com> @Gengliang Wang
>>> <ge...@databricks.com> @Maxim Gekk <ma...@databricks.com>
>>> @Yang,Jie(INF) <ya...@baidu.com> FYI)
>>>
>>> On Wed, 22 Jun 2022 at 19:34, Hyukjin Kwon <gu...@gmail.com> wrote:
>>>
>>>> Couple of updates:
>>>>
>>>>    -
>>>>
>>>>    All builds passed now with all combinations we defined in the
>>>>    GitHub Actions (e.g., branch-3.2, branch-3.3, JDK 11,
>>>>    JDK 17 and Scala 2.13), see https://github.com/apache/spark/actions
>>>>    cc @Tom Graves <tg...@yahoo.com> @Dongjoon Hyun
>>>>    <do...@gmail.com> FYI
>>>>    -
>>>>
>>>>    except one test that is being failed due to OOM. That’s being fixed
>>>>    at https://github.com/apache/spark/pull/36954, see
>>>>    also
>>>>    https://github.com/apache/spark/pull/36787#discussion_r901190636
>>>>    -
>>>>
>>>>    I am now adding PySpark, SparkR jobs to the scheduled builds at
>>>>    https://github.com/apache/spark/pull/36940
>>>>    and see if they pass. We might need a couple of more fixes there.
>>>>    -
>>>>
>>>>    There’s one last task to simply caching the Docker image (
>>>>    https://issues.apache.org/jira/browse/SPARK-39522).
>>>>    I will have to be less active for this week and next week because
>>>>    of the Spark Summit. Would appreciate if somebody
>>>>    finds some time to take a stab.
>>>>
>>>> About a quick hallway meetup, I will be there after Holden’s talk at
>>>> least to say hello to her :-).
>>>> Let’s have a quick chat about our CI. We still have some general
>>>> problems to cope with like the lack of resources in
>>>> GitHub Actions.
>>>>
>>>>
>>>>
>>>> On Tue, 21 Jun 2022 at 11:49, Hyukjin Kwon <gu...@gmail.com> wrote:
>>>>
>>>>> Just chatted offline - both I and Holden have multiple sessions :-).
>>>>> Probably let's meet up for a quick chat after your talk
>>>>> https://databricks.com/dataaisummit/session/what-do-when-your-job-goes-oom-night-flowcharts
>>>>> ?
>>>>>
>>>>>
>>>>> On Mon, 20 Jun 2022 at 22:23, Holden Karau <ho...@pigscanfly.ca>
>>>>> wrote:
>>>>>
>>>>>> How about a hallway meet up at Data AI summit to talk about build CI
>>>>>> if folks are
>>>>>> Interested?
>>>>>>
>>>>>> On Sun, Jun 19, 2022 at 7:50 PM Hyukjin Kwon <gu...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Increased the priority to a blocker - I don't think we can release
>>>>>>> with these build failures and poor CI
>>>>>>>
>>>>>>> On Mon, 20 Jun 2022 at 10:39, Hyukjin Kwon <gu...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> There are too many test failures here. I pinged in some PRs I could
>>>>>>>> identify from a cursory look but would be great for you guys to take a look
>>>>>>>> if you guys haven't tested your change against other environments like JDK
>>>>>>>> 11, Scala 2.13.
>>>>>>>>
>>>>>>>> On Mon, 20 Jun 2022 at 10:04, Hyukjin Kwon <gu...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi all,
>>>>>>>>>
>>>>>>>>> I am trying to rework GitHub Actions CI at
>>>>>>>>> https://issues.apache.org/jira/browse/SPARK-39515. Any help would
>>>>>>>>> be very appreciated.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>> Twitter: https://twitter.com/holdenkarau
>>>>>> Books (Learning Spark, High Performance Spark, etc.):
>>>>>> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
>>>>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>>>>>
>>>>>

Re: [SPARK-39515] Improve scheduled jobs in GitHub Actions

Posted by Hyukjin Kwon <gu...@gmail.com>.
Oops, I was confused about the time and distance in the US. I won't make it
too.
Let me find another time slot that works for more ppl.

On Thu, 23 Jun 2022 at 00:19, Dongjoon Hyun <do...@gmail.com> wrote:

> Thank you, Hyukjin! :)
>
> BTW, unfortunately, it seems that I cannot join that quick meeting.
> I have another schedule at South Bay around 7PM and need to leave San
> Francisco at least 5PM.
>
> Dongjoon.
>
>
> On Wed, Jun 22, 2022 at 3:39 AM Hyukjin Kwon <gu...@gmail.com> wrote:
>
>> (cc @Yikun Jiang <yi...@gmail.com> @Gengliang Wang
>> <ge...@databricks.com> @Maxim Gekk <ma...@databricks.com>
>> @Yang,Jie(INF) <ya...@baidu.com> FYI)
>>
>> On Wed, 22 Jun 2022 at 19:34, Hyukjin Kwon <gu...@gmail.com> wrote:
>>
>>> Couple of updates:
>>>
>>>    -
>>>
>>>    All builds passed now with all combinations we defined in the GitHub
>>>    Actions (e.g., branch-3.2, branch-3.3, JDK 11,
>>>    JDK 17 and Scala 2.13), see https://github.com/apache/spark/actions
>>>    cc @Tom Graves <tg...@yahoo.com> @Dongjoon Hyun
>>>    <do...@gmail.com> FYI
>>>    -
>>>
>>>    except one test that is being failed due to OOM. That’s being fixed
>>>    at https://github.com/apache/spark/pull/36954, see
>>>    also https://github.com/apache/spark/pull/36787#discussion_r901190636
>>>    -
>>>
>>>    I am now adding PySpark, SparkR jobs to the scheduled builds at
>>>    https://github.com/apache/spark/pull/36940
>>>    and see if they pass. We might need a couple of more fixes there.
>>>    -
>>>
>>>    There’s one last task to simply caching the Docker image (
>>>    https://issues.apache.org/jira/browse/SPARK-39522).
>>>    I will have to be less active for this week and next week because of
>>>    the Spark Summit. Would appreciate if somebody
>>>    finds some time to take a stab.
>>>
>>> About a quick hallway meetup, I will be there after Holden’s talk at
>>> least to say hello to her :-).
>>> Let’s have a quick chat about our CI. We still have some general
>>> problems to cope with like the lack of resources in
>>> GitHub Actions.
>>>
>>>
>>>
>>> On Tue, 21 Jun 2022 at 11:49, Hyukjin Kwon <gu...@gmail.com> wrote:
>>>
>>>> Just chatted offline - both I and Holden have multiple sessions :-).
>>>> Probably let's meet up for a quick chat after your talk
>>>> https://databricks.com/dataaisummit/session/what-do-when-your-job-goes-oom-night-flowcharts
>>>> ?
>>>>
>>>>
>>>> On Mon, 20 Jun 2022 at 22:23, Holden Karau <ho...@pigscanfly.ca>
>>>> wrote:
>>>>
>>>>> How about a hallway meet up at Data AI summit to talk about build CI
>>>>> if folks are
>>>>> Interested?
>>>>>
>>>>> On Sun, Jun 19, 2022 at 7:50 PM Hyukjin Kwon <gu...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Increased the priority to a blocker - I don't think we can release
>>>>>> with these build failures and poor CI
>>>>>>
>>>>>> On Mon, 20 Jun 2022 at 10:39, Hyukjin Kwon <gu...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> There are too many test failures here. I pinged in some PRs I could
>>>>>>> identify from a cursory look but would be great for you guys to take a look
>>>>>>> if you guys haven't tested your change against other environments like JDK
>>>>>>> 11, Scala 2.13.
>>>>>>>
>>>>>>> On Mon, 20 Jun 2022 at 10:04, Hyukjin Kwon <gu...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi all,
>>>>>>>>
>>>>>>>> I am trying to rework GitHub Actions CI at
>>>>>>>> https://issues.apache.org/jira/browse/SPARK-39515. Any help would
>>>>>>>> be very appreciated.
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>> Twitter: https://twitter.com/holdenkarau
>>>>> Books (Learning Spark, High Performance Spark, etc.):
>>>>> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
>>>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>>>>
>>>>

Re: [SPARK-39515] Improve scheduled jobs in GitHub Actions

Posted by Dongjoon Hyun <do...@gmail.com>.
Thank you, Hyukjin! :)

BTW, unfortunately, it seems that I cannot join that quick meeting.
I have another schedule at South Bay around 7PM and need to leave San
Francisco at least 5PM.

Dongjoon.


On Wed, Jun 22, 2022 at 3:39 AM Hyukjin Kwon <gu...@gmail.com> wrote:

> (cc @Yikun Jiang <yi...@gmail.com> @Gengliang Wang
> <ge...@databricks.com> @Maxim Gekk <ma...@databricks.com>
> @Yang,Jie(INF) <ya...@baidu.com> FYI)
>
> On Wed, 22 Jun 2022 at 19:34, Hyukjin Kwon <gu...@gmail.com> wrote:
>
>> Couple of updates:
>>
>>    -
>>
>>    All builds passed now with all combinations we defined in the GitHub
>>    Actions (e.g., branch-3.2, branch-3.3, JDK 11,
>>    JDK 17 and Scala 2.13), see https://github.com/apache/spark/actions
>>    cc @Tom Graves <tg...@yahoo.com> @Dongjoon Hyun
>>    <do...@gmail.com> FYI
>>    -
>>
>>    except one test that is being failed due to OOM. That’s being fixed
>>    at https://github.com/apache/spark/pull/36954, see
>>    also https://github.com/apache/spark/pull/36787#discussion_r901190636
>>    -
>>
>>    I am now adding PySpark, SparkR jobs to the scheduled builds at
>>    https://github.com/apache/spark/pull/36940
>>    and see if they pass. We might need a couple of more fixes there.
>>    -
>>
>>    There’s one last task to simply caching the Docker image (
>>    https://issues.apache.org/jira/browse/SPARK-39522).
>>    I will have to be less active for this week and next week because of
>>    the Spark Summit. Would appreciate if somebody
>>    finds some time to take a stab.
>>
>> About a quick hallway meetup, I will be there after Holden’s talk at
>> least to say hello to her :-).
>> Let’s have a quick chat about our CI. We still have some general problems
>> to cope with like the lack of resources in
>> GitHub Actions.
>>
>>
>>
>> On Tue, 21 Jun 2022 at 11:49, Hyukjin Kwon <gu...@gmail.com> wrote:
>>
>>> Just chatted offline - both I and Holden have multiple sessions :-).
>>> Probably let's meet up for a quick chat after your talk
>>> https://databricks.com/dataaisummit/session/what-do-when-your-job-goes-oom-night-flowcharts
>>> ?
>>>
>>>
>>> On Mon, 20 Jun 2022 at 22:23, Holden Karau <ho...@pigscanfly.ca> wrote:
>>>
>>>> How about a hallway meet up at Data AI summit to talk about build CI if
>>>> folks are
>>>> Interested?
>>>>
>>>> On Sun, Jun 19, 2022 at 7:50 PM Hyukjin Kwon <gu...@gmail.com>
>>>> wrote:
>>>>
>>>>> Increased the priority to a blocker - I don't think we can release
>>>>> with these build failures and poor CI
>>>>>
>>>>> On Mon, 20 Jun 2022 at 10:39, Hyukjin Kwon <gu...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> There are too many test failures here. I pinged in some PRs I could
>>>>>> identify from a cursory look but would be great for you guys to take a look
>>>>>> if you guys haven't tested your change against other environments like JDK
>>>>>> 11, Scala 2.13.
>>>>>>
>>>>>> On Mon, 20 Jun 2022 at 10:04, Hyukjin Kwon <gu...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi all,
>>>>>>>
>>>>>>> I am trying to rework GitHub Actions CI at
>>>>>>> https://issues.apache.org/jira/browse/SPARK-39515. Any help would
>>>>>>> be very appreciated.
>>>>>>>
>>>>>>>
>>>>>>> --
>>>> Twitter: https://twitter.com/holdenkarau
>>>> Books (Learning Spark, High Performance Spark, etc.):
>>>> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
>>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>>>
>>>

Re: [SPARK-39515] Improve scheduled jobs in GitHub Actions

Posted by Hyukjin Kwon <gu...@gmail.com>.
(cc @Yikun Jiang <yi...@gmail.com> @Gengliang Wang
<ge...@databricks.com> @Maxim Gekk <ma...@databricks.com>
@Yang,Jie(INF) <ya...@baidu.com> FYI)

On Wed, 22 Jun 2022 at 19:34, Hyukjin Kwon <gu...@gmail.com> wrote:

> Couple of updates:
>
>    -
>
>    All builds passed now with all combinations we defined in the GitHub
>    Actions (e.g., branch-3.2, branch-3.3, JDK 11,
>    JDK 17 and Scala 2.13), see https://github.com/apache/spark/actions cc @Tom
>    Graves <tg...@yahoo.com> @Dongjoon Hyun <do...@gmail.com>
>     FYI
>    -
>
>    except one test that is being failed due to OOM. That’s being fixed at
>    https://github.com/apache/spark/pull/36954, see
>    also https://github.com/apache/spark/pull/36787#discussion_r901190636
>    -
>
>    I am now adding PySpark, SparkR jobs to the scheduled builds at
>    https://github.com/apache/spark/pull/36940
>    and see if they pass. We might need a couple of more fixes there.
>    -
>
>    There’s one last task to simply caching the Docker image (
>    https://issues.apache.org/jira/browse/SPARK-39522).
>    I will have to be less active for this week and next week because of
>    the Spark Summit. Would appreciate if somebody
>    finds some time to take a stab.
>
> About a quick hallway meetup, I will be there after Holden’s talk at least
> to say hello to her :-).
> Let’s have a quick chat about our CI. We still have some general problems
> to cope with like the lack of resources in
> GitHub Actions.
>
>
>
> On Tue, 21 Jun 2022 at 11:49, Hyukjin Kwon <gu...@gmail.com> wrote:
>
>> Just chatted offline - both I and Holden have multiple sessions :-).
>> Probably let's meet up for a quick chat after your talk
>> https://databricks.com/dataaisummit/session/what-do-when-your-job-goes-oom-night-flowcharts
>> ?
>>
>>
>> On Mon, 20 Jun 2022 at 22:23, Holden Karau <ho...@pigscanfly.ca> wrote:
>>
>>> How about a hallway meet up at Data AI summit to talk about build CI if
>>> folks are
>>> Interested?
>>>
>>> On Sun, Jun 19, 2022 at 7:50 PM Hyukjin Kwon <gu...@gmail.com>
>>> wrote:
>>>
>>>> Increased the priority to a blocker - I don't think we can release with
>>>> these build failures and poor CI
>>>>
>>>> On Mon, 20 Jun 2022 at 10:39, Hyukjin Kwon <gu...@gmail.com> wrote:
>>>>
>>>>> There are too many test failures here. I pinged in some PRs I could
>>>>> identify from a cursory look but would be great for you guys to take a look
>>>>> if you guys haven't tested your change against other environments like JDK
>>>>> 11, Scala 2.13.
>>>>>
>>>>> On Mon, 20 Jun 2022 at 10:04, Hyukjin Kwon <gu...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> I am trying to rework GitHub Actions CI at
>>>>>> https://issues.apache.org/jira/browse/SPARK-39515. Any help would be
>>>>>> very appreciated.
>>>>>>
>>>>>>
>>>>>> --
>>> Twitter: https://twitter.com/holdenkarau
>>> Books (Learning Spark, High Performance Spark, etc.):
>>> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>>
>>

Re: [SPARK-39515] Improve scheduled jobs in GitHub Actions

Posted by Hyukjin Kwon <gu...@gmail.com>.
Couple of updates:

   -

   All builds passed now with all combinations we defined in the GitHub
   Actions (e.g., branch-3.2, branch-3.3, JDK 11,
   JDK 17 and Scala 2.13), see https://github.com/apache/spark/actions cc @Tom
   Graves <tg...@yahoo.com> @Dongjoon Hyun <do...@gmail.com>
    FYI
   -

   except one test that is being failed due to OOM. That’s being fixed at
   https://github.com/apache/spark/pull/36954, see
   also https://github.com/apache/spark/pull/36787#discussion_r901190636
   -

   I am now adding PySpark, SparkR jobs to the scheduled builds at
   https://github.com/apache/spark/pull/36940
   and see if they pass. We might need a couple of more fixes there.
   -

   There’s one last task to simply caching the Docker image (
   https://issues.apache.org/jira/browse/SPARK-39522).
   I will have to be less active for this week and next week because of the
   Spark Summit. Would appreciate if somebody
   finds some time to take a stab.

About a quick hallway meetup, I will be there after Holden’s talk at least
to say hello to her :-).
Let’s have a quick chat about our CI. We still have some general problems
to cope with like the lack of resources in
GitHub Actions.



On Tue, 21 Jun 2022 at 11:49, Hyukjin Kwon <gu...@gmail.com> wrote:

> Just chatted offline - both I and Holden have multiple sessions :-).
> Probably let's meet up for a quick chat after your talk
> https://databricks.com/dataaisummit/session/what-do-when-your-job-goes-oom-night-flowcharts
> ?
>
>
> On Mon, 20 Jun 2022 at 22:23, Holden Karau <ho...@pigscanfly.ca> wrote:
>
>> How about a hallway meet up at Data AI summit to talk about build CI if
>> folks are
>> Interested?
>>
>> On Sun, Jun 19, 2022 at 7:50 PM Hyukjin Kwon <gu...@gmail.com> wrote:
>>
>>> Increased the priority to a blocker - I don't think we can release with
>>> these build failures and poor CI
>>>
>>> On Mon, 20 Jun 2022 at 10:39, Hyukjin Kwon <gu...@gmail.com> wrote:
>>>
>>>> There are too many test failures here. I pinged in some PRs I could
>>>> identify from a cursory look but would be great for you guys to take a look
>>>> if you guys haven't tested your change against other environments like JDK
>>>> 11, Scala 2.13.
>>>>
>>>> On Mon, 20 Jun 2022 at 10:04, Hyukjin Kwon <gu...@gmail.com> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> I am trying to rework GitHub Actions CI at
>>>>> https://issues.apache.org/jira/browse/SPARK-39515. Any help would be
>>>>> very appreciated.
>>>>>
>>>>>
>>>>> --
>> Twitter: https://twitter.com/holdenkarau
>> Books (Learning Spark, High Performance Spark, etc.):
>> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>
>

Re: [SPARK-39515] Improve scheduled jobs in GitHub Actions

Posted by Hyukjin Kwon <gu...@gmail.com>.
Just chatted offline - both I and Holden have multiple sessions :-).
Probably let's meet up for a quick chat after your talk
https://databricks.com/dataaisummit/session/what-do-when-your-job-goes-oom-night-flowcharts
?


On Mon, 20 Jun 2022 at 22:23, Holden Karau <ho...@pigscanfly.ca> wrote:

> How about a hallway meet up at Data AI summit to talk about build CI if
> folks are
> Interested?
>
> On Sun, Jun 19, 2022 at 7:50 PM Hyukjin Kwon <gu...@gmail.com> wrote:
>
>> Increased the priority to a blocker - I don't think we can release with
>> these build failures and poor CI
>>
>> On Mon, 20 Jun 2022 at 10:39, Hyukjin Kwon <gu...@gmail.com> wrote:
>>
>>> There are too many test failures here. I pinged in some PRs I could
>>> identify from a cursory look but would be great for you guys to take a look
>>> if you guys haven't tested your change against other environments like JDK
>>> 11, Scala 2.13.
>>>
>>> On Mon, 20 Jun 2022 at 10:04, Hyukjin Kwon <gu...@gmail.com> wrote:
>>>
>>>> Hi all,
>>>>
>>>> I am trying to rework GitHub Actions CI at
>>>> https://issues.apache.org/jira/browse/SPARK-39515. Any help would be
>>>> very appreciated.
>>>>
>>>>
>>>> --
> Twitter: https://twitter.com/holdenkarau
> Books (Learning Spark, High Performance Spark, etc.):
> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>

Re: [SPARK-39515] Improve scheduled jobs in GitHub Actions

Posted by Holden Karau <ho...@pigscanfly.ca>.
How about a hallway meet up at Data AI summit to talk about build CI if
folks are
Interested?

On Sun, Jun 19, 2022 at 7:50 PM Hyukjin Kwon <gu...@gmail.com> wrote:

> Increased the priority to a blocker - I don't think we can release with
> these build failures and poor CI
>
> On Mon, 20 Jun 2022 at 10:39, Hyukjin Kwon <gu...@gmail.com> wrote:
>
>> There are too many test failures here. I pinged in some PRs I could
>> identify from a cursory look but would be great for you guys to take a look
>> if you guys haven't tested your change against other environments like JDK
>> 11, Scala 2.13.
>>
>> On Mon, 20 Jun 2022 at 10:04, Hyukjin Kwon <gu...@gmail.com> wrote:
>>
>>> Hi all,
>>>
>>> I am trying to rework GitHub Actions CI at
>>> https://issues.apache.org/jira/browse/SPARK-39515. Any help would be
>>> very appreciated.
>>>
>>>
>>> --
Twitter: https://twitter.com/holdenkarau
Books (Learning Spark, High Performance Spark, etc.):
https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
YouTube Live Streams: https://www.youtube.com/user/holdenkarau

Re: [SPARK-39515] Improve scheduled jobs in GitHub Actions

Posted by Hyukjin Kwon <gu...@gmail.com>.
Increased the priority to a blocker - I don't think we can release with
these build failures and poor CI

On Mon, 20 Jun 2022 at 10:39, Hyukjin Kwon <gu...@gmail.com> wrote:

> There are too many test failures here. I pinged in some PRs I could
> identify from a cursory look but would be great for you guys to take a look
> if you guys haven't tested your change against other environments like JDK
> 11, Scala 2.13.
>
> On Mon, 20 Jun 2022 at 10:04, Hyukjin Kwon <gu...@gmail.com> wrote:
>
>> Hi all,
>>
>> I am trying to rework GitHub Actions CI at
>> https://issues.apache.org/jira/browse/SPARK-39515. Any help would be
>> very appreciated.
>>
>>
>>

Re: [SPARK-39515] Improve scheduled jobs in GitHub Actions

Posted by Hyukjin Kwon <gu...@gmail.com>.
There are too many test failures here. I pinged in some PRs I could
identify from a cursory look but would be great for you guys to take a look
if you guys haven't tested your change against other environments like JDK
11, Scala 2.13.

On Mon, 20 Jun 2022 at 10:04, Hyukjin Kwon <gu...@gmail.com> wrote:

> Hi all,
>
> I am trying to rework GitHub Actions CI at
> https://issues.apache.org/jira/browse/SPARK-39515. Any help would be very
> appreciated.
>
>
>