You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/08/17 07:45:14 UTC

[GitHub] [spark] Yikun opened a new pull request, #37550: [WIP] Add infra image for lint job

Yikun opened a new pull request, #37550:
URL: https://github.com/apache/spark/pull/37550

   ### What changes were proposed in this pull request?
   
   This PR:
   - Adds `ghcr.io/apache/spark/apache-spark-github-action-image-cache:${{ inputs.branch }}-lint` for lint job, it will be used for master/branches/master shedule job.
   - Adds `ghcr.io/apache/spark/apache-spark-github-action-image-cache:${{ github.ref_name }}-lint` to speed up lint image build and make lint image as static image.
   
   
   ### Why are the changes needed?
   
   To aovid the issue like https://github.com/apache/spark/pull/37243#issuecomment-1191422150 , we had some initial discussion, we'd better move infra image into lint image.
   
   ### Does this PR introduce _any_ user-facing change?
   No, dev only
   
   ### How was this patch tested?
   CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a diff in pull request #37550: [WIP] Add separate infra image for lint job

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on code in PR #37550:
URL: https://github.com/apache/spark/pull/37550#discussion_r947610573


##########
.github/workflows/build_and_test.yml:
##########
@@ -300,16 +300,27 @@ jobs:
         uses: docker/setup-qemu-action@v1
       - name: Set up Docker Buildx
         uses: docker/setup-buildx-action@v1
-      - name: Build and push
+      - name: Build and push (Infra)

Review Comment:
   `Infra` -> `main` .. ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Yikun commented on a diff in pull request #37550: [WIP] Add infra image for lint job

Posted by GitBox <gi...@apache.org>.
Yikun commented on code in PR #37550:
URL: https://github.com/apache/spark/pull/37550#discussion_r947568387


##########
.github/workflows/build_and_test.yml:
##########
@@ -300,16 +300,27 @@ jobs:
         uses: docker/setup-qemu-action@v1
       - name: Set up Docker Buildx
         uses: docker/setup-buildx-action@v1
-      - name: Build and push
+      - name: Build and push (Infra)
         id: docker_build
         uses: docker/build-push-action@v2
         with:
-          context: ./dev/infra/
+          file: ./dev/infra/Dockerfile
           push: true
           tags: |
             ${{ needs.precondition.outputs.image_url }}
           # Use the infra image cache to speed up
           cache-from: type=registry,ref=ghcr.io/apache/spark/apache-spark-github-action-image-cache:${{ inputs.branch }}
+      - name: Build and push (Lint)
+        id: lint_docker_build
+        uses: docker/build-push-action@v2
+        with:
+          file: ./dev/infra/lint.Dockerfile
+          push: true
+          tags: |
+            ${{ needs.precondition.outputs.image_url }}-lint
+          build-args: BASE_IMG=ghcr.io/apache/spark/apache-spark-github-action-image-cache:${{ inputs.branch }}-static

Review Comment:
   This might bring some complex and issue when first commit merge after cut down branch because base image haven't generated yet. But all in one infra image doesn't have this problem, becuase only cache not working in first commit after cut down branch.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Yikun commented on a diff in pull request #37550: [WIP] Add separate infra image for lint job

Posted by GitBox <gi...@apache.org>.
Yikun commented on code in PR #37550:
URL: https://github.com/apache/spark/pull/37550#discussion_r947672641


##########
.github/workflows/build_infra_images_cache.yml:
##########
@@ -25,6 +25,7 @@ on:
     - 'master'
     paths:
     - 'dev/infra/Dockerfile'
+    - 'dev/infra/lint.Dockerfile'

Review Comment:
   BTW,
   - `Dockerfile.<something>` may also bring some misleading in some case: https://github.com/apache/spark/pull/35772#discussion_r822239491
   - We also use [dockerfile.java17](https://github.com/apache/spark/blob/master/resource-managers/kubernetes/docker/src/main/dockerfiles/spark/Dockerfile.java17) in Spark
   
   But I think `Dockerfile.lint` is ok in here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Yikun commented on a diff in pull request #37550: [WIP] Add separate infra image for lint job

Posted by GitBox <gi...@apache.org>.
Yikun commented on code in PR #37550:
URL: https://github.com/apache/spark/pull/37550#discussion_r947672641


##########
.github/workflows/build_infra_images_cache.yml:
##########
@@ -25,6 +25,7 @@ on:
     - 'master'
     paths:
     - 'dev/infra/Dockerfile'
+    - 'dev/infra/lint.Dockerfile'

Review Comment:
   BTW,
   - `Dockerfile.<something>` may also bring some misleading in some case: https://github.com/apache/spark/pull/35772#discussion_r822239491
   - We also use [dockerfile.java17](https://github.com/apache/spark/blob/master/resource-managers/kubernetes/docker/src/main/dockerfiles/spark/Dockerfile.java17) in Spark
   
   But I think `Dockerfile.int` is ok in here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] github-actions[bot] commented on pull request #37550: [SPARK-40125][INFRA] Add separate infra image for lint job

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #37550:
URL: https://github.com/apache/spark/pull/37550#issuecomment-1327938470

   We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
   If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Yikun commented on pull request #37550: [WIP] Add infra image for lint job

Posted by GitBox <gi...@apache.org>.
Yikun commented on PR #37550:
URL: https://github.com/apache/spark/pull/37550#issuecomment-1217610001

   Another choice is that we just put lint deps in to base infra image, like https://github.com/Yikun/spark/pull/152, it's more simple, but introduce some other deps in pyspark/sparkr job. From my view, it's also can accept.
   
   cc @HyukjinKwon @dongjoon-hyun @zhengruifeng WDYT?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Yikun commented on a diff in pull request #37550: [WIP] Add separate infra image for lint job

Posted by GitBox <gi...@apache.org>.
Yikun commented on code in PR #37550:
URL: https://github.com/apache/spark/pull/37550#discussion_r947672641


##########
.github/workflows/build_infra_images_cache.yml:
##########
@@ -25,6 +25,7 @@ on:
     - 'master'
     paths:
     - 'dev/infra/Dockerfile'
+    - 'dev/infra/lint.Dockerfile'

Review Comment:
   BTW,
   - `Dockerfile.<something>` may also bring some misleading in some case: https://github.com/apache/spark/pull/35772#discussion_r822239491
   - We also use [dockerfile.java17](https://github.com/apache/spark/blob/master/resource-managers/kubernetes/docker/src/main/dockerfiles/spark/Dockerfile.java17) in Spark
   
   But I think `Dockerfile.lint` is ok in here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Yikun commented on a diff in pull request #37550: [WIP] Add infra image for lint job

Posted by GitBox <gi...@apache.org>.
Yikun commented on code in PR #37550:
URL: https://github.com/apache/spark/pull/37550#discussion_r947568387


##########
.github/workflows/build_and_test.yml:
##########
@@ -300,16 +300,27 @@ jobs:
         uses: docker/setup-qemu-action@v1
       - name: Set up Docker Buildx
         uses: docker/setup-buildx-action@v1
-      - name: Build and push
+      - name: Build and push (Infra)
         id: docker_build
         uses: docker/build-push-action@v2
         with:
-          context: ./dev/infra/
+          file: ./dev/infra/Dockerfile
           push: true
           tags: |
             ${{ needs.precondition.outputs.image_url }}
           # Use the infra image cache to speed up
           cache-from: type=registry,ref=ghcr.io/apache/spark/apache-spark-github-action-image-cache:${{ inputs.branch }}
+      - name: Build and push (Lint)
+        id: lint_docker_build
+        uses: docker/build-push-action@v2
+        with:
+          file: ./dev/infra/lint.Dockerfile
+          push: true
+          tags: |
+            ${{ needs.precondition.outputs.image_url }}-lint
+          build-args: BASE_IMG=ghcr.io/apache/spark/apache-spark-github-action-image-cache:${{ inputs.branch }}-static

Review Comment:
   This might bring some complex and issue when first commit merge after cut down branch because base image haven't generated yet. But all in one infra image https://github.com/Yikun/spark/pull/152 doesn't have this problem, becuase only cache not working in first commit after cut down branch.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zhengruifeng commented on pull request #37550: [WIP] Add separate infra image for lint job

Posted by GitBox <gi...@apache.org>.
zhengruifeng commented on PR #37550:
URL: https://github.com/apache/spark/pull/37550#issuecomment-1217705194

   I dont know infra image well, but have 2 dumb question:


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a diff in pull request #37550: [WIP] Add separate infra image for lint job

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on code in PR #37550:
URL: https://github.com/apache/spark/pull/37550#discussion_r947605179


##########
.github/workflows/build_infra_images_cache.yml:
##########
@@ -25,6 +25,7 @@ on:
     - 'master'
     paths:
     - 'dev/infra/Dockerfile'
+    - 'dev/infra/lint.Dockerfile'

Review Comment:
   Do we have a better naming? `lint.Dockerfile` looks a bit odd to me - Dockerfile looks like an extension.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Yikun commented on a diff in pull request #37550: [WIP] Add separate infra image for lint job

Posted by GitBox <gi...@apache.org>.
Yikun commented on code in PR #37550:
URL: https://github.com/apache/spark/pull/37550#discussion_r947667755


##########
.github/workflows/build_infra_images_cache.yml:
##########
@@ -25,6 +25,7 @@ on:
     - 'master'
     paths:
     - 'dev/infra/Dockerfile'
+    - 'dev/infra/lint.Dockerfile'

Review Comment:
   `Dockerfile.<something>` or `<something>.Dockerfile` are recommand by docker officially:
   https://github.com/docker/docker.github.io/blob/master/_includes/guides/create-dockerfile.md
   
   It could also be: `Dockerfile.lint`
   
   I don't have much preference between `Dockerfile.lint` and `lint.Dockerfile`, all is ok for me.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a diff in pull request #37550: [WIP] Add separate infra image for lint job

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on code in PR #37550:
URL: https://github.com/apache/spark/pull/37550#discussion_r947610990


##########
.github/workflows/build_and_test.yml:
##########
@@ -300,16 +300,27 @@ jobs:
         uses: docker/setup-qemu-action@v1
       - name: Set up Docker Buildx
         uses: docker/setup-buildx-action@v1
-      - name: Build and push
+      - name: Build and push (Infra)
         id: docker_build
         uses: docker/build-push-action@v2
         with:
-          context: ./dev/infra/
+          file: ./dev/infra/Dockerfile
           push: true
           tags: |
             ${{ needs.precondition.outputs.image_url }}
           # Use the infra image cache to speed up
           cache-from: type=registry,ref=ghcr.io/apache/spark/apache-spark-github-action-image-cache:${{ inputs.branch }}
+      - name: Build and push (Lint)

Review Comment:
   maybe Lint -> lint



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #37550: [WIP] Add separate infra image for lint job

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on PR #37550:
URL: https://github.com/apache/spark/pull/37550#issuecomment-1217683686

   Looks pretty good


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Yikun commented on a diff in pull request #37550: [WIP] Add separate infra image for lint job

Posted by GitBox <gi...@apache.org>.
Yikun commented on code in PR #37550:
URL: https://github.com/apache/spark/pull/37550#discussion_r947724949


##########
.github/workflows/build_infra_images_cache.yml:
##########
@@ -25,6 +25,7 @@ on:
     - 'master'
     paths:
     - 'dev/infra/Dockerfile'
+    - 'dev/infra/lint.Dockerfile'

Review Comment:
   > Dockerfile looks like an extension.
   
   There seems to be nothing wrong with Dockerfile as an extension, I just saw intelliJ can identify `lint.Dockerfile` and highlight syntax, but `Dockerfile.lint` not. 😂



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] github-actions[bot] closed pull request #37550: [SPARK-40125][INFRA] Add separate infra image for lint job

Posted by GitBox <gi...@apache.org>.
github-actions[bot] closed pull request #37550: [SPARK-40125][INFRA] Add separate infra image for lint job
URL: https://github.com/apache/spark/pull/37550


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Yikun commented on a diff in pull request #37550: [WIP] Add separate infra image for lint job

Posted by GitBox <gi...@apache.org>.
Yikun commented on code in PR #37550:
URL: https://github.com/apache/spark/pull/37550#discussion_r947667755


##########
.github/workflows/build_infra_images_cache.yml:
##########
@@ -25,6 +25,7 @@ on:
     - 'master'
     paths:
     - 'dev/infra/Dockerfile'
+    - 'dev/infra/lint.Dockerfile'

Review Comment:
   `Dockerfile.<something>` or `<something>.Dockerfile` are remmomand by docker officially:
   https://github.com/docker/docker.github.io/blob/master/_includes/guides/create-dockerfile.md
   
   It could also be: `Dockerfile.lint`
   
   I don't have much preference between `Dockerfile.lint` and `lint.Dockerfile`, all is ok for me.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Yikun commented on pull request #37550: [WIP] Add separate infra image for lint job

Posted by GitBox <gi...@apache.org>.
Yikun commented on PR #37550:
URL: https://github.com/apache/spark/pull/37550#issuecomment-1217755142

   > the step to install those dependencies can be saved ?
   
   Yes, because we are using [cache-from](https://github.com/apache/spark/pull/37550/files#diff-48c0ee97c53013d18d6bbae44648f7fab9af2e0bf5b0dc1ca761e18ec5c478f2R323), so if no text changes and base image is not changed, it will be static.
   
   > if a new version of devtools is released some time, we won't worry about that, since this is a static image, right?
   
   Yes!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org