You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "vakarisbk (via GitHub)" <gi...@apache.org> on 2023/10/13 13:39:05 UTC

[PR] [WIP] Add support for java 17 and explicit Python versions from 3.5.0 [spark-docker]

vakarisbk opened a new pull request, #56:
URL: https://github.com/apache/spark-docker/pull/56

   
   ### What changes were proposed in this pull request?
   1. Create Java17 base images alongside Java11 images starting from spark 3.5.0
   2. Add the ability to explicitly define Python versions 
   3. Change ubuntu version to 22.04 for `scala2.12-java17-*`
   
   ### Why are the changes needed?
   
   Spark supports multiple versions of Java and Spark and some community members have a need to use specific versions of Java and Python for their use cases. Adding this option would simplify workflows for these users and make Spark more accessible.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No
   
   ### How was this patch tested?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] Add support for java 17 from spark 3.5.0 [spark-docker]

Posted by "vakarisbk (via GitHub)" <gi...@apache.org>.
vakarisbk commented on code in PR #56:
URL: https://github.com/apache/spark-docker/pull/56#discussion_r1367926824


##########
testing/testing.sh:
##########
@@ -61,7 +61,8 @@ function remove_network() {
 
 # Find and kill any remaining containers attached to the network
 function cleanup() {
-  local containers
+  local containers 
+

Review Comment:
   fixed



##########
versions.json:
##########
@@ -1,9 +1,38 @@
 {
   "versions": [
+    {
+      "path": "3.5.0/scala2.12-java17-python3-ubuntu",
+      "tags": [
+        "3.5.0-scala2.12-java17-python3-ubuntu",
+        "3.5.0-java17-python3",
+        "3.5.0-java17",
+        "python3-java17"
+      ]
+    },
+    {
+      "path": "3.5.0/scala2.12-java17-r-ubuntu",
+      "tags": [
+        "3.5.0-scala2.12-java17-r-ubuntu",
+        "3.5.0-java-17-r"

Review Comment:
   "3.5.0-java17-r" added



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] Add support for java 17 from spark 3.5.0 [spark-docker]

Posted by "vakarisbk (via GitHub)" <gi...@apache.org>.
vakarisbk commented on PR #56:
URL: https://github.com/apache/spark-docker/pull/56#issuecomment-1766504979

   1.  made `fetch-depth: 0` applicable only for 3.3.0 (saves about 400MB for other builds)
   2. borrowed CI runner cleanup scripts from the main spark repo [free_disk_space_container](https://github.com/apache/spark/blob/master/dev/free_disk_space_container) [free_disk_space](https://github.com/apache/spark/blob/master/dev/free_disk_space) and added an action step to execute them.
   These scripts made the CI runner filesystem go from this:
   ```
   Filesystem      Size  Used Avail Use% Mounted on
   /dev/root        84G   66G   18G  80% /
   tmpfs           3.4G  172K  3.4G   1% /dev/shm
   tmpfs           1.4G  1.2M  1.4G   1% /run
   tmpfs           5.0M     0  5.0M   0% /run/lock
   /dev/sdb15      105M  6.1M   99M   6% /boot/efi
   /dev/sda1        14G  4.1G  9.0G  31% /mnt
   tmpfs           693M   12K  693M   1% /run/user/1001
   ```
   
   to this:
   ```
   Filesystem      Size  Used Avail Use% Mounted on
   /dev/root        84G   29G   55G  35% /
   tmpfs           3.4G  172K  3.4G   1% /dev/shm
   tmpfs           1.4G  1.2M  1.4G   1% /run
   tmpfs           5.0M     0  5.0M   0% /run/lock
   /dev/sdb15      105M  6.1M   99M   6% /boot/efi
   /dev/sda1        14G  4.1G  9.0G  31% /mnt
   tmpfs           693M   12K  693M   1% /run/user/1001
   ```
   
   @Yikun could you trigger the builds one more time?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] Add support for java 17 from spark 3.5.0 [spark-docker]

Posted by "vakarisbk (via GitHub)" <gi...@apache.org>.
vakarisbk commented on code in PR #56:
URL: https://github.com/apache/spark-docker/pull/56#discussion_r1367926834


##########
versions.json:
##########
@@ -1,9 +1,38 @@
 {
   "versions": [
+    {
+      "path": "3.5.0/scala2.12-java17-python3-ubuntu",
+      "tags": [
+        "3.5.0-scala2.12-java17-python3-ubuntu",
+        "3.5.0-java17-python3",
+        "3.5.0-java17",
+        "python3-java17"
+      ]
+    },
+    {
+      "path": "3.5.0/scala2.12-java17-r-ubuntu",
+      "tags": [
+        "3.5.0-scala2.12-java17-r-ubuntu",
+        "3.5.0-java-17-r"

Review Comment:
   fixed to "3.5.0-java17-r"



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] Add support for java 17 from spark 3.5.0 [spark-docker]

Posted by "Yikun (via GitHub)" <gi...@apache.org>.
Yikun closed pull request #56: Add support for java 17 from spark 3.5.0
URL: https://github.com/apache/spark-docker/pull/56


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] Add support for java 17 from spark 3.5.0 [spark-docker]

Posted by "RoeiA (via GitHub)" <gi...@apache.org>.
RoeiA commented on PR #56:
URL: https://github.com/apache/spark-docker/pull/56#issuecomment-1801848290

   Waiting for this one for a while now, thank you @vakarisbk for this PR, would love to see it merged!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [WIP] Add support for java 17 and explicit Python versions from spark 3.5.0 onwards [spark-docker]

Posted by "vakarisbk (via GitHub)" <gi...@apache.org>.
vakarisbk commented on code in PR #56:
URL: https://github.com/apache/spark-docker/pull/56#discussion_r1359852744


##########
3.5.0/scala2.12-java11-python3-r-ubuntu/Dockerfile:
##########
@@ -20,7 +20,10 @@ USER root
 
 RUN set -ex; \
     apt-get update; \
-    apt-get install -y python3 python3-pip; \
+    apt install -y software-properties-common; \
+    add-apt-repository ppa:deadsnakes/ppa; \
+    apt install python3.10; \

Review Comment:
   Rolled this back. My initial idea was to propose adding images with multiple python versions to the repo (java17-python3.10, java17-python3.9, etc), but now that I think about it - probably not a lot of community members would benefit from this and it would clutter up the repository quite a bit.
   
   And those people who need to have specific python versions (like me) can just take a base image and install whatever python version they want.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] Add support for java 17 from spark 3.5.0 [spark-docker]

Posted by "Yikun (via GitHub)" <gi...@apache.org>.
Yikun commented on code in PR #56:
URL: https://github.com/apache/spark-docker/pull/56#discussion_r1361615937


##########
3.3.0/scala2.12-java11-python3-r-ubuntu/Dockerfile:
##########
@@ -24,9 +24,9 @@ RUN groupadd --system --gid=${spark_uid} spark && \
 RUN set -ex && \
     apt-get update && \
     ln -s /lib /lib64 && \
-    apt install -y gnupg2 wget bash tini libc6 libpam-modules krb5-user libnss3 procps net-tools gosu && \
-    apt install -y python3 python3-pip && \
-    apt install -y r-base r-base-dev && \
+    apt-get install -y gnupg2 wget bash tini libc6 libpam-modules krb5-user libnss3 procps net-tools gosu && \

Review Comment:
   this should be changed, we apply this change only after 3.4 version.
   
   It would be good if you can revert all 3.3.0 changes, : ) 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] Add support for java 17 from spark 3.5.0 [spark-docker]

Posted by "vakarisbk (via GitHub)" <gi...@apache.org>.
vakarisbk commented on PR #56:
URL: https://github.com/apache/spark-docker/pull/56#issuecomment-1766062543

   > I never hit this before, but if it is storage limit, you could try to remove some tmp file to save space, such as:
   > 
   > 1. https://github.com/apache/spark-docker/blob/master/.github/workflows/main.yml#L249
   > 
   > ```
   > sudo install minikube-linux-amd64 /usr/local/bin/minikube
   > rm minikube-linux-amd64
   > ```
   > 
   > It's about to save 80MB
   > 
   > (later upadte) I also noticed this change also apply on main repo: https://github.com/apache/spark/blob/master/.github/workflows/build_and_test.yml#L1045
   > 
   > 2. (if step 1 is ok, we don't need this step) https://github.com/apache/spark-docker/blob/028efd4637fb2cf791d5bd9ea70b2fca472de4b7/.github/workflows/main.yml#L201
   > 
   > remove `fetch-depth: 0`, seems also save some space?
   
   added `rm minikube-linux-amd64` and removed accidental changes in 3.3.0. Now only 3.5.0 will be built.
   
   removing `fetch-deph: 0` would help with space, but the default is `fetch-deph: 1` which only fetches a single commit form the main/master branch. That would make the 3.3.0 build fail as it needs to cherry-pick commits from history.
   I've tested this out on my own repo: https://github.com/vakarisbk/spark-docker/actions/runs/6545230049/job/17773271828
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] Add support for java 17 from spark 3.5.0 [spark-docker]

Posted by "Yikun (via GitHub)" <gi...@apache.org>.
Yikun commented on PR #56:
URL: https://github.com/apache/spark-docker/pull/56#issuecomment-1765809116

   I never hit this before, but if it is storage limit, you could try to remove some tmp file to save space, such as:
   
   
   1. https://github.com/apache/spark-docker/blob/master/.github/workflows/main.yml#L249
   ```
   sudo install minikube-linux-amd64 /usr/local/bin/minikube
   rm -f ./minikube-linux-amd64
   ```
   It's about to save 80MB
   
   2. https://github.com/apache/spark-docker/blob/028efd4637fb2cf791d5bd9ea70b2fca472de4b7/.github/workflows/main.yml#L201
   
   remove `fetch-depth: 0`, seems also save some space?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-43305] Add support for java 17 from spark 3.5.0 [spark-docker]

Posted by "Yikun (via GitHub)" <gi...@apache.org>.
Yikun commented on PR #56:
URL: https://github.com/apache/spark-docker/pull/56#issuecomment-1805024385

   @vakarisbk Merged to master.
   
   Thanks all!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-43305] Add support for java 17 from spark 3.5.0 [spark-docker]

Posted by "Yikun (via GitHub)" <gi...@apache.org>.
Yikun commented on PR #56:
URL: https://github.com/apache/spark-docker/pull/56#issuecomment-1805467380

   Image published on GHCR @vakarisbk, would you mind doing a post validation:
   
   https://github.com/apache/spark-docker/pkgs/container/spark-docker%2Fspark
   
   Then you could feel free to open a PR on official image like:
    https://github.com/docker-library/official-images/pull/15363
   
   The content can be generated by `tools/manifest.py manifest`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] Add support for java 17 from spark 3.5.0 [spark-docker]

Posted by "vakarisbk (via GitHub)" <gi...@apache.org>.
vakarisbk commented on PR #56:
URL: https://github.com/apache/spark-docker/pull/56#issuecomment-1768267473

   All tests have passed except the 3.3.2 build.
   
   The 3.3.2 build fails due to an issue with the GPG key on `keys.openpgp.org` (key on `keyserver.ubuntu.com` works fine)
   ```
   - gpg --keyserver hkps://keys.openpgp.org --recv-key "C56349D886F2B01F8CAE794C653C2301FEA493EE" 
   
   gpg: key 653C2301FEA493EE: no user ID
   gpg: Total number processed: 1 
   
   - gpg --batch --verify spark.tgz.asc spark.tgz
   
   gpg: Signature made Pn Vas 10 22:40:58 2023 EET
   gpg:                using RSA key C56349D886F2B01F8CAE794C653C2301FEA493EE
   gpg:                issuer "viirya@apache.org"
   gpg: Can't check signature: No public key
   ```
   
   But that's probably not relevant to this PR?
   
   Apart from that, the PR is ready for review.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] Add support for java 17 from spark 3.5.0 [spark-docker]

Posted by "Yikun (via GitHub)" <gi...@apache.org>.
Yikun commented on PR #56:
URL: https://github.com/apache/spark-docker/pull/56#issuecomment-1803034637

   Thanks for your efforts @vakarisbk , I'm going to merge this PR later today or tomorrow.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [WIP] Add support for java 17 and explicit Python versions from spark 3.5.0 onwards [spark-docker]

Posted by "vakarisbk (via GitHub)" <gi...@apache.org>.
vakarisbk commented on code in PR #56:
URL: https://github.com/apache/spark-docker/pull/56#discussion_r1359853632


##########
.github/workflows/test.yml:
##########
@@ -37,12 +37,15 @@ on:
         - 3.3.0
       java:
         description: 'The Java version of Spark image.'
-        default: 11
+        default: "11"

Review Comment:
   Not really. Value is defined as string and my linter was complaining that it's not a string. GH actions don't really care about this.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] Add support for java 17 from spark 3.5.0 [spark-docker]

Posted by "viirya (via GitHub)" <gi...@apache.org>.
viirya commented on PR #56:
URL: https://github.com/apache/spark-docker/pull/56#issuecomment-1773944548

   @Yikun Just uploaded to openpgp.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] Add support for java 17 from spark 3.5.0 [spark-docker]

Posted by "Yikun (via GitHub)" <gi...@apache.org>.
Yikun commented on code in PR #56:
URL: https://github.com/apache/spark-docker/pull/56#discussion_r1367866420


##########
versions.json:
##########
@@ -1,9 +1,38 @@
 {
   "versions": [
+    {
+      "path": "3.5.0/scala2.12-java17-python3-ubuntu",
+      "tags": [
+        "3.5.0-scala2.12-java17-python3-ubuntu",
+        "3.5.0-java17-python3",
+        "3.5.0-java17",
+        "python3-java17"
+      ]
+    },
+    {
+      "path": "3.5.0/scala2.12-java17-r-ubuntu",
+      "tags": [
+        "3.5.0-scala2.12-java17-r-ubuntu",
+        "3.5.0-java-17-r"
+      ]
+    },
+    {
+      "path": "3.5.0/scala2.12-java17-ubuntu",
+      "tags": [
+        "3.5.0-scala2.12-java17-ubuntu",
+        "3.5.0-java17-scala"
+      ]
+    },
+    {
+      "path": "3.5.0/scala2.12-java17-python3-r-ubuntu",
+      "tags": [
+        "3.5.0-scala2.12-java17-python3-r-ubuntu"
+      ]
+    },
     {
       "path": "3.5.0/scala2.12-java11-python3-ubuntu",
       "tags": [
-        "3.5.0-scala2.12-java11-python3-ubuntu",
+        "3.5.0-scala2.12-java17-python3-ubuntu",

Review Comment:
   This shouldn't be changed.



##########
testing/testing.sh:
##########
@@ -61,7 +61,8 @@ function remove_network() {
 
 # Find and kill any remaining containers attached to the network
 function cleanup() {
-  local containers
+  local containers 
+

Review Comment:
   unrelated change



##########
versions.json:
##########
@@ -1,9 +1,38 @@
 {
   "versions": [
+    {
+      "path": "3.5.0/scala2.12-java17-python3-ubuntu",
+      "tags": [
+        "3.5.0-scala2.12-java17-python3-ubuntu",
+        "3.5.0-java17-python3",
+        "3.5.0-java17",
+        "python3-java17"
+      ]
+    },
+    {
+      "path": "3.5.0/scala2.12-java17-r-ubuntu",
+      "tags": [
+        "3.5.0-scala2.12-java17-r-ubuntu",
+        "3.5.0-java-17-r"

Review Comment:
   ```suggestion
           "3.5.0-java17-r"
   ```
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] Add support for java 17 from spark 3.5.0 [spark-docker]

Posted by "Yikun (via GitHub)" <gi...@apache.org>.
Yikun commented on code in PR #56:
URL: https://github.com/apache/spark-docker/pull/56#discussion_r1369408222


##########
add-dockerfiles.sh:
##########
@@ -26,13 +26,17 @@
 # - Add 3.3.1 dockerfiles:
 #   $ ./add-dockerfiles.sh 3.3.1
 
-VERSION=${1:-"3.3.0"}
+VERSION=${1:-"3.5.0"}
 
 TAGS="
 scala2.12-java11-python3-r-ubuntu
 scala2.12-java11-python3-ubuntu
 scala2.12-java11-r-ubuntu
 scala2.12-java11-ubuntu
+scala2.12-java17-python3-r-ubuntu
+scala2.12-java17-python3-ubuntu
+scala2.12-java17-r-ubuntu
+scala2.12-java17-ubuntu

Review Comment:
   Because we only add after 3.5 version, so we should skip 3.3 / 3.4 version. So seems we need some thing like below:
   
   ```shell
   if ! echo $VERSION | grep -Eq "^3.3|^3.4"; then
      TAGS+="
      scala2.12-java17-python3-r-ubuntu
      scala2.12-java17-python3-ubuntu
      scala2.12-java17-r-ubuntu
      scala2.12-java17-ubuntu
      "
   fi
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [WIP] Add support for java 17 and explicit Python versions from spark 3.5.0 onwards [spark-docker]

Posted by "Yikun (via GitHub)" <gi...@apache.org>.
Yikun commented on code in PR #56:
URL: https://github.com/apache/spark-docker/pull/56#discussion_r1358359041


##########
add-dockerfiles.sh:
##########
@@ -44,12 +48,20 @@ for TAG in $TAGS; do
     if echo $TAG | grep -q "r-"; then
         OPTS+=" --sparkr"
     fi
+    
+    if echo $TAG | grep -q "java17"; then
+        OPTS+=" --java-version 17 --image eclipse-temurin:17-jre-jammy"
+    fi
+    if echo $TAG | grep -q "java11"; then

Review Comment:
   elif?



##########
add-dockerfiles.sh:
##########
@@ -44,12 +48,20 @@ for TAG in $TAGS; do
     if echo $TAG | grep -q "r-"; then
         OPTS+=" --sparkr"
     fi
+    
+    if echo $TAG | grep -q "java17"; then
+        OPTS+=" --java-version 17 --image eclipse-temurin:17-jre-jammy"

Review Comment:
   Greate!



##########
3.5.0/scala2.12-java11-python3-r-ubuntu/Dockerfile:
##########
@@ -20,7 +20,10 @@ USER root
 
 RUN set -ex; \
     apt-get update; \
-    apt-get install -y python3 python3-pip; \
+    apt install -y software-properties-common; \
+    add-apt-repository ppa:deadsnakes/ppa; \
+    apt install python3.10; \

Review Comment:
   Is there any special reason why we use the python 3.10? I prefer to use os default python3 version from matainence cost view.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [WIP] Add support for java 17 and explicit Python versions from spark 3.5.0 onwards [spark-docker]

Posted by "Yikun (via GitHub)" <gi...@apache.org>.
Yikun commented on code in PR #56:
URL: https://github.com/apache/spark-docker/pull/56#discussion_r1358361784


##########
tools/template.py:
##########
@@ -59,7 +59,7 @@ def parse_opts():
     parser.add_argument(
         "-j",
         "--java-version",
-        help="The Spark version of Dockerfile.",
+        help="Java version of Dockerfile.",

Review Comment:
   Good catch



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [WIP] Add support for java 17 and explicit Python versions from spark 3.5.0 onwards [spark-docker]

Posted by "vakarisbk (via GitHub)" <gi...@apache.org>.
vakarisbk commented on code in PR #56:
URL: https://github.com/apache/spark-docker/pull/56#discussion_r1359852798


##########
add-dockerfiles.sh:
##########
@@ -44,12 +48,20 @@ for TAG in $TAGS; do
     if echo $TAG | grep -q "r-"; then
         OPTS+=" --sparkr"
     fi
+    
+    if echo $TAG | grep -q "java17"; then
+        OPTS+=" --java-version 17 --image eclipse-temurin:17-jre-jammy"
+    fi
+    if echo $TAG | grep -q "java11"; then

Review Comment:
   fixed



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] Add support for java 17 from spark 3.5.0 [spark-docker]

Posted by "vakarisbk (via GitHub)" <gi...@apache.org>.
vakarisbk commented on code in PR #56:
URL: https://github.com/apache/spark-docker/pull/56#discussion_r1361816378


##########
3.3.0/scala2.12-java11-python3-r-ubuntu/Dockerfile:
##########
@@ -24,9 +24,9 @@ RUN groupadd --system --gid=${spark_uid} spark && \
 RUN set -ex && \
     apt-get update && \
     ln -s /lib /lib64 && \
-    apt install -y gnupg2 wget bash tini libc6 libpam-modules krb5-user libnss3 procps net-tools gosu && \
-    apt install -y python3 python3-pip && \
-    apt install -y r-base r-base-dev && \
+    apt-get install -y gnupg2 wget bash tini libc6 libpam-modules krb5-user libnss3 procps net-tools gosu && \

Review Comment:
   fixed



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] Add support for java 17 from spark 3.5.0 [spark-docker]

Posted by "Yikun (via GitHub)" <gi...@apache.org>.
Yikun commented on code in PR #56:
URL: https://github.com/apache/spark-docker/pull/56#discussion_r1358358604


##########
add-dockerfiles.sh:
##########
@@ -44,12 +48,20 @@ for TAG in $TAGS; do
     if echo $TAG | grep -q "r-"; then
         OPTS+=" --sparkr"
     fi
+    
+    if echo $TAG | grep -q "java17"; then
+        OPTS+=" --java-version 17 --image eclipse-temurin:17-jre-jammy"

Review Comment:
   Great!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] Add support for java 17 from spark 3.5.0 [spark-docker]

Posted by "vakarisbk (via GitHub)" <gi...@apache.org>.
vakarisbk commented on PR #56:
URL: https://github.com/apache/spark-docker/pull/56#issuecomment-1798141705

   @Yikun maybe we can have this PR merged even without @HyukjinKwon and @zhengruifeng approval? This will not impact the existing images


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] Add support for java 17 from spark 3.5.0 [spark-docker]

Posted by "Yikun (via GitHub)" <gi...@apache.org>.
Yikun commented on PR #56:
URL: https://github.com/apache/spark-docker/pull/56#issuecomment-1773942940

   @viirya Hi, Would you  mind taking a look on 3.3.2 release key issue. It might needs your help to upload the public key, see [1] as ref.
   
   [1] https://github.com/apache/spark-docker/pull/55#issuecomment-1715173342


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] Add support for java 17 from spark 3.5.0 [spark-docker]

Posted by "vakarisbk (via GitHub)" <gi...@apache.org>.
vakarisbk commented on PR #56:
URL: https://github.com/apache/spark-docker/pull/56#issuecomment-1765778478

   It seems to me that the builds are failing due to insufficient storage on the runners.
   
   ```[info] org.apache.spark.deploy.k8s.integrationtest.KubernetesSuite *** ABORTED *** (1 second, 160 milliseconds)
   [info]   java.lang.AssertionError: assertion failed: Failed to execute -- bash -c MINIKUBE_IN_STYLE=true minikube status  --
   [info] minikube
   [info] type: Control Plane
   [info] host: InsufficientStorage
   [info] kubelet: Running
   [info] apiserver: Running
   [info] kubeconfig: Configured
   [info] docker-env: in-use
   ```
   Maybe it's possible to try switching to larger runners?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [WIP] Add support for java 17 and explicit Python versions from spark 3.5.0 onwards [spark-docker]

Posted by "Yikun (via GitHub)" <gi...@apache.org>.
Yikun commented on code in PR #56:
URL: https://github.com/apache/spark-docker/pull/56#discussion_r1358357340


##########
3.5.0/scala2.12-java11-python3-r-ubuntu/Dockerfile:
##########
@@ -20,7 +20,10 @@ USER root
 
 RUN set -ex; \
     apt-get update; \
-    apt-get install -y python3 python3-pip; \
+    apt install -y software-properties-common; \
+    add-apt-repository ppa:deadsnakes/ppa; \
+    apt install python3.10; \

Review Comment:
   Is there any special reason why we use the python 3.10? I prefer to use os default python3 version from matainence cost view, and also os default python version has more stable quality and security.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [WIP] Add support for java 17 and explicit Python versions from spark 3.5.0 onwards [spark-docker]

Posted by "Yikun (via GitHub)" <gi...@apache.org>.
Yikun commented on code in PR #56:
URL: https://github.com/apache/spark-docker/pull/56#discussion_r1358357340


##########
3.5.0/scala2.12-java11-python3-r-ubuntu/Dockerfile:
##########
@@ -20,7 +20,10 @@ USER root
 
 RUN set -ex; \
     apt-get update; \
-    apt-get install -y python3 python3-pip; \
+    apt install -y software-properties-common; \
+    add-apt-repository ppa:deadsnakes/ppa; \
+    apt install python3.10; \

Review Comment:
   Is there any special reason why we use the python 3.10? I prefer to use os default python3 version from matainence cost view, and also os default python version has more stable quality.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] Add support for java 17 from spark 3.5.0 [spark-docker]

Posted by "Yikun (via GitHub)" <gi...@apache.org>.
Yikun commented on PR #56:
URL: https://github.com/apache/spark-docker/pull/56#issuecomment-1776247619

   cc @HyukjinKwon @zhengruifeng Would you mind also taking a look?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] Add support for java 17 from spark 3.5.0 [spark-docker]

Posted by "vakarisbk (via GitHub)" <gi...@apache.org>.
vakarisbk commented on PR #56:
URL: https://github.com/apache/spark-docker/pull/56#issuecomment-1774127165

   > Please also make sure:
   > 
   > 1. All dockerfiles and entrypoint.sh should be generated by `add-dockerfiles.sh`
   
   All dockerfiles and entrypoints were generated using the add-dockerfiles.sh
   To validate, I ran this diff:
   
   ```
   mv 3.5.0 3.5.0_copy; \
   ./add-dockerfiles.sh 3.5.0; \
   diff -r 3.5.0 3.5.0_copy;
   ```
   
   > 2. It would be better if you can publish these images in your local repo to test (by appending a local change line in your local branch .github/workflows/publish.yml L50),  It's just a test but shouldn't be changed in this PR.
   
   I've published the images in my [forked repository](https://github.com/vakarisbk/spark-docker/pkgs/container/spark-docker%2Fspark).
   Publish job logs can be found [here](https://github.com/vakarisbk/spark-docker/actions/runs/6604314223/job/17938533429).
   
   Let me know if anything else is needed


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] Add support for java 17 from spark 3.5.0 [spark-docker]

Posted by "vakarisbk (via GitHub)" <gi...@apache.org>.
vakarisbk commented on code in PR #56:
URL: https://github.com/apache/spark-docker/pull/56#discussion_r1367926881


##########
versions.json:
##########
@@ -1,9 +1,38 @@
 {
   "versions": [
+    {
+      "path": "3.5.0/scala2.12-java17-python3-ubuntu",
+      "tags": [
+        "3.5.0-scala2.12-java17-python3-ubuntu",
+        "3.5.0-java17-python3",
+        "3.5.0-java17",
+        "python3-java17"
+      ]
+    },
+    {
+      "path": "3.5.0/scala2.12-java17-r-ubuntu",
+      "tags": [
+        "3.5.0-scala2.12-java17-r-ubuntu",
+        "3.5.0-java-17-r"
+      ]
+    },
+    {
+      "path": "3.5.0/scala2.12-java17-ubuntu",
+      "tags": [
+        "3.5.0-scala2.12-java17-ubuntu",
+        "3.5.0-java17-scala"
+      ]
+    },
+    {
+      "path": "3.5.0/scala2.12-java17-python3-r-ubuntu",
+      "tags": [
+        "3.5.0-scala2.12-java17-python3-r-ubuntu"
+      ]
+    },
     {
       "path": "3.5.0/scala2.12-java11-python3-ubuntu",
       "tags": [
-        "3.5.0-scala2.12-java11-python3-ubuntu",
+        "3.5.0-scala2.12-java17-python3-ubuntu",

Review Comment:
   fixed



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] Add support for java 17 from spark 3.5.0 [spark-docker]

Posted by "vakarisbk (via GitHub)" <gi...@apache.org>.
vakarisbk commented on code in PR #56:
URL: https://github.com/apache/spark-docker/pull/56#discussion_r1383342025


##########
add-dockerfiles.sh:
##########
@@ -26,13 +26,17 @@
 # - Add 3.3.1 dockerfiles:
 #   $ ./add-dockerfiles.sh 3.3.1
 
-VERSION=${1:-"3.3.0"}
+VERSION=${1:-"3.5.0"}
 
 TAGS="
 scala2.12-java11-python3-r-ubuntu
 scala2.12-java11-python3-ubuntu
 scala2.12-java11-r-ubuntu
 scala2.12-java11-ubuntu
+scala2.12-java17-python3-r-ubuntu
+scala2.12-java17-python3-ubuntu
+scala2.12-java17-r-ubuntu
+scala2.12-java17-ubuntu

Review Comment:
   fixed



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-43305] Add support for java 17 from spark 3.5.0 [spark-docker]

Posted by "Yikun (via GitHub)" <gi...@apache.org>.
Yikun commented on PR #56:
URL: https://github.com/apache/spark-docker/pull/56#issuecomment-1805025761

   Test publish on: https://github.com/apache/spark-docker/actions/runs/6820460339


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [WIP] Add support for java 17 and explicit Python versions from spark 3.5.0 onwards [spark-docker]

Posted by "Yikun (via GitHub)" <gi...@apache.org>.
Yikun commented on code in PR #56:
URL: https://github.com/apache/spark-docker/pull/56#discussion_r1358361340


##########
.github/workflows/test.yml:
##########
@@ -37,12 +37,15 @@ on:
         - 3.3.0
       java:
         description: 'The Java version of Spark image.'
-        default: 11
+        default: "11"

Review Comment:
   Is it neccessary?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org