You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "vakarisbk (via GitHub)" <gi...@apache.org> on 2023/10/13 13:39:05 UTC
[PR] [WIP] Add support for java 17 and explicit Python versions from 3.5.0 [spark-docker]
vakarisbk opened a new pull request, #56:
URL: https://github.com/apache/spark-docker/pull/56
### What changes were proposed in this pull request?
1. Create Java17 base images alongside Java11 images starting from spark 3.5.0
2. Add the ability to explicitly define Python versions
3. Change ubuntu version to 22.04 for `scala2.12-java17-*`
### Why are the changes needed?
Spark supports multiple versions of Java and Spark and some community members have a need to use specific versions of Java and Python for their use cases. Adding this option would simplify workflows for these users and make Spark more accessible.
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
Re: [PR] Add support for java 17 from spark 3.5.0 [spark-docker]
Posted by "vakarisbk (via GitHub)" <gi...@apache.org>.
vakarisbk commented on code in PR #56:
URL: https://github.com/apache/spark-docker/pull/56#discussion_r1367926824
##########
testing/testing.sh:
##########
@@ -61,7 +61,8 @@ function remove_network() {
# Find and kill any remaining containers attached to the network
function cleanup() {
- local containers
+ local containers
+
Review Comment:
fixed
##########
versions.json:
##########
@@ -1,9 +1,38 @@
{
"versions": [
+ {
+ "path": "3.5.0/scala2.12-java17-python3-ubuntu",
+ "tags": [
+ "3.5.0-scala2.12-java17-python3-ubuntu",
+ "3.5.0-java17-python3",
+ "3.5.0-java17",
+ "python3-java17"
+ ]
+ },
+ {
+ "path": "3.5.0/scala2.12-java17-r-ubuntu",
+ "tags": [
+ "3.5.0-scala2.12-java17-r-ubuntu",
+ "3.5.0-java-17-r"
Review Comment:
"3.5.0-java17-r" added
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
Re: [PR] Add support for java 17 from spark 3.5.0 [spark-docker]
Posted by "vakarisbk (via GitHub)" <gi...@apache.org>.
vakarisbk commented on PR #56:
URL: https://github.com/apache/spark-docker/pull/56#issuecomment-1766504979
1. made `fetch-depth: 0` applicable only for 3.3.0 (saves about 400MB for other builds)
2. borrowed CI runner cleanup scripts from the main spark repo [free_disk_space_container](https://github.com/apache/spark/blob/master/dev/free_disk_space_container) [free_disk_space](https://github.com/apache/spark/blob/master/dev/free_disk_space) and added an action step to execute them.
These scripts made the CI runner filesystem go from this:
```
Filesystem Size Used Avail Use% Mounted on
/dev/root 84G 66G 18G 80% /
tmpfs 3.4G 172K 3.4G 1% /dev/shm
tmpfs 1.4G 1.2M 1.4G 1% /run
tmpfs 5.0M 0 5.0M 0% /run/lock
/dev/sdb15 105M 6.1M 99M 6% /boot/efi
/dev/sda1 14G 4.1G 9.0G 31% /mnt
tmpfs 693M 12K 693M 1% /run/user/1001
```
to this:
```
Filesystem Size Used Avail Use% Mounted on
/dev/root 84G 29G 55G 35% /
tmpfs 3.4G 172K 3.4G 1% /dev/shm
tmpfs 1.4G 1.2M 1.4G 1% /run
tmpfs 5.0M 0 5.0M 0% /run/lock
/dev/sdb15 105M 6.1M 99M 6% /boot/efi
/dev/sda1 14G 4.1G 9.0G 31% /mnt
tmpfs 693M 12K 693M 1% /run/user/1001
```
@Yikun could you trigger the builds one more time?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
Re: [PR] Add support for java 17 from spark 3.5.0 [spark-docker]
Posted by "vakarisbk (via GitHub)" <gi...@apache.org>.
vakarisbk commented on code in PR #56:
URL: https://github.com/apache/spark-docker/pull/56#discussion_r1367926834
##########
versions.json:
##########
@@ -1,9 +1,38 @@
{
"versions": [
+ {
+ "path": "3.5.0/scala2.12-java17-python3-ubuntu",
+ "tags": [
+ "3.5.0-scala2.12-java17-python3-ubuntu",
+ "3.5.0-java17-python3",
+ "3.5.0-java17",
+ "python3-java17"
+ ]
+ },
+ {
+ "path": "3.5.0/scala2.12-java17-r-ubuntu",
+ "tags": [
+ "3.5.0-scala2.12-java17-r-ubuntu",
+ "3.5.0-java-17-r"
Review Comment:
fixed to "3.5.0-java17-r"
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
Re: [PR] Add support for java 17 from spark 3.5.0 [spark-docker]
Posted by "Yikun (via GitHub)" <gi...@apache.org>.
Yikun closed pull request #56: Add support for java 17 from spark 3.5.0
URL: https://github.com/apache/spark-docker/pull/56
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
Re: [PR] Add support for java 17 from spark 3.5.0 [spark-docker]
Posted by "RoeiA (via GitHub)" <gi...@apache.org>.
RoeiA commented on PR #56:
URL: https://github.com/apache/spark-docker/pull/56#issuecomment-1801848290
Waiting for this one for a while now, thank you @vakarisbk for this PR, would love to see it merged!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
Re: [PR] [WIP] Add support for java 17 and explicit Python versions from spark 3.5.0 onwards [spark-docker]
Posted by "vakarisbk (via GitHub)" <gi...@apache.org>.
vakarisbk commented on code in PR #56:
URL: https://github.com/apache/spark-docker/pull/56#discussion_r1359852744
##########
3.5.0/scala2.12-java11-python3-r-ubuntu/Dockerfile:
##########
@@ -20,7 +20,10 @@ USER root
RUN set -ex; \
apt-get update; \
- apt-get install -y python3 python3-pip; \
+ apt install -y software-properties-common; \
+ add-apt-repository ppa:deadsnakes/ppa; \
+ apt install python3.10; \
Review Comment:
Rolled this back. My initial idea was to propose adding images with multiple python versions to the repo (java17-python3.10, java17-python3.9, etc), but now that I think about it - probably not a lot of community members would benefit from this and it would clutter up the repository quite a bit.
And those people who need to have specific python versions (like me) can just take a base image and install whatever python version they want.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
Re: [PR] Add support for java 17 from spark 3.5.0 [spark-docker]
Posted by "Yikun (via GitHub)" <gi...@apache.org>.
Yikun commented on code in PR #56:
URL: https://github.com/apache/spark-docker/pull/56#discussion_r1361615937
##########
3.3.0/scala2.12-java11-python3-r-ubuntu/Dockerfile:
##########
@@ -24,9 +24,9 @@ RUN groupadd --system --gid=${spark_uid} spark && \
RUN set -ex && \
apt-get update && \
ln -s /lib /lib64 && \
- apt install -y gnupg2 wget bash tini libc6 libpam-modules krb5-user libnss3 procps net-tools gosu && \
- apt install -y python3 python3-pip && \
- apt install -y r-base r-base-dev && \
+ apt-get install -y gnupg2 wget bash tini libc6 libpam-modules krb5-user libnss3 procps net-tools gosu && \
Review Comment:
this should be changed, we apply this change only after 3.4 version.
It would be good if you can revert all 3.3.0 changes, : )
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
Re: [PR] Add support for java 17 from spark 3.5.0 [spark-docker]
Posted by "vakarisbk (via GitHub)" <gi...@apache.org>.
vakarisbk commented on PR #56:
URL: https://github.com/apache/spark-docker/pull/56#issuecomment-1766062543
> I never hit this before, but if it is storage limit, you could try to remove some tmp file to save space, such as:
>
> 1. https://github.com/apache/spark-docker/blob/master/.github/workflows/main.yml#L249
>
> ```
> sudo install minikube-linux-amd64 /usr/local/bin/minikube
> rm minikube-linux-amd64
> ```
>
> It's about to save 80MB
>
> (later upadte) I also noticed this change also apply on main repo: https://github.com/apache/spark/blob/master/.github/workflows/build_and_test.yml#L1045
>
> 2. (if step 1 is ok, we don't need this step) https://github.com/apache/spark-docker/blob/028efd4637fb2cf791d5bd9ea70b2fca472de4b7/.github/workflows/main.yml#L201
>
> remove `fetch-depth: 0`, seems also save some space?
added `rm minikube-linux-amd64` and removed accidental changes in 3.3.0. Now only 3.5.0 will be built.
removing `fetch-deph: 0` would help with space, but the default is `fetch-deph: 1` which only fetches a single commit form the main/master branch. That would make the 3.3.0 build fail as it needs to cherry-pick commits from history.
I've tested this out on my own repo: https://github.com/vakarisbk/spark-docker/actions/runs/6545230049/job/17773271828
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
Re: [PR] Add support for java 17 from spark 3.5.0 [spark-docker]
Posted by "Yikun (via GitHub)" <gi...@apache.org>.
Yikun commented on PR #56:
URL: https://github.com/apache/spark-docker/pull/56#issuecomment-1765809116
I never hit this before, but if it is storage limit, you could try to remove some tmp file to save space, such as:
1. https://github.com/apache/spark-docker/blob/master/.github/workflows/main.yml#L249
```
sudo install minikube-linux-amd64 /usr/local/bin/minikube
rm -f ./minikube-linux-amd64
```
It's about to save 80MB
2. https://github.com/apache/spark-docker/blob/028efd4637fb2cf791d5bd9ea70b2fca472de4b7/.github/workflows/main.yml#L201
remove `fetch-depth: 0`, seems also save some space?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
Re: [PR] [SPARK-43305] Add support for java 17 from spark 3.5.0 [spark-docker]
Posted by "Yikun (via GitHub)" <gi...@apache.org>.
Yikun commented on PR #56:
URL: https://github.com/apache/spark-docker/pull/56#issuecomment-1805024385
@vakarisbk Merged to master.
Thanks all!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
Re: [PR] [SPARK-43305] Add support for java 17 from spark 3.5.0 [spark-docker]
Posted by "Yikun (via GitHub)" <gi...@apache.org>.
Yikun commented on PR #56:
URL: https://github.com/apache/spark-docker/pull/56#issuecomment-1805467380
Image published on GHCR @vakarisbk, would you mind doing a post validation:
https://github.com/apache/spark-docker/pkgs/container/spark-docker%2Fspark
Then you could feel free to open a PR on official image like:
https://github.com/docker-library/official-images/pull/15363
The content can be generated by `tools/manifest.py manifest`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
Re: [PR] Add support for java 17 from spark 3.5.0 [spark-docker]
Posted by "vakarisbk (via GitHub)" <gi...@apache.org>.
vakarisbk commented on PR #56:
URL: https://github.com/apache/spark-docker/pull/56#issuecomment-1768267473
All tests have passed except the 3.3.2 build.
The 3.3.2 build fails due to an issue with the GPG key on `keys.openpgp.org` (key on `keyserver.ubuntu.com` works fine)
```
- gpg --keyserver hkps://keys.openpgp.org --recv-key "C56349D886F2B01F8CAE794C653C2301FEA493EE"
gpg: key 653C2301FEA493EE: no user ID
gpg: Total number processed: 1
- gpg --batch --verify spark.tgz.asc spark.tgz
gpg: Signature made Pn Vas 10 22:40:58 2023 EET
gpg: using RSA key C56349D886F2B01F8CAE794C653C2301FEA493EE
gpg: issuer "viirya@apache.org"
gpg: Can't check signature: No public key
```
But that's probably not relevant to this PR?
Apart from that, the PR is ready for review.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
Re: [PR] Add support for java 17 from spark 3.5.0 [spark-docker]
Posted by "Yikun (via GitHub)" <gi...@apache.org>.
Yikun commented on PR #56:
URL: https://github.com/apache/spark-docker/pull/56#issuecomment-1803034637
Thanks for your efforts @vakarisbk , I'm going to merge this PR later today or tomorrow.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
Re: [PR] [WIP] Add support for java 17 and explicit Python versions from spark 3.5.0 onwards [spark-docker]
Posted by "vakarisbk (via GitHub)" <gi...@apache.org>.
vakarisbk commented on code in PR #56:
URL: https://github.com/apache/spark-docker/pull/56#discussion_r1359853632
##########
.github/workflows/test.yml:
##########
@@ -37,12 +37,15 @@ on:
- 3.3.0
java:
description: 'The Java version of Spark image.'
- default: 11
+ default: "11"
Review Comment:
Not really. Value is defined as string and my linter was complaining that it's not a string. GH actions don't really care about this.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
Re: [PR] Add support for java 17 from spark 3.5.0 [spark-docker]
Posted by "viirya (via GitHub)" <gi...@apache.org>.
viirya commented on PR #56:
URL: https://github.com/apache/spark-docker/pull/56#issuecomment-1773944548
@Yikun Just uploaded to openpgp.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
Re: [PR] Add support for java 17 from spark 3.5.0 [spark-docker]
Posted by "Yikun (via GitHub)" <gi...@apache.org>.
Yikun commented on code in PR #56:
URL: https://github.com/apache/spark-docker/pull/56#discussion_r1367866420
##########
versions.json:
##########
@@ -1,9 +1,38 @@
{
"versions": [
+ {
+ "path": "3.5.0/scala2.12-java17-python3-ubuntu",
+ "tags": [
+ "3.5.0-scala2.12-java17-python3-ubuntu",
+ "3.5.0-java17-python3",
+ "3.5.0-java17",
+ "python3-java17"
+ ]
+ },
+ {
+ "path": "3.5.0/scala2.12-java17-r-ubuntu",
+ "tags": [
+ "3.5.0-scala2.12-java17-r-ubuntu",
+ "3.5.0-java-17-r"
+ ]
+ },
+ {
+ "path": "3.5.0/scala2.12-java17-ubuntu",
+ "tags": [
+ "3.5.0-scala2.12-java17-ubuntu",
+ "3.5.0-java17-scala"
+ ]
+ },
+ {
+ "path": "3.5.0/scala2.12-java17-python3-r-ubuntu",
+ "tags": [
+ "3.5.0-scala2.12-java17-python3-r-ubuntu"
+ ]
+ },
{
"path": "3.5.0/scala2.12-java11-python3-ubuntu",
"tags": [
- "3.5.0-scala2.12-java11-python3-ubuntu",
+ "3.5.0-scala2.12-java17-python3-ubuntu",
Review Comment:
This shouldn't be changed.
##########
testing/testing.sh:
##########
@@ -61,7 +61,8 @@ function remove_network() {
# Find and kill any remaining containers attached to the network
function cleanup() {
- local containers
+ local containers
+
Review Comment:
unrelated change
##########
versions.json:
##########
@@ -1,9 +1,38 @@
{
"versions": [
+ {
+ "path": "3.5.0/scala2.12-java17-python3-ubuntu",
+ "tags": [
+ "3.5.0-scala2.12-java17-python3-ubuntu",
+ "3.5.0-java17-python3",
+ "3.5.0-java17",
+ "python3-java17"
+ ]
+ },
+ {
+ "path": "3.5.0/scala2.12-java17-r-ubuntu",
+ "tags": [
+ "3.5.0-scala2.12-java17-r-ubuntu",
+ "3.5.0-java-17-r"
Review Comment:
```suggestion
"3.5.0-java17-r"
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
Re: [PR] Add support for java 17 from spark 3.5.0 [spark-docker]
Posted by "Yikun (via GitHub)" <gi...@apache.org>.
Yikun commented on code in PR #56:
URL: https://github.com/apache/spark-docker/pull/56#discussion_r1369408222
##########
add-dockerfiles.sh:
##########
@@ -26,13 +26,17 @@
# - Add 3.3.1 dockerfiles:
# $ ./add-dockerfiles.sh 3.3.1
-VERSION=${1:-"3.3.0"}
+VERSION=${1:-"3.5.0"}
TAGS="
scala2.12-java11-python3-r-ubuntu
scala2.12-java11-python3-ubuntu
scala2.12-java11-r-ubuntu
scala2.12-java11-ubuntu
+scala2.12-java17-python3-r-ubuntu
+scala2.12-java17-python3-ubuntu
+scala2.12-java17-r-ubuntu
+scala2.12-java17-ubuntu
Review Comment:
Because we only add after 3.5 version, so we should skip 3.3 / 3.4 version. So seems we need some thing like below:
```shell
if ! echo $VERSION | grep -Eq "^3.3|^3.4"; then
TAGS+="
scala2.12-java17-python3-r-ubuntu
scala2.12-java17-python3-ubuntu
scala2.12-java17-r-ubuntu
scala2.12-java17-ubuntu
"
fi
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
Re: [PR] [WIP] Add support for java 17 and explicit Python versions from spark 3.5.0 onwards [spark-docker]
Posted by "Yikun (via GitHub)" <gi...@apache.org>.
Yikun commented on code in PR #56:
URL: https://github.com/apache/spark-docker/pull/56#discussion_r1358359041
##########
add-dockerfiles.sh:
##########
@@ -44,12 +48,20 @@ for TAG in $TAGS; do
if echo $TAG | grep -q "r-"; then
OPTS+=" --sparkr"
fi
+
+ if echo $TAG | grep -q "java17"; then
+ OPTS+=" --java-version 17 --image eclipse-temurin:17-jre-jammy"
+ fi
+ if echo $TAG | grep -q "java11"; then
Review Comment:
elif?
##########
add-dockerfiles.sh:
##########
@@ -44,12 +48,20 @@ for TAG in $TAGS; do
if echo $TAG | grep -q "r-"; then
OPTS+=" --sparkr"
fi
+
+ if echo $TAG | grep -q "java17"; then
+ OPTS+=" --java-version 17 --image eclipse-temurin:17-jre-jammy"
Review Comment:
Greate!
##########
3.5.0/scala2.12-java11-python3-r-ubuntu/Dockerfile:
##########
@@ -20,7 +20,10 @@ USER root
RUN set -ex; \
apt-get update; \
- apt-get install -y python3 python3-pip; \
+ apt install -y software-properties-common; \
+ add-apt-repository ppa:deadsnakes/ppa; \
+ apt install python3.10; \
Review Comment:
Is there any special reason why we use the python 3.10? I prefer to use os default python3 version from matainence cost view.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
Re: [PR] [WIP] Add support for java 17 and explicit Python versions from spark 3.5.0 onwards [spark-docker]
Posted by "Yikun (via GitHub)" <gi...@apache.org>.
Yikun commented on code in PR #56:
URL: https://github.com/apache/spark-docker/pull/56#discussion_r1358361784
##########
tools/template.py:
##########
@@ -59,7 +59,7 @@ def parse_opts():
parser.add_argument(
"-j",
"--java-version",
- help="The Spark version of Dockerfile.",
+ help="Java version of Dockerfile.",
Review Comment:
Good catch
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
Re: [PR] [WIP] Add support for java 17 and explicit Python versions from spark 3.5.0 onwards [spark-docker]
Posted by "vakarisbk (via GitHub)" <gi...@apache.org>.
vakarisbk commented on code in PR #56:
URL: https://github.com/apache/spark-docker/pull/56#discussion_r1359852798
##########
add-dockerfiles.sh:
##########
@@ -44,12 +48,20 @@ for TAG in $TAGS; do
if echo $TAG | grep -q "r-"; then
OPTS+=" --sparkr"
fi
+
+ if echo $TAG | grep -q "java17"; then
+ OPTS+=" --java-version 17 --image eclipse-temurin:17-jre-jammy"
+ fi
+ if echo $TAG | grep -q "java11"; then
Review Comment:
fixed
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
Re: [PR] Add support for java 17 from spark 3.5.0 [spark-docker]
Posted by "vakarisbk (via GitHub)" <gi...@apache.org>.
vakarisbk commented on code in PR #56:
URL: https://github.com/apache/spark-docker/pull/56#discussion_r1361816378
##########
3.3.0/scala2.12-java11-python3-r-ubuntu/Dockerfile:
##########
@@ -24,9 +24,9 @@ RUN groupadd --system --gid=${spark_uid} spark && \
RUN set -ex && \
apt-get update && \
ln -s /lib /lib64 && \
- apt install -y gnupg2 wget bash tini libc6 libpam-modules krb5-user libnss3 procps net-tools gosu && \
- apt install -y python3 python3-pip && \
- apt install -y r-base r-base-dev && \
+ apt-get install -y gnupg2 wget bash tini libc6 libpam-modules krb5-user libnss3 procps net-tools gosu && \
Review Comment:
fixed
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
Re: [PR] Add support for java 17 from spark 3.5.0 [spark-docker]
Posted by "Yikun (via GitHub)" <gi...@apache.org>.
Yikun commented on code in PR #56:
URL: https://github.com/apache/spark-docker/pull/56#discussion_r1358358604
##########
add-dockerfiles.sh:
##########
@@ -44,12 +48,20 @@ for TAG in $TAGS; do
if echo $TAG | grep -q "r-"; then
OPTS+=" --sparkr"
fi
+
+ if echo $TAG | grep -q "java17"; then
+ OPTS+=" --java-version 17 --image eclipse-temurin:17-jre-jammy"
Review Comment:
Great!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
Re: [PR] Add support for java 17 from spark 3.5.0 [spark-docker]
Posted by "vakarisbk (via GitHub)" <gi...@apache.org>.
vakarisbk commented on PR #56:
URL: https://github.com/apache/spark-docker/pull/56#issuecomment-1798141705
@Yikun maybe we can have this PR merged even without @HyukjinKwon and @zhengruifeng approval? This will not impact the existing images
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
Re: [PR] Add support for java 17 from spark 3.5.0 [spark-docker]
Posted by "Yikun (via GitHub)" <gi...@apache.org>.
Yikun commented on PR #56:
URL: https://github.com/apache/spark-docker/pull/56#issuecomment-1773942940
@viirya Hi, Would you mind taking a look on 3.3.2 release key issue. It might needs your help to upload the public key, see [1] as ref.
[1] https://github.com/apache/spark-docker/pull/55#issuecomment-1715173342
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
Re: [PR] Add support for java 17 from spark 3.5.0 [spark-docker]
Posted by "vakarisbk (via GitHub)" <gi...@apache.org>.
vakarisbk commented on PR #56:
URL: https://github.com/apache/spark-docker/pull/56#issuecomment-1765778478
It seems to me that the builds are failing due to insufficient storage on the runners.
```[info] org.apache.spark.deploy.k8s.integrationtest.KubernetesSuite *** ABORTED *** (1 second, 160 milliseconds)
[info] java.lang.AssertionError: assertion failed: Failed to execute -- bash -c MINIKUBE_IN_STYLE=true minikube status --
[info] minikube
[info] type: Control Plane
[info] host: InsufficientStorage
[info] kubelet: Running
[info] apiserver: Running
[info] kubeconfig: Configured
[info] docker-env: in-use
```
Maybe it's possible to try switching to larger runners?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
Re: [PR] [WIP] Add support for java 17 and explicit Python versions from spark 3.5.0 onwards [spark-docker]
Posted by "Yikun (via GitHub)" <gi...@apache.org>.
Yikun commented on code in PR #56:
URL: https://github.com/apache/spark-docker/pull/56#discussion_r1358357340
##########
3.5.0/scala2.12-java11-python3-r-ubuntu/Dockerfile:
##########
@@ -20,7 +20,10 @@ USER root
RUN set -ex; \
apt-get update; \
- apt-get install -y python3 python3-pip; \
+ apt install -y software-properties-common; \
+ add-apt-repository ppa:deadsnakes/ppa; \
+ apt install python3.10; \
Review Comment:
Is there any special reason why we use the python 3.10? I prefer to use os default python3 version from matainence cost view, and also os default python version has more stable quality and security.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
Re: [PR] [WIP] Add support for java 17 and explicit Python versions from spark 3.5.0 onwards [spark-docker]
Posted by "Yikun (via GitHub)" <gi...@apache.org>.
Yikun commented on code in PR #56:
URL: https://github.com/apache/spark-docker/pull/56#discussion_r1358357340
##########
3.5.0/scala2.12-java11-python3-r-ubuntu/Dockerfile:
##########
@@ -20,7 +20,10 @@ USER root
RUN set -ex; \
apt-get update; \
- apt-get install -y python3 python3-pip; \
+ apt install -y software-properties-common; \
+ add-apt-repository ppa:deadsnakes/ppa; \
+ apt install python3.10; \
Review Comment:
Is there any special reason why we use the python 3.10? I prefer to use os default python3 version from matainence cost view, and also os default python version has more stable quality.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
Re: [PR] Add support for java 17 from spark 3.5.0 [spark-docker]
Posted by "Yikun (via GitHub)" <gi...@apache.org>.
Yikun commented on PR #56:
URL: https://github.com/apache/spark-docker/pull/56#issuecomment-1776247619
cc @HyukjinKwon @zhengruifeng Would you mind also taking a look?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
Re: [PR] Add support for java 17 from spark 3.5.0 [spark-docker]
Posted by "vakarisbk (via GitHub)" <gi...@apache.org>.
vakarisbk commented on PR #56:
URL: https://github.com/apache/spark-docker/pull/56#issuecomment-1774127165
> Please also make sure:
>
> 1. All dockerfiles and entrypoint.sh should be generated by `add-dockerfiles.sh`
All dockerfiles and entrypoints were generated using the add-dockerfiles.sh
To validate, I ran this diff:
```
mv 3.5.0 3.5.0_copy; \
./add-dockerfiles.sh 3.5.0; \
diff -r 3.5.0 3.5.0_copy;
```
> 2. It would be better if you can publish these images in your local repo to test (by appending a local change line in your local branch .github/workflows/publish.yml L50), It's just a test but shouldn't be changed in this PR.
I've published the images in my [forked repository](https://github.com/vakarisbk/spark-docker/pkgs/container/spark-docker%2Fspark).
Publish job logs can be found [here](https://github.com/vakarisbk/spark-docker/actions/runs/6604314223/job/17938533429).
Let me know if anything else is needed
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
Re: [PR] Add support for java 17 from spark 3.5.0 [spark-docker]
Posted by "vakarisbk (via GitHub)" <gi...@apache.org>.
vakarisbk commented on code in PR #56:
URL: https://github.com/apache/spark-docker/pull/56#discussion_r1367926881
##########
versions.json:
##########
@@ -1,9 +1,38 @@
{
"versions": [
+ {
+ "path": "3.5.0/scala2.12-java17-python3-ubuntu",
+ "tags": [
+ "3.5.0-scala2.12-java17-python3-ubuntu",
+ "3.5.0-java17-python3",
+ "3.5.0-java17",
+ "python3-java17"
+ ]
+ },
+ {
+ "path": "3.5.0/scala2.12-java17-r-ubuntu",
+ "tags": [
+ "3.5.0-scala2.12-java17-r-ubuntu",
+ "3.5.0-java-17-r"
+ ]
+ },
+ {
+ "path": "3.5.0/scala2.12-java17-ubuntu",
+ "tags": [
+ "3.5.0-scala2.12-java17-ubuntu",
+ "3.5.0-java17-scala"
+ ]
+ },
+ {
+ "path": "3.5.0/scala2.12-java17-python3-r-ubuntu",
+ "tags": [
+ "3.5.0-scala2.12-java17-python3-r-ubuntu"
+ ]
+ },
{
"path": "3.5.0/scala2.12-java11-python3-ubuntu",
"tags": [
- "3.5.0-scala2.12-java11-python3-ubuntu",
+ "3.5.0-scala2.12-java17-python3-ubuntu",
Review Comment:
fixed
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
Re: [PR] Add support for java 17 from spark 3.5.0 [spark-docker]
Posted by "vakarisbk (via GitHub)" <gi...@apache.org>.
vakarisbk commented on code in PR #56:
URL: https://github.com/apache/spark-docker/pull/56#discussion_r1383342025
##########
add-dockerfiles.sh:
##########
@@ -26,13 +26,17 @@
# - Add 3.3.1 dockerfiles:
# $ ./add-dockerfiles.sh 3.3.1
-VERSION=${1:-"3.3.0"}
+VERSION=${1:-"3.5.0"}
TAGS="
scala2.12-java11-python3-r-ubuntu
scala2.12-java11-python3-ubuntu
scala2.12-java11-r-ubuntu
scala2.12-java11-ubuntu
+scala2.12-java17-python3-r-ubuntu
+scala2.12-java17-python3-ubuntu
+scala2.12-java17-r-ubuntu
+scala2.12-java17-ubuntu
Review Comment:
fixed
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
Re: [PR] [SPARK-43305] Add support for java 17 from spark 3.5.0 [spark-docker]
Posted by "Yikun (via GitHub)" <gi...@apache.org>.
Yikun commented on PR #56:
URL: https://github.com/apache/spark-docker/pull/56#issuecomment-1805025761
Test publish on: https://github.com/apache/spark-docker/actions/runs/6820460339
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
Re: [PR] [WIP] Add support for java 17 and explicit Python versions from spark 3.5.0 onwards [spark-docker]
Posted by "Yikun (via GitHub)" <gi...@apache.org>.
Yikun commented on code in PR #56:
URL: https://github.com/apache/spark-docker/pull/56#discussion_r1358361340
##########
.github/workflows/test.yml:
##########
@@ -37,12 +37,15 @@ on:
- 3.3.0
java:
description: 'The Java version of Spark image.'
- default: 11
+ default: "11"
Review Comment:
Is it neccessary?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org