You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by GitBox <gi...@apache.org> on 2020/04/30 10:20:26 UTC

[GitHub] [flink-docker] zentol opened a new pull request #20: [FLINK-17164] Add job-cluster support

zentol opened a new pull request #20:
URL: https://github.com/apache/flink-docker/pull/20


   Adds support for job-clusters.
   
   This PR is also relevant for 1.10, but the commits would be identical.
   
   A job-cluster can be started by passing standalone-job to docker-entrypoint.sh. User artifacts (i.e., the job jar to run) can be passed via the USER_ARTIFACTS environment variable as a list of files/directories, separated with semi-colons (;). This will usually point to some mounted volume.
   The script then copies all files into the usrlib directory of the distribution.
   
   The vast majority of this PR is concerned with testing. Since a job-cluster cannot be started without a job to run I added a maven project containing a trivial streaming job that runs infinitely.
   The jar is built on CI before the tests are run and provided to the jobmanager as described above.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink-docker] zentol commented on a change in pull request #20: [FLINK-17164] Add job-cluster support

Posted by GitBox <gi...@apache.org>.
zentol commented on a change in pull request #20:
URL: https://github.com/apache/flink-docker/pull/20#discussion_r422837552



##########
File path: docker-entrypoint.sh
##########
@@ -56,6 +58,27 @@ copy_plugins_if_required() {
   done
 }
 
+setup_usrlib() {
+  if [ -z "${USER_ARTIFACTS}" ]; then
+    return 0
+  fi
+
+  echo "Setting up usrlib"
+  mkdir -p "${FLINK_HOME}/usrlib"
+  for user_artifact in $(echo "${USER_ARTIFACTS}" | tr ';' ' '); do

Review comment:
       pretty much copied from the plugins equivalent above.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink-docker] azagrebin commented on a change in pull request #20: [FLINK-17164] Add job-cluster support

Posted by GitBox <gi...@apache.org>.
azagrebin commented on a change in pull request #20:
URL: https://github.com/apache/flink-docker/pull/20#discussion_r422275526



##########
File path: docker-entrypoint.sh
##########
@@ -18,6 +18,8 @@
 # limitations under the License.
 ###############################################################################
 
+COMMAND_STANDALONE="standalone-job"

Review comment:
       btw, we already called it differently [here](https://github.com/apache/flink/blob/master/flink-container/docker/docker-entrypoint.sh#L24)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink-docker] tillrohrmann commented on a change in pull request #20: [FLINK-17164] Add job-cluster support

Posted by GitBox <gi...@apache.org>.
tillrohrmann commented on a change in pull request #20:
URL: https://github.com/apache/flink-docker/pull/20#discussion_r422829605



##########
File path: docker-entrypoint.sh
##########
@@ -18,6 +18,8 @@
 # limitations under the License.
 ###############################################################################
 
+COMMAND_STANDALONE="standalone-job"

Review comment:
       What about `application` or so? https://ci.apache.org/projects/flink/flink-docs-stable/concepts/glossary.html#flink-application-cluster




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink-docker] zentol commented on a change in pull request #20: [FLINK-17164] Add job-cluster support

Posted by GitBox <gi...@apache.org>.
zentol commented on a change in pull request #20:
URL: https://github.com/apache/flink-docker/pull/20#discussion_r422867843



##########
File path: docker-entrypoint.sh
##########
@@ -18,6 +18,8 @@
 # limitations under the License.
 ###############################################################################
 
+COMMAND_STANDALONE="standalone-job"

Review comment:
       @tillrohrmann I wouldn't use `application`; while the glossary contained the concept of `Application Cluster` for years I don't think I have ever seen it mentioned anywhere (especially not anywhere in flink-dist).
   The current(master) glossary also says that a Flink _application_ can be both submitted to a session or application cluster, so it isn't exactly clear what `application` would do. Maybe this will change when FLIP-85 (Flink Application Mode) is completed, _if_ it fully subsumes per-job clusters, but until then I would stay away from this term.
   Considering the potential future features outlined in the FLIP,
   > Being able to launch one image in Kubernetes and the launched containers will figure out who is the JM and who are the TMs.
   
   the semantics of `application` could change again in the future, and it may be best to reserve this term for that time.
   
   @azagrebin I did not know that. I'd still stick with `standalone-job` though to stay close to flink-dist. I very much want to prevent us from introducing separate terminology into the docker images.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink-docker] tillrohrmann commented on a change in pull request #20: [FLINK-17164] Add job-cluster support

Posted by GitBox <gi...@apache.org>.
tillrohrmann commented on a change in pull request #20:
URL: https://github.com/apache/flink-docker/pull/20#discussion_r422894423



##########
File path: docker-entrypoint.sh
##########
@@ -18,6 +18,8 @@
 # limitations under the License.
 ###############################################################################
 
+COMMAND_STANDALONE="standalone-job"

Review comment:
       I see your point @zentol that you want to stick to the shell script names.
   
   I guess my main concern is that I associate standalone with a bare metal deployment and not with Docker. Strictly speaking there is not much of a difference though.
   
   Maybe we should rename the `standalone-job.sh` into something more aligned with the glossary. But I agree that this should/could be a separate effort.
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink-docker] azagrebin commented on a change in pull request #20: [FLINK-17164] Add job-cluster support

Posted by GitBox <gi...@apache.org>.
azagrebin commented on a change in pull request #20:
URL: https://github.com/apache/flink-docker/pull/20#discussion_r422832264



##########
File path: docker-entrypoint.sh
##########
@@ -56,6 +58,27 @@ copy_plugins_if_required() {
   done
 }
 
+setup_usrlib() {
+  if [ -z "${USER_ARTIFACTS}" ]; then

Review comment:
       maybe `JOB_ARTIFACTS` or `USER_JOB_ARTIFACTS`?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink-docker] zentol commented on a change in pull request #20: [FLINK-17164] Add job-cluster support

Posted by GitBox <gi...@apache.org>.
zentol commented on a change in pull request #20:
URL: https://github.com/apache/flink-docker/pull/20#discussion_r420972480



##########
File path: testing/docker-test-job/pom.xml
##########
@@ -0,0 +1,75 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+	xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
+	<modelVersion>4.0.0</modelVersion>
+
+	<groupId>org.apache.flink</groupId>
+	<artifactId>docker-test-job</artifactId>
+	<version>1.0</version>
+	<packaging>jar</packaging>
+
+	<properties>
+		<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
+		<flink.version>1.9.0</flink.version>

Review comment:
       it doesn't matter, FWIW it could also be 1.4.0. There is no requirement for this version to be in sync with the Flink version.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink-docker] zentol commented on a change in pull request #20: [FLINK-17164] Add job-cluster support

Posted by GitBox <gi...@apache.org>.
zentol commented on a change in pull request #20:
URL: https://github.com/apache/flink-docker/pull/20#discussion_r422868962



##########
File path: docker-entrypoint.sh
##########
@@ -56,6 +58,27 @@ copy_plugins_if_required() {
   done
 }
 
+setup_usrlib() {
+  if [ -z "${USER_ARTIFACTS}" ]; then

Review comment:
       We could generalize this more and call it `USER_LIB_ARTIFACTS` and use it also in session mode.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink-docker] azagrebin commented on a change in pull request #20: [FLINK-17164] Add job-cluster support

Posted by GitBox <gi...@apache.org>.
azagrebin commented on a change in pull request #20:
URL: https://github.com/apache/flink-docker/pull/20#discussion_r422834212



##########
File path: docker-entrypoint.sh
##########
@@ -56,6 +58,27 @@ copy_plugins_if_required() {
   done
 }
 
+setup_usrlib() {
+  if [ -z "${USER_ARTIFACTS}" ]; then
+    return 0
+  fi
+
+  echo "Setting up usrlib"
+  mkdir -p "${FLINK_HOME}/usrlib"
+  for user_artifact in $(echo "${USER_ARTIFACTS}" | tr ';' ' '); do

Review comment:
       any specific reason to use `;` separation? not `,` or `:`?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink-docker] zentol commented on a change in pull request #20: [FLINK-17164] Add job-cluster support

Posted by GitBox <gi...@apache.org>.
zentol commented on a change in pull request #20:
URL: https://github.com/apache/flink-docker/pull/20#discussion_r420971916



##########
File path: docker-entrypoint.sh
##########
@@ -18,6 +18,8 @@
 # limitations under the License.
 ###############################################################################
 
+COMMAND_STANDALONE="standalone-job"

Review comment:
       Is the best term we got at the moment, and I'd prefer to not introduce another one.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink-docker] azagrebin commented on a change in pull request #20: [FLINK-17164] Add job-cluster support

Posted by GitBox <gi...@apache.org>.
azagrebin commented on a change in pull request #20:
URL: https://github.com/apache/flink-docker/pull/20#discussion_r420910726



##########
File path: docker-entrypoint.sh
##########
@@ -18,6 +18,8 @@
 # limitations under the License.
 ###############################################################################
 
+COMMAND_STANDALONE="standalone-job"

Review comment:
       Is `standalone-job` already a stable name for the single job mode because of its user-facing bin script?
   I personally find it confusing. What about `single-jobmanager` and add another alias for cluster mode: `cluster-jobmanager`?

##########
File path: testing/run_travis_tests.sh
##########
@@ -11,14 +11,6 @@ fi
 
 BRANCH="$TRAVIS_BRANCH"
 
-if [ -n "$IS_PULL_REQUEST" ]; then
-  changed_files="$(git diff --name-only $BRANCH...HEAD)"

Review comment:
       why does it not work anymore? is `$BRANCH` not `dev-master` now?

##########
File path: docker-entrypoint.sh
##########
@@ -56,6 +58,27 @@ copy_plugins_if_required() {
   done
 }
 
+setup_usrlib() {
+  if [ -z "${USER_ARTIFACTS}" ]; then
+    return 0
+  fi
+
+  echo "Setting up usrlib"
+  for user_artifact in $(echo "${USER_ARTIFACTS}" | tr ';' ' '); do
+    echo "Adding artifact ${user_artifact} to usrlib directory."
+
+    mkdir -p "${FLINK_HOME}/usrlib"

Review comment:
       ```suggestion
     mkdir -p "${FLINK_HOME}/usrlib"
     for user_artifact in $(echo "${USER_ARTIFACTS}" | tr ';' ' '); do
       echo "Adding artifact ${user_artifact} to usrlib directory."
   
   ```
   moving `mkdir -p "${FLINK_HOME}/usrlib"` out of loop

##########
File path: testing/docker-test-job/pom.xml
##########
@@ -0,0 +1,75 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+	xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
+	<modelVersion>4.0.0</modelVersion>
+
+	<groupId>org.apache.flink</groupId>
+	<artifactId>docker-test-job</artifactId>
+	<version>1.0</version>
+	<packaging>jar</packaging>
+
+	<properties>
+		<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
+		<flink.version>1.9.0</flink.version>
+		<java.version>1.8</java.version>

Review comment:
       also, there is going to be Java 11

##########
File path: testing/docker-test-job/pom.xml
##########
@@ -0,0 +1,75 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+	xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
+	<modelVersion>4.0.0</modelVersion>
+
+	<groupId>org.apache.flink</groupId>
+	<artifactId>docker-test-job</artifactId>
+	<version>1.0</version>
+	<packaging>jar</packaging>
+
+	<properties>
+		<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
+		<flink.version>1.9.0</flink.version>

Review comment:
       is it `1.9` because it does not matter too much?
   maybe one more place speaking for having flink version somewhere in one place
   I suppose it could be rewritten while building it in `build_test_job()`

##########
File path: testing/testing_lib.sh
##########
@@ -178,17 +182,30 @@ function internal_smoke_test_images() {
 
     create_network
     trap cleanup EXIT RETURN
+    build_test_job
 
     local jobmanager_container_id
     local taskmanager_container_id
 
     for dockerfile in $dockerfiles; do
         build_image "$dockerfile"
-        jobmanager_container_id="$(internal_run_jobmanager "$dockerfile" "${docker_run_command_args}")"
+
+        jobmanager_container_id="$(internal_run_jobmanager "$dockerfile" "${docker_run_command_args}" "jobmanager")"
         taskmanager_container_id="$(internal_run_taskmanager "$dockerfile" "${docker_run_command_args}")"
         wait_for_jobmanager "$dockerfile"
         test_image "$dockerfile"
         docker kill "$jobmanager_container_id" "$taskmanager_container_id" > /dev/null
+
+        jobmanager_container_id="$(internal_run_jobmanager \
+            "$dockerfile" \
+            "${docker_run_command_args} --mount type=bind,src=$(pwd)/testing/docker-test-job/target,target=/jars -e USER_ARTIFACTS=/jars" \
+            "standalone-job --job-classname org.apache.flink.StreamingJob")"
+        taskmanager_container_id="$(internal_run_taskmanager \
+            "$dockerfile" \
+            "${docker_run_command_args} --mount type=bind,src=$(pwd)/testing/docker-test-job/target,target=/jars")"
+        wait_for_jobmanager "$dockerfile"
+        test_image "$dockerfile"
+        docker kill "$jobmanager_container_id" "$taskmanager_container_id" > /dev/null

Review comment:
       could it be deduplicated for cluster/job mode?
   ```sh
   docker_run_command_args="${docker_run_command_args} --mount type=bind,src=$(pwd)/testing/docker-test-job/target,target=/jars -e USER_ARTIFACTS=/jars"
   local jm_args="standalone-job --job-classname org.apache.flink.StreamingJob"
   test_jm_tm ${dockerfile} ${docker_run_command_args} ${jm_args}
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink-docker] zentol commented on a change in pull request #20: [FLINK-17164] Add job-cluster support

Posted by GitBox <gi...@apache.org>.
zentol commented on a change in pull request #20:
URL: https://github.com/apache/flink-docker/pull/20#discussion_r420971244



##########
File path: testing/run_travis_tests.sh
##########
@@ -11,14 +11,6 @@ fi
 
 BRANCH="$TRAVIS_BRANCH"
 
-if [ -n "$IS_PULL_REQUEST" ]; then
-  changed_files="$(git diff --name-only $BRANCH...HEAD)"

Review comment:
       You can't do a diff against an arbitrary branch right after a clone; we'd first need to fetch other branches which I tried but the damn thing still wasn't working.
   
   So I deleted it because this is not something I want to spend time on.
   Note that this also applies to the other dev branches.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink-docker] zentol commented on a change in pull request #20: [FLINK-17164] Add job-cluster support

Posted by GitBox <gi...@apache.org>.
zentol commented on a change in pull request #20:
URL: https://github.com/apache/flink-docker/pull/20#discussion_r420972950



##########
File path: testing/docker-test-job/pom.xml
##########
@@ -0,0 +1,75 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+	xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
+	<modelVersion>4.0.0</modelVersion>
+
+	<groupId>org.apache.flink</groupId>
+	<artifactId>docker-test-job</artifactId>
+	<version>1.0</version>
+	<packaging>jar</packaging>
+
+	<properties>
+		<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
+		<flink.version>1.9.0</flink.version>
+		<java.version>1.8</java.version>

Review comment:
       Yep, we'll handle that when the time comes.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink-docker] azagrebin commented on a change in pull request #20: [FLINK-17164] Add job-cluster support

Posted by GitBox <gi...@apache.org>.
azagrebin commented on a change in pull request #20:
URL: https://github.com/apache/flink-docker/pull/20#discussion_r421375300



##########
File path: testing/run_travis_tests.sh
##########
@@ -11,14 +11,6 @@ fi
 
 BRANCH="$TRAVIS_BRANCH"
 
-if [ -n "$IS_PULL_REQUEST" ]; then
-  changed_files="$(git diff --name-only $BRANCH...HEAD)"

Review comment:
       alright, indeed, the value seems to be not so big comparing to dev&maintenance.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org