You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@flink.apache.org by al...@apache.org on 2019/08/06 15:12:50 UTC
[flink] branch release-1.8 updated (a76b9e9 -> 954f3c0)
This is an automated email from the ASF dual-hosted git repository.
aljoscha pushed a change to branch release-1.8
in repository https://gitbox.apache.org/repos/asf/flink.git.
from a76b9e9 [FLINK-13394][travis] Use fallback unsafe MapR repository
new 9441505 [FLINK-10368] Harden Dockerized Kerberos tests by waiting for NM to be up
new 9881c45 [hotfix] Print Flink logs from YARN in test_yarn_kerberos_docker.sh
new 954f3c0 [FLINK-10368] Increase slot request timeout to harden YARN/Kerberos test
The 3 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails. The revisions
listed as "add" were already present in the repository and have only
been added to this reference.
Summary of changes:
.../test-scripts/test_yarn_kerberos_docker.sh | 31 ++++++++++++++++++++--
1 file changed, 29 insertions(+), 2 deletions(-)
[flink] 02/03: [hotfix] Print Flink logs from YARN in
test_yarn_kerberos_docker.sh
Posted by al...@apache.org.
This is an automated email from the ASF dual-hosted git repository.
aljoscha pushed a commit to branch release-1.8
in repository https://gitbox.apache.org/repos/asf/flink.git
commit 9881c45ea271176d19457b0e2aa98d1b4975f860
Author: Aljoscha Krettek <al...@apache.org>
AuthorDate: Fri Aug 2 14:48:24 2019 +0200
[hotfix] Print Flink logs from YARN in test_yarn_kerberos_docker.sh
---
flink-end-to-end-tests/test-scripts/test_yarn_kerberos_docker.sh | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/flink-end-to-end-tests/test-scripts/test_yarn_kerberos_docker.sh b/flink-end-to-end-tests/test-scripts/test_yarn_kerberos_docker.sh
index 528dfed..8f7d676 100755
--- a/flink-end-to-end-tests/test-scripts/test_yarn_kerberos_docker.sh
+++ b/flink-end-to-end-tests/test-scripts/test_yarn_kerberos_docker.sh
@@ -172,6 +172,13 @@ else
echo "Docker logs:"
docker logs master
exit 1
+
+ echo "Flink logs:"
+ docker exec -it master bash -c "kinit -kt /home/hadoop-user/hadoop-user.keytab hadoop-user"
+ application_id=`docker exec -it master bash -c "yarn application -list -appStates ALL" | grep "Flink session cluster" | awk '{print \$1}'`
+ echo "Application ID: $application_id"
+ docker exec -it master bash -c "yarn logs -applicationId $application_id"
+ docker exec -it master bash -c "kdestroy"
fi
if [[ ! "$OUTPUT" =~ "consummation,1" ]]; then
[flink] 03/03: [FLINK-10368] Increase slot request timeout to
harden YARN/Kerberos test
Posted by al...@apache.org.
This is an automated email from the ASF dual-hosted git repository.
aljoscha pushed a commit to branch release-1.8
in repository https://gitbox.apache.org/repos/asf/flink.git
commit 954f3c0fb33185c34fca485ccf47a2d0de587d72
Author: Aljoscha Krettek <al...@apache.org>
AuthorDate: Mon Aug 5 10:15:34 2019 +0200
[FLINK-10368] Increase slot request timeout to harden YARN/Kerberos test
Before, the tests were sometimes failing with
NoResourceAvailableException. In the logs it was visible that the
requested TaskExecutors (TMs) were connecting after the exception was
thrown. Increasing the timeout therefore fixes the instability.
---
flink-end-to-end-tests/test-scripts/test_yarn_kerberos_docker.sh | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/flink-end-to-end-tests/test-scripts/test_yarn_kerberos_docker.sh b/flink-end-to-end-tests/test-scripts/test_yarn_kerberos_docker.sh
index 8f7d676..f142a37 100755
--- a/flink-end-to-end-tests/test-scripts/test_yarn_kerberos_docker.sh
+++ b/flink-end-to-end-tests/test-scripts/test_yarn_kerberos_docker.sh
@@ -138,7 +138,7 @@ docker exec -it master bash -c "tar xzf /home/hadoop-user/$FLINK_TARBALL --direc
# minimal Flink config, bebe
docker exec -it master bash -c "echo \"security.kerberos.login.keytab: /home/hadoop-user/hadoop-user.keytab\" > /home/hadoop-user/$FLINK_DIRNAME/conf/flink-conf.yaml"
docker exec -it master bash -c "echo \"security.kerberos.login.principal: hadoop-user\" >> /home/hadoop-user/$FLINK_DIRNAME/conf/flink-conf.yaml"
-docker exec -it master bash -c "echo \"slot.request.timeout: 60000\" >> /home/hadoop-user/$FLINK_DIRNAME/conf/flink-conf.yaml"
+docker exec -it master bash -c "echo \"slot.request.timeout: 120000\" >> /home/hadoop-user/$FLINK_DIRNAME/conf/flink-conf.yaml"
docker exec -it master bash -c "echo \"containerized.heap-cutoff-min: 100\" >> /home/hadoop-user/$FLINK_DIRNAME/conf/flink-conf.yaml"
echo "Flink config:"
[flink] 01/03: [FLINK-10368] Harden Dockerized Kerberos tests by
waiting for NM to be up
Posted by al...@apache.org.
This is an automated email from the ASF dual-hosted git repository.
aljoscha pushed a commit to branch release-1.8
in repository https://gitbox.apache.org/repos/asf/flink.git
commit 94415058a3e71ff53b7d3985fa038fa1c4e4aefa
Author: Aljoscha Krettek <al...@apache.org>
AuthorDate: Thu Aug 1 13:04:24 2019 +0200
[FLINK-10368] Harden Dockerized Kerberos tests by waiting for NM to be up
Before, we didn't wait for Yarn NodeManagers to be up. This meant that
sometimes the Flink Job would not have enough resources to run.
---
.../test-scripts/test_yarn_kerberos_docker.sh | 22 +++++++++++++++++++++-
1 file changed, 21 insertions(+), 1 deletion(-)
diff --git a/flink-end-to-end-tests/test-scripts/test_yarn_kerberos_docker.sh b/flink-end-to-end-tests/test-scripts/test_yarn_kerberos_docker.sh
index 5f2dea2..528dfed 100755
--- a/flink-end-to-end-tests/test-scripts/test_yarn_kerberos_docker.sh
+++ b/flink-end-to-end-tests/test-scripts/test_yarn_kerberos_docker.sh
@@ -61,7 +61,7 @@ function start_hadoop_cluster() {
return 1
else
echo "Waiting for hadoop cluster to come up. We have been trying for $time_diff seconds, retrying ..."
- sleep 10
+ sleep 5
fi
done
@@ -74,6 +74,26 @@ function start_hadoop_cluster() {
return 1
fi
+ # try and see if NodeManagers are up, otherwise the Flink job will not have enough resources
+ # to run
+ nm_running="0"
+ start_time=$(date +%s)
+ while [ "$nm_running" -lt "2" ]; do
+ current_time=$(date +%s)
+ time_diff=$((current_time - start_time))
+
+ if [ $time_diff -ge $MAX_RETRY_SECONDS ]; then
+ return 1
+ else
+ echo "We only have $nm_running NodeManagers up. We have been trying for $time_diff seconds, retrying ..."
+ sleep 1
+ fi
+
+ docker exec -it master bash -c "kinit -kt /home/hadoop-user/hadoop-user.keytab hadoop-user"
+ nm_running=`docker exec -it master bash -c "yarn node -list" | grep RUNNING | wc -l`
+ docker exec -it master bash -c "kdestroy"
+ done
+
return 0
}