You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by do...@apache.org on 2021/02/20 04:07:28 UTC

[spark-website] branch asf-site updated: Update Testing K8S section of the Developer Tools page for SPARK-34426 (#304)

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/spark-website.git


The following commit(s) were added to refs/heads/asf-site by this push:
     new 8f20858  Update Testing K8S section of the Developer Tools page for SPARK-34426 (#304)
8f20858 is described below

commit 8f20858e329014cf3fb4dd8cf6489acdca077bb3
Author: Attila Zsolt Piros <20...@users.noreply.github.com>
AuthorDate: Sat Feb 20 05:07:18 2021 +0100

    Update Testing K8S section of the Developer Tools page for SPARK-34426 (#304)
    
    Updating the `Testing K8S` section of the Developer Tools page after [[SPARK-34426][K8S][TESTS] Add driver and executors POD logs to integration tests log when the test fails ](https://github.com/apache/spark/pull/31561) is merged.
    
    There are some minor errors I bumped into during testing and those also will be fixed here:
    - change `ZINC_PORT` calculation as it was initialized with a python2 only expression (just the parenthesis was missing for python3 compliance)
    - introducing`HADOOP_PROFILE` as `dev-run-integration-tests.sh` before this PR was using the `3.2.2` and the `hadoop-2.7` which was passed for `make-distribution.sh` caused strange errors
    - introducing `TARBALL_TO_TEST` as the old solution was sensitive for having more tgz-s in the workspace directory
    - initializing `MOUNT_PID` in the same line where `minikube mount` was asked to run in the background and use `$!` instead of `jobs -rp` (`-r` list all the running jobs which could lead to killing something else and not the mount).
    - getting rid of the `WORKSPACE` env var
---
 developer-tools.md        | 19 ++++++++++---------
 site/developer-tools.html | 19 ++++++++++---------
 2 files changed, 20 insertions(+), 18 deletions(-)

diff --git a/developer-tools.md b/developer-tools.md
index 9d82a25..bf13ed3 100644
--- a/developer-tools.md
+++ b/developer-tools.md
@@ -185,7 +185,7 @@ If you have made changes to the K8S bindings in Apache Spark, it would behoove y
 
 - minikube version v0.34.1 (or greater, but backwards-compatibility between versions is spotty)
 - You must use a VM driver!  Running minikube with the `--vm-driver=none` option requires that the user launching minikube/k8s have root access.  Our Jenkins workers use the [kvm2](https://github.com/kubernetes/minikube/blob/master/docs/drivers.md#kvm2-driver) drivers.  More details [here](https://github.com/kubernetes/minikube/blob/master/docs/drivers.md).
-- kubernetes version v1.13.3 (can be set by executing `minikube config set kubernetes-version v1.13.3`)
+- kubernetes version v1.15.12 (can be set by executing `minikube config set kubernetes-version v1.15.12`)
 
 Once you have minikube properly set up, and have successfully completed the [quick start](https://kubernetes.io/docs/setup/minikube/#quickstart), you can test your changes locally.  All subsequent commands should be run from your root spark/ repo directory:
 
@@ -194,10 +194,12 @@ Once you have minikube properly set up, and have successfully completed the [qui
 ```
 export DATE=`date "+%Y%m%d"`
 export REVISION=`git rev-parse --short HEAD`
-export ZINC_PORT=$(python -S -c "import random; print random.randrange(3030,4030)")
+export ZINC_PORT=$(python -S -c "import random; print(random.randrange(3030,4030))")
+export HADOOP_PROFILE=hadoop-3.2
 
 ./dev/make-distribution.sh --name ${DATE}-${REVISION} --pip --tgz -DzincPort=${ZINC_PORT} \
-     -Phadoop-2.7 -Pkubernetes -Pkinesis-asl -Phive -Phive-thriftserver
+     -P$HADOOP_PROFILE -Pkubernetes -Pkinesis-asl -Phive -Phive-thriftserver
+export TARBALL_TO_TEST=($(pwd)/spark-*${DATE}-${REVISION}.tgz)
 ```
 
 2) Use that tarball and run the K8S integration tests:
@@ -208,23 +210,22 @@ export PVC_TESTS_HOST_PATH=$PVC_TMP_DIR
 export PVC_TESTS_VM_PATH=$PVC_TMP_DIR
 
 minikube --vm-driver=<YOUR VM DRIVER HERE> start --memory 6000 --cpus 8
+minikube config set kubernetes-version v1.15.12
 
-minikube mount ${PVC_TESTS_HOST_PATH}:${PVC_TESTS_VM_PATH} --9p-version=9p2000.L --gid=0 --uid=185 &
-
-MOUNT_PID=$(jobs -rp)
+minikube mount ${PVC_TESTS_HOST_PATH}:${PVC_TESTS_VM_PATH} --9p-version=9p2000.L --gid=0 --uid=185 &; MOUNT_PID=$!
 
 kubectl create clusterrolebinding serviceaccounts-cluster-admin --clusterrole=cluster-admin --group=system:serviceaccounts || true
 
 ./resource-managers/kubernetes/integration-tests/dev/dev-run-integration-tests.sh \
-    --spark-tgz ${WORKSPACE}/spark-*.tgz
+    --spark-tgz $TARBALL_TO_TEST --hadoop-profile $HADOOP_PROFILE --exclude-tags r
 
 kill -9 $MOUNT_PID
 minikube stop
 ```
 
-After the run is completed, the integration test logs are saved here:  `./resource-managers/kubernetes/integration-tests/target/integration-tests.log`
+After the run is completed, the integration test logs are saved here: `./resource-managers/kubernetes/integration-tests/target/integration-tests.log`.
 
-Getting logs from the pods and containers directly is an exercise left to the reader.
+In case of a failure the POD logs (driver and executors) can be found at the end of the failed test (within `integration-tests.log`) in the `EXTRA LOGS FOR THE FAILED TEST` section.
 
 Kubernetes, and more importantly, minikube have rapid release cycles, and point releases have been found to be buggy and/or break older and existing functionality.  If you are having trouble getting tests to pass on Jenkins, but locally things work, don't hesitate to file a Jira issue.
 
diff --git a/site/developer-tools.html b/site/developer-tools.html
index d6dce72..02a556e 100644
--- a/site/developer-tools.html
+++ b/site/developer-tools.html
@@ -364,7 +364,7 @@ Generating HTML files for PySpark coverage under /.../spark/python/test_coverage
 <ul>
   <li>minikube version v0.34.1 (or greater, but backwards-compatibility between versions is spotty)</li>
   <li>You must use a VM driver!  Running minikube with the <code class="highlighter-rouge">--vm-driver=none</code> option requires that the user launching minikube/k8s have root access.  Our Jenkins workers use the <a href="https://github.com/kubernetes/minikube/blob/master/docs/drivers.md#kvm2-driver">kvm2</a> drivers.  More details <a href="https://github.com/kubernetes/minikube/blob/master/docs/drivers.md">here</a>.</li>
-  <li>kubernetes version v1.13.3 (can be set by executing <code class="highlighter-rouge">minikube config set kubernetes-version v1.13.3</code>)</li>
+  <li>kubernetes version v1.15.12 (can be set by executing <code class="highlighter-rouge">minikube config set kubernetes-version v1.15.12</code>)</li>
 </ul>
 
 <p>Once you have minikube properly set up, and have successfully completed the <a href="https://kubernetes.io/docs/setup/minikube/#quickstart">quick start</a>, you can test your changes locally.  All subsequent commands should be run from your root spark/ repo directory:</p>
@@ -373,10 +373,12 @@ Generating HTML files for PySpark coverage under /.../spark/python/test_coverage
 
 <div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>export DATE=`date "+%Y%m%d"`
 export REVISION=`git rev-parse --short HEAD`
-export ZINC_PORT=$(python -S -c "import random; print random.randrange(3030,4030)")
+export ZINC_PORT=$(python -S -c "import random; print(random.randrange(3030,4030))")
+export HADOOP_PROFILE=hadoop-3.2
 
 ./dev/make-distribution.sh --name ${DATE}-${REVISION} --pip --tgz -DzincPort=${ZINC_PORT} \
-     -Phadoop-2.7 -Pkubernetes -Pkinesis-asl -Phive -Phive-thriftserver
+     -P$HADOOP_PROFILE -Pkubernetes -Pkinesis-asl -Phive -Phive-thriftserver
+export TARBALL_TO_TEST=($(pwd)/spark-*${DATE}-${REVISION}.tgz)
 </code></pre></div></div>
 
 <p>2) Use that tarball and run the K8S integration tests:</p>
@@ -386,23 +388,22 @@ export PVC_TESTS_HOST_PATH=$PVC_TMP_DIR
 export PVC_TESTS_VM_PATH=$PVC_TMP_DIR
 
 minikube --vm-driver=&lt;YOUR VM DRIVER HERE&gt; start --memory 6000 --cpus 8
+minikube config set kubernetes-version v1.15.12
 
-minikube mount ${PVC_TESTS_HOST_PATH}:${PVC_TESTS_VM_PATH} --9p-version=9p2000.L --gid=0 --uid=185 &amp;
-
-MOUNT_PID=$(jobs -rp)
+minikube mount ${PVC_TESTS_HOST_PATH}:${PVC_TESTS_VM_PATH} --9p-version=9p2000.L --gid=0 --uid=185 &amp;; MOUNT_PID=$!
 
 kubectl create clusterrolebinding serviceaccounts-cluster-admin --clusterrole=cluster-admin --group=system:serviceaccounts || true
 
 ./resource-managers/kubernetes/integration-tests/dev/dev-run-integration-tests.sh \
-    --spark-tgz ${WORKSPACE}/spark-*.tgz
+    --spark-tgz $TARBALL_TO_TEST --hadoop-profile $HADOOP_PROFILE --exclude-tags r
 
 kill -9 $MOUNT_PID
 minikube stop
 </code></pre></div></div>
 
-<p>After the run is completed, the integration test logs are saved here:  <code class="highlighter-rouge">./resource-managers/kubernetes/integration-tests/target/integration-tests.log</code></p>
+<p>After the run is completed, the integration test logs are saved here: <code class="highlighter-rouge">./resource-managers/kubernetes/integration-tests/target/integration-tests.log</code>.</p>
 
-<p>Getting logs from the pods and containers directly is an exercise left to the reader.</p>
+<p>In case of a failure the POD logs (driver and executors) can be found at the end of the failed test (within <code class="highlighter-rouge">integration-tests.log</code>) in the <code class="highlighter-rouge">EXTRA LOGS FOR THE FAILED TEST</code> section.</p>
 
 <p>Kubernetes, and more importantly, minikube have rapid release cycles, and point releases have been found to be buggy and/or break older and existing functionality.  If you are having trouble getting tests to pass on Jenkins, but locally things work, don&#8217;t hesitate to file a Jira issue.</p>
 


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org