You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2021/09/17 19:33:53 UTC

[GitHub] [spark] warrenzhu25 opened a new pull request #34035: [SPARK-36793][K8S] Support write container stdout/stderr to file

warrenzhu25 opened a new pull request #34035:
URL: https://github.com/apache/spark/pull/34035


   ### What changes were proposed in this pull request?
   Support write container stdout/stderr to file
   
   ### Why are the changes needed?
   If users want to sidecar logging agent to send stdout/stderr to external log storage,  only way is to change entrypoint.sh, which might break compatibility with community version.
   
   ### Does this PR introduce _any_ user-facing change?
   Yes. User can enable this feature by spark config.
   
   ### How was this patch tested?
   Added UT in BasicDriverFeatureStepSuite and BasicExecutorFeatureStepSuite
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] warrenzhu25 commented on pull request #34035: [SPARK-36793][K8S] Support write container stdout/stderr to file

Posted by GitBox <gi...@apache.org>.
warrenzhu25 commented on pull request #34035:
URL: https://github.com/apache/spark/pull/34035#issuecomment-1034588906


   @HyukjinKwon Could you help take a look and remove `stale`  label?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] martin-g commented on a change in pull request #34035: [SPARK-36793][K8S] Support write container stdout/stderr to file

Posted by GitBox <gi...@apache.org>.
martin-g commented on a change in pull request #34035:
URL: https://github.com/apache/spark/pull/34035#discussion_r803405302



##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala
##########
@@ -616,6 +616,20 @@ private[spark] object Config extends Logging {
       .checkValue(value => value > 0, "Maximum number of pending pods should be a positive integer")
       .createWithDefault(Int.MaxValue)
 
+  val KUBERNETES_LOG_TO_FILE =
+    ConfigBuilder("spark.kubernetes.logToFile.enabled")
+      .doc("Whether to write executor/driver stdout/stderr as log file")
+      .version("3.2.0")

Review comment:
       The version specified in `docs/running-on-kubernetes.md` says `3.3.0`.

##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala
##########
@@ -616,6 +616,20 @@ private[spark] object Config extends Logging {
       .checkValue(value => value > 0, "Maximum number of pending pods should be a positive integer")
       .createWithDefault(Int.MaxValue)
 
+  val KUBERNETES_LOG_TO_FILE =
+    ConfigBuilder("spark.kubernetes.logToFile.enabled")
+      .doc("Whether to write executor/driver stdout/stderr as log file")
+      .version("3.2.0")
+      .booleanConf
+      .createWithDefault(false)
+
+  val KUBERNETES_LOG_TO_FILE_PATH =
+    ConfigBuilder("spark.kubernetes.logToFile.path")
+      .doc("The path to write executor/driver stdout/stderr as log file")
+      .version("3.2.0")

Review comment:
       ditto

##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -177,7 +177,14 @@ private[spark] class BasicExecutorFeatureStep(
             .withValue(opt)
             .build()
         }
-      }
+      } ++ {
+      if (kubernetesConf.get(KUBERNETES_LOG_TO_FILE)) {
+        Seq(new EnvVarBuilder()
+          .withName(ENV_SPARK_LOG_PATH)
+          .withValue(kubernetesConf.get(KUBERNETES_LOG_TO_FILE_PATH))
+          .build())
+      } else None

Review comment:
       ditto

##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicDriverFeatureStep.scala
##########
@@ -82,7 +82,14 @@ private[spark] class BasicDriverFeatureStep(conf: KubernetesDriverConf)
           .withName(env._1)
           .withValue(env._2)
           .build()
-      }
+      } ++ {
+      if (conf.get(KUBERNETES_LOG_TO_FILE)) {
+        Seq(new EnvVarBuilder()
+          .withName(ENV_SPARK_LOG_PATH)
+          .withValue(conf.get(KUBERNETES_LOG_TO_FILE_PATH))
+          .build())
+      } else None

Review comment:
       Does this compile ? `None` (Option) is not a collection.
   Maybe `Seq.empty` ?!

##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Constants.scala
##########
@@ -66,6 +66,7 @@ private[spark] object Constants {
   val ENV_SPARK_CONF_DIR = "SPARK_CONF_DIR"
   val ENV_SPARK_USER = "SPARK_USER"
   val ENV_RESOURCE_PROFILE_ID = "SPARK_RESOURCE_PROFILE_ID"
+  val ENV_SPARK_LOG_PATH = "K8S_SPARK_LOG_PATH"

Review comment:
       all other env vars start with `SPARK_`. For consistency, I think it would be better to change this to `SPARK_K8S_LOG_PATH`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] warrenzhu25 commented on a change in pull request #34035: [SPARK-36793][K8S] Support write container stdout/stderr to file

Posted by GitBox <gi...@apache.org>.
warrenzhu25 commented on a change in pull request #34035:
URL: https://github.com/apache/spark/pull/34035#discussion_r803436029



##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -177,7 +177,14 @@ private[spark] class BasicExecutorFeatureStep(
             .withValue(opt)
             .build()
         }
-      }
+      } ++ {
+      if (kubernetesConf.get(KUBERNETES_LOG_TO_FILE)) {
+        Seq(new EnvVarBuilder()
+          .withName(ENV_SPARK_LOG_PATH)
+          .withValue(kubernetesConf.get(KUBERNETES_LOG_TO_FILE_PATH))
+          .build())
+      } else None

Review comment:
       I think `None` also worked, one example is https://github.com/warrenzhu25/spark/blob/f9057749f7d7d25d396c03a8041a0a55e97148ab/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala#L156




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] github-actions[bot] closed pull request #34035: [SPARK-36793][K8S] Support write container stdout/stderr to file

Posted by GitBox <gi...@apache.org>.
github-actions[bot] closed pull request #34035:
URL: https://github.com/apache/spark/pull/34035


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] warrenzhu25 commented on a change in pull request #34035: [SPARK-36793][K8S] Support write container stdout/stderr to file

Posted by GitBox <gi...@apache.org>.
warrenzhu25 commented on a change in pull request #34035:
URL: https://github.com/apache/spark/pull/34035#discussion_r712371809



##########
File path: docs/running-on-kubernetes.md
##########
@@ -1120,6 +1120,22 @@ See the [configuration page](configuration.html) for information on Spark config
   </td>
   <td>3.0.0</td>
 </tr>
+<tr>
+  <td><code>spark.kubernetes.logToFile.enabled</code></td>
+  <td><code>false</code></td>
+  <td>
+   Whether to write executor/driver stdout/stderr as log file
+  </td>
+  <td>3.2.0</td>

Review comment:
       Updated. Any other comments?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] github-actions[bot] commented on pull request #34035: [SPARK-36793][K8S] Support write container stdout/stderr to file

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #34035:
URL: https://github.com/apache/spark/pull/34035#issuecomment-1003639967


   We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
   If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] warrenzhu25 commented on a change in pull request #34035: [SPARK-36793][K8S] Support write container stdout/stderr to file

Posted by GitBox <gi...@apache.org>.
warrenzhu25 commented on a change in pull request #34035:
URL: https://github.com/apache/spark/pull/34035#discussion_r715190296



##########
File path: resource-managers/kubernetes/docker/src/main/dockerfiles/spark/entrypoint.sh
##########
@@ -103,4 +103,14 @@ case "$1" in
 esac
 
 # Execute the container CMD under tini for better hygiene
-exec /usr/bin/tini -s -- "${CMD[@]}"
+if [ -z "$K8S_SPARK_LOG_PATH" ]; then
+  exec /usr/bin/tini -s -- "${CMD[@]}"
+else
+  if [ -n "$K8S_SPARK_LOG_PATH" ]; then
+    mkdir -p "$K8S_SPARK_LOG_PATH"
+  fi
+  # TODO: support custom stdout/stderr file name by spark config
+  K8S_SPARK_LOG_STDOUT=$K8S_SPARK_LOG_PATH/stdout
+  K8S_SPARK_LOG_STDERR=$K8S_SPARK_LOG_PATH/stderr
+  exec /usr/bin/tini -s -- "${CMD[@]}" > >(tee $K8S_SPARK_LOG_STDOUT) 2> >(tee $K8S_SPARK_LOG_STDERR >&2)

Review comment:
       Use tee can make stdout/stderr captured by kubectl logs.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #34035: [SPARK-36793][K8S] Support write container stdout/stderr to file

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #34035:
URL: https://github.com/apache/spark/pull/34035#discussion_r711844990



##########
File path: docs/running-on-kubernetes.md
##########
@@ -1120,6 +1120,22 @@ See the [configuration page](configuration.html) for information on Spark config
   </td>
   <td>3.0.0</td>
 </tr>
+<tr>
+  <td><code>spark.kubernetes.logToFile.enabled</code></td>
+  <td><code>false</code></td>
+  <td>
+   Whether to write executor/driver stdout/stderr as log file
+  </td>
+  <td>3.2.0</td>

Review comment:
       it should be 3.3.0




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34035: [SPARK-36793][K8S] Support write container stdout/stderr to file

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34035:
URL: https://github.com/apache/spark/pull/34035#issuecomment-922043991


   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] shanyu commented on a change in pull request #34035: [SPARK-36793][K8S] Support write container stdout/stderr to file

Posted by GitBox <gi...@apache.org>.
shanyu commented on a change in pull request #34035:
URL: https://github.com/apache/spark/pull/34035#discussion_r713390953



##########
File path: resource-managers/kubernetes/docker/src/main/dockerfiles/spark/entrypoint.sh
##########
@@ -103,4 +103,14 @@ case "$1" in
 esac
 
 # Execute the container CMD under tini for better hygiene
-exec /usr/bin/tini -s -- "${CMD[@]}"
+if [ -z "$K8S_SPARK_LOG_PATH" ]; then
+  exec /usr/bin/tini -s -- "${CMD[@]}"
+else
+  if [ -n "$K8S_SPARK_LOG_PATH" ]; then
+    mkdir -p "$K8S_SPARK_LOG_PATH"
+  fi
+  # TODO: support custom stdout/stderr file name by spark config
+  K8S_SPARK_LOG_STDOUT=$K8S_SPARK_LOG_PATH/stdout
+  K8S_SPARK_LOG_STDERR=$K8S_SPARK_LOG_PATH/stderr
+  exec /usr/bin/tini -s -- "${CMD[@]}" > >(tee $K8S_SPARK_LOG_STDOUT) 2> >(tee $K8S_SPARK_LOG_STDERR >&2)

Review comment:
       Would a simple redirect work? e.g.:
   1> $K8S_SPARK_LOG_STDOUT 2> $K8S_SPARK_LOG_STDERR




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org