You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Tianbin Jiang <ji...@gmail.com> on 2021/05/17 17:48:53 UTC

Spark History Server to S3 doesn't show up incomplete jobs

Hi all,
 I am using Spark 2.4.5.  I am redirecting the spark event logs to a S3
with the following configuration:

spark.eventLog.enabled = true
spark.history.ui.port = 18080
spark.eventLog.dir = s3://livy-spark-log/spark-history/
spark.history.fs.logDirectory = s3://livy-spark-log/spark-history/
spark.history.fs.update.interval = 5s


Once my application is completed, I can see it shows up on the spark
history server. However, running applications doesn't show up on
"incomplete applications". I have also checked the log, whenever my
application end, I can see this message:

21/05/17 06:14:18 INFO k8s.KubernetesClusterSchedulerBackend: Shutting down
all executors
21/05/17 06:14:18 INFO
k8s.KubernetesClusterSchedulerBackend$KubernetesDriverEndpoint: Asking each
executor to shut down
21/05/17 06:14:18 WARN k8s.ExecutorPodsWatchSnapshotSource: Kubernetes
client has been closed (this is expected if the application is shutting
down.)
*21/05/17 06:14:18 INFO s3n.MultipartUploadOutputStream: close closed:false
s3://livy-spark-log/spark-history/spark-48c3141875fe4c67b5708400134ea3d6.inprogress*
*21/05/17 06:14:19 INFO s3n.S3NativeFileSystem: rename
s3://livy-spark-log/spark-history/spark-48c3141875fe4c67b5708400134ea3d6.inprogress
s3://livy-spark-log/spark-history/spark-48c3141875fe4c67b5708400134ea3d6*
21/05/17 06:14:19 INFO spark.MapOutputTrackerMasterEndpoint:
MapOutputTrackerMasterEndpoint stopped!
21/05/17 06:14:19 INFO memory.MemoryStore: MemoryStore cleared
21/05/17 06:14:19 INFO storage.BlockManager: BlockManager stopped


I am not able to see any xx.inprogress file on S3 though. Anyone had this
problem before?

-- 

Sincerely:
 Tianbin Jiang