You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by "Rodriguez Hortala, Juan" <ho...@amazon.com> on 2017/03/13 22:23:14 UTC

Adding the executor ID to Spark logs when launching an executor in a YARN container

Hi Spark developers,

For Spark running on YARN, I would like to be able to find out the container where an executor is running by looking at the logs. I haven't been able to find a way to do this, not even with the Spark UI, as neither the Executors tab nor the stage information page show the container id. I was thinking on modifying the logs sent in YarnAllocator to log the executor id on container start, as follows:

@@ -494,7 +494,8 @@ private[yarn] class YarnAllocator(
       val containerId = container.getId
       val executorId = executorIdCounter.toString
       assert(container.getResource.getMemory >= resource.getMemory)
-      logInfo(s"Launching container $containerId on host $executorHostname")
+      logInfo(s"Launching container $containerId on host $executorHostname " +
+        s"for executor with ID $executorId")

       def updateInternalState(): Unit = synchronized {
         numExecutorsRunning += 1
@@ -528,7 +529,8 @@ private[yarn] class YarnAllocator(
                 updateInternalState()
               } catch {
                 case NonFatal(e) =>
-                  logError(s"Failed to launch executor $executorId on container $containerId", e)
+                  logError(s"Failed to launch executor $executorId on container $containerId " +
+                    s"for executor with ID $executorId", e)
                   // Assigned container should be released immediately to avoid unnecessary resource
                   // occupation.
                   amClient.releaseAssignedContainer(containerId)

Do you think this is a good idea, or there is a better way to achieve this?

Thanks in advance,

Juan ?