You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Jacek Laskowski <ja...@japila.pl> on 2016/07/31 08:44:35 UTC

[YARN] YarnAllocator.updateResourceRequests -- could be simpler and faster, too?

Hi,

I've been reviewing YarnAllocator.updateResourceRequests and think the
other branch is...too verbose (and may be deceiving that it's more
complex than it really is). I hope to get corrected.

The source code of the method is
https://github.com/apache/spark/blob/master/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala#L294.

The method is about requesting new or cancelling outstanding YARN
container requests for executors. Right?

It starts by checking the executor container requests using
getPendingAllocate [1] and branches by missing (the other branch uses
numPendingAllocate also which is going to be crucial in my
understanding of the method).

So, in the other branch, when the number of outstanding YARN
containers is too much [2], the method calls [3]

amClient.getMatchingRequests(RM_REQUEST_PRIORITY, ANY_HOST, resource)

which is exactly getPendingAllocate [4] (!) Is that correct?

If my understanding is correct, the code does not need to call
getPendingAllocate in the branch since it had already requested it in
[1] (at the very beginning) and since we're inside the branch the code
does not have to check "matchingRequests.isEmpty" either.

My understanding is that the code should be as simple as the following:

    } else if (numPendingAllocate > 0 && missing < 0) {
      val numToCancel = math.min(numPendingAllocate, -missing)
      logInfo(s"Canceling requests for $numToCancel executor
container(s) to have a new desired " +
        s"total $targetNumExecutors executors.")
      pendingAllocate.take(numToCancel).foreach(amClient.removeContainerRequest)
    }

i.e. just a single pendingAllocate.take...

Is that correct?

THANKS a lot for reading thus far!!! I greatly appreciate your time
and effort to help me understand Spark.

p.s. I'm ready with a PR.

[1] https://github.com/apache/spark/blob/master/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala#L295
[2] https://github.com/apache/spark/blob/master/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala#L355
[3] https://github.com/apache/spark/blob/master/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala#L360
[4] https://github.com/apache/spark/blob/master/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala#L200

Pozdrawiam,
Jacek Laskowski
----
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org