You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Jacek Laskowski <ja...@japila.pl> on 2020/12/30 14:34:22 UTC

[3.0.1] ExecutorMonitor.onJobStart and StageInfo.shuffleDepId that's never used?

Hi,

It's been a while. Glad to be back Sparkians!

I've been exploring ExecutorMonitor.onJobStart in 3.0.1 and noticed that it
uses StageInfo.shuffleDepId [1] that is None by default and moreover never
"written to" according to IntelliJ IDEA.

Is this the case and intentional?

I'm wondering how much IDEA knows about codegen and that's where it's used
(?)

I've just stumbled upon it and before I spend more time on this I thought
I'd ask (perhaps it's going to change in 3.1?). Help appreciated.

[1]
https://github.com/apache/spark/blob/78df2caec8c94c31e5c9ddc30ed8acb424084181/core/src/main/scala/org/apache/spark/scheduler/dynalloc/ExecutorMonitor.scala#L179

Pozdrawiam,
Jacek Laskowski
----
https://about.me/JacekLaskowski
"The Internals Of" Online Books <https://books.japila.pl/>
Follow me on https://twitter.com/jaceklaskowski

<https://twitter.com/jaceklaskowski>

Re: [3.0.1] ExecutorMonitor.onJobStart and StageInfo.shuffleDepId that's never used?

Posted by Jacek Laskowski <ja...@japila.pl>.
Hi,

Sorry. A false alarm. Got mistaken with what IDEA calls "unused" may
not really be unused. It is (re)assigned in StageInfo.fromStage for
a ShuffleMapStage [1] and then caught in ExecutorMonitor [2] (since it's a
SparkListener).

[1]
https://github.com/apache/spark/blob/094563384478a402c36415edf04ee7b884a34fc9/core/src/main/scala/org/apache/spark/scheduler/StageInfo.scala#L108
[2]
https://github.com/apache/spark/blob/78df2caec8c94c31e5c9ddc30ed8acb424084181/core/src/main/scala/org/apache/spark/scheduler/dynalloc/ExecutorMonitor.scala#L179

Pozdrawiam,
Jacek Laskowski
----
https://about.me/JacekLaskowski
"The Internals Of" Online Books <https://books.japila.pl/>
Follow me on https://twitter.com/jaceklaskowski

<https://twitter.com/jaceklaskowski>


On Wed, Dec 30, 2020 at 3:34 PM Jacek Laskowski <ja...@japila.pl> wrote:

> Hi,
>
> It's been a while. Glad to be back Sparkians!
>
> I've been exploring ExecutorMonitor.onJobStart in 3.0.1 and noticed that
> it uses StageInfo.shuffleDepId [1] that is None by default and moreover
> never "written to" according to IntelliJ IDEA.
>
> Is this the case and intentional?
>
> I'm wondering how much IDEA knows about codegen and that's where it's used
> (?)
>
> I've just stumbled upon it and before I spend more time on this I thought
> I'd ask (perhaps it's going to change in 3.1?). Help appreciated.
>
> [1]
> https://github.com/apache/spark/blob/78df2caec8c94c31e5c9ddc30ed8acb424084181/core/src/main/scala/org/apache/spark/scheduler/dynalloc/ExecutorMonitor.scala#L179
>
> Pozdrawiam,
> Jacek Laskowski
> ----
> https://about.me/JacekLaskowski
> "The Internals Of" Online Books <https://books.japila.pl/>
> Follow me on https://twitter.com/jaceklaskowski
>
> <https://twitter.com/jaceklaskowski>
>