You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Patrick Brown <pa...@gmail.com> on 2018/10/16 16:33:40 UTC

[Spark UI] Spark 2.3.1 UI no longer respects spark.ui.retainedJobs

I recently upgraded to spark 2.3.1 I have had these same settings in my
spark submit script, which worked on 2.0.2, and according to the
documentation appear to not have changed:

spark.ui.retainedTasks=1
spark.ui.retainedStages=1
spark.ui.retainedJobs=1

However in 2.3.1 the UI doesn't seem to respect this, it still retains a
huge number of jobs:

[image: Screen Shot 2018-10-16 at 10.31.50 AM.png]


Is this a known issue? Any ideas?

Re: [Spark UI] Spark 2.3.1 UI no longer respects spark.ui.retainedJobs

Posted by Patrick Brown <pa...@gmail.com>.
Done:

https://issues.apache.org/jira/browse/SPARK-25837

On Thu, Oct 25, 2018 at 10:21 AM Marcelo Vanzin <va...@cloudera.com> wrote:

> Ah that makes more sense. Could you file a bug with that information
> so we don't lose track of this?
>
> Thanks
> On Wed, Oct 24, 2018 at 6:13 PM Patrick Brown
> <pa...@gmail.com> wrote:
> >
> > On my production application I am running ~200 jobs at once, but
> continue to submit jobs in this manner for sometimes ~1 hour.
> >
> > The reproduction code above generally only has 4 ish jobs running at
> once, and as you can see runs through 50k jobs in this manner.
> >
> > I guess I should clarify my above statement, the issue seems to appear
> when running multiple jobs at once as well as in sequence for a while and
> may as well have something to do with high master CPU usage (thus the
> collect in the code). My rough guess would be whatever is managing clearing
> out completed jobs gets overwhelmed (my master was a 4 core machine while
> running this, and htop reported almost full CPU usage across all 4 cores).
> >
> > The attached screenshot shows the state of the webui after running the
> repro code, you can see the ui is displaying some 43k completed jobs (takes
> a long time to load) after a few minutes of inactivity this will clear out,
> however as my production application continues to submit jobs every once in
> a while, the issue persists.
> >
> > On Wed, Oct 24, 2018 at 5:05 PM Marcelo Vanzin <va...@cloudera.com>
> wrote:
> >>
> >> When you say many jobs at once, what ballpark are you talking about?
> >>
> >> The code in 2.3+ does try to keep data about all running jobs and
> >> stages regardless of the limit. If you're running into issues because
> >> of that we may have to look again at whether that's the right thing to
> >> do.
> >> On Tue, Oct 23, 2018 at 10:02 AM Patrick Brown
> >> <pa...@gmail.com> wrote:
> >> >
> >> > I believe I may be able to reproduce this now, it seems like it may
> be something to do with many jobs at once:
> >> >
> >> > Spark 2.3.1
> >> >
> >> > > spark-shell --conf spark.ui.retainedJobs=1
> >> >
> >> > scala> import scala.concurrent._
> >> > scala> import scala.concurrent.ExecutionContext.Implicits.global
> >> > scala> for (i <- 0 until 50000) { Future { println(sc.parallelize(0
> until i).collect.length) } }
> >> >
> >> > On Mon, Oct 22, 2018 at 11:25 AM Marcelo Vanzin <va...@cloudera.com>
> wrote:
> >> >>
> >> >> Just tried on 2.3.2 and worked fine for me. UI had a single job and a
> >> >> single stage (+ the tasks related to that single stage), same thing
> in
> >> >> memory (checked with jvisualvm).
> >> >>
> >> >> On Sat, Oct 20, 2018 at 6:45 PM Marcelo Vanzin <va...@cloudera.com>
> wrote:
> >> >> >
> >> >> > On Tue, Oct 16, 2018 at 9:34 AM Patrick Brown
> >> >> > <pa...@gmail.com> wrote:
> >> >> > > I recently upgraded to spark 2.3.1 I have had these same
> settings in my spark submit script, which worked on 2.0.2, and according to
> the documentation appear to not have changed:
> >> >> > >
> >> >> > > spark.ui.retainedTasks=1
> >> >> > > spark.ui.retainedStages=1
> >> >> > > spark.ui.retainedJobs=1
> >> >> >
> >> >> > I tried that locally on the current master and it seems to be
> working.
> >> >> > I don't have 2.3 easily in front of me right now, but will take a
> look
> >> >> > Monday.
> >> >> >
> >> >> > --
> >> >> > Marcelo
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> Marcelo
> >>
> >>
> >>
> >> --
> >> Marcelo
>
>
>
> --
> Marcelo
>

Re: [Spark UI] Spark 2.3.1 UI no longer respects spark.ui.retainedJobs

Posted by Marcelo Vanzin <va...@cloudera.com.INVALID>.
Ah that makes more sense. Could you file a bug with that information
so we don't lose track of this?

Thanks
On Wed, Oct 24, 2018 at 6:13 PM Patrick Brown
<pa...@gmail.com> wrote:
>
> On my production application I am running ~200 jobs at once, but continue to submit jobs in this manner for sometimes ~1 hour.
>
> The reproduction code above generally only has 4 ish jobs running at once, and as you can see runs through 50k jobs in this manner.
>
> I guess I should clarify my above statement, the issue seems to appear when running multiple jobs at once as well as in sequence for a while and may as well have something to do with high master CPU usage (thus the collect in the code). My rough guess would be whatever is managing clearing out completed jobs gets overwhelmed (my master was a 4 core machine while running this, and htop reported almost full CPU usage across all 4 cores).
>
> The attached screenshot shows the state of the webui after running the repro code, you can see the ui is displaying some 43k completed jobs (takes a long time to load) after a few minutes of inactivity this will clear out, however as my production application continues to submit jobs every once in a while, the issue persists.
>
> On Wed, Oct 24, 2018 at 5:05 PM Marcelo Vanzin <va...@cloudera.com> wrote:
>>
>> When you say many jobs at once, what ballpark are you talking about?
>>
>> The code in 2.3+ does try to keep data about all running jobs and
>> stages regardless of the limit. If you're running into issues because
>> of that we may have to look again at whether that's the right thing to
>> do.
>> On Tue, Oct 23, 2018 at 10:02 AM Patrick Brown
>> <pa...@gmail.com> wrote:
>> >
>> > I believe I may be able to reproduce this now, it seems like it may be something to do with many jobs at once:
>> >
>> > Spark 2.3.1
>> >
>> > > spark-shell --conf spark.ui.retainedJobs=1
>> >
>> > scala> import scala.concurrent._
>> > scala> import scala.concurrent.ExecutionContext.Implicits.global
>> > scala> for (i <- 0 until 50000) { Future { println(sc.parallelize(0 until i).collect.length) } }
>> >
>> > On Mon, Oct 22, 2018 at 11:25 AM Marcelo Vanzin <va...@cloudera.com> wrote:
>> >>
>> >> Just tried on 2.3.2 and worked fine for me. UI had a single job and a
>> >> single stage (+ the tasks related to that single stage), same thing in
>> >> memory (checked with jvisualvm).
>> >>
>> >> On Sat, Oct 20, 2018 at 6:45 PM Marcelo Vanzin <va...@cloudera.com> wrote:
>> >> >
>> >> > On Tue, Oct 16, 2018 at 9:34 AM Patrick Brown
>> >> > <pa...@gmail.com> wrote:
>> >> > > I recently upgraded to spark 2.3.1 I have had these same settings in my spark submit script, which worked on 2.0.2, and according to the documentation appear to not have changed:
>> >> > >
>> >> > > spark.ui.retainedTasks=1
>> >> > > spark.ui.retainedStages=1
>> >> > > spark.ui.retainedJobs=1
>> >> >
>> >> > I tried that locally on the current master and it seems to be working.
>> >> > I don't have 2.3 easily in front of me right now, but will take a look
>> >> > Monday.
>> >> >
>> >> > --
>> >> > Marcelo
>> >>
>> >>
>> >>
>> >> --
>> >> Marcelo
>>
>>
>>
>> --
>> Marcelo



-- 
Marcelo

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Re: [Spark UI] Spark 2.3.1 UI no longer respects spark.ui.retainedJobs

Posted by Patrick Brown <pa...@gmail.com>.
On my production application I am running ~200 jobs at once, but continue
to submit jobs in this manner for sometimes ~1 hour.

The reproduction code above generally only has 4 ish jobs running at once,
and as you can see runs through 50k jobs in this manner.

I guess I should clarify my above statement, the issue seems to appear when
running multiple jobs at once as well as in sequence for a while and may as
well have something to do with high master CPU usage (thus the collect in
the code). My rough guess would be whatever is managing clearing out
completed jobs gets overwhelmed (my master was a 4 core machine while
running this, and htop reported almost full CPU usage across all 4 cores).

The attached screenshot shows the state of the webui after running the
repro code, you can see the ui is displaying some 43k completed jobs (takes
a long time to load) after a few minutes of inactivity this will clear out,
however as my production application continues to submit jobs every once in
a while, the issue persists.

On Wed, Oct 24, 2018 at 5:05 PM Marcelo Vanzin <va...@cloudera.com> wrote:

> When you say many jobs at once, what ballpark are you talking about?
>
> The code in 2.3+ does try to keep data about all running jobs and
> stages regardless of the limit. If you're running into issues because
> of that we may have to look again at whether that's the right thing to
> do.
> On Tue, Oct 23, 2018 at 10:02 AM Patrick Brown
> <pa...@gmail.com> wrote:
> >
> > I believe I may be able to reproduce this now, it seems like it may be
> something to do with many jobs at once:
> >
> > Spark 2.3.1
> >
> > > spark-shell --conf spark.ui.retainedJobs=1
> >
> > scala> import scala.concurrent._
> > scala> import scala.concurrent.ExecutionContext.Implicits.global
> > scala> for (i <- 0 until 50000) { Future { println(sc.parallelize(0
> until i).collect.length) } }
> >
> > On Mon, Oct 22, 2018 at 11:25 AM Marcelo Vanzin <va...@cloudera.com>
> wrote:
> >>
> >> Just tried on 2.3.2 and worked fine for me. UI had a single job and a
> >> single stage (+ the tasks related to that single stage), same thing in
> >> memory (checked with jvisualvm).
> >>
> >> On Sat, Oct 20, 2018 at 6:45 PM Marcelo Vanzin <va...@cloudera.com>
> wrote:
> >> >
> >> > On Tue, Oct 16, 2018 at 9:34 AM Patrick Brown
> >> > <pa...@gmail.com> wrote:
> >> > > I recently upgraded to spark 2.3.1 I have had these same settings
> in my spark submit script, which worked on 2.0.2, and according to the
> documentation appear to not have changed:
> >> > >
> >> > > spark.ui.retainedTasks=1
> >> > > spark.ui.retainedStages=1
> >> > > spark.ui.retainedJobs=1
> >> >
> >> > I tried that locally on the current master and it seems to be working.
> >> > I don't have 2.3 easily in front of me right now, but will take a look
> >> > Monday.
> >> >
> >> > --
> >> > Marcelo
> >>
> >>
> >>
> >> --
> >> Marcelo
>
>
>
> --
> Marcelo
>

Re: [Spark UI] Spark 2.3.1 UI no longer respects spark.ui.retainedJobs

Posted by Marcelo Vanzin <va...@cloudera.com.INVALID>.
When you say many jobs at once, what ballpark are you talking about?

The code in 2.3+ does try to keep data about all running jobs and
stages regardless of the limit. If you're running into issues because
of that we may have to look again at whether that's the right thing to
do.
On Tue, Oct 23, 2018 at 10:02 AM Patrick Brown
<pa...@gmail.com> wrote:
>
> I believe I may be able to reproduce this now, it seems like it may be something to do with many jobs at once:
>
> Spark 2.3.1
>
> > spark-shell --conf spark.ui.retainedJobs=1
>
> scala> import scala.concurrent._
> scala> import scala.concurrent.ExecutionContext.Implicits.global
> scala> for (i <- 0 until 50000) { Future { println(sc.parallelize(0 until i).collect.length) } }
>
> On Mon, Oct 22, 2018 at 11:25 AM Marcelo Vanzin <va...@cloudera.com> wrote:
>>
>> Just tried on 2.3.2 and worked fine for me. UI had a single job and a
>> single stage (+ the tasks related to that single stage), same thing in
>> memory (checked with jvisualvm).
>>
>> On Sat, Oct 20, 2018 at 6:45 PM Marcelo Vanzin <va...@cloudera.com> wrote:
>> >
>> > On Tue, Oct 16, 2018 at 9:34 AM Patrick Brown
>> > <pa...@gmail.com> wrote:
>> > > I recently upgraded to spark 2.3.1 I have had these same settings in my spark submit script, which worked on 2.0.2, and according to the documentation appear to not have changed:
>> > >
>> > > spark.ui.retainedTasks=1
>> > > spark.ui.retainedStages=1
>> > > spark.ui.retainedJobs=1
>> >
>> > I tried that locally on the current master and it seems to be working.
>> > I don't have 2.3 easily in front of me right now, but will take a look
>> > Monday.
>> >
>> > --
>> > Marcelo
>>
>>
>>
>> --
>> Marcelo



-- 
Marcelo

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Re: [Spark UI] Spark 2.3.1 UI no longer respects spark.ui.retainedJobs

Posted by Patrick Brown <pa...@gmail.com>.
I believe I may be able to reproduce this now, it seems like it may be
something to do with many jobs at once:

Spark 2.3.1

> spark-shell --conf spark.ui.retainedJobs=1

scala> import scala.concurrent._
scala> import scala.concurrent.ExecutionContext.Implicits.global
scala> for (i <- 0 until 50000) { Future { println(sc.parallelize(0 until
i).collect.length) } }

On Mon, Oct 22, 2018 at 11:25 AM Marcelo Vanzin <va...@cloudera.com> wrote:

> Just tried on 2.3.2 and worked fine for me. UI had a single job and a
> single stage (+ the tasks related to that single stage), same thing in
> memory (checked with jvisualvm).
>
> On Sat, Oct 20, 2018 at 6:45 PM Marcelo Vanzin <va...@cloudera.com>
> wrote:
> >
> > On Tue, Oct 16, 2018 at 9:34 AM Patrick Brown
> > <pa...@gmail.com> wrote:
> > > I recently upgraded to spark 2.3.1 I have had these same settings in
> my spark submit script, which worked on 2.0.2, and according to the
> documentation appear to not have changed:
> > >
> > > spark.ui.retainedTasks=1
> > > spark.ui.retainedStages=1
> > > spark.ui.retainedJobs=1
> >
> > I tried that locally on the current master and it seems to be working.
> > I don't have 2.3 easily in front of me right now, but will take a look
> > Monday.
> >
> > --
> > Marcelo
>
>
>
> --
> Marcelo
>

Re: [Spark UI] Spark 2.3.1 UI no longer respects spark.ui.retainedJobs

Posted by Marcelo Vanzin <va...@cloudera.com.INVALID>.
Just tried on 2.3.2 and worked fine for me. UI had a single job and a
single stage (+ the tasks related to that single stage), same thing in
memory (checked with jvisualvm).

On Sat, Oct 20, 2018 at 6:45 PM Marcelo Vanzin <va...@cloudera.com> wrote:
>
> On Tue, Oct 16, 2018 at 9:34 AM Patrick Brown
> <pa...@gmail.com> wrote:
> > I recently upgraded to spark 2.3.1 I have had these same settings in my spark submit script, which worked on 2.0.2, and according to the documentation appear to not have changed:
> >
> > spark.ui.retainedTasks=1
> > spark.ui.retainedStages=1
> > spark.ui.retainedJobs=1
>
> I tried that locally on the current master and it seems to be working.
> I don't have 2.3 easily in front of me right now, but will take a look
> Monday.
>
> --
> Marcelo



-- 
Marcelo

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Re: [Spark UI] Spark 2.3.1 UI no longer respects spark.ui.retainedJobs

Posted by Marcelo Vanzin <va...@cloudera.com.INVALID>.
On Tue, Oct 16, 2018 at 9:34 AM Patrick Brown
<pa...@gmail.com> wrote:
> I recently upgraded to spark 2.3.1 I have had these same settings in my spark submit script, which worked on 2.0.2, and according to the documentation appear to not have changed:
>
> spark.ui.retainedTasks=1
> spark.ui.retainedStages=1
> spark.ui.retainedJobs=1

I tried that locally on the current master and it seems to be working.
I don't have 2.3 easily in front of me right now, but will take a look
Monday.

-- 
Marcelo

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Re: [Spark UI] Spark 2.3.1 UI no longer respects spark.ui.retainedJobs

Posted by Shing Hing Man <ma...@yahoo.com.INVALID>.
 I have the same problem when I upgrade my application from Spark 2.2.1 to Spark 2.3.2 and run in Yarn client mode.
Also I noticed that in my Spark driver,  org.apache.spark.status.TaskDataWrapper
could take up more than 2G of memory. 

Shing


    On Tuesday, 16 October 2018, 17:34:02 GMT+1, Patrick Brown <pa...@gmail.com> wrote:  
 
 I recently upgraded to spark 2.3.1 I have had these same settings in my spark submit script, which worked on 2.0.2, and according to the documentation appear to not have changed:
spark.ui.retainedTasks=1spark.ui.retainedStages=1spark.ui.retainedJobs=1
However in 2.3.1 the UI doesn't seem to respect this, it still retains a huge number of jobs:



Is this a known issue? Any ideas?