You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Jack Kolokasis <ko...@ics.forth.gr> on 2019/03/26 12:59:38 UTC
Spark Profiler
Hello all,
I am looking for a spark profiler to trace my application to find
the bottlenecks. I need to trace CPU usage, Memory Usage and I/O usage.
I am looking forward for your reply.
--Iacovos
---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org
Re: Spark Profiler
Posted by jcdauchy <je...@moodys.com>.
Hello Jack,
You can also have a look at “Babar”, there is a nice “flame graph” feature
too. I haven’t had the time to test it out.
https://github.com/criteo/babar
JC
--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org
Re: Spark Profiler
Posted by Hariharan <ha...@gmail.com>.
Hi Jack,
You can try sparklens (https://github.com/qubole/sparklens). I think it
won't give details at as low a level as you're looking for, but it can help
you identify and remove performance bottlenecks.
~ Hariharan
On Fri, Mar 29, 2019 at 12:01 AM bo yang <bo...@gmail.com> wrote:
> Yeah, these options are very valuable. Just add another option :) We build
> a jvm profiler (https://github.com/uber-common/jvm-profiler) to monitor
> and profile Spark applications in large scale (e.g. sending metrics to
> kafka / hive for batch analysis). People could try it as well.
>
>
> On Wed, Mar 27, 2019 at 1:49 PM Jack Kolokasis <ko...@ics.forth.gr>
> wrote:
>
>> Thanks for your reply. Your help is very valuable and all these links
>> are helpful (especially your example)
>>
>> Best Regards
>>
>> --Iacovos
>> On 3/27/19 10:42 PM, Luca Canali wrote:
>>
>> I find that the Spark metrics system is quite useful to gather resource
>> utilization metrics of Spark applications, including CPU, memory and I/O.
>>
>> If you are interested an example how this works for us at:
>> https://db-blog.web.cern.ch/blog/luca-canali/2019-02-performance-dashboard-apache-spark
>> If instead you are rather looking at ways to instrument your Spark code
>> with performance metrics, Spark task metrics and event listeners are quite
>> useful for that. See also
>> https://github.com/apache/spark/blob/master/docs/monitoring.md and
>> https://github.com/LucaCanali/sparkMeasure
>>
>>
>>
>> Regards,
>>
>> Luca
>>
>>
>>
>> *From:* manish ranjan <cs...@gmail.com> <cs...@gmail.com>
>> *Sent:* Tuesday, March 26, 2019 15:24
>> *To:* Jack Kolokasis <ko...@ics.forth.gr> <ko...@ics.forth.gr>
>> *Cc:* user <us...@spark.apache.org> <us...@spark.apache.org>
>> *Subject:* Re: Spark Profiler
>>
>>
>>
>> I have found ganglia very helpful in understanding network I/o , CPU and
>> memory usage for a given spark cluster.
>>
>> I have not used , but have heard good things about Dr Elephant ( which I
>> think was contributed by LinkedIn but not 100%sure).
>>
>>
>>
>> On Tue, Mar 26, 2019, 5:59 AM Jack Kolokasis <ko...@ics.forth.gr>
>> wrote:
>>
>> Hello all,
>>
>> I am looking for a spark profiler to trace my application to find
>> the bottlenecks. I need to trace CPU usage, Memory Usage and I/O usage.
>>
>> I am looking forward for your reply.
>>
>> --Iacovos
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>>
>>
Re: Spark Profiler
Posted by bo yang <bo...@gmail.com>.
Yeah, these options are very valuable. Just add another option :) We build
a jvm profiler (https://github.com/uber-common/jvm-profiler) to monitor and
profile Spark applications in large scale (e.g. sending metrics to kafka /
hive for batch analysis). People could try it as well.
On Wed, Mar 27, 2019 at 1:49 PM Jack Kolokasis <ko...@ics.forth.gr>
wrote:
> Thanks for your reply. Your help is very valuable and all these links are
> helpful (especially your example)
>
> Best Regards
>
> --Iacovos
> On 3/27/19 10:42 PM, Luca Canali wrote:
>
> I find that the Spark metrics system is quite useful to gather resource
> utilization metrics of Spark applications, including CPU, memory and I/O.
>
> If you are interested an example how this works for us at:
> https://db-blog.web.cern.ch/blog/luca-canali/2019-02-performance-dashboard-apache-spark
> If instead you are rather looking at ways to instrument your Spark code
> with performance metrics, Spark task metrics and event listeners are quite
> useful for that. See also
> https://github.com/apache/spark/blob/master/docs/monitoring.md and
> https://github.com/LucaCanali/sparkMeasure
>
>
>
> Regards,
>
> Luca
>
>
>
> *From:* manish ranjan <cs...@gmail.com> <cs...@gmail.com>
> *Sent:* Tuesday, March 26, 2019 15:24
> *To:* Jack Kolokasis <ko...@ics.forth.gr> <ko...@ics.forth.gr>
> *Cc:* user <us...@spark.apache.org> <us...@spark.apache.org>
> *Subject:* Re: Spark Profiler
>
>
>
> I have found ganglia very helpful in understanding network I/o , CPU and
> memory usage for a given spark cluster.
>
> I have not used , but have heard good things about Dr Elephant ( which I
> think was contributed by LinkedIn but not 100%sure).
>
>
>
> On Tue, Mar 26, 2019, 5:59 AM Jack Kolokasis <ko...@ics.forth.gr>
> wrote:
>
> Hello all,
>
> I am looking for a spark profiler to trace my application to find
> the bottlenecks. I need to trace CPU usage, Memory Usage and I/O usage.
>
> I am looking forward for your reply.
>
> --Iacovos
>
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>
>
Re: Spark Profiler
Posted by Jack Kolokasis <ko...@ics.forth.gr>.
Thanks for your reply. Your help is very valuable and all these links
are helpful (especially your example)
Best Regards
--Iacovos
On 3/27/19 10:42 PM, Luca Canali wrote:
>
> I find that the Spark metrics system is quite useful to gather
> resource utilization metrics of Spark applications, including CPU,
> memory and I/O.
>
> If you are interested an example how this works for us at:
> https://db-blog.web.cern.ch/blog/luca-canali/2019-02-performance-dashboard-apache-spark
>
> If instead you are rather looking at ways to instrument your Spark
> code with performance metrics, Spark task metrics and event listeners
> are quite useful for that. See also
> https://github.com/apache/spark/blob/master/docs/monitoring.md and
> https://github.com/LucaCanali/sparkMeasure
>
> Regards,
>
> Luca
>
> *From:*manish ranjan <cs...@gmail.com>
> *Sent:* Tuesday, March 26, 2019 15:24
> *To:* Jack Kolokasis <ko...@ics.forth.gr>
> *Cc:* user <us...@spark.apache.org>
> *Subject:* Re: Spark Profiler
>
> I have found ganglia very helpful in understanding network I/o , CPU
> and memory usage for a given spark cluster.
>
> I have not used , but have heard good things about Dr Elephant ( which
> I think was contributed by LinkedIn but not 100%sure).
>
> On Tue, Mar 26, 2019, 5:59 AM Jack Kolokasis <kolokasis@ics.forth.gr
> <ma...@ics.forth.gr>> wrote:
>
> Hello all,
>
> I am looking for a spark profiler to trace my application to
> find
> the bottlenecks. I need to trace CPU usage, Memory Usage and I/O
> usage.
>
> I am looking forward for your reply.
>
> --Iacovos
>
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
> <ma...@spark.apache.org>
>
RE: Spark Profiler
Posted by Luca Canali <Lu...@cern.ch>.
I find that the Spark metrics system is quite useful to gather resource utilization metrics of Spark applications, including CPU, memory and I/O.
If you are interested an example how this works for us at: https://db-blog.web.cern.ch/blog/luca-canali/2019-02-performance-dashboard-apache-spark
If instead you are rather looking at ways to instrument your Spark code with performance metrics, Spark task metrics and event listeners are quite useful for that. See also https://github.com/apache/spark/blob/master/docs/monitoring.md and https://github.com/LucaCanali/sparkMeasure
Regards,
Luca
From: manish ranjan <cs...@gmail.com>
Sent: Tuesday, March 26, 2019 15:24
To: Jack Kolokasis <ko...@ics.forth.gr>
Cc: user <us...@spark.apache.org>
Subject: Re: Spark Profiler
I have found ganglia very helpful in understanding network I/o , CPU and memory usage for a given spark cluster.
I have not used , but have heard good things about Dr Elephant ( which I think was contributed by LinkedIn but not 100%sure).
On Tue, Mar 26, 2019, 5:59 AM Jack Kolokasis <ko...@ics.forth.gr>> wrote:
Hello all,
I am looking for a spark profiler to trace my application to find
the bottlenecks. I need to trace CPU usage, Memory Usage and I/O usage.
I am looking forward for your reply.
--Iacovos
---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org<ma...@spark.apache.org>
Re: Spark Profiler
Posted by manish ranjan <cs...@gmail.com>.
I have found ganglia very helpful in understanding network I/o , CPU and
memory usage for a given spark cluster.
I have not used , but have heard good things about Dr Elephant ( which I
think was contributed by LinkedIn but not 100%sure).
On Tue, Mar 26, 2019, 5:59 AM Jack Kolokasis <ko...@ics.forth.gr> wrote:
> Hello all,
>
> I am looking for a spark profiler to trace my application to find
> the bottlenecks. I need to trace CPU usage, Memory Usage and I/O usage.
>
> I am looking forward for your reply.
>
> --Iacovos
>
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>
>