You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Something Something <ma...@gmail.com> on 2020/05/07 18:26:10 UTC

No. of active states?

Is there a way to get the total no. of active states in memory at any given
point in a Stateful Spark Structured Streaming job? We are thinking of
using this metric for 'Auto Scaling' our Spark cluster.

Re: No. of active states?

Posted by Jungtaek Lim <ka...@gmail.com>.
Have you looked through and see metrics for state operators?

It has been providing "total rows" of state, and starting from Spark 2.4 it
also provides additional metrics specific to HDFSBackedStateStoreProvider,
including estimated memory usage in overall.

https://github.com/apache/spark/blob/24fac1e0c70a783b4d240607639ff20d7dd24191/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/HDFSBackedStateStoreProvider.scala#L168-L179


On Fri, May 8, 2020 at 11:30 AM Something Something <
mailinglists19@gmail.com> wrote:

> No. We are already capturing these metrics (e.g. numInputRows,
> inputRowsPerSecond).
>
> I am talking about "No. of States" in the memory at any given time.
>
> On Thu, May 7, 2020 at 4:31 PM Jungtaek Lim <ka...@gmail.com>
> wrote:
>
>> If you're referring total "entries" in all states in SS job, it's being
>> provided via StreamingQueryListener.
>>
>>
>> http://spark.apache.org/docs/latest/structured-streaming-programming-guide.html#monitoring-streaming-queries
>>
>> Hope this helps.
>>
>> On Fri, May 8, 2020 at 3:26 AM Something Something <
>> mailinglists19@gmail.com> wrote:
>>
>>> Is there a way to get the total no. of active states in memory at any
>>> given point in a Stateful Spark Structured Streaming job? We are thinking
>>> of using this metric for 'Auto Scaling' our Spark cluster.
>>>
>>

Re: No. of active states?

Posted by Edgardo Szrajber <sz...@yahoo.com.INVALID>.
 This should open a new world of real-time metrics for you.How to get Spark Metrics as JSON using Spark REST API in YARN Cluster mode


| 
| 
| 
|  |  |

 |

 |
| 
|  | 
How to get Spark Metrics as JSON using Spark REST API in YARN Cluster mode

Anbu Cheeralan

Spark provides the metrics in UI. You can access the UI using either port 4040 (Standalone) or using a proxy thr...
 |

 |

 |



Bentzi

    On Friday, May 8, 2020, 05:30:56 AM GMT+3, Something Something <ma...@gmail.com> wrote:  
 
 No. We are already capturing these metrics (e.g. numInputRows, inputRowsPerSecond).
I am talking about "No. of States" in the memory at any given time. 
On Thu, May 7, 2020 at 4:31 PM Jungtaek Lim <ka...@gmail.com> wrote:

If you're referring total "entries" in all states in SS job, it's being provided via StreamingQueryListener.
http://spark.apache.org/docs/latest/structured-streaming-programming-guide.html#monitoring-streaming-queries

Hope this helps.
On Fri, May 8, 2020 at 3:26 AM Something Something <ma...@gmail.com> wrote:

Is there a way to get the total no. of active states in memory at any given point in a Stateful Spark Structured Streaming job? We are thinking of using this metric for 'Auto Scaling' our Spark cluster.

  

Re: No. of active states?

Posted by Something Something <ma...@gmail.com>.
No. We are already capturing these metrics (e.g. numInputRows,
inputRowsPerSecond).

I am talking about "No. of States" in the memory at any given time.

On Thu, May 7, 2020 at 4:31 PM Jungtaek Lim <ka...@gmail.com>
wrote:

> If you're referring total "entries" in all states in SS job, it's being
> provided via StreamingQueryListener.
>
>
> http://spark.apache.org/docs/latest/structured-streaming-programming-guide.html#monitoring-streaming-queries
>
> Hope this helps.
>
> On Fri, May 8, 2020 at 3:26 AM Something Something <
> mailinglists19@gmail.com> wrote:
>
>> Is there a way to get the total no. of active states in memory at any
>> given point in a Stateful Spark Structured Streaming job? We are thinking
>> of using this metric for 'Auto Scaling' our Spark cluster.
>>
>

Re: No. of active states?

Posted by Jungtaek Lim <ka...@gmail.com>.
If you're referring total "entries" in all states in SS job, it's being
provided via StreamingQueryListener.

http://spark.apache.org/docs/latest/structured-streaming-programming-guide.html#monitoring-streaming-queries

Hope this helps.

On Fri, May 8, 2020 at 3:26 AM Something Something <ma...@gmail.com>
wrote:

> Is there a way to get the total no. of active states in memory at any
> given point in a Stateful Spark Structured Streaming job? We are thinking
> of using this metric for 'Auto Scaling' our Spark cluster.
>