You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "bjkonglu (JIRA)" <ji...@apache.org> on 2018/11/21 06:09:00 UTC

[jira] [Updated] (SPARK-26135) Structured Streaming reporting metrics programmatically using asynchronous APIs can't get all queries metrics

     [ https://issues.apache.org/jira/browse/SPARK-26135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

bjkonglu updated SPARK-26135:
-----------------------------
    Environment: 
h3.  

 

 

  was:
h3.  

 


> Structured Streaming reporting metrics programmatically using asynchronous APIs can't get all queries metrics
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-26135
>                 URL: https://issues.apache.org/jira/browse/SPARK-26135
>             Project: Spark
>          Issue Type: Improvement
>          Components: Structured Streaming
>    Affects Versions: 2.3.1
>         Environment: h3.  
>  
>  
>            Reporter: bjkonglu
>            Priority: Major
>
> h3. Background
>  When I use Structured Streaming handle real-time data, I also want to know the streaming application metrics, for example prcessedRowsPerSecond、inputRowsPerSeconds etc. So I report metrics programmatically using asynchronous APIs.
> {code:java}
> val spark: SparkSession = ...
> spark.streams.addListener(new StreamingQueryListener() {
>     override def onQueryStarted(queryStarted: QueryStartedEvent): Unit = {
>         println("Query started: " + queryStarted.id)
>     }
>     override def onQueryTerminated(queryTerminated: QueryTerminatedEvent): Unit = {
>         println("Query terminated: " + queryTerminated.id)
>     }
>     override def onQueryProgress(queryProgress: QueryProgressEvent): Unit = {
>         println("Query made progress: " + queryProgress.progress)
>     }
> })
> {code}
> h3. Questions
>   When the streaming application has a single query, asynchronous APIs work well. But when the streaming application has many queries, asynchronous APIs can't report metrics exactly, some queries can report well, some queries report delay and metrics number lower. 
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org