You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by nilmish <ni...@gmail.com> on 2014/06/10 11:56:57 UTC

Problem in Spark Streaming

I am running a spark streaming job to count top 10 hashtags over last 5 mins
window, querying every 1 sec. 

It is taking approx <1.4 sec (end-to-end-delay) to answer most of the query
but there are few instances in between when it takes considerable more
amount of time (like around 15 sec) due to which the response time of
further queries also becomes more.  I am not able to debug the reason for
such spikes in between.  The data rate is nearly constant, so this spike is
not due to sudden increase in the data rate. 

Also is there any way so that I can fix a bound on time taken by a
particular query. Like if a particular query takes more than say 2 sec then
it should kill that query and move on to the next query. So that if a
particular query takes more time then it do not effect future queries.

Thanx,
Nilesh



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Problem-in-Spark-Streaming-tp7310.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Problem in Spark Streaming

Posted by nilmish <ni...@gmail.com>.
You can measure the latency from the logs. Search for words like Total delay
in the logs. This denotes the total end to end delay for a particular query.



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Problem-in-Spark-Streaming-tp7310p7312.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Problem in Spark Streaming

Posted by Yingjun Wu <wu...@gmail.com>.
Hi Nilmish,

I confront the same problem. I am wondering how do you measure the latency?

Regards,
Yingjun



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Problem-in-Spark-Streaming-tp7310p7311.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Problem in Spark Streaming

Posted by vinay Bajaj <vb...@gmail.com>.
http://stackoverflow.com/questions/895444/java-garbage-collection-log-messages

http://stackoverflow.com/questions/16794783/how-to-read-a-verbosegc-output

I think this will help in understanding the logs.


On Wed, Jun 11, 2014 at 12:53 PM, nilmish <ni...@gmail.com> wrote:

>
> I used these commands to show the GC timings : -verbose:gc
> -XX:-PrintGCDetails -XX:+PrintGCTimeStamps
>
> Following is the output I got on the standard output :
>
> 4.092: [GC 4.092: [ParNew: 274752K->27199K(309056K), 0.0421460 secs]
> 274752K->27199K(995776K), 0.0422720 secs] [Times: user=0.33 sys=0.11,
> real=0.04 secs]
>
> 16.630: [GC 16.630: [ParNew: 301951K->17854K(309056K), 0.0686940 secs]
> 301951K->23624K(995776K), 0.0689110 secs] [Times: user=0.36 sys=0.05,
> real=0.07 secs]
>
> 32.440: [GC 32.441: [ParNew: 292606K->14985K(309056K), 0.0206040 secs]
> 298376K->20755K(995776K), 0.0208320 secs] [Times: user=0.20 sys=0.00,
> real=0.02 secs]
>
> 42.626: [GC 42.626: [ParNew: 289737K->15467K(309056K), 0.0138100 secs]
> 295507K->21237K(995776K), 0.0139830 secs] [Times: user=0.10 sys=0.00,
> real=0.01 secs]
>
> 56.633: [GC 56.633: [ParNew: 290219K->17334K(309056K), 0.0170930 secs]
> 295989K->23105K(995776K), 0.0173130 secs] [Times: user=0.12 sys=0.01,
> real=0.02 secs]
>
> Can anyone help me to understand these messgaes related to GC ?
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Problem-in-Spark-Streaming-tp7310p7384.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>

Re: Problem in Spark Streaming

Posted by nilmish <ni...@gmail.com>.
I used these commands to show the GC timings : -verbose:gc
-XX:-PrintGCDetails -XX:+PrintGCTimeStamps

Following is the output I got on the standard output :

4.092: [GC 4.092: [ParNew: 274752K->27199K(309056K), 0.0421460 secs]
274752K->27199K(995776K), 0.0422720 secs] [Times: user=0.33 sys=0.11,
real=0.04 secs] 

16.630: [GC 16.630: [ParNew: 301951K->17854K(309056K), 0.0686940 secs]
301951K->23624K(995776K), 0.0689110 secs] [Times: user=0.36 sys=0.05,
real=0.07 secs]
 
32.440: [GC 32.441: [ParNew: 292606K->14985K(309056K), 0.0206040 secs]
298376K->20755K(995776K), 0.0208320 secs] [Times: user=0.20 sys=0.00,
real=0.02 secs]
 
42.626: [GC 42.626: [ParNew: 289737K->15467K(309056K), 0.0138100 secs]
295507K->21237K(995776K), 0.0139830 secs] [Times: user=0.10 sys=0.00,
real=0.01 secs]
 
56.633: [GC 56.633: [ParNew: 290219K->17334K(309056K), 0.0170930 secs]
295989K->23105K(995776K), 0.0173130 secs] [Times: user=0.12 sys=0.01,
real=0.02 secs] 

Can anyone help me to understand these messgaes related to GC ?



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Problem-in-Spark-Streaming-tp7310p7384.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Problem in Spark Streaming

Posted by Ashish Rangole <ar...@gmail.com>.
Have you considered the garbage collection impact and if it coincides with
your latency spikes? You can enable gc logging by changing Spark
configuration for your job.
Hi, as I searched the keyword "Total delay" in the console log, the delay
keeps increasing. I am not sure what does this "total delay" mean? For
example, if I perform a windowing wordcount with windowSize=10000ms and
slidingStep=2000ms, then does the delay measured from the 10th second?

A sample log is shown as follows:
Total delay: 136.983 s for time 1402409331000 ms (execution: 1.711s) <--what
is execution time?
Finished TID 490 in 14 ms on .... (progress: 1/6) <--what is TID? and what
is the progress?



--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Problem-in-Spark-Streaming-tp7310p7329.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Problem in Spark Streaming

Posted by Yingjun Wu <wu...@gmail.com>.
Hi, as I searched the keyword "Total delay" in the console log, the delay
keeps increasing. I am not sure what does this "total delay" mean? For
example, if I perform a windowing wordcount with windowSize=10000ms and
slidingStep=2000ms, then does the delay measured from the 10th second? 

A sample log is shown as follows:
Total delay: 136.983 s for time 1402409331000 ms (execution: 1.711s) <--what
is execution time?
Finished TID 490 in 14 ms on .... (progress: 1/6) <--what is TID? and what
is the progress?



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Problem-in-Spark-Streaming-tp7310p7329.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Problem in Spark Streaming

Posted by Boduo Li <on...@gmail.com>.
Oh, I mean the average data rate/node.

But in case I want to know the input activities to each node (I use a custom
receiver instead of Kafka), I usually search these records in logs to get a
sense: "BlockManagerInfo: Added input ... on [hostname:port] (size: xxx KB)"

I also see some spikes in latency as I posted earlier:
http://apache-spark-user-list.1001560.n3.nabble.com/How-to-achieve-reasonable-performance-on-Spark-Streaming-tp7262.html
It's even worse as the spikes cause the latency to increase infinitely when
the data rate is a little high, although the machines are underutilized. I
can't explain it either. I'm not sure if the cause is the same as yours.



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Problem-in-Spark-Streaming-tp7310p7327.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Problem in Spark Streaming

Posted by nilmish <ni...@gmail.com>.
How can I measure data rate/node ?

I am feeding the data through kafka API. I only know the total inflow data
rate which almost remains constant . How can I figure out what amount of
data is distributed to the nodes in my cluster ? 

Latency does not keep on increasing infinetly. It goes up for some instant
and then it drops down again to the normal level. I want to get away with
these spikes in between. 



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Problem-in-Spark-Streaming-tp7310p7325.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Problem in Spark Streaming

Posted by Boduo Li <on...@gmail.com>.
Hi Nilmish,

What's the data rate/node when you see the high latency? (It seems the
latency keeps increasing.) Do you still see it if you lower the data rate or
the frequency of the windowed query?



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Problem-in-Spark-Streaming-tp7310p7321.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.