You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Chan Chor Pang <ch...@indetail.co.jp> on 2016/08/05 01:51:24 UTC

Spark 1.6 Streaming delay after long run

after upgrade from Spark 1.5 to 1.6(CDH 5.6.0 -> 5.7.1)
some of our streaming job getting delay after long run.

with a little invesgation, here is what i found.
     - the same program have no problem with Spark 1.5
     - we have two kind of streaming and only those with 
"updateStateByKey" was affected,
     - cpu usage getting higher and higher over time ( with 1core@5% at 
start and 1core@100% after a week )
     - data rate is alound 100 event/s, there is no chance for the cpu 
to work so hard.
     - process time for a batch delay from 100ms at start to 3s after a week
     - evening running the same program(for difference input data), not 
all process delay with the same scale
     - no warning or error message until it delay too much and went out 
of memory
     - process time of customer code seems have no problem
     - memory/heap usage looks normal to me

Im suspecting the problem is comming from updateStateByKey but i cant 
trace it down

any one experience the same problem?


--
BR
Peter Chan

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org