You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2020/11/20 04:36:30 UTC

[GitHub] [spark] HeartSaVioR edited a comment on pull request #30427: [SPARK-33224][SS] Add watermark gap information into SS UI page

HeartSaVioR edited a comment on pull request #30427:
URL: https://github.com/apache/spark/pull/30427#issuecomment-730844687


   The `complicated case` in manual test demonstrates the use case of "event time processing". Please take a look at the code how I randomize the event timestamp in input rows.
   
   Technically, the graph is almost meaningless on processing time, because the event timestamp would be nearly same as batch timestamp. Even the query is lagging, once the next batch is launched, the event timestamp of inputs will be matched to the batch timestamp.
   
   The graph will be helpful if they're either using "ingest time" (not timestamped by Spark, but timestamped when ingested to the input storage) which could show the lag of process, or using "event time" which is the best case of showing the gap.
   
   ![Figure_05_-_Event_Time_vs_Processing_Time](https://user-images.githubusercontent.com/1317309/99758506-3a852f00-2b35-11eb-9f40-5f7c5aba7ec2.png)
   
   The graph is borrowed from the gold articles below. If you haven't read below articles, strongly recommend to read them, or read the book "Streaming Systems".
   
   https://www.oreilly.com/radar/the-world-beyond-batch-streaming-101/
   https://www.oreilly.com/radar/the-world-beyond-batch-streaming-102/


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org