You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2021/03/05 04:45:59 UTC

[GitHub] [spark] HeartSaVioR commented on pull request #31700: [SPARK-34183][SS] DataSource V2: Support required distribution and ordering in SS

HeartSaVioR commented on pull request #31700:
URL: https://github.com/apache/spark/pull/31700#issuecomment-791149996


   Actually that's the one of few advantages from micro-batch compared to record-to-record, and we already leveraged it by some public API (e.g. flatMapGroupsWithState - this "sorts" the inputs so that values from the same group can be served sequentially). 
   
   That said, I'm supportive on the concept of the ordering, only for micro-batch. Dealing with sort in continuous mode is quite tricky - despite of the nature of record-to-record processing, sort requires to buffer inputs into state or somewhere in memory until the epoch has been finished (we can maintain the state or buffer be kept to be sorted though), and downstream operations can only continue their works, which contradicts the fact that epoch is finished.
   
   My 2 cents on continuous mode is that we'd be better to admit the architectural differences between the batch oriented and streaming oriented, and try to have some safe approach to isolate between twos. Naturally integrating twos sounds very hard to achieve, and even has been playing as roadblock for improving functionalities on micro-batch mode as well.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org