You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2019/02/22 00:36:01 UTC

[GitHub] arunmahadevan commented on issue #23859: [SPARK-26956][SQL] remove streaming output mode from data source v2 APIs

arunmahadevan commented on issue #23859: [SPARK-26956][SQL] remove streaming output mode from data source v2 APIs
URL: https://github.com/apache/spark/pull/23859#issuecomment-466226075

Is there any design doc on what we plan to support in the new model? If redesigning, it might be worth re-evaluating the need for these different modes before proceeding with the implementation.

For instance the Streaming sinks cannot correctly handle the updates without support for retractions.

And not sure how the truncate is going to be used. Do we expect the sink to completely truncate the output and fill it with the new values for each micro batch? I am not sure how thats very useful other than maybe in the console sink.

I think ideally what we want is for an operator to emit a stream of values (and retractions) to the sink so that the sink can correctly process and update the results. So may be the different modes are not needed other than for backward compatibility reasons.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org