You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2019/05/17 04:12:31 UTC

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #24631: [MINOR][CORE][DOC]Avoid hardcoded configs and fix kafka sink write semantics in document

dongjoon-hyun commented on a change in pull request #24631: [MINOR][CORE][DOC]Avoid hardcoded configs and fix kafka sink write semantics in document
URL: https://github.com/apache/spark/pull/24631#discussion_r284972996

##########
File path: docs/structured-streaming-kafka-integration.md
##########
@@ -441,7 +441,7 @@ Apache Kafka only supports at least once write semantics. Consequently, when wri
or Batch Queries---to Kafka, some records may be duplicated; this can happen, for example, if Kafka needs
to retry a message that was not acknowledged by a Broker, even though that Broker received and wrote the message record.
Structured Streaming cannot prevent such duplicates from occurring due to these Kafka write semantics. However,
-if writing the query is successful, then you can assume that the query output was written at least once. A possible
+if writing the query is successful, then you can assume that the query output was written exactly once. A possible

Review comment:
Hi, @wenxuanguan . This looks wrong in this context. Could you explain your thought?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org