You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2019/03/10 05:34:35 UTC

[GitHub] [spark] hehuiyuan commented on a change in pull request #23999: [docs]Add additional explanation for "Setting the max receiving rate" in streaming-programming-guide.md

hehuiyuan commented on a change in pull request #23999: [docs]Add additional explanation for "Setting the max receiving rate" in streaming-programming-guide.md
URL: https://github.com/apache/spark/pull/23999#discussion_r264024237
 
 

 ##########
 File path: docs/streaming-programming-guide.md
 ##########
 @@ -2036,7 +2036,7 @@ To run a Spark Streaming applications, you need to have the following.
   `spark.streaming.receiver.maxRate` for receivers and `spark.streaming.kafka.maxRatePerPartition`
   for Direct Kafka approach. In Spark 1.5, we have introduced a feature called *backpressure* that
   eliminate the need to set this rate limit, as Spark Streaming automatically figures out the
-  rate limits and dynamically adjusts them if the processing conditions change. This backpressure
+  rate limits and dynamically adjusts them if the processing conditions change.If the first batch of data is very large which causes the first batch is processing all the time and the task can not work normally , using a maximum rate limit can solve the problem .This backpressure
 
 Review comment:
   First of all,think you for your reply.
   
   The original document means that setting backpressure does not require to set this rate limit。However, In actual usage scenarios, such as spark streaming consuming kafka, the first batch of data is often very large, leading to the first batch has been processing, affecting the normal operation of tasks。Even the first batch of data is finished and it  costs much more time than the batch time , the efficiency of processing  subsequent batch is not as good as the efficiency of the first batch of data  was processed in batch time then continue  processing ubsequent batch.
   
   In a word,i want to express  setting backpressure is not need setting rate limit that is not rigorous .
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org