You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by do...@apache.org on 2019/07/21 20:24:59 UTC

[spark] branch branch-2.4 updated: [SPARK-28464][DOC][SS] Document Kafka source minPartitions option

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
     new a7e2de8  [SPARK-28464][DOC][SS] Document Kafka source minPartitions option
a7e2de8 is described below

commit a7e2de86e0f133a2c76b221baefc1a590b2b1a39
Author: Arun Pandian <ap...@groupon.com>
AuthorDate: Sun Jul 21 13:07:22 2019 -0700

    [SPARK-28464][DOC][SS] Document Kafka source minPartitions option
    
    Adding doc for the kafka source minPartitions option to "Structured Streaming + Kafka Integration Guide"
    
    The text is based on the content in  https://docs.databricks.com/spark/latest/structured-streaming/kafka.html#configuration
    
    Closes #25219 from arunpandianp/SPARK-28464.
    
    Authored-by: Arun Pandian <ap...@groupon.com>
    Signed-off-by: Dongjoon Hyun <dh...@apple.com>
    (cherry picked from commit a0a58cf2effc4f4fb17ef3b1ca3def2d4022c970)
    Signed-off-by: Dongjoon Hyun <dh...@apple.com>
---
 docs/structured-streaming-kafka-integration.md | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/docs/structured-streaming-kafka-integration.md b/docs/structured-streaming-kafka-integration.md
index 71fd5b1..680fe78 100644
--- a/docs/structured-streaming-kafka-integration.md
+++ b/docs/structured-streaming-kafka-integration.md
@@ -374,6 +374,16 @@ The following configurations are optional:
   <td>streaming and batch</td>
   <td>Rate limit on maximum number of offsets processed per trigger interval. The specified total number of offsets will be proportionally split across topicPartitions of different volume.</td>
 </tr>
+<tr>
+  <td>minPartitions</td>
+  <td>int</td>
+  <td>none</td>
+  <td>streaming and batch</td>
+  <td>Minimum number of partitions to read from Kafka.
+  By default, Spark has a 1-1 mapping of topicPartitions to Spark partitions consuming from Kafka.
+  If you set this option to a value greater than your topicPartitions, Spark will divvy up large
+  Kafka partitions to smaller pieces.</td>
+</tr>
 </table>
 
 ## Writing Data to Kafka


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org