You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@kafka.apache.org by bb...@apache.org on 2019/12/16 16:30:22 UTC
[kafka] branch 2.1 updated: port paragrpah from CP docs (#7808)

This is an automated email from the ASF dual-hosted git repository.

bbejeck pushed a commit to branch 2.1
in repository https://gitbox.apache.org/repos/asf/kafka.git


The following commit(s) were added to refs/heads/2.1 by this push:
     new e82a67b  port paragrpah from CP docs (#7808)
e82a67b is described below

commit e82a67b29adbff67fdc8052aa9e4396e0364ccd3
Author: A. Sophie Blee-Goldman <so...@confluent.io>
AuthorDate: Mon Dec 9 13:35:17 2019 -0800

    port paragrpah from CP docs (#7808)
    
    The AK Streams architecture docs should explain how the maximum parallelism is determined
    Reviewers: Bill Bejeck <bb...@gmail.com>
---
 docs/streams/architecture.html | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/docs/streams/architecture.html b/docs/streams/architecture.html
index 8bc3156..7efd7ea 100644
--- a/docs/streams/architecture.html
+++ b/docs/streams/architecture.html
@@ -66,6 +66,14 @@
     </p>
 
     <p>
+        Slightly simplified, the maximum parallelism at which your application may run is bounded by the maximum number of stream tasks, which itself is determined by
+        maximum number of partitions of the input topic(s) the application is reading from. For example, if your input topic has 5 partitions, then you can run up to 5
+        applications instances. These instances will collaboratively process the topic’s data. If you run a larger number of app instances than partitions of the input
+        topic, the “excess” app instances will launch but remain idle; however, if one of the busy instances goes down, one of the idle instances will resume the former’s
+        work.
+    </p>
+
+    <p>
         It is important to understand that Kafka Streams is not a resource manager, but a library that "runs" anywhere its stream processing application runs.
         Multiple instances of the application are executed either on the same machine, or spread across multiple machines and tasks can be distributed automatically
         by the library to those running application instances. The assignment of partitions to tasks never changes; if an application instance fails, all its assigned