You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by pravin kumar <pk...@gmail.com> on 2018/03/01 07:28:59 UTC

Tasks across MultipleJVM

I have just did wikifeed example and given the output of wikifeed example
to another topologyProcessor to find the even numbers.while testing for
multiple Consumers in three JVM, the output topic is revoked and rebalanced
across three JVMs. i have got 10 tasks (max no of partitions).

i have three topics:

wikifeedInputtopic1 - 10 partitions
wikifeedOutputtopic1 - 10 partitions
sumoutputeventopicC1 - 5 partitions

I have tried to Spread the task across multiple JVM,

in JVM1:First i have this much partitions
[0_0, 0_1, 1_0, 0_2, 1_1, 0_3, 1_2, 0_4, 1_3, 1_4, 0_5, 1_5, 0_6, 1_6, 0_7,
1_7, 0_8, 1_8, 0_9, 1_9]

then i started with second JVM i have got
JVM1:
current active tasks: [0_0, 1_0, 0_1, 1_1, 0_2, 1_2, 0_3, 1_3, 0_4, 1_4]
    current standby tasks: []
    previous active tasks: [0_0, 0_1, 1_0, 0_2, 1_1, 0_3, 1_2, 0_4, 1_3,
1_4, 0_5, 1_5, 0_6, 1_6, 0_7, 1_7, 0_8, 1_8, 0_9, 1_9]
 (org.apache.kafka.streams.processor.internals.StreamThread)

JVM2:
current active tasks: [0_5, 1_5, 0_6, 1_6, 0_7, 1_7, 0_8, 1_8, 0_9, 1_9]
    current standby tasks: []
    previous active tasks: []

while i started third JVM :
JVM1:
current active tasks: [0_0, 1_0, 0_1, 1_1, 0_2, 1_2, 1_9]
    current standby tasks: []
    previous active tasks: [0_0, 1_0, 0_1, 1_1, 0_2, 1_2, 0_3, 1_3, 0_4,
1_4]

JVM2:
    current active tasks: [0_5, 1_5, 0_6, 1_6, 0_7, 0_8, 1_8]
    current standby tasks: []
    previous active tasks: [0_5, 0_6, 1_5, 0_7, 1_6, 0_8, 1_7, 0_9, 1_8,
1_9]

JVM3:

current active tasks: [0_3, 1_3, 0_4, 1_4, 1_7, 0_9]
    current standby tasks: []
    previous active tasks: []

i have aslo updated the statedirectory while starting three JVMs
but i have not got the latest task list in statedirectory:

FirstJVM stateDirectory::
[admin@nms-181 WikiFeedLambdaexampleC2]$ ls
0_0  0_1  0_2  0_3  0_4  0_5  0_6  0_7  0_8  0_9  1_0  1_1  1_2  1_3  1_4
1_5  1_6  1_7  1_8  1_9

SecondJVM stateDirectory:
[admin@nms-181 WikiFeedLambdaexampleC2]$ ls
0_5  0_6  0_7  0_8  0_9  1_5  1_6  1_7  1_8  1_9

ThirdJVM stateDirectory:
[admin@nms-181 WikiFeedLambdaexampleC2]$ ls
0_3  0_4  0_9  1_3  1_4  1_7

Doubts:
1.while runnning in multiple JVM with multiple Consumers ,the task also
gets spread across multiple JVM??

2.why  third stateDirectory has the lastest task list,others dont have the
latest taskList ???


i have attached the my codes below: