You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by shkob1 <sh...@gmail.com> on 2018/11/12 18:28:23 UTC

Kinesis Shards and Parallelism

Looking at the doc
https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/kinesis.html
"When the number of shards is larger than the parallelism of the consumer,
then each consumer subtask can subscribe to multiple shards; otherwise if
the number of shards is smaller than the parallelism of the consumer, then
some consumer subtasks will simply be idle and wait until it gets assigned
new shards"

I have the *same number of shards as the configured parallelism*. Seems
though a task is grabbing multiple shards while others are idle. is that an
expected behavior?



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Kinesis Shards and Parallelism

Posted by shkob1 <sh...@gmail.com>.
Actually was looking at the task manager level, i did have more slots than
shards, so it does make sense i had an idle task manager while other task
managers split the shards between their slots

Thanks!



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Kinesis Shards and Parallelism

Posted by "Tzu-Li (Gordon) Tai" <tz...@apache.org>.
Hi,

Another detail not that apparent in the description is that the assignment
would only be evenly distributed assuming that the open Kinesis shards have
consecutive shard ids, and are of the same Kinesis stream.
Once you reshard a Kinesis stream, it could be that the shard ids are no
longer consecutive.

To overcome the skew in the distribution after several reshard operations,
the Flink Kinesis Consumer allows providing a custom
`KinesisShardAssigner`, which allows the user to decide how shards should
be partitioned.
Please let me know if this helps!

Cheers,
Gordon

On Tue, Nov 13, 2018 at 2:28 AM shkob1 <sh...@gmail.com> wrote:

> Looking at the doc
>
> https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/kinesis.html
> "When the number of shards is larger than the parallelism of the consumer,
> then each consumer subtask can subscribe to multiple shards; otherwise if
> the number of shards is smaller than the parallelism of the consumer, then
> some consumer subtasks will simply be idle and wait until it gets assigned
> new shards"
>
> I have the *same number of shards as the configured parallelism*. Seems
> though a task is grabbing multiple shards while others are idle. is that an
> expected behavior?
>
>
>
> --
> Sent from:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
>