You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@storm.apache.org by Peter Neubauer <ne...@gmail.com> on 2020/04/07 07:07:33 UTC

Is anyone available for some in-depth storm 2.1 explaining?

Hi there,
we are trying to upgrade our Storm 1.2.3 installation (with Trident
Kafka spouts) to 2.1. In our tests with 2 topic partitions, we cannot
really see a pattern on how exactly the different partitions are
polled (we get messages randomly left in one or both partitions
without the spout moving, sometimes it moves after a couple of
minutes, sometimes after we send more messages, sometimes after we
remote-debug and insert a breakpoint into the KafkaConsumer). The new
record-size fetch ConsumerConfig.MAX_POLL_RECORDS_CONFIG is very
welcome btw!

Is there anyone that can help us (even on a consulting base) to go to
the bottom with the spout fetch behavior so we are not in the blind
when updating?

Just let me know, or maybe point to relevant documentation in case we
missed that - thanks a lot!

/peter

G:  neubauer.peter
S:  peter.neubauer
P:  +46 704 106975
L:   http://www.linkedin.com/in/neubauer
T:   @peterneubauer

Mapillary - Join the greatest expedition of our time.

Re: Is anyone available for some in-depth storm 2.1 explaining?

Posted by Peter Neubauer <pe...@mapillary.com>.

Thank you Ethan for looking into this!

However, the partitioning etc is not really the problem here I think.
It is more of an issue to understand in what order partitions are
emptied and why there might be tuples left in the Kafka topic without
being polled into the topology.

We tried to remote debug this and see that one partition at a time is
polled, while the others are disabled. However, we cannot see why
there would be tuples left unread in the partitions.

Any thoughts?

/peter

G:  neubauer.peter
S:  peter.neubauer
P:  +46 704 106975
L:   http://www.linkedin.com/in/neubauer
T:   @peterneubauer

Mapillary - Join the greatest expedition of our time.

On Thu, Apr 9, 2020 at 6:15 AM Ethan Li <et...@gmail.com> wrote:
>
> Hi Peter,
>
> I am not very familiar with trident kafka spout. I found something https://github.com/apache/storm/blob/master/docs/storm-kafka-client.md#manual-partition-assignment-advanced related. See if it helps.
>
> Best
> Ethan
>
> On Tue, Apr 7, 2020 at 2:07 AM Peter Neubauer <ne...@gmail.com> wrote:
>>
>> Hi there,
>> we are trying to upgrade our Storm 1.2.3 installation (with Trident
>> Kafka spouts) to 2.1. In our tests with 2 topic partitions, we cannot
>> really see a pattern on how exactly the different partitions are
>> polled (we get messages randomly left in one or both partitions
>> without the spout moving, sometimes it moves after a couple of
>> minutes, sometimes after we send more messages, sometimes after we
>> remote-debug and insert a breakpoint into the KafkaConsumer). The new
>> record-size fetch ConsumerConfig.MAX_POLL_RECORDS_CONFIG is very
>> welcome btw!
>>
>> Is there anyone that can help us (even on a consulting base) to go to
>> the bottom with the spout fetch behavior so we are not in the blind
>> when updating?
>>
>> Just let me know, or maybe point to relevant documentation in case we
>> missed that - thanks a lot!
>>
>> /peter
>>
>> G:  neubauer.peter
>> S:  peter.neubauer
>> P:  +46 704 106975
>> L:   http://www.linkedin.com/in/neubauer
>> T:   @peterneubauer
>>
>> Mapillary - Join the greatest expedition of our time.

Re: Is anyone available for some in-depth storm 2.1 explaining?

Posted by Ethan Li <et...@gmail.com>.

Hi Peter,

I am not very familiar with trident kafka spout. I found something
https://github.com/apache/storm/blob/master/docs/storm-kafka-client.md#manual-partition-assignment-advanced
related.
See if it helps.

Best
Ethan

On Tue, Apr 7, 2020 at 2:07 AM Peter Neubauer <ne...@gmail.com>
wrote:

> Hi there,
> we are trying to upgrade our Storm 1.2.3 installation (with Trident
> Kafka spouts) to 2.1. In our tests with 2 topic partitions, we cannot
> really see a pattern on how exactly the different partitions are
> polled (we get messages randomly left in one or both partitions
> without the spout moving, sometimes it moves after a couple of
> minutes, sometimes after we send more messages, sometimes after we
> remote-debug and insert a breakpoint into the KafkaConsumer). The new
> record-size fetch ConsumerConfig.MAX_POLL_RECORDS_CONFIG is very
> welcome btw!
>
> Is there anyone that can help us (even on a consulting base) to go to
> the bottom with the spout fetch behavior so we are not in the blind
> when updating?
>
> Just let me know, or maybe point to relevant documentation in case we
> missed that - thanks a lot!
>
> /peter
>
> G:  neubauer.peter
> S:  peter.neubauer
> P:  +46 704 106975
> L:   http://www.linkedin.com/in/neubauer
> T:   @peterneubauer
>
> Mapillary - Join the greatest expedition of our time.
>