You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Siva Ram <si...@gmail.com> on 2018/08/06 03:45:44 UTC

Re: Kafka stream application load balancing

Hi Guozhang,

Thanks for the suggestions below.  I consider we got past the REBALANCING
issue.  However, we are running into significant memory usage issue.  I
will open a separate thread for this.

1) During the punctuate we require to perform certain tasks and it was
exceeding the consumer request timeout.  After setting it to the
appropriate level, we are not seeing the REBALANCING event.

2) The application contains multiple stores (7 stores, 2 being the major
that persists the JSON POJO ) and the logging has been enabled for all of
the store.

3) The commit interval is set to 60000 (1min)


Regards,
Ashok

On Mon, Jul 30, 2018 at 12:27 PM, Guozhang Wang <wa...@gmail.com> wrote:

> Hello Siva,
>
> To better understand your situation, I'd need to ask a few more questions:
>
> 1) What triggers your REBALANCING event?
>
> 2) Does your application contain any states? If yes, how are they
> configured (persistent or in-memory, is logging enabled, etc)?
>
> 3) What is your commit interval configured via "commit.interval.ms"?
>
>
> To have better insights  on what's happening, you can 1) set the
> StateRestoreListener via KafkaStreams#setGlobalStateRestoreListener
> (details can be found here:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 167%3A+Add+interface+for+the+state+store+restoration+process),
> to see how much data are being restored during the task resuming process,
> 2) monitor on state store restoration metrics (
> https://kafka.apache.org/documentation/#kafka_streams_store_monitoring)
> such as "restore-latency-avg" and "restore-rate". 3) Look into your log4j
> and check for "partition revocation took" and "partition assignment took"
> entries and check their time difference.
>
>
> Guozhang
>
>
>
> On Sun, Jul 29, 2018 at 10:37 AM, Siva Ram <si...@gmail.com> wrote:
>
> >  Hi,
> >
> > Kafka version 1.0.0 (can't upgrade to another version yet due to legacy
> > dependency)
> >
> > The stream application uses low level processor API and maintains
> state.  A
> > topic is setup with 30 partitions and I had split to 2 stream application
> > instances consuming the same topic, each with 15 threads.  The
> application
> > starts fine and moves well until REBALANCING occur.  When it does, the
> > application takes long time to move to RUNNING status by itself.  During
> > this time no exception, no additional logging occurs in the application.
> >
> > 1) Could this behavior be due to an issue on Kafka broker OR is this
> > related to the stream application?
> >
> > 2) What logging can we increase to get additional insight as to what
> cause
> > this pause state for a significant period of time (this is impacting the
> > throughput)?
> >
> > FYI, we have set the REQUEST TIMEOUT to max integer value to avoid
> > timeout.  In the event we have a single application with 30 threads, I
> > don't see this long pause, but that means we need to increase the number
> of
> > threads and memory, which is vertical scaling and not feasible for
> handling
> > a topic with significant volume.
> >
> > *Instance 1:*
> >
> > 2018-07-29 01:45:43 INFO  StreamStateListener22 - Stream application
> moved
> > from RUNNING to REBALANCING
> > 2018-07-29 02:15:59 INFO  StreamStateListener22 - Stream application
> moved
> > from REBALANCING to RUNNING
> >
> > 2018-07-29 05:19:18 INFO  StreamStateListener22 - Stream application
> moved
> > from RUNNING to REBALANCING
> > 2018-07-29 05:54:00 INFO  StreamStateListener22 - Stream application
> moved
> > from REBALANCING to RUNNING
> >
> > *Instance 2:*
> >
> > 2018-07-29 01:45:58 INFO  StreamStateListener22 - Stream application
> moved
> > from RUNNING to REBALANCING
> > 2018-07-29 02:41:22 INFO  StreamStateListener22 - Stream application
> moved
> > from REBALANCING to RUNNING
> >
> > 2018-07-29 05:19:33 INFO  StreamStateListener22 - Stream application
> moved
> > from RUNNING to REBALANCING
> > 2018-07-29 05:54:14 INFO  StreamStateListener22 - Stream application
> moved
> > from REBALANCING to RUNNING
> >
> >
> > Thanks,
> > Siva
> >
>
>
>
> --
> -- Guozhang
>