You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@kafka.apache.org by Sadhan Sood <sa...@gmail.com> on 2014/04/24 21:57:34 UTC

Brokers throwing warning messages after change in retention policy and multiple produce failures

We are seeing some strange behavior from brokers after we we had to change
our log retention policy on brokers yesterday. We had a huge spike in
producer data for a small period which caused brokers to get very close to
the max disk space. Normally our retention policy is good 6-7 days but
since our consumers were synced up we changed the retention policy from
hour based to size based and cut short the size to a safe number (half of
our max disk space and normal usage is around 30%). After the restart, we
started seeing multiple producer side failures with FailedSends metrics
showing almost 10% failures and FailedProduceRequestsPerSec on the broker
side a non-zero number. The traces from one of the brokers looked like this:

[KafkaApi-8] Produce request with correlation id 2050686 from client xxx on
partition [TOPIC_NAME,18] failed due to Partition [TOPIC_NAME,18] doesn't
exist on 8 (kafka.server.KafkaApis)
 [KafkaApi-8] Produce request with correlation id 2102325 from client xxx
on partition [TOPIC_NAME,28] failed due to Partition [TOPIC_NAME,28]
doesn't exist on 8 (kafka.server.KafkaApis)

We checked and made sure those partitions were present on the broker.
Any help is appreciated. Also, is there a recommended way to purge log data
quickly out from the brokers.

Thanks,
Sadhan

Re: Brokers throwing warning messages after change in retention policy and multiple produce failures

Posted by Guozhang Wang <wa...@gmail.com>.

Let us know if you still see this issue in 0.8.1.1.


On Mon, Apr 28, 2014 at 3:35 PM, Sadhan Sood <sa...@gmail.com> wrote:

> Thank you @Guozhang  and @Drew for your responses. We don't see any errors
> in server logs but not sure if we could've run into KAFKA-1311. We are
> using 0.8.0 on brokers and 0.8.1 on producers to get the extra partition
> key and the option of not saving it on brokers. We would upgrade our
> producers to 0.8.1.1 though and see if the problems go away.
>
>
> On Sat, Apr 26, 2014 at 1:35 PM, Drew Goya <dr...@gradientx.com> wrote:
>
> > I ran into this problem while restarting brokers while running 0.8.1.
>  This
> > was usually a sign that you ran into KAFKA-1311.
> >
> > I've since rolled upgrades to the latest on the 0.8.1 branch (0.8.1.1)
> and
> > my problems have gone away
> >
> >
> > On Thu, Apr 24, 2014 at 6:07 PM, Guozhang Wang <wa...@gmail.com>
> wrote:
> >
> > > Hi Sadhan,
> > >
> > > Do you see any errors on the server logs?
> > >
> > > Guozhang
> > >
> > >
> > > On Thu, Apr 24, 2014 at 12:57 PM, Sadhan Sood <sa...@gmail.com>
> > > wrote:
> > >
> > > > We are seeing some strange behavior from brokers after we we had to
> > > change
> > > > our log retention policy on brokers yesterday. We had a huge spike in
> > > > producer data for a small period which caused brokers to get very
> close
> > > to
> > > > the max disk space. Normally our retention policy is good 6-7 days
> but
> > > > since our consumers were synced up we changed the retention policy
> from
> > > > hour based to size based and cut short the size to a safe number
> (half
> > of
> > > > our max disk space and normal usage is around 30%). After the
> restart,
> > we
> > > > started seeing multiple producer side failures with FailedSends
> metrics
> > > > showing almost 10% failures and FailedProduceRequestsPerSec on the
> > broker
> > > > side a non-zero number. The traces from one of the brokers looked
> like
> > > > this:
> > > >
> > > > [KafkaApi-8] Produce request with correlation id 2050686 from client
> > xxx
> > > on
> > > > partition [TOPIC_NAME,18] failed due to Partition [TOPIC_NAME,18]
> > doesn't
> > > > exist on 8 (kafka.server.KafkaApis)
> > > >  [KafkaApi-8] Produce request with correlation id 2102325 from client
> > xxx
> > > > on partition [TOPIC_NAME,28] failed due to Partition [TOPIC_NAME,28]
> > > > doesn't exist on 8 (kafka.server.KafkaApis)
> > > >
> > > > We checked and made sure those partitions were present on the broker.
> > > > Any help is appreciated. Also, is there a recommended way to purge
> log
> > > data
> > > > quickly out from the brokers.
> > > >
> > > > Thanks,
> > > > Sadhan
> > > >
> > >
> > >
> > >
> > > --
> > > -- Guozhang
> > >
> >
>



-- 
-- Guozhang

Re: Brokers throwing warning messages after change in retention policy and multiple produce failures

Posted by Sadhan Sood <sa...@gmail.com>.

Thank you @Guozhang  and @Drew for your responses. We don't see any errors
in server logs but not sure if we could've run into KAFKA-1311. We are
using 0.8.0 on brokers and 0.8.1 on producers to get the extra partition
key and the option of not saving it on brokers. We would upgrade our
producers to 0.8.1.1 though and see if the problems go away.


On Sat, Apr 26, 2014 at 1:35 PM, Drew Goya <dr...@gradientx.com> wrote:

> I ran into this problem while restarting brokers while running 0.8.1.  This
> was usually a sign that you ran into KAFKA-1311.
>
> I've since rolled upgrades to the latest on the 0.8.1 branch (0.8.1.1) and
> my problems have gone away
>
>
> On Thu, Apr 24, 2014 at 6:07 PM, Guozhang Wang <wa...@gmail.com> wrote:
>
> > Hi Sadhan,
> >
> > Do you see any errors on the server logs?
> >
> > Guozhang
> >
> >
> > On Thu, Apr 24, 2014 at 12:57 PM, Sadhan Sood <sa...@gmail.com>
> > wrote:
> >
> > > We are seeing some strange behavior from brokers after we we had to
> > change
> > > our log retention policy on brokers yesterday. We had a huge spike in
> > > producer data for a small period which caused brokers to get very close
> > to
> > > the max disk space. Normally our retention policy is good 6-7 days but
> > > since our consumers were synced up we changed the retention policy from
> > > hour based to size based and cut short the size to a safe number (half
> of
> > > our max disk space and normal usage is around 30%). After the restart,
> we
> > > started seeing multiple producer side failures with FailedSends metrics
> > > showing almost 10% failures and FailedProduceRequestsPerSec on the
> broker
> > > side a non-zero number. The traces from one of the brokers looked like
> > > this:
> > >
> > > [KafkaApi-8] Produce request with correlation id 2050686 from client
> xxx
> > on
> > > partition [TOPIC_NAME,18] failed due to Partition [TOPIC_NAME,18]
> doesn't
> > > exist on 8 (kafka.server.KafkaApis)
> > >  [KafkaApi-8] Produce request with correlation id 2102325 from client
> xxx
> > > on partition [TOPIC_NAME,28] failed due to Partition [TOPIC_NAME,28]
> > > doesn't exist on 8 (kafka.server.KafkaApis)
> > >
> > > We checked and made sure those partitions were present on the broker.
> > > Any help is appreciated. Also, is there a recommended way to purge log
> > data
> > > quickly out from the brokers.
> > >
> > > Thanks,
> > > Sadhan
> > >
> >
> >
> >
> > --
> > -- Guozhang
> >
>

Re: Brokers throwing warning messages after change in retention policy and multiple produce failures

Posted by Drew Goya <dr...@gradientx.com>.

I ran into this problem while restarting brokers while running 0.8.1.  This
was usually a sign that you ran into KAFKA-1311.

I've since rolled upgrades to the latest on the 0.8.1 branch (0.8.1.1) and
my problems have gone away


On Thu, Apr 24, 2014 at 6:07 PM, Guozhang Wang <wa...@gmail.com> wrote:

> Hi Sadhan,
>
> Do you see any errors on the server logs?
>
> Guozhang
>
>
> On Thu, Apr 24, 2014 at 12:57 PM, Sadhan Sood <sa...@gmail.com>
> wrote:
>
> > We are seeing some strange behavior from brokers after we we had to
> change
> > our log retention policy on brokers yesterday. We had a huge spike in
> > producer data for a small period which caused brokers to get very close
> to
> > the max disk space. Normally our retention policy is good 6-7 days but
> > since our consumers were synced up we changed the retention policy from
> > hour based to size based and cut short the size to a safe number (half of
> > our max disk space and normal usage is around 30%). After the restart, we
> > started seeing multiple producer side failures with FailedSends metrics
> > showing almost 10% failures and FailedProduceRequestsPerSec on the broker
> > side a non-zero number. The traces from one of the brokers looked like
> > this:
> >
> > [KafkaApi-8] Produce request with correlation id 2050686 from client xxx
> on
> > partition [TOPIC_NAME,18] failed due to Partition [TOPIC_NAME,18] doesn't
> > exist on 8 (kafka.server.KafkaApis)
> >  [KafkaApi-8] Produce request with correlation id 2102325 from client xxx
> > on partition [TOPIC_NAME,28] failed due to Partition [TOPIC_NAME,28]
> > doesn't exist on 8 (kafka.server.KafkaApis)
> >
> > We checked and made sure those partitions were present on the broker.
> > Any help is appreciated. Also, is there a recommended way to purge log
> data
> > quickly out from the brokers.
> >
> > Thanks,
> > Sadhan
> >
>
>
>
> --
> -- Guozhang
>

Re: Brokers throwing warning messages after change in retention policy and multiple produce failures

Posted by Guozhang Wang <wa...@gmail.com>.

Hi Sadhan,

Do you see any errors on the server logs?

Guozhang


On Thu, Apr 24, 2014 at 12:57 PM, Sadhan Sood <sa...@gmail.com> wrote:

> We are seeing some strange behavior from brokers after we we had to change
> our log retention policy on brokers yesterday. We had a huge spike in
> producer data for a small period which caused brokers to get very close to
> the max disk space. Normally our retention policy is good 6-7 days but
> since our consumers were synced up we changed the retention policy from
> hour based to size based and cut short the size to a safe number (half of
> our max disk space and normal usage is around 30%). After the restart, we
> started seeing multiple producer side failures with FailedSends metrics
> showing almost 10% failures and FailedProduceRequestsPerSec on the broker
> side a non-zero number. The traces from one of the brokers looked like
> this:
>
> [KafkaApi-8] Produce request with correlation id 2050686 from client xxx on
> partition [TOPIC_NAME,18] failed due to Partition [TOPIC_NAME,18] doesn't
> exist on 8 (kafka.server.KafkaApis)
>  [KafkaApi-8] Produce request with correlation id 2102325 from client xxx
> on partition [TOPIC_NAME,28] failed due to Partition [TOPIC_NAME,28]
> doesn't exist on 8 (kafka.server.KafkaApis)
>
> We checked and made sure those partitions were present on the broker.
> Any help is appreciated. Also, is there a recommended way to purge log data
> quickly out from the brokers.
>
> Thanks,
> Sadhan
>



-- 
-- Guozhang