You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@kafka.apache.org by Bhavesh Mistry <mi...@gmail.com> on 2015/03/25 20:35:57 UTC

Producer Behavior When one or more Brokers' Disk is Full.

Hello Kafka Community,



What is expected behavior on Producer side when one or more Brokers’  disk
is full, but have not reached retention period for topics (by size or by
time limit).



Does producer send data to that particular brokers and/or Producer Queue
gets full and always throws  Queue Full  or based on configuration (I have
producer with non-blocking setting when queue is full and ack are 0,1 and
retries set to 3).



What is expected behavior on OLD [Scala Based] vs Pure Java Based Producer ?


Here is reference to past discussion:
http://grokbase.com/t/kafka/users/147h4958k8/how-to-recover-from-a-disk-full-situation-in-kafka-cluster


Is there wiki or cookbook steps to recover from such situation ?



Thanks,

Bhavesh

Re: Producer Behavior When one or more Brokers' Disk is Full.

Posted by Guozhang Wang <wa...@gmail.com>.

Hmm, I think Svante is correct, writes on disk-full would probably cause
the underlying file system to get in a bad state, and in that sense the
broker needs to be brought down for maintenance.

Guozhang

On Thu, Mar 26, 2015 at 3:49 PM, svante karlsson <sa...@csi.se> wrote:

> >4. As for recovering broker from disk full, if replication is enabled one
> >can just bring it down (the leader of the partition will then migrate to
> >other brokers), clear the disk space, and bring it up again; if
> replication
> >is not enabled then you can first move the partitions away from this
> broker
> >using the partition-reassignment tool and then do the same.
>
>
> I believe that this is handled in a rather abrupt way at the server. It
> will crash and if you have replication the partition leader will move.
>
> However you must manually solve the disk space issue before restarting the
> failed broker since replication will immediately crash it again.
>
> (The same thing also applies to a broken disk)
>
> I think that partition-reassignment requires a healthy broker but I might
> be wrong on this.
>
> /svante
>



-- 
-- Guozhang

Re: Producer Behavior When one or more Brokers' Disk is Full.

Posted by svante karlsson <sa...@csi.se>.

>4. As for recovering broker from disk full, if replication is enabled one
>can just bring it down (the leader of the partition will then migrate to
>other brokers), clear the disk space, and bring it up again; if replication
>is not enabled then you can first move the partitions away from this broker
>using the partition-reassignment tool and then do the same.


I believe that this is handled in a rather abrupt way at the server. It
will crash and if you have replication the partition leader will move.

However you must manually solve the disk space issue before restarting the
failed broker since replication will immediately crash it again.

(The same thing also applies to a broken disk)

I think that partition-reassignment requires a healthy broker but I might
be wrong on this.

/svante

Re: Producer Behavior When one or more Brokers' Disk is Full.

Posted by Guozhang Wang <wa...@gmail.com>.

Hi Bhavesh,

1. Server disk-full is treated the same as other error, that an error code
will be returned (in this case I think it is "Unknown" error though, as
disk IO exception is not specifically captured).

2. Upon receiving the error from the brokers, producer will retry based on
its configs. However, for both Scala and Java producer, it will still try
to send to the same broker since the partition to which the message should
be sent is determined by the time it is buffered in batch and will not
change ever since.

3. For Scala producer aync mode, if the retry number is set high it could
cause the producer retrying many times before dropping it on the floor, and
hence the buffer gets full causing BufferFullException on other sends.

4. As for recovering broker from disk full, if replication is enabled one
can just bring it down (the leader of the partition will then migrate to
other brokers), clear the disk space, and bring it up again; if replication
is not enabled then you can first move the partitions away from this broker
using the partition-reassignment tool and then do the same.

Guozhang

On Wed, Mar 25, 2015 at 12:35 PM, Bhavesh Mistry <mistry.p.bhavesh@gmail.com
> wrote:

> Hello Kafka Community,
>
>
>
> What is expected behavior on Producer side when one or more Brokers’  disk
> is full, but have not reached retention period for topics (by size or by
> time limit).
>
>
>
> Does producer send data to that particular brokers and/or Producer Queue
> gets full and always throws  Queue Full  or based on configuration (I have
> producer with non-blocking setting when queue is full and ack are 0,1 and
> retries set to 3).
>
>
>
> What is expected behavior on OLD [Scala Based] vs Pure Java Based Producer
> ?
>
>
> Here is reference to past discussion:
>
> http://grokbase.com/t/kafka/users/147h4958k8/how-to-recover-from-a-disk-full-situation-in-kafka-cluster
>
>
> Is there wiki or cookbook steps to recover from such situation ?
>
>
>
> Thanks,
>
> Bhavesh
>

-- 
-- Guozhang