You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@kafka.apache.org by James Cheng <jc...@tivo.com> on 2016/02/25 22:46:37 UTC

Questions about unclean leader election and "Halting because log truncation is not allowed"

Hi,

I ran into a scenario where one of my brokers would continually shutdown, with the error message:
[2016-02-25 00:29:39,236] FATAL [ReplicaFetcherThread-0-1], Halting because log truncation is not allowed for topic test, Current leader 1's latest offset 0 is less than replica 2's latest offset 151 (kafka.server.ReplicaFetcherThread)

I managed to reproduce it with the following scenario:
1. Start broker1, with unclean.leader.election.enable=false
2. Start broker2, with unclean.leader.election.enable=false

3. Create topic, single partition, with replication-factor 2.
4. Write data to the topic.

5. At this point, both brokers are in the ISR. Broker1 is the partition leader.

6. Ctrl-Z on broker2. (Simulates a GC pause or a slow network) Broker2 gets dropped out of ISR. Broker1 is still the leader. I can still write data to the partition.

7. Shutdown Broker1. Hard or controlled, doesn't matter.

8. rm -rf the log directory of broker1. (This simulates a disk replacement or full hardware replacement)

9. Resume broker2. It attempts to connect to broker1, but doesn't succeed because broker1 is down. At this point, the partition is offline. Can't write to it.

10. Resume broker1. Broker1 resumes leadership of the topic. Broker2 attempts to join ISR, and immediately halts with the error message:
[2016-02-25 00:29:39,236] FATAL [ReplicaFetcherThread-0-1], Halting because log truncation is not allowed for topic test, Current leader 1's latest offset 0 is less than replica 2's latest offset 151 (kafka.server.ReplicaFetcherThread)

I am able to recover by setting unclean.leader.election.enable=true on my brokers.

I'm trying to understand a couple things:
* Is my scenario a valid supported one, or is it along the lines of "don't ever do that"?
* In step 10, why is broker1 allowed to resume leadership even though it has no data?
* In step 10, why is it necessary to stop the entire broker due to one partition that is in this state? Wouldn't it be possible for the broker to continue to serve traffic for all the other topics, and just mark this one as unavailable?
* Would it make sense to allow an operator to manually specify which broker they want to become the new master? This would give me more control over how much data loss I am willing to handle. In this case, I would want broker2 to become the new master. Or, is that possible and I just don't know how to do it?
* Would it be possible to make unclean.leader.election.enable to be a per-topic configuration? This would let me control how much data loss I am willing to handle.

Btw, the comment in the source code for that error message indicates:
https://github.com/apache/kafka/blob/01aeea7c7bca34f1edce40116b7721335938b13b/core/src/main/scala/kafka/server/ReplicaFetcherThread.scala#L164-L166

// Prior to truncating the follower's log, ensure that doing so is not disallowed by the configuration for unclean leader election.
// This situation could only happen if the unclean election configuration for a topic changes while a replica is down. Otherwise,
// we should never encounter this situation since a non-ISR leader cannot be elected if disallowed by the broker configuration.

But I don't believe that happened. I never changed the configuration. But I did venture into "unclean leader election" territory, so I'm not sure if the comment still applies.

Thanks,
-James

________________________________

This email and any attachments may contain confidential and privileged material for the sole use of the intended recipient. Any review, copying, or distribution of this email (or any attachments) by others is prohibited. If you are not the intended recipient, please contact the sender immediately and permanently delete this email and any attachments. No employee or agent of TiVo Inc. is authorized to conclude any binding agreement on behalf of TiVo Inc. by email. Binding agreements with TiVo Inc. may only be made by a signed written agreement.

Re: Questions about unclean leader election and "Halting because log truncation is not allowed"

Posted by "Olson,Andrew" <AO...@CERNER.COM>.

unclean.leader.election.enable is actually a valid topic-level configuration, I opened https://issues.apache.org/jira/browse/KAFKA-3298 to get the documentation updated.




That code comment doesn’t tell the complete story and could probably be updated for clarity as we’ve learned a lot since then. It’s still theoretically possible in certain severe split-brain situations such as the one your reproduction scenario introduces. Hopefully https://issues.apache.org/jira/browse/KAFKA-2143 helps to prevent the possibility from arising however.

On 2/25/16, 3:46 PM, "James Cheng" <jc...@tivo.com> wrote:

>Hi,
>
>I ran into a scenario where one of my brokers would continually shutdown, with the error message:
>[2016-02-25 00:29:39,236] FATAL [ReplicaFetcherThread-0-1], Halting because log truncation is not allowed for topic test, Current leader 1's latest offset 0 is less than replica 2's latest offset 151 (kafka.server.ReplicaFetcherThread)
>
>I managed to reproduce it with the following scenario:
>1. Start broker1, with unclean.leader.election.enable=false
>2. Start broker2, with unclean.leader.election.enable=false
>
>3. Create topic, single partition, with replication-factor 2.
>4. Write data to the topic.
>
>5. At this point, both brokers are in the ISR. Broker1 is the partition leader.
>
>6. Ctrl-Z on broker2. (Simulates a GC pause or a slow network) Broker2 gets dropped out of ISR. Broker1 is still the leader. I can still write data to the partition.
>
>7. Shutdown Broker1. Hard or controlled, doesn't matter.
>
>8. rm -rf the log directory of broker1. (This simulates a disk replacement or full hardware replacement)
>
>9. Resume broker2. It attempts to connect to broker1, but doesn't succeed because broker1 is down. At this point, the partition is offline. Can't write to it.
>
>10. Resume broker1. Broker1 resumes leadership of the topic. Broker2 attempts to join ISR, and immediately halts with the error message:
>[2016-02-25 00:29:39,236] FATAL [ReplicaFetcherThread-0-1], Halting because log truncation is not allowed for topic test, Current leader 1's latest offset 0 is less than replica 2's latest offset 151 (kafka.server.ReplicaFetcherThread)
>
>I am able to recover by setting unclean.leader.election.enable=true on my brokers.
>
>I'm trying to understand a couple things:
>* Is my scenario a valid supported one, or is it along the lines of "don't ever do that"?
>* In step 10, why is broker1 allowed to resume leadership even though it has no data?
>* In step 10, why is it necessary to stop the entire broker due to one partition that is in this state? Wouldn't it be possible for the broker to continue to serve traffic for all the other topics, and just mark this one as unavailable?
>* Would it make sense to allow an operator to manually specify which broker they want to become the new master? This would give me more control over how much data loss I am willing to handle. In this case, I would want broker2 to become the new master. Or, is that possible and I just don't know how to do it?
>* Would it be possible to make unclean.leader.election.enable to be a per-topic configuration? This would let me control how much data loss I am willing to handle.
>
>Btw, the comment in the source code for that error message indicates:
>https://github.com/apache/kafka/blob/01aeea7c7bca34f1edce40116b7721335938b13b/core/src/main/scala/kafka/server/ReplicaFetcherThread.scala#L164-L166
>
>      // Prior to truncating the follower's log, ensure that doing so is not disallowed by the configuration for unclean leader election.
>      // This situation could only happen if the unclean election configuration for a topic changes while a replica is down. Otherwise,
>      // we should never encounter this situation since a non-ISR leader cannot be elected if disallowed by the broker configuration.
>
>But I don't believe that happened. I never changed the configuration. But I did venture into "unclean leader election" territory, so I'm not sure if the comment still applies.
>
>Thanks,
>-James
>
>
>
>________________________________
>
>This email and any attachments may contain confidential and privileged material for the sole use of the intended recipient. Any review, copying, or distribution of this email (or any attachments) by others is prohibited. If you are not the intended recipient, please contact the sender immediately and permanently delete this email and any attachments. No employee or agent of TiVo Inc. is authorized to conclude any binding agreement on behalf of TiVo Inc. by email. Binding agreements with TiVo Inc. may only be made by a signed written agreement.

CONFIDENTIALITY NOTICE This message and any included attachments are from Cerner Corporation and are intended only for the addressee. The information contained in this message is confidential and may constitute inside or non-public information under international, federal, or state securities laws. Unauthorized forwarding, printing, copying, distribution, or use of such information is strictly prohibited and may be unlawful. If you are not the addressee, please promptly delete this message and notify the sender of the delivery error by e-mail or you may call Cerner's corporate offices in Kansas City, Missouri, U.S.A at (+1) (816)221-1024.

Re: Questions about unclean leader election and "Halting because log truncation is not allowed"

Posted by James Cheng <jc...@tivo.com>.

Anthony,

I filed https://issues.apache.org/jira/browse/KAFKA-3410 to track this.

-James

> On Feb 25, 2016, at 2:16 PM, Anthony Sparks <an...@gmail.com> wrote:
>
> Hello James,
>
> We received this exact same error this past Tuesday (we are on 0.8.2).  To
> answer at least one of your bullet points -- this is a valid scenario. We
> had the same questions, I'm starting to think this is a bug -- thank you
> for the reproducing steps!
>
> I looked over the Release Notes to see if maybe there were some fixes in
> newer versions -- this bug fix looked the most related:
> https://issues.apache.org/jira/browse/KAFKA-2143
>
> Thank you,
>
> Tony
>
> On Thu, Feb 25, 2016 at 3:46 PM, James Cheng <jc...@tivo.com> wrote:
>
>> Hi,
>>
>> I ran into a scenario where one of my brokers would continually shutdown,
>> with the error message:
>> [2016-02-25 00:29:39,236] FATAL [ReplicaFetcherThread-0-1], Halting
>> because log truncation is not allowed for topic test, Current leader 1's
>> latest offset 0 is less than replica 2's latest offset 151
>> (kafka.server.ReplicaFetcherThread)
>>
>> I managed to reproduce it with the following scenario:
>> 1. Start broker1, with unclean.leader.election.enable=false
>> 2. Start broker2, with unclean.leader.election.enable=false
>>
>> 3. Create topic, single partition, with replication-factor 2.
>> 4. Write data to the topic.
>>
>> 5. At this point, both brokers are in the ISR. Broker1 is the partition
>> leader.
>>
>> 6. Ctrl-Z on broker2. (Simulates a GC pause or a slow network) Broker2
>> gets dropped out of ISR. Broker1 is still the leader. I can still write
>> data to the partition.
>>
>> 7. Shutdown Broker1. Hard or controlled, doesn't matter.
>>
>> 8. rm -rf the log directory of broker1. (This simulates a disk replacement
>> or full hardware replacement)
>>
>> 9. Resume broker2. It attempts to connect to broker1, but doesn't succeed
>> because broker1 is down. At this point, the partition is offline. Can't
>> write to it.
>>
>> 10. Resume broker1. Broker1 resumes leadership of the topic. Broker2
>> attempts to join ISR, and immediately halts with the error message:
>> [2016-02-25 00:29:39,236] FATAL [ReplicaFetcherThread-0-1], Halting
>> because log truncation is not allowed for topic test, Current leader 1's
>> latest offset 0 is less than replica 2's latest offset 151
>> (kafka.server.ReplicaFetcherThread)
>>
>> I am able to recover by setting unclean.leader.election.enable=true on my
>> brokers.
>>
>> I'm trying to understand a couple things:
>> * Is my scenario a valid supported one, or is it along the lines of "don't
>> ever do that"?
>> * In step 10, why is broker1 allowed to resume leadership even though it
>> has no data?
>> * In step 10, why is it necessary to stop the entire broker due to one
>> partition that is in this state? Wouldn't it be possible for the broker to
>> continue to serve traffic for all the other topics, and just mark this one
>> as unavailable?
>> * Would it make sense to allow an operator to manually specify which
>> broker they want to become the new master? This would give me more control
>> over how much data loss I am willing to handle. In this case, I would want
>> broker2 to become the new master. Or, is that possible and I just don't
>> know how to do it?
>> * Would it be possible to make unclean.leader.election.enable to be a
>> per-topic configuration? This would let me control how much data loss I am
>> willing to handle.
>>
>> Btw, the comment in the source code for that error message indicates:
>>
>> https://github.com/apache/kafka/blob/01aeea7c7bca34f1edce40116b7721335938b13b/core/src/main/scala/kafka/server/ReplicaFetcherThread.scala#L164-L166
>>
>>      // Prior to truncating the follower's log, ensure that doing so is
>> not disallowed by the configuration for unclean leader election.
>>      // This situation could only happen if the unclean election
>> configuration for a topic changes while a replica is down. Otherwise,
>>      // we should never encounter this situation since a non-ISR leader
>> cannot be elected if disallowed by the broker configuration.
>>
>> But I don't believe that happened. I never changed the configuration. But
>> I did venture into "unclean leader election" territory, so I'm not sure if
>> the comment still applies.
>>
>> Thanks,
>> -James
>>
>>
>>
>> ________________________________
>>
>> This email and any attachments may contain confidential and privileged
>> material for the sole use of the intended recipient. Any review, copying,
>> or distribution of this email (or any attachments) by others is prohibited.
>> If you are not the intended recipient, please contact the sender
>> immediately and permanently delete this email and any attachments. No
>> employee or agent of TiVo Inc. is authorized to conclude any binding
>> agreement on behalf of TiVo Inc. by email. Binding agreements with TiVo
>> Inc. may only be made by a signed written agreement.
>>


________________________________

This email and any attachments may contain confidential and privileged material for the sole use of the intended recipient. Any review, copying, or distribution of this email (or any attachments) by others is prohibited. If you are not the intended recipient, please contact the sender immediately and permanently delete this email and any attachments. No employee or agent of TiVo Inc. is authorized to conclude any binding agreement on behalf of TiVo Inc. by email. Binding agreements with TiVo Inc. may only be made by a signed written agreement.

Re: Questions about unclean leader election and "Halting because log truncation is not allowed"

Posted by Anthony Sparks <an...@gmail.com>.

Hello James,

We received this exact same error this past Tuesday (we are on 0.8.2).  To
answer at least one of your bullet points -- this is a valid scenario. We
had the same questions, I'm starting to think this is a bug -- thank you
for the reproducing steps!

I looked over the Release Notes to see if maybe there were some fixes in
newer versions -- this bug fix looked the most related:
https://issues.apache.org/jira/browse/KAFKA-2143

Thank you,

Tony

On Thu, Feb 25, 2016 at 3:46 PM, James Cheng <jc...@tivo.com> wrote:

> Hi,
>
> I ran into a scenario where one of my brokers would continually shutdown,
> with the error message:
> [2016-02-25 00:29:39,236] FATAL [ReplicaFetcherThread-0-1], Halting
> because log truncation is not allowed for topic test, Current leader 1's
> latest offset 0 is less than replica 2's latest offset 151
> (kafka.server.ReplicaFetcherThread)
>
> I managed to reproduce it with the following scenario:
> 1. Start broker1, with unclean.leader.election.enable=false
> 2. Start broker2, with unclean.leader.election.enable=false
>
> 3. Create topic, single partition, with replication-factor 2.
> 4. Write data to the topic.
>
> 5. At this point, both brokers are in the ISR. Broker1 is the partition
> leader.
>
> 6. Ctrl-Z on broker2. (Simulates a GC pause or a slow network) Broker2
> gets dropped out of ISR. Broker1 is still the leader. I can still write
> data to the partition.
>
> 7. Shutdown Broker1. Hard or controlled, doesn't matter.
>
> 8. rm -rf the log directory of broker1. (This simulates a disk replacement
> or full hardware replacement)
>
> 9. Resume broker2. It attempts to connect to broker1, but doesn't succeed
> because broker1 is down. At this point, the partition is offline. Can't
> write to it.
>
> 10. Resume broker1. Broker1 resumes leadership of the topic. Broker2
> attempts to join ISR, and immediately halts with the error message:
> [2016-02-25 00:29:39,236] FATAL [ReplicaFetcherThread-0-1], Halting
> because log truncation is not allowed for topic test, Current leader 1's
> latest offset 0 is less than replica 2's latest offset 151
> (kafka.server.ReplicaFetcherThread)
>
> I am able to recover by setting unclean.leader.election.enable=true on my
> brokers.
>
> I'm trying to understand a couple things:
> * Is my scenario a valid supported one, or is it along the lines of "don't
> ever do that"?
> * In step 10, why is broker1 allowed to resume leadership even though it
> has no data?
> * In step 10, why is it necessary to stop the entire broker due to one
> partition that is in this state? Wouldn't it be possible for the broker to
> continue to serve traffic for all the other topics, and just mark this one
> as unavailable?
> * Would it make sense to allow an operator to manually specify which
> broker they want to become the new master? This would give me more control
> over how much data loss I am willing to handle. In this case, I would want
> broker2 to become the new master. Or, is that possible and I just don't
> know how to do it?
> * Would it be possible to make unclean.leader.election.enable to be a
> per-topic configuration? This would let me control how much data loss I am
> willing to handle.
>
> Btw, the comment in the source code for that error message indicates:
>
> https://github.com/apache/kafka/blob/01aeea7c7bca34f1edce40116b7721335938b13b/core/src/main/scala/kafka/server/ReplicaFetcherThread.scala#L164-L166
>
>       // Prior to truncating the follower's log, ensure that doing so is
> not disallowed by the configuration for unclean leader election.
>       // This situation could only happen if the unclean election
> configuration for a topic changes while a replica is down. Otherwise,
>       // we should never encounter this situation since a non-ISR leader
> cannot be elected if disallowed by the broker configuration.
>
> But I don't believe that happened. I never changed the configuration. But
> I did venture into "unclean leader election" territory, so I'm not sure if
> the comment still applies.
>
> Thanks,
> -James
>
>
>
> ________________________________
>
> This email and any attachments may contain confidential and privileged
> material for the sole use of the intended recipient. Any review, copying,
> or distribution of this email (or any attachments) by others is prohibited.
> If you are not the intended recipient, please contact the sender
> immediately and permanently delete this email and any attachments. No
> employee or agent of TiVo Inc. is authorized to conclude any binding
> agreement on behalf of TiVo Inc. by email. Binding agreements with TiVo
> Inc. may only be made by a signed written agreement.
>