You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@kafka.apache.org by D C <dr...@gmail.com> on 2020/06/11 01:22:58 UTC

Re: Last Stable Offset (LSO) stuck for specific topic partition after Broker issues

Hey peeps,

Anyone else encountered this and got to the bottom of it?

I'm facing a similar issue, having LSO stuck for some partitions in a topic
and the consumers can't get data out of it (we're using read_committed =
true).

When this issue started happening we were on kafka 2.3.1
i tried:
- restarting the consumers
- deleting the partition from the leader and letting it get in sync with
the new leader
- rolling restart of the brokers
- shutting down the whole cluster and starting it again
- tried deleting the txnindex files (after backing them up) and restarting
the brokers
- tried putting down the follower brokers of a partition and resyncing that
partition on them from scratch
- upgraded both kafka broker and client to 2.5.0

Now the following questions arise:
Where is the LSO actually stored (even if you get rid of the txnfiles, the
LSO stays the same).
Is there any way that the LSO can be reset?
Is there any way that you can manually abort and clean the state of a stuck
transaction? (i suspect that this is the reason why the LSO is stuck)
Is there any way to manually trigger a consistency check on the logfiles
that would fix any existing issues with either the logs or the indexes in
the partition?

Cheers,
Dragos