You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by John Malizia <jo...@gmail.com> on 2020/06/27 03:49:23 UTC

Re: Kafka: Messages disappearing from topics, largestTime=0

Hi there, I'm joining the party a little late on this one, but this is
something I encountered at work and I think I can shed some light on the
problem at hand. I filed a bug report
https://issues.apache.org/jira/browse/KAFKA-10207 and also submitted a pull
request https://github.com/apache/kafka/pull/8936 that should resolve the
issue.

From my investigation, it appears the issue was related to the jvm version
we were using and only happened against a zfs mount. We tried ext4 and
btrfs successfully under this configuration, but eventually upgraded our
jvm and the issue with using zfs disappeared. I hope this helps!

On 2020/04/29 12:09:28, Liam Clarke-Hutchinson <l....@adscale.co.nz> wrote:
> Hmm, how are you doing your rolling deploys?>
>
> I'm wondering if the time indexes are being corrupted by unclean>
> shutdowns. I've>
> been reading code and the only path I could find that led to a largest>
> timestamp of 0 was, as you've discovered, where there was no time index.>
>
> WRT to the corruption - the broker being SIGKILLed (systemctl by default>
> sends SIGKILL 90 seconds after SIGTERM, and our broker needed 120s to
shut>
> down cleanly) has caused index corruption for us in the past - although
in>
> our case it was recovered from automatically by the broker. Just took 2>
> hours.>
>
> Also are you moving between versions with these deploys?>
>
> On Wed, 29 Apr. 2020, 11:23 pm JP MB, <jo...@gmail.com> wrote:>
>
> > The server is in UTC, [2020-04-27 10:36:40,386] was actually my time.
On>
> > the server was 9:36.>
> > It doesn't look like a timezone problem because it cleans properly
other>
> > records, exactly 48 hours.>
> >>
> > Em qua., 29 de abr. de 2020 às 11:26, Goran Sliskovic>
> > <gs...@yahoo.com.invalid> escreveu:>
> >>
> > > Hi,>
> > > When lastModifiedTime on that segment is converted to human readable>
> > time:>
> > > Monday, April 27, 2020 9:14:19 AM UTC>
> > >>
> > > In what time zone is the server (IOW: [2020-04-27 10:36:40,386] from
the>
> > > log is in what time zone)?>
> > > It looks as largestTime is property of log record and 0 means the
log>
> > > record is empty.>
> > >>
> > > On Tuesday, April 28, 2020, 04:37:03 PM GMT+2, JP MB <>
> > > jose.brandao1994@gmail.com> wrote:>
> > >>
> > > Hi,>
> > > We have messages disappearing from topics on Apache Kafka with
versions>
> > > 2.3, 2.4.0, 2.4.1 and 2.5.0. We noticed this when we make a rolling>
> > > deployment of our clusters and unfortunately it doesn't happen every>
> > time,>
> > > so it's very inconsistent.>
> > >>
> > > Sometimes we lose all messages inside a topic, other times we lose
all>
> > > messages inside a partition. When this happens the following log is
a>
> > > constant:>
> > >>
> > > [2020-04-27 10:36:40,386] INFO [Log partition=test-lost-messages-5,>
> > > dir=/var/kafkadata/data01/data] Deleting segments>
> > > List(LogSegment(baseOffset=6, size=728,>
> > > lastModifiedTime=1587978859000, largestTime=0)) (kafka.log.Log)>
> > >>
> > > There is also a previous log saying this segment hit the retention
time>
> > > breach of 48 hours. In this example, the message was produced ~12
minutes>
> > > before the deployment.>
> > >>
> > > Notice, all messages that are wrongly deleted havelargestTime=0 and
the>
> > > ones that are properly deleted have a valid timestamp in there. From
what>
> > > we read from documentation and code it looks like the largestTime is
used>
> > > to calculate if a given segment reached the time breach or not.>
> > >>
> > > Since we can observe this in multiple versions of Kafka, we think
this>
> > > might be related to anything external to Kafka. E.g Zookeeper.>
> > >>
> > > Does anyone have any ideas of why this could be happening?>
> > > For the record, we are using Zookeeper 3.6.0.>
> > >>
> >>
>