You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by Fares Oueslati <ou...@gmail.com> on 2022/05/02 14:15:19 UTC

Re: Possible Bug: kafka-reassign-partitions causing the data retention time to be reset

Hello Lqjacklee,

Is there any news on this please, especially regarding my last message?

Do you think it is possible to modify the segment files manually with touch
-a -m -t 203801181205.09 my_sement_file? I could keep the original
modification date of the files before the move and update them afterwards.

Thanks a lot!
Fares

On Fri, Apr 29, 2022 at 3:24 AM Fares Oueslati <ou...@gmail.com>
wrote:

> Thanks for your help!
>
> I'm not sure how that would help me though. I'm not actually trying to
> decommission a Kafka broker.
> I would like to move all the data from one disk (log.dir) to another
> within the same broker while keeping the original modification time of the
> moved segment files.
> After that I would like to delete the disk, not the broker.
>
> Kind Regards,
> Fares
>
> On Thu, Apr 28, 2022 at 7:05 PM lqjacklee <lq...@gmail.com> wrote:
>
>> The resource (https://mike.seid.io/blog/decommissiong-a-kafka-node.html)
>> may help you.
>> I have created (https://issues.apache.org/jira/browse/KAFKA-13860) to
>> replay the case .
>>
>> On Thu, Apr 28, 2022 at 10:33 PM Fares Oueslati <ou...@gmail.com>
>> wrote:
>>
>>> Hello,
>>>
>>> I'm not sure how to report this properly but I didn't get any answer in
>>> the
>>> user mailing list.
>>>
>>> In order to remove a disk in a JBOD setup, I moved all data from one disk
>>> to another on every Kafka broker using kafka-reassign-partitions, then I
>>> went through some weird behaviour.
>>> Basically, the disk storage kept increasing even though there is no
>>> change
>>> on bytes in metric per broker.
>>> After investigation, I’ve seen that all segment log files in the new
>>> log.dir had a modification date set to the moment when the move had been
>>> done.
>>> So I guess the process applying the retention policy (log cleaner?) uses
>>> that timestamp to check whether the segment file should be deleted or
>>> not.
>>> So I ended up with a lot more data than we were supposed to store, since
>>> we
>>> are basically doubling the retention time of all the freshly moved data.
>>>
>>> This seems to me to be a buggy behavior of the command, is it possible to
>>> create a JIRA to track and eventually fix this?
>>> The only option I see to fix it is to keep the modification date before
>>> moving the data and applying it manually afterwards for every segment
>>> file, touching those files manually doesn't seem very safe imho.
>>>
>>> Thanks
>>> Fares Oueslati
>>>
>>

Re: Possible Bug: kafka-reassign-partitions causing the data retention time to be reset

Posted by Luke Chen <sh...@gmail.com>.
Hi Fared,

> So I guess the process applying the retention policy (log cleaner?) uses
that timestamp to check whether the segment file should be deleted or not.
So I ended up with a lot more data than we were supposed to store, since we
are basically doubling the retention time of all the freshly moved data.

No, Kafka uses the largest timestamp each segment has to determine if it
exceeds the retention period. But if the log is corrupted or for some
reason it can't find the largest timestamp, it'll use file modified date.
(You can check here
<https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/log/UnifiedLog.scala#L2124-L2130>
)

If you confirmed the log retention doesn't work as expected after log dir
movement, you can create a JIRA ticket for this issue. That must be a bug.
If possible, please upload the broker logs and the log files with log
records (if possible, to identify if there's log corruption) into the JIRA.

Thank you.
Luke

On Mon, May 2, 2022 at 10:22 PM Fares Oueslati <ou...@gmail.com>
wrote:

> Hello Lqjacklee,
>
> Is there any news on this please, especially regarding my last message?
>
> Do you think it is possible to modify the segment files manually with touch
> -a -m -t 203801181205.09 my_sement_file? I could keep the original
> modification date of the files before the move and update them afterwards.
>
> Thanks a lot!
> Fares
>
> On Fri, Apr 29, 2022 at 3:24 AM Fares Oueslati <ou...@gmail.com>
> wrote:
>
> > Thanks for your help!
> >
> > I'm not sure how that would help me though. I'm not actually trying to
> > decommission a Kafka broker.
> > I would like to move all the data from one disk (log.dir) to another
> > within the same broker while keeping the original modification time of
> the
> > moved segment files.
> > After that I would like to delete the disk, not the broker.
> >
> > Kind Regards,
> > Fares
> >
> > On Thu, Apr 28, 2022 at 7:05 PM lqjacklee <lq...@gmail.com> wrote:
> >
> >> The resource (https://mike.seid.io/blog/decommissiong-a-kafka-node.html
> )
> >> may help you.
> >> I have created (https://issues.apache.org/jira/browse/KAFKA-13860) to
> >> replay the case .
> >>
> >> On Thu, Apr 28, 2022 at 10:33 PM Fares Oueslati <
> oueslati.fares@gmail.com>
> >> wrote:
> >>
> >>> Hello,
> >>>
> >>> I'm not sure how to report this properly but I didn't get any answer in
> >>> the
> >>> user mailing list.
> >>>
> >>> In order to remove a disk in a JBOD setup, I moved all data from one
> disk
> >>> to another on every Kafka broker using kafka-reassign-partitions, then
> I
> >>> went through some weird behaviour.
> >>> Basically, the disk storage kept increasing even though there is no
> >>> change
> >>> on bytes in metric per broker.
> >>> After investigation, I’ve seen that all segment log files in the new
> >>> log.dir had a modification date set to the moment when the move had
> been
> >>> done.
> >>> So I guess the process applying the retention policy (log cleaner?)
> uses
> >>> that timestamp to check whether the segment file should be deleted or
> >>> not.
> >>> So I ended up with a lot more data than we were supposed to store,
> since
> >>> we
> >>> are basically doubling the retention time of all the freshly moved
> data.
> >>>
> >>> This seems to me to be a buggy behavior of the command, is it possible
> to
> >>> create a JIRA to track and eventually fix this?
> >>> The only option I see to fix it is to keep the modification date before
> >>> moving the data and applying it manually afterwards for every segment
> >>> file, touching those files manually doesn't seem very safe imho.
> >>>
> >>> Thanks
> >>> Fares Oueslati
> >>>
> >>
>