You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Umesh Chaudhary (JIRA)" <ji...@apache.org> on 2016/09/29 06:21:20 UTC

[jira] [Commented] (KAFKA-4142) Log files in /data dir date modified keeps being updated?

    [ https://issues.apache.org/jira/browse/KAFKA-4142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15531931#comment-15531931 ] 

Umesh Chaudhary commented on KAFKA-4142:
----------------------------------------

[~cmhillerman], Did you compare the size of both files which you see the date modified changed but are having same name?

From the doc I see:
the log is cleaned by recopying each log segment but omitting any key that appears in the offset map with a higher offset than what is found in the segment (i.e. messages with a key that appears in the dirty section of the log).

I believe when you will compare the size of both files, you will find some messages are cleaned based on retention policy.

> Log files in /data dir date modified keeps being updated?
> ---------------------------------------------------------
>
>                 Key: KAFKA-4142
>                 URL: https://issues.apache.org/jira/browse/KAFKA-4142
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.10.0.0
>         Environment: CentOS release 6.8 (Final)
> uname -a
> Linux 2.6.32-642.1.1.el6.x86_64 #1 SMP Tue May 31 21:57:07 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
>            Reporter: Clint Hillerman
>            Priority: Minor
>
> The date modified of the kafka logs (the main ones specified by logs.dirs in the config) keep getting updated and set to the exact same time.
> For example:
> Say I had two log and index files ( date modified - file name):
> 20160901:10:00:01 - 0001.log
> 20160901:10:00:01 -0001.index
> 20160902:10:00:01 -0002.log
> 20160902:10:00:01 -0002.index
> Later I notice the logs are getting way to old for the retention time. I then go look at the log dir and I see this:
> 20160903:10:00:01 - 0001.log
> 2016090310:00:01 -0001.index
> 20160903:10:00:01 -0002.log
> 20160903:10:00:01 -0002.index
> 20160903:10:00:01 -0003.log
> 20160903:10:00:01 -0003.index
> 20160904:10:00:01 -0004.log
> 20160904:10:00:01 -0004.index
> The first two log files had there date modified moved forward for some reason. They were updated from 0901 and 0902 to 0903. 
> It seems to happen periodically. The new logs that kafka writes out have the correct time stamp. 
> This causes the logs to not be deleted. Right now I just touch the log files to an older date and they are deleted right away. 
> Any help would be appreciated. Also, I'll explain the problem better if this doesn't make sense.
> Thanks,



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)