You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "dhruba borthakur (JIRA)" <ji...@apache.org> on 2008/09/03 23:33:44 UTC

[jira] Updated: (HADOOP-1869) access times of HDFS files

     [ https://issues.apache.org/jira/browse/HADOOP-1869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-1869:
-------------------------------------

    Attachment: accessTime6.patch

Given that Raghu, Owen and Allen commented that it is better to follow the POSIX semantics of allowing an user to set either access time or modification time to any arbitrary value he/she likes, I change my earlier patch sightly to add the following API:

{quote}
 /**
 * Set access time of a file
 * @param p The path
 * @param mtime Set the modification time of this file.
 *              The number of milliseconds since Jan 1, 1970. 
 *              A value of -1 means that this call should not set modification time.
  * @param atime Set the access time of this file.
  *              The number of milliseconds since Jan 1, 1970. 
  *              A value of -1 means that this call should not set access time.
  */
public void setTimes(Path p, long mtime, long atime
     ) throws IOException;

{quote}

This is precisely similar to the POSIX utimes call, but follows the Hadoop naming pattern for method names. This allows setting access time or modification time or both.


> access times of HDFS files
> --------------------------
>
>                 Key: HADOOP-1869
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1869
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>             Fix For: 0.19.0
>
>         Attachments: accessTime1.patch, accessTime4.patch, accessTime5.patch, accessTime6.patch
>
>
> HDFS should support some type of statistics that allows an administrator to determine when a file was last accessed. 
> Since HDFS does not have quotas yet, it is likely that users keep on accumulating files in their home directories without much regard to the amount of space they are occupying. This causes memory-related problems with the namenode.
> Access times are costly to maintain. AFS does not maintain access times. I thind DCE-DFS does maintain access times with a coarse granularity.
> One proposal for HDFS would be to implement something like an "access bit". 
> 1. This access-bit is set when a file is accessed. If the access bit is already set, then this call does not result in a transaction.
> 2. A FileSystem.clearAccessBits() indicates that the access bits of all files need to be cleared.
> An administrator can effectively use the above mechanism (maybe a daily cron job) to determine files that are recently used.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.