You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-issues@hadoop.apache.org by "dhruba borthakur (JIRA)" <ji...@apache.org> on 2009/10/28 01:20:59 UTC

[jira] Commented: (HADOOP-6338) Utility to tail the contents of a directory

    [ https://issues.apache.org/jira/browse/HADOOP-6338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12770719#action_12770719 ] 

dhruba borthakur commented on HADOOP-6338:
------------------------------------------

Such a utility helps in providing a simple one-file-abstraction for an application that wants to consume the contents of a data-set created by a map-reduce application. An application that was consuming data in real-time via a "tail -f" command can be easlily migrated to work directly on HDFS files. 

> Utility to tail the contents of a directory
> -------------------------------------------
>
>                 Key: HADOOP-6338
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6338
>             Project: Hadoop Common
>          Issue Type: Improvement
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>
> There is an existing utility "bin/hadoop fs -tail -f <filename>" that prints the last few records from the specified file. A map-reduce application uses a directory as a data-set and it creates multiple files in a HDFS directory. I am proposing that we extend  "bin/hadoop fs -tail -f <directory>" to tail the contents of a directory. The files in the directory can be sorted (lexicographically, or based on modtimes) to arrive at the virtual sequence of the set of files inside the directory. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.