You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Gautam Singaraju <ga...@gmail.com> on 2011/03/06 23:30:53 UTC

File access pattern on HDFS?

Hi,

Is there a mechanism to get the list of files accessed on HDFS at the
NameNode?
Thanks!
---
Gautam

Re: File access pattern on HDFS?

Posted by Dhruba Borthakur <dh...@gmail.com>.
sure, one option (maybe not a very scalable one) is to register a URL with
the namenode. The NN can then post a batch of transactions as a POST to the
specified URL.

another option would be to write the file-change-log to a well-known hdfs
file itself.

thanks
dhruba


On Mon, Mar 7, 2011 at 1:05 PM, Gautam Singaraju <gautam.singaraju@gmail.com
> wrote:

> HDFS-1179: is exactly what I was looking for. Would it be a good idea to
> transmit info over TCP/UDP?
> ---
> Gautam
>
>
>
> On Mon, Mar 7, 2011 at 11:46 AM, Dhruba Borthakur <dh...@gmail.com>wrote:
>
>> Here is a JIRA that talks about a file-change-log (but no work has been
>> done
>> yet)
>>
>> http://issues.apache.org/jira/browse/HDFS-1179
>>
>> thanks,
>> dhruba
>>
>> On Mon, Mar 7, 2011 at 1:24 AM, Harsh J <qw...@gmail.com> wrote:
>>
>> > There is no such information (history of atime changes, although atime
>> > is held for every file in the NN) held by the NameNode right now. I
>> > think HDFS-782 is slightly relevant to maintaining a 'hot-zone' info,
>> > although at a block level and among datanodes. I couldn't find a jira
>> > that talks about keeping a list of atime modifications on the
>> > NameNode.
>> >
>> > On Mon, Mar 7, 2011 at 4:00 AM, Gautam Singaraju
>> > <ga...@gmail.com> wrote:
>> > > Hi,
>> > >
>> > > Is there a mechanism to get the list of files accessed on HDFS at the
>> > > NameNode?
>> > > Thanks!
>> > > ---
>> > > Gautam
>> > >
>> >
>> >
>> >
>> > --
>> > Harsh J
>> > www.harshj.com
>> >
>>
>>
>>
>> --
>> Connect to me at http://www.facebook.com/dhruba
>>
>
>


-- 
Connect to me at http://www.facebook.com/dhruba

Re: File access pattern on HDFS?

Posted by Gautam Singaraju <ga...@gmail.com>.
HDFS-1179: is exactly what I was looking for. Would it be a good idea to
transmit info over TCP/UDP?
---
Gautam


On Mon, Mar 7, 2011 at 11:46 AM, Dhruba Borthakur <dh...@gmail.com> wrote:

> Here is a JIRA that talks about a file-change-log (but no work has been
> done
> yet)
>
> http://issues.apache.org/jira/browse/HDFS-1179
>
> thanks,
> dhruba
>
> On Mon, Mar 7, 2011 at 1:24 AM, Harsh J <qw...@gmail.com> wrote:
>
> > There is no such information (history of atime changes, although atime
> > is held for every file in the NN) held by the NameNode right now. I
> > think HDFS-782 is slightly relevant to maintaining a 'hot-zone' info,
> > although at a block level and among datanodes. I couldn't find a jira
> > that talks about keeping a list of atime modifications on the
> > NameNode.
> >
> > On Mon, Mar 7, 2011 at 4:00 AM, Gautam Singaraju
> > <ga...@gmail.com> wrote:
> > > Hi,
> > >
> > > Is there a mechanism to get the list of files accessed on HDFS at the
> > > NameNode?
> > > Thanks!
> > > ---
> > > Gautam
> > >
> >
> >
> >
> > --
> > Harsh J
> > www.harshj.com
> >
>
>
>
> --
> Connect to me at http://www.facebook.com/dhruba
>

Re: File access pattern on HDFS?

Posted by Dhruba Borthakur <dh...@gmail.com>.
Here is a JIRA that talks about a file-change-log (but no work has been done
yet)

http://issues.apache.org/jira/browse/HDFS-1179

thanks,
dhruba

On Mon, Mar 7, 2011 at 1:24 AM, Harsh J <qw...@gmail.com> wrote:

> There is no such information (history of atime changes, although atime
> is held for every file in the NN) held by the NameNode right now. I
> think HDFS-782 is slightly relevant to maintaining a 'hot-zone' info,
> although at a block level and among datanodes. I couldn't find a jira
> that talks about keeping a list of atime modifications on the
> NameNode.
>
> On Mon, Mar 7, 2011 at 4:00 AM, Gautam Singaraju
> <ga...@gmail.com> wrote:
> > Hi,
> >
> > Is there a mechanism to get the list of files accessed on HDFS at the
> > NameNode?
> > Thanks!
> > ---
> > Gautam
> >
>
>
>
> --
> Harsh J
> www.harshj.com
>



-- 
Connect to me at http://www.facebook.com/dhruba

Re: File access pattern on HDFS?

Posted by Harsh J <qw...@gmail.com>.
There is no such information (history of atime changes, although atime
is held for every file in the NN) held by the NameNode right now. I
think HDFS-782 is slightly relevant to maintaining a 'hot-zone' info,
although at a block level and among datanodes. I couldn't find a jira
that talks about keeping a list of atime modifications on the
NameNode.

On Mon, Mar 7, 2011 at 4:00 AM, Gautam Singaraju
<ga...@gmail.com> wrote:
> Hi,
>
> Is there a mechanism to get the list of files accessed on HDFS at the
> NameNode?
> Thanks!
> ---
> Gautam
>



-- 
Harsh J
www.harshj.com

Re: File access pattern on HDFS?

Posted by icebergs <hk...@gmail.com>.
hadoop fs -ls

or use listStatus,for an example in "Hadoop: The Definitive Guide" as
follow.

public class ListStatus {
  public static void main(String[] args) throws Exception {
    String uri = args[0];
    Configuration conf = new Configuration();
    FileSystem fs = FileSystem.get(URI.create(uri), conf);
    Path[] paths = new Path[args.length];
    for (int i = 0; i < paths.length; i++) {
      paths[i] = new Path(args[i]);
    }
    FileStatus[] status = fs.listStatus(paths);
    Path[] listedPaths = FileUtil.stat2Paths(status);
    for (Path p : listedPaths) {
      System.out.println(p);
    }
  }

2011/3/7 Gautam Singaraju <ga...@gmail.com>

> Hi,
>
> Is there a mechanism to get the list of files accessed on HDFS at the
> NameNode?
> Thanks!
> ---
> Gautam
>