You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-dev@hadoop.apache.org by "Hairong Kuang (JIRA)" <ji...@apache.org> on 2006/11/14 00:25:38 UTC

[jira] Created: (HADOOP-713) dfs list operation is too expensive

dfs list operation is too expensive
-----------------------------------

                 Key: HADOOP-713
                 URL: http://issues.apache.org/jira/browse/HADOOP-713
             Project: Hadoop
          Issue Type: Improvement
          Components: dfs
    Affects Versions: 0.8.0
            Reporter: Hairong Kuang


A list request to dfs returns an array of DFSFileInfo. A DFSFileInfo of a directory contains a field called contentsLen, indicating its size  which gets computed at the namenode side by resursively going through its subdirs. At the same time, the whole dfs directory tree is locked.

The list operation is used a lot by DFSClient for listing a directory, getting a file's size and # of replicas, and getting the size of dfs. Only the last operation needs the field contentsLen to be computed.

To reduce its cost, we can add a flag to the list request. ContentsLen is computed If the flag is set. By default, the flag is false.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

RE: [jira] Created: (HADOOP-713) dfs list operation is too expensive

Posted by Hairong Kuang <ha...@yahoo-inc.com>.

Currently a subtree size is computed at the name node side as a side effect
of the list command. Did you mean that we should have a seperate command?

Hairong

-----Original Message-----
From: Doug Cutting [mailto:cutting@apache.org] 
Sent: Thursday, November 16, 2006 10:29 AM
To: hadoop-dev@lucene.apache.org
Subject: Re: [jira] Created: (HADOOP-713) dfs list operation is too
expensive

Hairong Kuang wrote:
> "Dfs -du" is implemented using list requests. So I propose that we 
> support two types of list: one computing subtree size and one not.

Put another way: there should be a namenode method that computes a subtree
size: we don't want to compute these client-side.

Doug

Re: [jira] Created: (HADOOP-713) dfs list operation is too expensive

Posted by Doug Cutting <cu...@apache.org>.

Hairong Kuang wrote:
> "Dfs -du" is implemented using list requests. So I propose that we support
> two types of list: one computing subtree size and one not.

Put another way: there should be a namenode method that computes a 
subtree size: we don't want to compute these client-side.

Doug

RE: [jira] Created: (HADOOP-713) dfs list operation is too expensive

Posted by Hairong Kuang <ha...@yahoo-inc.com>.

"Dfs -du" is implemented using list requests. So I propose that we support
two types of list: one computing subtree size and one not.

Hairong

-----Original Message-----
From: Eric Baldeschwieler [mailto:eric14@yahoo-inc.com] 
Sent: Wednesday, November 15, 2006 1:11 PM
To: hadoop-dev@lucene.apache.org
Subject: Re: [jira] Created: (HADOOP-713) dfs list operation is too
expensive

It is not free.  As I understand it, we are recursively walking the
namespace tree with every ls to get this.

This is not a scalable design.  Even posix doesn't do this!

This is a performance problem that will only get worse.  I suggest removing
this performance mistake and documenting the existence of dfs -du, which is
a rather familiar solution to most users.

On Nov 15, 2006, at 12:19 PM, Yoram Arnon wrote:

>  I opt for displaying the size in bytes for now, since it's computed 
> anyway, is readily available for free, and improves the UI.
> If/when we fix HADOOP-713 we can replace the computation of size with 
> a better value for #files.
> Let's not prevent an improvement just because it might change in the 
> future.
> Yoram
>
>> -----Original Message-----
>> From: Eric Baldeschwieler [mailto:eric14@yahoo-inc.com]
>> Sent: Tuesday, November 14, 2006 7:10 PM
>> To: hadoop-dev@lucene.apache.org
>> Subject: Re: [jira] Created: (HADOOP-713) dfs list operation is too 
>> expensive
>>
>> So let's display nothing for now and revisit this once we have a 
>> cleaner CRC story.
>>
>>
>> On Nov 14, 2006, at 10:55 AM, Hairong Kuang wrote:
>>
>>> Setting the size of a directory to be the # of files is a good idea. 
>>> But the problem is that dfs name node has no idea of checksum
>> files. So the
>>> number
>>> of files include that of checksum files. But what's displayed at the 
>>> client side has filtered out the checksum files. So the # of files 
>>> does not match what's really displayed at the client side.
>>>
>>> Hairong
>>>
>>> -----Original Message-----
>>> From: Arkady Borkovsky [mailto:arkady@yahoo-inc.com]
>>> Sent: Monday, November 13, 2006 5:07 PM
>>> To: hadoop-dev@lucene.apache.org
>>> Subject: Re: [jira] Created: (HADOOP-713) dfs list operation is too 
>>> expensive
>>>
>>> When listing a directory, for directory entries it may be more 
>>> useful to display the number of files in a directory, rather than 
>>> the number of bytes used by all the files in the directory and its 
>>> subdirectories.
>>> This a subjective opinion -- comments?
>>>
>>> (Currently, the value displayed subdirectory is "0")
>>>
>>> On Nov 13, 2006, at 3:25 PM, Hairong Kuang (JIRA) wrote:
>>>
>>>> dfs list operation is too expensive
>>>> -----------------------------------
>>>>
>>>>                  Key: HADOOP-713
>>>>                  URL:
>> http://issues.apache.org/jira/browse/HADOOP-713
>>>>              Project: Hadoop
>>>>           Issue Type: Improvement
>>>>           Components: dfs
>>>>     Affects Versions: 0.8.0
>>>>             Reporter: Hairong Kuang
>>>>
>>>>
>>>> A list request to dfs returns an array of DFSFileInfo. A
>> DFSFileInfo
>>>> of a directory contains a field called contentsLen, indicating its 
>>>> size  which gets computed at the namenode side by resursively going 
>>>> through its subdirs. At the same time, the whole dfs directory tree 
>>>> is locked.
>>>>
>>>> The list operation is used a lot by DFSClient for listing a 
>>>> directory, getting a file's size and # of replicas, and getting the
>> size of dfs.
>>>> Only the last operation needs the field contentsLen to be computed.
>>>>
>>>> To reduce its cost, we can add a flag to the list request.
>>>> ContentsLen
>>>> is computed If the flag is set. By default, the flag is false.
>>>>
>>>> --
>>>> This message is automatically generated by JIRA.
>>>> -
>>>> If you think it was sent incorrectly contact one of the
>>>> administrators:
>>>> http://issues.apache.org/jira/secure/Administrators.jspa
>>>> -
>>>> For more information on JIRA, see:
>>>> http://www.atlassian.com/software/jira
>>>>
>>>>
>>>
>>>
>>
>>
>

Re: [jira] Created: (HADOOP-713) dfs list operation is too expensive

Posted by Eric Baldeschwieler <er...@yahoo-inc.com>.

I've got to disagree with that.  It's simply not responsible to add  
features that are not going to be easy to support as we near our  
medium term size targets.  Especially not in a central interface.   
I'd welcome anyone who wants to contribute an expanded viewer that  
lists recursive sizes, guess dates and does other cool things.

I'd just not support adding it to the project core or into the  
primary HDFS browse API that is built into all name and data nodes.   
We want these daemons to be simple, fast and stable.

That would commit us to either:

a) Taking a feature away later to ease scaling the system.  Always  
hard to do.

b) Doing a lot of extra engineering to make this fast later, even  
though it is not central to the framework.

On Nov 15, 2006, at 11:29 PM, Arkady Borkovsky wrote:

> Eric,
>
> there is some difference between the foundation components and user  
> facing components like UI.
> While foundation is expected to be stable and compatible from  
> release to release,
> UI is expected to evolve, continuously becoming more and more  
> useful and powerful, and providing as much of useful functionality  
> at any given time as possible.  The specific issue discussed in  
> this thread (what datums to show in directory listing for a  
> subdirectory) is pretty minor -- the more the better, commensurate  
> with resources, and nothing too misleading is the answer,
> But as the matter of design principles, user facing components have  
> different nature than the infrastructure.  There can be no "time  
> bomb" in the UI.
>
> -- ab
...

Re: [jira] Created: (HADOOP-713) dfs list operation is too expensive

Posted by Arkady Borkovsky <ar...@yahoo-inc.com>.

Eric,

there is some difference between the foundation components and user 
facing components like UI.
While foundation is expected to be stable and compatible from release 
to release,
UI is expected to evolve, continuously becoming more and more useful 
and powerful, and providing as much of useful functionality at any 
given time as possible.  The specific issue discussed in this thread 
(what datums to show in directory listing for a subdirectory) is pretty 
minor -- the more the better, commensurate with resources, and nothing 
too misleading is the answer,
But as the matter of design principles, user facing components have 
different nature than the infrastructure.  There can be no "time bomb" 
in the UI.

-- ab

On Nov 15, 2006, at 10:23 PM, Eric Baldeschwieler wrote:

> Come on.  This is a time bomb.  Let's fix it.  Let's not wire it into 
> our web UI.  That makes tree browsing dangerously expensive and sets 
> us up to have users expect this misfeature be supported.
>
> The goal is to keep things simple.  Expanding the deployment of 
> unsustainable / unscalable features is a distraction.
>
> Name node lockups are hardly a hypothetical problem for us.
>
> -1
>
>
> On Nov 15, 2006, at 2:06 PM, Yoram Arnon wrote:
>
>> I agree with all that, except that that's how the ls command works 
>> now,
>> performance issues and all, and that will change only when we fix
>> HADOOP-713. Until then, using that field is free - it's being computed
>> anyway.
>>
>> That said, HADOOP-713 not a current pain point. Users running ls is 
>> pretty
>> much a non issue, since it's a rare operation, and it takes a 
>> fraction of a
>> second on the name node with our largish dfs. M-R jobs don't really 
>> pay a
>> penalty for this behaviour, since they normally execute on the last 
>> level of
>> the tree anyway, where the current behaviour is desirable.
>> With all that in mind, the bug may stay in the queue for a while, 
>> until more
>> important issues are addressed.
>> Until then, we may as well get a better UI.
>>
>> Yoram
>>
>>> -----Original Message-----
>>> From: Eric Baldeschwieler [mailto:eric14@yahoo-inc.com]
>>> Sent: Wednesday, November 15, 2006 1:11 PM
>>> To: hadoop-dev@lucene.apache.org
>>> Subject: Re: [jira] Created: (HADOOP-713) dfs list operation
>>> is too expensive
>>>
>>> It is not free.  As I understand it, we are recursively walking the
>>> namespace tree with every ls to get this.
>>>
>>> This is not a scalable design.  Even posix doesn't do this!
>>>
>>> This is a performance problem that will only get worse.  I suggest
>>> removing this performance mistake and documenting the existence of
>>> dfs -du, which is a rather familiar solution to most users.
>>>
>>> On Nov 15, 2006, at 12:19 PM, Yoram Arnon wrote:
>>>
>>>>  I opt for displaying the size in bytes for now, since it's
>>>> computed anyway,
>>>> is readily available for free, and improves the UI.
>>>> If/when we fix HADOOP-713 we can replace the computation of size
>>>> with a
>>>> better value for #files.
>>>> Let's not prevent an improvement just because it might change in
>>>> the future.
>>>> Yoram
>>>>
>>>>> -----Original Message-----
>>>>> From: Eric Baldeschwieler [mailto:eric14@yahoo-inc.com]
>>>>> Sent: Tuesday, November 14, 2006 7:10 PM
>>>>> To: hadoop-dev@lucene.apache.org
>>>>> Subject: Re: [jira] Created: (HADOOP-713) dfs list operation
>>>>> is too expensive
>>>>>
>>>>> So let's display nothing for now and revisit this once we have a
>>>>> cleaner CRC story.
>>>>>
>>>>>
>>>>> On Nov 14, 2006, at 10:55 AM, Hairong Kuang wrote:
>>>>>
>>>>>> Setting the size of a directory to be the # of files is a good
>>>>>> idea. But the
>>>>>> problem is that dfs name node has no idea of checksum
>>>>> files. So the
>>>>>> number
>>>>>> of files include that of checksum files. But what's displayed at
>>>>>> the client
>>>>>> side has filtered out the checksum files. So the # of files does
>>>>>> not match
>>>>>> what's really displayed at the client side.
>>>>>>
>>>>>> Hairong
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Arkady Borkovsky [mailto:arkady@yahoo-inc.com]
>>>>>> Sent: Monday, November 13, 2006 5:07 PM
>>>>>> To: hadoop-dev@lucene.apache.org
>>>>>> Subject: Re: [jira] Created: (HADOOP-713) dfs list
>>> operation is too
>>>>>> expensive
>>>>>>
>>>>>> When listing a directory, for directory entries it may be more
>>>>>> useful to
>>>>>> display the number of files in a directory, rather than the number
>>>>>> of bytes
>>>>>> used by all the files in the directory and its subdirectories.
>>>>>> This a subjective opinion -- comments?
>>>>>>
>>>>>> (Currently, the value displayed subdirectory is "0")
>>>>>>
>>>>>> On Nov 13, 2006, at 3:25 PM, Hairong Kuang (JIRA) wrote:
>>>>>>
>>>>>>> dfs list operation is too expensive
>>>>>>> -----------------------------------
>>>>>>>
>>>>>>>                  Key: HADOOP-713
>>>>>>>                  URL:
>>>>> http://issues.apache.org/jira/browse/HADOOP-713
>>>>>>>              Project: Hadoop
>>>>>>>           Issue Type: Improvement
>>>>>>>           Components: dfs
>>>>>>>     Affects Versions: 0.8.0
>>>>>>>             Reporter: Hairong Kuang
>>>>>>>
>>>>>>>
>>>>>>> A list request to dfs returns an array of DFSFileInfo. A
>>>>> DFSFileInfo
>>>>>>> of a directory contains a field called contentsLen,
>>> indicating its
>>>>>>> size  which gets computed at the namenode side by
>>> resursively going
>>>>>>> through its subdirs. At the same time, the whole dfs directory
>>>>>>> tree is
>>>>>>> locked.
>>>>>>>
>>>>>>> The list operation is used a lot by DFSClient for listing a
>>>>>>> directory,
>>>>>>> getting a file's size and # of replicas, and getting the
>>>>> size of dfs.
>>>>>>> Only the last operation needs the field contentsLen to
>>> be computed.
>>>>>>>
>>>>>>> To reduce its cost, we can add a flag to the list request.
>>>>>>> ContentsLen
>>>>>>> is computed If the flag is set. By default, the flag is false.
>>>>>>>
>>>>>>> --
>>>>>>> This message is automatically generated by JIRA.
>>>>>>> -
>>>>>>> If you think it was sent incorrectly contact one of the
>>>>>>> administrators:
>>>>>>> http://issues.apache.org/jira/secure/Administrators.jspa
>>>>>>> -
>>>>>>> For more information on JIRA, see:
>>>>>>> http://www.atlassian.com/software/jira
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>
>

Re: [jira] Created: (HADOOP-713) dfs list operation is too expensive

Posted by Eric Baldeschwieler <er...@yahoo-inc.com>.

Come on.  This is a time bomb.  Let's fix it.  Let's not wire it into  
our web UI.  That makes tree browsing dangerously expensive and sets  
us up to have users expect this misfeature be supported.

The goal is to keep things simple.  Expanding the deployment of  
unsustainable / unscalable features is a distraction.

Name node lockups are hardly a hypothetical problem for us.

-1


On Nov 15, 2006, at 2:06 PM, Yoram Arnon wrote:

> I agree with all that, except that that's how the ls command works  
> now,
> performance issues and all, and that will change only when we fix
> HADOOP-713. Until then, using that field is free - it's being computed
> anyway.
>
> That said, HADOOP-713 not a current pain point. Users running ls is  
> pretty
> much a non issue, since it's a rare operation, and it takes a  
> fraction of a
> second on the name node with our largish dfs. M-R jobs don't really  
> pay a
> penalty for this behaviour, since they normally execute on the last  
> level of
> the tree anyway, where the current behaviour is desirable.
> With all that in mind, the bug may stay in the queue for a while,  
> until more
> important issues are addressed.
> Until then, we may as well get a better UI.
>
> Yoram
>
>> -----Original Message-----
>> From: Eric Baldeschwieler [mailto:eric14@yahoo-inc.com]
>> Sent: Wednesday, November 15, 2006 1:11 PM
>> To: hadoop-dev@lucene.apache.org
>> Subject: Re: [jira] Created: (HADOOP-713) dfs list operation
>> is too expensive
>>
>> It is not free.  As I understand it, we are recursively walking the
>> namespace tree with every ls to get this.
>>
>> This is not a scalable design.  Even posix doesn't do this!
>>
>> This is a performance problem that will only get worse.  I suggest
>> removing this performance mistake and documenting the existence of
>> dfs -du, which is a rather familiar solution to most users.
>>
>> On Nov 15, 2006, at 12:19 PM, Yoram Arnon wrote:
>>
>>>  I opt for displaying the size in bytes for now, since it's
>>> computed anyway,
>>> is readily available for free, and improves the UI.
>>> If/when we fix HADOOP-713 we can replace the computation of size
>>> with a
>>> better value for #files.
>>> Let's not prevent an improvement just because it might change in
>>> the future.
>>> Yoram
>>>
>>>> -----Original Message-----
>>>> From: Eric Baldeschwieler [mailto:eric14@yahoo-inc.com]
>>>> Sent: Tuesday, November 14, 2006 7:10 PM
>>>> To: hadoop-dev@lucene.apache.org
>>>> Subject: Re: [jira] Created: (HADOOP-713) dfs list operation
>>>> is too expensive
>>>>
>>>> So let's display nothing for now and revisit this once we have a
>>>> cleaner CRC story.
>>>>
>>>>
>>>> On Nov 14, 2006, at 10:55 AM, Hairong Kuang wrote:
>>>>
>>>>> Setting the size of a directory to be the # of files is a good
>>>>> idea. But the
>>>>> problem is that dfs name node has no idea of checksum
>>>> files. So the
>>>>> number
>>>>> of files include that of checksum files. But what's displayed at
>>>>> the client
>>>>> side has filtered out the checksum files. So the # of files does
>>>>> not match
>>>>> what's really displayed at the client side.
>>>>>
>>>>> Hairong
>>>>>
>>>>> -----Original Message-----
>>>>> From: Arkady Borkovsky [mailto:arkady@yahoo-inc.com]
>>>>> Sent: Monday, November 13, 2006 5:07 PM
>>>>> To: hadoop-dev@lucene.apache.org
>>>>> Subject: Re: [jira] Created: (HADOOP-713) dfs list
>> operation is too
>>>>> expensive
>>>>>
>>>>> When listing a directory, for directory entries it may be more
>>>>> useful to
>>>>> display the number of files in a directory, rather than the number
>>>>> of bytes
>>>>> used by all the files in the directory and its subdirectories.
>>>>> This a subjective opinion -- comments?
>>>>>
>>>>> (Currently, the value displayed subdirectory is "0")
>>>>>
>>>>> On Nov 13, 2006, at 3:25 PM, Hairong Kuang (JIRA) wrote:
>>>>>
>>>>>> dfs list operation is too expensive
>>>>>> -----------------------------------
>>>>>>
>>>>>>                  Key: HADOOP-713
>>>>>>                  URL:
>>>> http://issues.apache.org/jira/browse/HADOOP-713
>>>>>>              Project: Hadoop
>>>>>>           Issue Type: Improvement
>>>>>>           Components: dfs
>>>>>>     Affects Versions: 0.8.0
>>>>>>             Reporter: Hairong Kuang
>>>>>>
>>>>>>
>>>>>> A list request to dfs returns an array of DFSFileInfo. A
>>>> DFSFileInfo
>>>>>> of a directory contains a field called contentsLen,
>> indicating its
>>>>>> size  which gets computed at the namenode side by
>> resursively going
>>>>>> through its subdirs. At the same time, the whole dfs directory
>>>>>> tree is
>>>>>> locked.
>>>>>>
>>>>>> The list operation is used a lot by DFSClient for listing a
>>>>>> directory,
>>>>>> getting a file's size and # of replicas, and getting the
>>>> size of dfs.
>>>>>> Only the last operation needs the field contentsLen to
>> be computed.
>>>>>>
>>>>>> To reduce its cost, we can add a flag to the list request.
>>>>>> ContentsLen
>>>>>> is computed If the flag is set. By default, the flag is false.
>>>>>>
>>>>>> --
>>>>>> This message is automatically generated by JIRA.
>>>>>> -
>>>>>> If you think it was sent incorrectly contact one of the
>>>>>> administrators:
>>>>>> http://issues.apache.org/jira/secure/Administrators.jspa
>>>>>> -
>>>>>> For more information on JIRA, see:
>>>>>> http://www.atlassian.com/software/jira
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>>
>

RE: [jira] Created: (HADOOP-713) dfs list operation is too expensive

Posted by Yoram Arnon <ya...@yahoo-inc.com>.

I agree with all that, except that that's how the ls command works now,
performance issues and all, and that will change only when we fix
HADOOP-713. Until then, using that field is free - it's being computed
anyway.

That said, HADOOP-713 not a current pain point. Users running ls is pretty
much a non issue, since it's a rare operation, and it takes a fraction of a
second on the name node with our largish dfs. M-R jobs don't really pay a
penalty for this behaviour, since they normally execute on the last level of
the tree anyway, where the current behaviour is desirable. 
With all that in mind, the bug may stay in the queue for a while, until more
important issues are addressed.
Until then, we may as well get a better UI.

Yoram

> -----Original Message-----
> From: Eric Baldeschwieler [mailto:eric14@yahoo-inc.com] 
> Sent: Wednesday, November 15, 2006 1:11 PM
> To: hadoop-dev@lucene.apache.org
> Subject: Re: [jira] Created: (HADOOP-713) dfs list operation 
> is too expensive
> 
> It is not free.  As I understand it, we are recursively walking the  
> namespace tree with every ls to get this.
> 
> This is not a scalable design.  Even posix doesn't do this!
> 
> This is a performance problem that will only get worse.  I suggest  
> removing this performance mistake and documenting the existence of  
> dfs -du, which is a rather familiar solution to most users.
> 
> On Nov 15, 2006, at 12:19 PM, Yoram Arnon wrote:
> 
> >  I opt for displaying the size in bytes for now, since it's  
> > computed anyway,
> > is readily available for free, and improves the UI.
> > If/when we fix HADOOP-713 we can replace the computation of size  
> > with a
> > better value for #files.
> > Let's not prevent an improvement just because it might change in  
> > the future.
> > Yoram
> >
> >> -----Original Message-----
> >> From: Eric Baldeschwieler [mailto:eric14@yahoo-inc.com]
> >> Sent: Tuesday, November 14, 2006 7:10 PM
> >> To: hadoop-dev@lucene.apache.org
> >> Subject: Re: [jira] Created: (HADOOP-713) dfs list operation
> >> is too expensive
> >>
> >> So let's display nothing for now and revisit this once we have a
> >> cleaner CRC story.
> >>
> >>
> >> On Nov 14, 2006, at 10:55 AM, Hairong Kuang wrote:
> >>
> >>> Setting the size of a directory to be the # of files is a good
> >>> idea. But the
> >>> problem is that dfs name node has no idea of checksum
> >> files. So the
> >>> number
> >>> of files include that of checksum files. But what's displayed at
> >>> the client
> >>> side has filtered out the checksum files. So the # of files does
> >>> not match
> >>> what's really displayed at the client side.
> >>>
> >>> Hairong
> >>>
> >>> -----Original Message-----
> >>> From: Arkady Borkovsky [mailto:arkady@yahoo-inc.com]
> >>> Sent: Monday, November 13, 2006 5:07 PM
> >>> To: hadoop-dev@lucene.apache.org
> >>> Subject: Re: [jira] Created: (HADOOP-713) dfs list 
> operation is too
> >>> expensive
> >>>
> >>> When listing a directory, for directory entries it may be more
> >>> useful to
> >>> display the number of files in a directory, rather than the number
> >>> of bytes
> >>> used by all the files in the directory and its subdirectories.
> >>> This a subjective opinion -- comments?
> >>>
> >>> (Currently, the value displayed subdirectory is "0")
> >>>
> >>> On Nov 13, 2006, at 3:25 PM, Hairong Kuang (JIRA) wrote:
> >>>
> >>>> dfs list operation is too expensive
> >>>> -----------------------------------
> >>>>
> >>>>                  Key: HADOOP-713
> >>>>                  URL:
> >> http://issues.apache.org/jira/browse/HADOOP-713
> >>>>              Project: Hadoop
> >>>>           Issue Type: Improvement
> >>>>           Components: dfs
> >>>>     Affects Versions: 0.8.0
> >>>>             Reporter: Hairong Kuang
> >>>>
> >>>>
> >>>> A list request to dfs returns an array of DFSFileInfo. A
> >> DFSFileInfo
> >>>> of a directory contains a field called contentsLen, 
> indicating its
> >>>> size  which gets computed at the namenode side by 
> resursively going
> >>>> through its subdirs. At the same time, the whole dfs directory
> >>>> tree is
> >>>> locked.
> >>>>
> >>>> The list operation is used a lot by DFSClient for listing a
> >>>> directory,
> >>>> getting a file's size and # of replicas, and getting the
> >> size of dfs.
> >>>> Only the last operation needs the field contentsLen to 
> be computed.
> >>>>
> >>>> To reduce its cost, we can add a flag to the list request.
> >>>> ContentsLen
> >>>> is computed If the flag is set. By default, the flag is false.
> >>>>
> >>>> --
> >>>> This message is automatically generated by JIRA.
> >>>> -
> >>>> If you think it was sent incorrectly contact one of the
> >>>> administrators:
> >>>> http://issues.apache.org/jira/secure/Administrators.jspa
> >>>> -
> >>>> For more information on JIRA, see:
> >>>> http://www.atlassian.com/software/jira
> >>>>
> >>>>
> >>>
> >>>
> >>
> >>
> >
> 
>

Re: [jira] Created: (HADOOP-713) dfs list operation is too expensive

Posted by Eric Baldeschwieler <er...@yahoo-inc.com>.

It is not free.  As I understand it, we are recursively walking the  
namespace tree with every ls to get this.

This is not a scalable design.  Even posix doesn't do this!

This is a performance problem that will only get worse.  I suggest  
removing this performance mistake and documenting the existence of  
dfs -du, which is a rather familiar solution to most users.

On Nov 15, 2006, at 12:19 PM, Yoram Arnon wrote:

>  I opt for displaying the size in bytes for now, since it's  
> computed anyway,
> is readily available for free, and improves the UI.
> If/when we fix HADOOP-713 we can replace the computation of size  
> with a
> better value for #files.
> Let's not prevent an improvement just because it might change in  
> the future.
> Yoram
>
>> -----Original Message-----
>> From: Eric Baldeschwieler [mailto:eric14@yahoo-inc.com]
>> Sent: Tuesday, November 14, 2006 7:10 PM
>> To: hadoop-dev@lucene.apache.org
>> Subject: Re: [jira] Created: (HADOOP-713) dfs list operation
>> is too expensive
>>
>> So let's display nothing for now and revisit this once we have a
>> cleaner CRC story.
>>
>>
>> On Nov 14, 2006, at 10:55 AM, Hairong Kuang wrote:
>>
>>> Setting the size of a directory to be the # of files is a good
>>> idea. But the
>>> problem is that dfs name node has no idea of checksum
>> files. So the
>>> number
>>> of files include that of checksum files. But what's displayed at
>>> the client
>>> side has filtered out the checksum files. So the # of files does
>>> not match
>>> what's really displayed at the client side.
>>>
>>> Hairong
>>>
>>> -----Original Message-----
>>> From: Arkady Borkovsky [mailto:arkady@yahoo-inc.com]
>>> Sent: Monday, November 13, 2006 5:07 PM
>>> To: hadoop-dev@lucene.apache.org
>>> Subject: Re: [jira] Created: (HADOOP-713) dfs list operation is too
>>> expensive
>>>
>>> When listing a directory, for directory entries it may be more
>>> useful to
>>> display the number of files in a directory, rather than the number
>>> of bytes
>>> used by all the files in the directory and its subdirectories.
>>> This a subjective opinion -- comments?
>>>
>>> (Currently, the value displayed subdirectory is "0")
>>>
>>> On Nov 13, 2006, at 3:25 PM, Hairong Kuang (JIRA) wrote:
>>>
>>>> dfs list operation is too expensive
>>>> -----------------------------------
>>>>
>>>>                  Key: HADOOP-713
>>>>                  URL:
>> http://issues.apache.org/jira/browse/HADOOP-713
>>>>              Project: Hadoop
>>>>           Issue Type: Improvement
>>>>           Components: dfs
>>>>     Affects Versions: 0.8.0
>>>>             Reporter: Hairong Kuang
>>>>
>>>>
>>>> A list request to dfs returns an array of DFSFileInfo. A
>> DFSFileInfo
>>>> of a directory contains a field called contentsLen, indicating its
>>>> size  which gets computed at the namenode side by resursively going
>>>> through its subdirs. At the same time, the whole dfs directory
>>>> tree is
>>>> locked.
>>>>
>>>> The list operation is used a lot by DFSClient for listing a
>>>> directory,
>>>> getting a file's size and # of replicas, and getting the
>> size of dfs.
>>>> Only the last operation needs the field contentsLen to be computed.
>>>>
>>>> To reduce its cost, we can add a flag to the list request.
>>>> ContentsLen
>>>> is computed If the flag is set. By default, the flag is false.
>>>>
>>>> --
>>>> This message is automatically generated by JIRA.
>>>> -
>>>> If you think it was sent incorrectly contact one of the
>>>> administrators:
>>>> http://issues.apache.org/jira/secure/Administrators.jspa
>>>> -
>>>> For more information on JIRA, see:
>>>> http://www.atlassian.com/software/jira
>>>>
>>>>
>>>
>>>
>>
>>
>

RE: [jira] Created: (HADOOP-713) dfs list operation is too expensive

Posted by Yoram Arnon <ya...@yahoo-inc.com>.

 I opt for displaying the size in bytes for now, since it's computed anyway,
is readily available for free, and improves the UI.
If/when we fix HADOOP-713 we can replace the computation of size with a
better value for #files.
Let's not prevent an improvement just because it might change in the future.
Yoram

> -----Original Message-----
> From: Eric Baldeschwieler [mailto:eric14@yahoo-inc.com] 
> Sent: Tuesday, November 14, 2006 7:10 PM
> To: hadoop-dev@lucene.apache.org
> Subject: Re: [jira] Created: (HADOOP-713) dfs list operation 
> is too expensive
> 
> So let's display nothing for now and revisit this once we have a  
> cleaner CRC story.
> 
> 
> On Nov 14, 2006, at 10:55 AM, Hairong Kuang wrote:
> 
> > Setting the size of a directory to be the # of files is a good  
> > idea. But the
> > problem is that dfs name node has no idea of checksum 
> files. So the  
> > number
> > of files include that of checksum files. But what's displayed at  
> > the client
> > side has filtered out the checksum files. So the # of files does  
> > not match
> > what's really displayed at the client side.
> >
> > Hairong
> >
> > -----Original Message-----
> > From: Arkady Borkovsky [mailto:arkady@yahoo-inc.com]
> > Sent: Monday, November 13, 2006 5:07 PM
> > To: hadoop-dev@lucene.apache.org
> > Subject: Re: [jira] Created: (HADOOP-713) dfs list operation is too
> > expensive
> >
> > When listing a directory, for directory entries it may be more  
> > useful to
> > display the number of files in a directory, rather than the number  
> > of bytes
> > used by all the files in the directory and its subdirectories.
> > This a subjective opinion -- comments?
> >
> > (Currently, the value displayed subdirectory is "0")
> >
> > On Nov 13, 2006, at 3:25 PM, Hairong Kuang (JIRA) wrote:
> >
> >> dfs list operation is too expensive
> >> -----------------------------------
> >>
> >>                  Key: HADOOP-713
> >>                  URL: 
> http://issues.apache.org/jira/browse/HADOOP-713
> >>              Project: Hadoop
> >>           Issue Type: Improvement
> >>           Components: dfs
> >>     Affects Versions: 0.8.0
> >>             Reporter: Hairong Kuang
> >>
> >>
> >> A list request to dfs returns an array of DFSFileInfo. A 
> DFSFileInfo
> >> of a directory contains a field called contentsLen, indicating its
> >> size  which gets computed at the namenode side by resursively going
> >> through its subdirs. At the same time, the whole dfs directory  
> >> tree is
> >> locked.
> >>
> >> The list operation is used a lot by DFSClient for listing a  
> >> directory,
> >> getting a file's size and # of replicas, and getting the 
> size of dfs.
> >> Only the last operation needs the field contentsLen to be computed.
> >>
> >> To reduce its cost, we can add a flag to the list request.  
> >> ContentsLen
> >> is computed If the flag is set. By default, the flag is false.
> >>
> >> --
> >> This message is automatically generated by JIRA.
> >> -
> >> If you think it was sent incorrectly contact one of the
> >> administrators:
> >> http://issues.apache.org/jira/secure/Administrators.jspa
> >> -
> >> For more information on JIRA, see:
> >> http://www.atlassian.com/software/jira
> >>
> >>
> >
> >
> 
>

Re: [jira] Created: (HADOOP-713) dfs list operation is too expensive

Posted by Eric Baldeschwieler <er...@yahoo-inc.com>.

So let's display nothing for now and revisit this once we have a  
cleaner CRC story.


On Nov 14, 2006, at 10:55 AM, Hairong Kuang wrote:

> Setting the size of a directory to be the # of files is a good  
> idea. But the
> problem is that dfs name node has no idea of checksum files. So the  
> number
> of files include that of checksum files. But what's displayed at  
> the client
> side has filtered out the checksum files. So the # of files does  
> not match
> what's really displayed at the client side.
>
> Hairong
>
> -----Original Message-----
> From: Arkady Borkovsky [mailto:arkady@yahoo-inc.com]
> Sent: Monday, November 13, 2006 5:07 PM
> To: hadoop-dev@lucene.apache.org
> Subject: Re: [jira] Created: (HADOOP-713) dfs list operation is too
> expensive
>
> When listing a directory, for directory entries it may be more  
> useful to
> display the number of files in a directory, rather than the number  
> of bytes
> used by all the files in the directory and its subdirectories.
> This a subjective opinion -- comments?
>
> (Currently, the value displayed subdirectory is "0")
>
> On Nov 13, 2006, at 3:25 PM, Hairong Kuang (JIRA) wrote:
>
>> dfs list operation is too expensive
>> -----------------------------------
>>
>>                  Key: HADOOP-713
>>                  URL: http://issues.apache.org/jira/browse/HADOOP-713
>>              Project: Hadoop
>>           Issue Type: Improvement
>>           Components: dfs
>>     Affects Versions: 0.8.0
>>             Reporter: Hairong Kuang
>>
>>
>> A list request to dfs returns an array of DFSFileInfo. A DFSFileInfo
>> of a directory contains a field called contentsLen, indicating its
>> size  which gets computed at the namenode side by resursively going
>> through its subdirs. At the same time, the whole dfs directory  
>> tree is
>> locked.
>>
>> The list operation is used a lot by DFSClient for listing a  
>> directory,
>> getting a file's size and # of replicas, and getting the size of dfs.
>> Only the last operation needs the field contentsLen to be computed.
>>
>> To reduce its cost, we can add a flag to the list request.  
>> ContentsLen
>> is computed If the flag is set. By default, the flag is false.
>>
>> --
>> This message is automatically generated by JIRA.
>> -
>> If you think it was sent incorrectly contact one of the
>> administrators:
>> http://issues.apache.org/jira/secure/Administrators.jspa
>> -
>> For more information on JIRA, see:
>> http://www.atlassian.com/software/jira
>>
>>
>
>

RE: [jira] Created: (HADOOP-713) dfs list operation is too expensive

Posted by Hairong Kuang <ha...@yahoo-inc.com>.

Setting the size of a directory to be the # of files is a good idea. But the
problem is that dfs name node has no idea of checksum files. So the number
of files include that of checksum files. But what's displayed at the client
side has filtered out the checksum files. So the # of files does not match
what's really displayed at the client side.

Hairong 

-----Original Message-----
From: Arkady Borkovsky [mailto:arkady@yahoo-inc.com] 
Sent: Monday, November 13, 2006 5:07 PM
To: hadoop-dev@lucene.apache.org
Subject: Re: [jira] Created: (HADOOP-713) dfs list operation is too
expensive

When listing a directory, for directory entries it may be more useful to
display the number of files in a directory, rather than the number of bytes
used by all the files in the directory and its subdirectories.
This a subjective opinion -- comments?

(Currently, the value displayed subdirectory is "0")

On Nov 13, 2006, at 3:25 PM, Hairong Kuang (JIRA) wrote:

> dfs list operation is too expensive
> -----------------------------------
>
>                  Key: HADOOP-713
>                  URL: http://issues.apache.org/jira/browse/HADOOP-713
>              Project: Hadoop
>           Issue Type: Improvement
>           Components: dfs
>     Affects Versions: 0.8.0
>             Reporter: Hairong Kuang
>
>
> A list request to dfs returns an array of DFSFileInfo. A DFSFileInfo 
> of a directory contains a field called contentsLen, indicating its 
> size  which gets computed at the namenode side by resursively going 
> through its subdirs. At the same time, the whole dfs directory tree is 
> locked.
>
> The list operation is used a lot by DFSClient for listing a directory, 
> getting a file's size and # of replicas, and getting the size of dfs.
> Only the last operation needs the field contentsLen to be computed.
>
> To reduce its cost, we can add a flag to the list request. ContentsLen 
> is computed If the flag is set. By default, the flag is false.
>
> --
> This message is automatically generated by JIRA.
> -
> If you think it was sent incorrectly contact one of the
> administrators: 
> http://issues.apache.org/jira/secure/Administrators.jspa
> -
> For more information on JIRA, see: 
> http://www.atlassian.com/software/jira
>
>

Re: [jira] Created: (HADOOP-713) dfs list operation is too expensive

Posted by Arkady Borkovsky <ar...@yahoo-inc.com>.

When listing a directory, for directory entries it may be more useful 
to display the number of files in a directory, rather than the number 
of bytes used by all the files in the directory and its subdirectories.
This a subjective opinion -- comments?

(Currently, the value displayed subdirectory is "0")

On Nov 13, 2006, at 3:25 PM, Hairong Kuang (JIRA) wrote:

> dfs list operation is too expensive
> -----------------------------------
>
>                  Key: HADOOP-713
>                  URL: http://issues.apache.org/jira/browse/HADOOP-713
>              Project: Hadoop
>           Issue Type: Improvement
>           Components: dfs
>     Affects Versions: 0.8.0
>             Reporter: Hairong Kuang
>
>
> A list request to dfs returns an array of DFSFileInfo. A DFSFileInfo 
> of a directory contains a field called contentsLen, indicating its 
> size  which gets computed at the namenode side by resursively going 
> through its subdirs. At the same time, the whole dfs directory tree is 
> locked.
>
> The list operation is used a lot by DFSClient for listing a directory, 
> getting a file's size and # of replicas, and getting the size of dfs. 
> Only the last operation needs the field contentsLen to be computed.
>
> To reduce its cost, we can add a flag to the list request. ContentsLen 
> is computed If the flag is set. By default, the flag is false.
>
> -- 
> This message is automatically generated by JIRA.
> -
> If you think it was sent incorrectly contact one of the 
> administrators: 
> http://issues.apache.org/jira/secure/Administrators.jspa
> -
> For more information on JIRA, see: 
> http://www.atlassian.com/software/jira
>
>

[jira] Commented: (HADOOP-713) dfs list operation is too expensive

Posted by "Sameer Paranjpye (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12542296 ] 

Sameer Paranjpye commented on HADOOP-713:
-----------------------------------------

> I think it worked that way at one time in the past, and was found to put too much RPC load on the namenode

True, but at the time I think we were making a getLength RPC for every file encountered. With the new listStatus API we could get the sizes of all the files in a directory and then recursively call listStatus for the subdirectories. This would be significantly lower RPC load than an invocation per file. 

One observation to take into account is that 'ls -r' recurses on the client side, makes a call per directory in a tree and is a pretty frequent operation (certainly more frequent than du). Since 'ls -r' doesn't appear to overburden the Namenode with RPCs, it feels like that ought to be true for du as well.

> dfs list operation is too expensive
> -----------------------------------
>
>                 Key: HADOOP-713
>                 URL: https://issues.apache.org/jira/browse/HADOOP-713
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.8.0
>            Reporter: Hairong Kuang
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.15.1
>
>         Attachments: optimizeComputeContentLen.patch
>
>
> A list request to dfs returns an array of DFSFileInfo. A DFSFileInfo of a directory contains a field called contentsLen, indicating its size  which gets computed at the namenode side by resursively going through its subdirs. At the same time, the whole dfs directory tree is locked.
> The list operation is used a lot by DFSClient for listing a directory, getting a file's size and # of replicas, and getting the size of dfs. Only the last operation needs the field contentsLen to be computed.
> To reduce its cost, we can add a flag to the list request. ContentsLen is computed If the flag is set. By default, the flag is false.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

RE: [jira] Updated: (HADOOP-713) dfs list operation is too expensive

Posted by Christian Kunz <ck...@yahoo-inc.com>.

Assigned it to 0.15.1

-Christian 

-----Original Message-----
From: Nigel Daley [mailto:ndaley@yahoo-inc.com] 
Sent: Monday, November 12, 2007 11:57 PM
To: hadoop-dev@lucene.apache.org
Subject: Re: [jira] Updated: (HADOOP-713) dfs list operation is too
expensive

Blocker for what release?  Please assign to a release.

On Nov 12, 2007, at 5:51 PM, Christian Kunz (JIRA) wrote:

>
>      [ https://issues.apache.org/jira/browse/HADOOP-713? 
> page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
>
> Christian Kunz updated HADOOP-713:
> ----------------------------------
>
>     Priority: Blocker  (was: Major)
>
> Changing to blocker based on conversation with Sameer.
>
>> dfs list operation is too expensive
>> -----------------------------------
>>
>>                 Key: HADOOP-713
>>                 URL: https://issues.apache.org/jira/browse/HADOOP-713
>>             Project: Hadoop
>>          Issue Type: Improvement
>>          Components: dfs
>>    Affects Versions: 0.8.0
>>            Reporter: Hairong Kuang
>>            Assignee: Hairong Kuang
>>            Priority: Blocker
>>
>> A list request to dfs returns an array of DFSFileInfo. A DFSFileInfo 
>> of a directory contains a field called contentsLen, indicating its 
>> size  which gets computed at the namenode side by resursively going 
>> through its subdirs. At the same time, the whole dfs directory tree 
>> is locked.
>> The list operation is used a lot by DFSClient for listing a 
>> directory, getting a file's size and # of replicas, and getting the 
>> size of dfs. Only the last operation needs the field contentsLen to 
>> be computed.
>> To reduce its cost, we can add a flag to the list request.  
>> ContentsLen is computed If the flag is set. By default, the flag is 
>> false.
>
> --
> This message is automatically generated by JIRA.
> -
> You can reply to this email to add a comment to the issue online.
>

Re: [jira] Updated: (HADOOP-713) dfs list operation is too expensive

Posted by Nigel Daley <nd...@yahoo-inc.com>.

Blocker for what release?  Please assign to a release.

On Nov 12, 2007, at 5:51 PM, Christian Kunz (JIRA) wrote:

>
>      [ https://issues.apache.org/jira/browse/HADOOP-713? 
> page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
>
> Christian Kunz updated HADOOP-713:
> ----------------------------------
>
>     Priority: Blocker  (was: Major)
>
> Changing to blocker based on conversation with Sameer.
>
>> dfs list operation is too expensive
>> -----------------------------------
>>
>>                 Key: HADOOP-713
>>                 URL: https://issues.apache.org/jira/browse/HADOOP-713
>>             Project: Hadoop
>>          Issue Type: Improvement
>>          Components: dfs
>>    Affects Versions: 0.8.0
>>            Reporter: Hairong Kuang
>>            Assignee: Hairong Kuang
>>            Priority: Blocker
>>
>> A list request to dfs returns an array of DFSFileInfo. A  
>> DFSFileInfo of a directory contains a field called contentsLen,  
>> indicating its size  which gets computed at the namenode side by  
>> resursively going through its subdirs. At the same time, the whole  
>> dfs directory tree is locked.
>> The list operation is used a lot by DFSClient for listing a  
>> directory, getting a file's size and # of replicas, and getting  
>> the size of dfs. Only the last operation needs the field  
>> contentsLen to be computed.
>> To reduce its cost, we can add a flag to the list request.  
>> ContentsLen is computed If the flag is set. By default, the flag  
>> is false.
>
> -- 
> This message is automatically generated by JIRA.
> -
> You can reply to this email to add a comment to the issue online.
>

[jira] Updated: (HADOOP-713) dfs list operation is too expensive

Posted by "Christian Kunz (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Christian Kunz updated HADOOP-713:
----------------------------------

    Priority: Blocker  (was: Major)

Changing to blocker based on conversation with Sameer.

> dfs list operation is too expensive
> -----------------------------------
>
>                 Key: HADOOP-713
>                 URL: https://issues.apache.org/jira/browse/HADOOP-713
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.8.0
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>            Priority: Blocker
>
> A list request to dfs returns an array of DFSFileInfo. A DFSFileInfo of a directory contains a field called contentsLen, indicating its size  which gets computed at the namenode side by resursively going through its subdirs. At the same time, the whole dfs directory tree is locked.
> The list operation is used a lot by DFSClient for listing a directory, getting a file's size and # of replicas, and getting the size of dfs. Only the last operation needs the field contentsLen to be computed.
> To reduce its cost, we can add a flag to the list request. ContentsLen is computed If the flag is set. By default, the flag is false.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-713) dfs list operation is too expensive

Posted by "Christian Kunz (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12542025 ] 

Christian Kunz commented on HADOOP-713:
---------------------------------------

Running a profiler on the namenode showed that we are spending a lot  of our time in getListing and DFSFileInfo.computeContentsLength. This is high priority fix for us.

> dfs list operation is too expensive
> -----------------------------------
>
>                 Key: HADOOP-713
>                 URL: https://issues.apache.org/jira/browse/HADOOP-713
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.8.0
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>
> A list request to dfs returns an array of DFSFileInfo. A DFSFileInfo of a directory contains a field called contentsLen, indicating its size  which gets computed at the namenode side by resursively going through its subdirs. At the same time, the whole dfs directory tree is locked.
> The list operation is used a lot by DFSClient for listing a directory, getting a file's size and # of replicas, and getting the size of dfs. Only the last operation needs the field contentsLen to be computed.
> To reduce its cost, we can add a flag to the list request. ContentsLen is computed If the flag is set. By default, the flag is false.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-713) dfs list operation is too expensive

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12542291 ] 

Doug Cutting commented on HADOOP-713:
-------------------------------------

> The client computes the size of a directory by recursively traversing all nodes in the subtree.

I think it worked that way at one time in the past, and was found to put too much RPC load on the namenode.  When someone wants to know the size of a directory (du -s) it is much more efficient to do the recursion server-side on the namenode.  We should avoid doing it for every directory listing, but we should still do it server-side.

> dfs list operation is too expensive
> -----------------------------------
>
>                 Key: HADOOP-713
>                 URL: https://issues.apache.org/jira/browse/HADOOP-713
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.8.0
>            Reporter: Hairong Kuang
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.15.1
>
>         Attachments: optimizeComputeContentLen.patch
>
>
> A list request to dfs returns an array of DFSFileInfo. A DFSFileInfo of a directory contains a field called contentsLen, indicating its size  which gets computed at the namenode side by resursively going through its subdirs. At the same time, the whole dfs directory tree is locked.
> The list operation is used a lot by DFSClient for listing a directory, getting a file's size and # of replicas, and getting the size of dfs. Only the last operation needs the field contentsLen to be computed.
> To reduce its cost, we can add a flag to the list request. ContentsLen is computed If the flag is set. By default, the flag is false.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-713) dfs list operation is too expensive

Posted by "Hudson (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12543884 ] 

Hudson commented on HADOOP-713:
-------------------------------

Integrated in Hadoop-Nightly #309 (See [http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/309/])

> dfs list operation is too expensive
> -----------------------------------
>
>                 Key: HADOOP-713
>                 URL: https://issues.apache.org/jira/browse/HADOOP-713
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.8.0
>            Reporter: Hairong Kuang
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.15.1
>
>         Attachments: optimizeComputeContentLen.patch, optimizeComputeContentLen2.patch, optimizeComputeContentLen3.patch
>
>
> A list request to dfs returns an array of DFSFileInfo. A DFSFileInfo of a directory contains a field called contentsLen, indicating its size  which gets computed at the namenode side by resursively going through its subdirs. At the same time, the whole dfs directory tree is locked.
> The list operation is used a lot by DFSClient for listing a directory, getting a file's size and # of replicas, and getting the size of dfs. Only the last operation needs the field contentsLen to be computed.
> To reduce its cost, we can add a flag to the list request. ContentsLen is computed If the flag is set. By default, the flag is false.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-713) dfs list operation is too expensive

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-713:
------------------------------------

    Status: Patch Available  (was: Open)

> dfs list operation is too expensive
> -----------------------------------
>
>                 Key: HADOOP-713
>                 URL: https://issues.apache.org/jira/browse/HADOOP-713
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.8.0
>            Reporter: Hairong Kuang
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.15.1
>
>         Attachments: optimizeComputeContentLen.patch, optimizeComputeContentLen2.patch, optimizeComputeContentLen3.patch
>
>
> A list request to dfs returns an array of DFSFileInfo. A DFSFileInfo of a directory contains a field called contentsLen, indicating its size  which gets computed at the namenode side by resursively going through its subdirs. At the same time, the whole dfs directory tree is locked.
> The list operation is used a lot by DFSClient for listing a directory, getting a file's size and # of replicas, and getting the size of dfs. Only the last operation needs the field contentsLen to be computed.
> To reduce its cost, we can add a flag to the list request. ContentsLen is computed If the flag is set. By default, the flag is false.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (HADOOP-713) dfs list operation is too expensive

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hairong Kuang reassigned HADOOP-713:
------------------------------------

    Assignee: Hairong Kuang  (was: Sameer Paranjpye)

> dfs list operation is too expensive
> -----------------------------------
>
>                 Key: HADOOP-713
>                 URL: https://issues.apache.org/jira/browse/HADOOP-713
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.8.0
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>
> A list request to dfs returns an array of DFSFileInfo. A DFSFileInfo of a directory contains a field called contentsLen, indicating its size  which gets computed at the namenode side by resursively going through its subdirs. At the same time, the whole dfs directory tree is locked.
> The list operation is used a lot by DFSClient for listing a directory, getting a file's size and # of replicas, and getting the size of dfs. Only the last operation needs the field contentsLen to be computed.
> To reduce its cost, we can add a flag to the list request. ContentsLen is computed If the flag is set. By default, the flag is false.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-713) dfs list operation is too expensive

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-713:
------------------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

I just committed this. 

> dfs list operation is too expensive
> -----------------------------------
>
>                 Key: HADOOP-713
>                 URL: https://issues.apache.org/jira/browse/HADOOP-713
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.8.0
>            Reporter: Hairong Kuang
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.15.1
>
>         Attachments: optimizeComputeContentLen.patch, optimizeComputeContentLen2.patch, optimizeComputeContentLen3.patch
>
>
> A list request to dfs returns an array of DFSFileInfo. A DFSFileInfo of a directory contains a field called contentsLen, indicating its size  which gets computed at the namenode side by resursively going through its subdirs. At the same time, the whole dfs directory tree is locked.
> The list operation is used a lot by DFSClient for listing a directory, getting a file's size and # of replicas, and getting the size of dfs. Only the last operation needs the field contentsLen to be computed.
> To reduce its cost, we can add a flag to the list request. ContentsLen is computed If the flag is set. By default, the flag is false.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-713) dfs list operation is too expensive

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-713:
------------------------------------

    Attachment: optimizeComputeContentLen2.patch

Here is a patch that introduces a new API in the ClientProtocol called getContentSize(path). It returns the size of the entire subtree rooted at path.
This call is used by du to retrieve the size of a directory.

Bumped up the client protocol version.



> dfs list operation is too expensive
> -----------------------------------
>
>                 Key: HADOOP-713
>                 URL: https://issues.apache.org/jira/browse/HADOOP-713
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.8.0
>            Reporter: Hairong Kuang
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.15.1
>
>         Attachments: optimizeComputeContentLen.patch, optimizeComputeContentLen2.patch
>
>
> A list request to dfs returns an array of DFSFileInfo. A DFSFileInfo of a directory contains a field called contentsLen, indicating its size  which gets computed at the namenode side by resursively going through its subdirs. At the same time, the whole dfs directory tree is locked.
> The list operation is used a lot by DFSClient for listing a directory, getting a file's size and # of replicas, and getting the size of dfs. Only the last operation needs the field contentsLen to be computed.
> To reduce its cost, we can add a flag to the list request. ContentsLen is computed If the flag is set. By default, the flag is false.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (HADOOP-713) dfs list operation is too expensive

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12542635 ] 

dhruba edited comment on HADOOP-713 at 11/14/07 4:21 PM:
-------------------------------------------------------------------

Here is a patch that introduces a new API in the ClientProtocol called getContentLength(path). It returns the size of the entire subtree rooted at path.
This call is used by du to retrieve the size of a directory.

Bumped up the client protocol version.



      was (Author: dhruba):
    Here is a patch that introduces a new API in the ClientProtocol called getContentSize(path). It returns the size of the entire subtree rooted at path.
This call is used by du to retrieve the size of a directory.

Bumped up the client protocol version.


  
> dfs list operation is too expensive
> -----------------------------------
>
>                 Key: HADOOP-713
>                 URL: https://issues.apache.org/jira/browse/HADOOP-713
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.8.0
>            Reporter: Hairong Kuang
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.15.1
>
>         Attachments: optimizeComputeContentLen.patch, optimizeComputeContentLen2.patch
>
>
> A list request to dfs returns an array of DFSFileInfo. A DFSFileInfo of a directory contains a field called contentsLen, indicating its size  which gets computed at the namenode side by resursively going through its subdirs. At the same time, the whole dfs directory tree is locked.
> The list operation is used a lot by DFSClient for listing a directory, getting a file's size and # of replicas, and getting the size of dfs. Only the last operation needs the field contentsLen to be computed.
> To reduce its cost, we can add a flag to the list request. ContentsLen is computed If the flag is set. By default, the flag is false.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-713) dfs list operation is too expensive

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12541997 ] 

dhruba borthakur commented on HADOOP-713:
-----------------------------------------

This is a simple fix and should reduce appreciable amount of load on namenode. I like the idea of adding a new flag to the getListings call.

> dfs list operation is too expensive
> -----------------------------------
>
>                 Key: HADOOP-713
>                 URL: https://issues.apache.org/jira/browse/HADOOP-713
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.8.0
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>
> A list request to dfs returns an array of DFSFileInfo. A DFSFileInfo of a directory contains a field called contentsLen, indicating its size  which gets computed at the namenode side by resursively going through its subdirs. At the same time, the whole dfs directory tree is locked.
> The list operation is used a lot by DFSClient for listing a directory, getting a file's size and # of replicas, and getting the size of dfs. Only the last operation needs the field contentsLen to be computed.
> To reduce its cost, we can add a flag to the list request. ContentsLen is computed If the flag is set. By default, the flag is false.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-713) dfs list operation is too expensive

Posted by "Christian Kunz (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Christian Kunz updated HADOOP-713:
----------------------------------

    Fix Version/s: 0.15.1

> dfs list operation is too expensive
> -----------------------------------
>
>                 Key: HADOOP-713
>                 URL: https://issues.apache.org/jira/browse/HADOOP-713
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.8.0
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>            Priority: Blocker
>             Fix For: 0.15.1
>
>
> A list request to dfs returns an array of DFSFileInfo. A DFSFileInfo of a directory contains a field called contentsLen, indicating its size  which gets computed at the namenode side by resursively going through its subdirs. At the same time, the whole dfs directory tree is locked.
> The list operation is used a lot by DFSClient for listing a directory, getting a file's size and # of replicas, and getting the size of dfs. Only the last operation needs the field contentsLen to be computed.
> To reduce its cost, we can add a flag to the list request. ContentsLen is computed If the flag is set. By default, the flag is false.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-713) dfs list operation is too expensive

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12543656 ] 

Hadoop QA commented on HADOOP-713:
----------------------------------

+1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12369695/optimizeComputeContentLen3.patch
against trunk revision r595563.

    @author +1.  The patch does not contain any @author tags.

    javadoc +1.  The javadoc tool did not generate any warning messages.

    javac +1.  The applied patch does not generate any new compiler warnings.

    findbugs +1.  The patch does not introduce any new Findbugs warnings.

    core tests +1.  The patch passed core unit tests.

    contrib tests +1.  The patch passed contrib unit tests.

Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1118/testReport/
Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1118/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1118/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1118/console

This message is automatically generated.

> dfs list operation is too expensive
> -----------------------------------
>
>                 Key: HADOOP-713
>                 URL: https://issues.apache.org/jira/browse/HADOOP-713
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.8.0
>            Reporter: Hairong Kuang
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.15.1
>
>         Attachments: optimizeComputeContentLen.patch, optimizeComputeContentLen2.patch, optimizeComputeContentLen3.patch
>
>
> A list request to dfs returns an array of DFSFileInfo. A DFSFileInfo of a directory contains a field called contentsLen, indicating its size  which gets computed at the namenode side by resursively going through its subdirs. At the same time, the whole dfs directory tree is locked.
> The list operation is used a lot by DFSClient for listing a directory, getting a file's size and # of replicas, and getting the size of dfs. Only the last operation needs the field contentsLen to be computed.
> To reduce its cost, we can add a flag to the list request. ContentsLen is computed If the flag is set. By default, the flag is false.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-713) dfs list operation is too expensive

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12542627 ] 

dhruba borthakur commented on HADOOP-713:
-----------------------------------------

I was assuming that we cannot add a new API in a point release. If you think it is ok, then I can go ahead and do it as part of this JIRA.

> dfs list operation is too expensive
> -----------------------------------
>
>                 Key: HADOOP-713
>                 URL: https://issues.apache.org/jira/browse/HADOOP-713
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.8.0
>            Reporter: Hairong Kuang
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.15.1
>
>         Attachments: optimizeComputeContentLen.patch
>
>
> A list request to dfs returns an array of DFSFileInfo. A DFSFileInfo of a directory contains a field called contentsLen, indicating its size  which gets computed at the namenode side by resursively going through its subdirs. At the same time, the whole dfs directory tree is locked.
> The list operation is used a lot by DFSClient for listing a directory, getting a file's size and # of replicas, and getting the size of dfs. Only the last operation needs the field contentsLen to be computed.
> To reduce its cost, we can add a flag to the list request. ContentsLen is computed If the flag is set. By default, the flag is false.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-713) dfs list operation is too expensive

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-713:
------------------------------------

    Attachment: optimizeComputeContentLen.patch

A DfsPath object for a directory used to have the size of all files inside that directory summed up in it. This means that the INodeDirectory.computeContentsLength had to recursively traval all the nodes in that specified subtree and compute the size of the directory. This uses up a lot of CPU on the namenode.

The fix propsed here is that the namenode returns a size of 0 for directories. The client computes the size of a directory by recursively traversing all nodes in the subtree.



> dfs list operation is too expensive
> -----------------------------------
>
>                 Key: HADOOP-713
>                 URL: https://issues.apache.org/jira/browse/HADOOP-713
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.8.0
>            Reporter: Hairong Kuang
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.15.1
>
>         Attachments: optimizeComputeContentLen.patch
>
>
> A list request to dfs returns an array of DFSFileInfo. A DFSFileInfo of a directory contains a field called contentsLen, indicating its size  which gets computed at the namenode side by resursively going through its subdirs. At the same time, the whole dfs directory tree is locked.
> The list operation is used a lot by DFSClient for listing a directory, getting a file's size and # of replicas, and getting the size of dfs. Only the last operation needs the field contentsLen to be computed.
> To reduce its cost, we can add a flag to the list request. ContentsLen is computed If the flag is set. By default, the flag is false.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-713) dfs list operation is too expensive

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12543110 ] 

Doug Cutting commented on HADOOP-713:
-------------------------------------

This looks good to me.

Do any unit tests actually exercise this?  They probably call getContentLength() on files, but does any unit test use this on a directory and check that the value is reasonable?  If not, we might add such a test.

One other minor thing: the cast to DfsPath in DistributedFileSystem.java immediately after the changes in the patch can be removed.

> dfs list operation is too expensive
> -----------------------------------
>
>                 Key: HADOOP-713
>                 URL: https://issues.apache.org/jira/browse/HADOOP-713
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.8.0
>            Reporter: Hairong Kuang
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.15.1
>
>         Attachments: optimizeComputeContentLen.patch, optimizeComputeContentLen2.patch
>
>
> A list request to dfs returns an array of DFSFileInfo. A DFSFileInfo of a directory contains a field called contentsLen, indicating its size  which gets computed at the namenode side by resursively going through its subdirs. At the same time, the whole dfs directory tree is locked.
> The list operation is used a lot by DFSClient for listing a directory, getting a file's size and # of replicas, and getting the size of dfs. Only the last operation needs the field contentsLen to be computed.
> To reduce its cost, we can add a flag to the list request. ContentsLen is computed If the flag is set. By default, the flag is false.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-713) dfs list operation is too expensive

Posted by "Sameer Paranjpye (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12542628 ] 

Sameer Paranjpye commented on HADOOP-713:
-----------------------------------------

> If not, we could bump the protocol version, which we normally try to avoid in a point release, but I think it might be better to fail than to give the wrong answer.

If we're bumping the protocol version, we may as well add the new call.

> dfs list operation is too expensive
> -----------------------------------
>
>                 Key: HADOOP-713
>                 URL: https://issues.apache.org/jira/browse/HADOOP-713
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.8.0
>            Reporter: Hairong Kuang
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.15.1
>
>         Attachments: optimizeComputeContentLen.patch
>
>
> A list request to dfs returns an array of DFSFileInfo. A DFSFileInfo of a directory contains a field called contentsLen, indicating its size  which gets computed at the namenode side by resursively going through its subdirs. At the same time, the whole dfs directory tree is locked.
> The list operation is used a lot by DFSClient for listing a directory, getting a file's size and # of replicas, and getting the size of dfs. Only the last operation needs the field contentsLen to be computed.
> To reduce its cost, we can add a flag to the list request. ContentsLen is computed If the flag is set. By default, the flag is false.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (HADOOP-713) dfs list operation is too expensive

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur reassigned HADOOP-713:
---------------------------------------

    Assignee: dhruba borthakur  (was: Hairong Kuang)

> dfs list operation is too expensive
> -----------------------------------
>
>                 Key: HADOOP-713
>                 URL: https://issues.apache.org/jira/browse/HADOOP-713
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.8.0
>            Reporter: Hairong Kuang
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.15.1
>
>
> A list request to dfs returns an array of DFSFileInfo. A DFSFileInfo of a directory contains a field called contentsLen, indicating its size  which gets computed at the namenode side by resursively going through its subdirs. At the same time, the whole dfs directory tree is locked.
> The list operation is used a lot by DFSClient for listing a directory, getting a file's size and # of replicas, and getting the size of dfs. Only the last operation needs the field contentsLen to be computed.
> To reduce its cost, we can add a flag to the list request. ContentsLen is computed If the flag is set. By default, the flag is false.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-713) dfs list operation is too expensive

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12542573 ] 

Doug Cutting commented on HADOOP-713:
-------------------------------------

> This would be significantly lower RPC load than an invocation per file.

Yes, but still many more than a single RPC call.  If 'du' requires an RPC per subdirectory, then we should carefully benchmark 'du /' on large filesystems to make sure that it doesn't (a) take too long; and (b) doesn't create so much namenode load that other processes are impacted.

> 'ls -r' recurses on the client side, makes a call per directory in a tree and is a pretty frequent operation (certainly more frequent than du).

Hmm.  On directories near the root, where the cost his highest, I use 'du' more frequently than 'ls -r'.



> dfs list operation is too expensive
> -----------------------------------
>
>                 Key: HADOOP-713
>                 URL: https://issues.apache.org/jira/browse/HADOOP-713
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.8.0
>            Reporter: Hairong Kuang
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.15.1
>
>         Attachments: optimizeComputeContentLen.patch
>
>
> A list request to dfs returns an array of DFSFileInfo. A DFSFileInfo of a directory contains a field called contentsLen, indicating its size  which gets computed at the namenode side by resursively going through its subdirs. At the same time, the whole dfs directory tree is locked.
> The list operation is used a lot by DFSClient for listing a directory, getting a file's size and # of replicas, and getting the size of dfs. Only the last operation needs the field contentsLen to be computed.
> To reduce its cost, we can add a flag to the list request. ContentsLen is computed If the flag is set. By default, the flag is false.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-713) dfs list operation is too expensive

Posted by "Sameer Paranjpye (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12542623 ] 

Sameer Paranjpye commented on HADOOP-713:
-----------------------------------------

> A "lsr /" on the same cluster took 13 minutes. The CPU on the namenode increased from its normal of 4% to 10%. We saw garbage collection occuring but the 
> Eden-Heap-Space on the namenode remained within its normal limits.

Note that in this case, 'lsr /' is recursing on both the client and the server side. The client recurses through directories fetching the list of files in each directory with 'listPaths' calls, but each 'listPaths' call causes a recursion on the Namenode all the way to the leaves which is done to compute the size of the directory. 

> dfs list operation is too expensive
> -----------------------------------
>
>                 Key: HADOOP-713
>                 URL: https://issues.apache.org/jira/browse/HADOOP-713
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.8.0
>            Reporter: Hairong Kuang
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.15.1
>
>         Attachments: optimizeComputeContentLen.patch
>
>
> A list request to dfs returns an array of DFSFileInfo. A DFSFileInfo of a directory contains a field called contentsLen, indicating its size  which gets computed at the namenode side by resursively going through its subdirs. At the same time, the whole dfs directory tree is locked.
> The list operation is used a lot by DFSClient for listing a directory, getting a file's size and # of replicas, and getting the size of dfs. Only the last operation needs the field contentsLen to be computed.
> To reduce its cost, we can add a flag to the list request. ContentsLen is computed If the flag is set. By default, the flag is false.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-713) dfs list operation is too expensive

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12543376 ] 

Hairong Kuang commented on HADOOP-713:
--------------------------------------

+1 The patch looks good.

> dfs list operation is too expensive
> -----------------------------------
>
>                 Key: HADOOP-713
>                 URL: https://issues.apache.org/jira/browse/HADOOP-713
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.8.0
>            Reporter: Hairong Kuang
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.15.1
>
>         Attachments: optimizeComputeContentLen.patch, optimizeComputeContentLen2.patch, optimizeComputeContentLen3.patch
>
>
> A list request to dfs returns an array of DFSFileInfo. A DFSFileInfo of a directory contains a field called contentsLen, indicating its size  which gets computed at the namenode side by resursively going through its subdirs. At the same time, the whole dfs directory tree is locked.
> The list operation is used a lot by DFSClient for listing a directory, getting a file's size and # of replicas, and getting the size of dfs. Only the last operation needs the field contentsLen to be computed.
> To reduce its cost, we can add a flag to the list request. ContentsLen is computed If the flag is set. By default, the flag is false.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-713) dfs list operation is too expensive

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12542616 ] 

dhruba borthakur commented on HADOOP-713:
-----------------------------------------

We experimented on a system that has 6 million files. A "du /" with recursion implemented on the namenode took less than 2 seconds. There was no measure-able change in CPU usage or GC on the nameode when this command was run.

A "lsr /" on the same cluster took 13 minutes. The CPU on the namenode increased from its normal of 4% to 10%. We saw garbage collection occuring but the Eden-Heap-Space on the namenode remained within its normal limits.

Given the above, I propose that we accept this patch into 0.15 and trunk so that the immediate performance bottleneck on the namenode is fixed. I will create a new JIRA that describes that du should be implemented at the namenode using one RPC call.



> dfs list operation is too expensive
> -----------------------------------
>
>                 Key: HADOOP-713
>                 URL: https://issues.apache.org/jira/browse/HADOOP-713
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.8.0
>            Reporter: Hairong Kuang
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.15.1
>
>         Attachments: optimizeComputeContentLen.patch
>
>
> A list request to dfs returns an array of DFSFileInfo. A DFSFileInfo of a directory contains a field called contentsLen, indicating its size  which gets computed at the namenode side by resursively going through its subdirs. At the same time, the whole dfs directory tree is locked.
> The list operation is used a lot by DFSClient for listing a directory, getting a file's size and # of replicas, and getting the size of dfs. Only the last operation needs the field contentsLen to be computed.
> To reduce its cost, we can add a flag to the list request. ContentsLen is computed If the flag is set. By default, the flag is false.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-713) dfs list operation is too expensive

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12542624 ] 

Doug Cutting commented on HADOOP-713:
-------------------------------------

There are back-compatibility issues with this patch.  If servers are upgraded but not clients, then 'du' will not work.  Is this acceptable?  If not, we could bump the protocol version, which we normally try to avoid in a point release, but I think it might be better to fail than to give the wrong answer.

> I propose that we accept this patch into 0.15 and trunk so that the immediate performance bottleneck on the namenode is fixed

How much harder would it be to add the new method to the protocol?  It seems to me the patch wouldn't be much larger...


> dfs list operation is too expensive
> -----------------------------------
>
>                 Key: HADOOP-713
>                 URL: https://issues.apache.org/jira/browse/HADOOP-713
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.8.0
>            Reporter: Hairong Kuang
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.15.1
>
>         Attachments: optimizeComputeContentLen.patch
>
>
> A list request to dfs returns an array of DFSFileInfo. A DFSFileInfo of a directory contains a field called contentsLen, indicating its size  which gets computed at the namenode side by resursively going through its subdirs. At the same time, the whole dfs directory tree is locked.
> The list operation is used a lot by DFSClient for listing a directory, getting a file's size and # of replicas, and getting the size of dfs. Only the last operation needs the field contentsLen to be computed.
> To reduce its cost, we can add a flag to the list request. ContentsLen is computed If the flag is set. By default, the flag is false.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-713) dfs list operation is too expensive

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-713:
------------------------------------

    Attachment: optimizeComputeContentLen3.patch

Created a new test case that triggers the getContentLength API for files and directories.

> dfs list operation is too expensive
> -----------------------------------
>
>                 Key: HADOOP-713
>                 URL: https://issues.apache.org/jira/browse/HADOOP-713
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.8.0
>            Reporter: Hairong Kuang
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.15.1
>
>         Attachments: optimizeComputeContentLen.patch, optimizeComputeContentLen2.patch, optimizeComputeContentLen3.patch
>
>
> A list request to dfs returns an array of DFSFileInfo. A DFSFileInfo of a directory contains a field called contentsLen, indicating its size  which gets computed at the namenode side by resursively going through its subdirs. At the same time, the whole dfs directory tree is locked.
> The list operation is used a lot by DFSClient for listing a directory, getting a file's size and # of replicas, and getting the size of dfs. Only the last operation needs the field contentsLen to be computed.
> To reduce its cost, we can add a flag to the list request. ContentsLen is computed If the flag is set. By default, the flag is false.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-713) dfs list operation is too expensive

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12542630 ] 

Doug Cutting commented on HADOOP-713:
-------------------------------------

> I was assuming that we cannot add a new API in a point release.

Well, if we adhere to that rule, how can we fix this in 0.15?  I don't sense folks want to wait for 0.16 for this, nor do I see how to fix it without incrementing the protocol version.

> dfs list operation is too expensive
> -----------------------------------
>
>                 Key: HADOOP-713
>                 URL: https://issues.apache.org/jira/browse/HADOOP-713
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.8.0
>            Reporter: Hairong Kuang
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.15.1
>
>         Attachments: optimizeComputeContentLen.patch
>
>
> A list request to dfs returns an array of DFSFileInfo. A DFSFileInfo of a directory contains a field called contentsLen, indicating its size  which gets computed at the namenode side by resursively going through its subdirs. At the same time, the whole dfs directory tree is locked.
> The list operation is used a lot by DFSClient for listing a directory, getting a file's size and # of replicas, and getting the size of dfs. Only the last operation needs the field contentsLen to be computed.
> To reduce its cost, we can add a flag to the list request. ContentsLen is computed If the flag is set. By default, the flag is false.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-713) dfs list operation is too expensive

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12542218 ] 

Hairong Kuang commented on HADOOP-713:
--------------------------------------

Another cleaner fix is to add a seperate client protocol to get the directory size while getListing does not need to compute the directory size.

> dfs list operation is too expensive
> -----------------------------------
>
>                 Key: HADOOP-713
>                 URL: https://issues.apache.org/jira/browse/HADOOP-713
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.8.0
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>            Priority: Blocker
>             Fix For: 0.15.1
>
>
> A list request to dfs returns an array of DFSFileInfo. A DFSFileInfo of a directory contains a field called contentsLen, indicating its size  which gets computed at the namenode side by resursively going through its subdirs. At the same time, the whole dfs directory tree is locked.
> The list operation is used a lot by DFSClient for listing a directory, getting a file's size and # of replicas, and getting the size of dfs. Only the last operation needs the field contentsLen to be computed.
> To reduce its cost, we can add a flag to the list request. ContentsLen is computed If the flag is set. By default, the flag is false.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-713) dfs list operation is too expensive

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12542314 ] 

Hairong Kuang commented on HADOOP-713:
--------------------------------------

+1 The code looks good.

> dfs list operation is too expensive
> -----------------------------------
>
>                 Key: HADOOP-713
>                 URL: https://issues.apache.org/jira/browse/HADOOP-713
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.8.0
>            Reporter: Hairong Kuang
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.15.1
>
>         Attachments: optimizeComputeContentLen.patch
>
>
> A list request to dfs returns an array of DFSFileInfo. A DFSFileInfo of a directory contains a field called contentsLen, indicating its size  which gets computed at the namenode side by resursively going through its subdirs. At the same time, the whole dfs directory tree is locked.
> The list operation is used a lot by DFSClient for listing a directory, getting a file's size and # of replicas, and getting the size of dfs. Only the last operation needs the field contentsLen to be computed.
> To reduce its cost, we can add a flag to the list request. ContentsLen is computed If the flag is set. By default, the flag is false.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.