You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by ba...@gmail.com on 2006/03/07 19:40:28 UTC

Question about the "N"DFSClient

I am going through my code and changing over from NDFS to DFS since I
was using Nutch primarily for the DFS. I noticed that the protection
of the DFSClient has changed along with things like DFSFile and
DataNodeInfo.Was the reason solely to make the JavaDoc simpler?

I was using those objects to have a jsp rendered status check, so
having them available would be nice.

-Barry

Re: Question about the "N"DFSClient

Posted by ba...@gmail.com.
Woops... I didn't realize this was being compiled under 1.4 which
doesn't have the fancy loops.

On 3/7/06, barry.kaplan@gmail.com <ba...@gmail.com> wrote:
> I would like to be able to see the capacity, used, and effective bytes
> of the DFS, as well as a DataNode breakdown listing name, capacity,
> and remaining.
>
> How would you feel about adding the following to DistributedFileSystem:
>
>     public long getCapacity() throws IOException{
>         return dfs.totalRawCapacity();
>     }
>
>     public long getUsed() throws IOException{
>         return dfs.totalRawUsed();
>     }
>
>     public long getEffectiveBytes()throws IOException{
>         long totalEffectiveBytes = 0;
>         DFSFileInfo dfsFiles[] = dfs.listFiles(getPath(new File("/")));
>         for (DFSFileInfo dfsFileInfo : dfsFiles) {
>             totalEffectiveBytes += dfsFileInfo.getContentsLen();
>         }
>         return totalEffectiveBytes;
>     }
>
>     public DatanodeStats[] getDataNodeStats() throws IOException{
>         DatanodeInfo[]  dnReport = dfs.datanodeReport();
>         DatanodeStats[] stats = new DatnodeStats[dnReport.length];
>         int i =0;
>
>         for (DatanodeInfo datanodeInfo : dnReport) {
>             stats[i].name = datanodeInfo.getName();
>             stats[i].capacity = datanodeInfo.getCapacity();
>             stats[i].remaining = datanodeInfo.getRemaining();
>             i++;
>         }
>         return stats;
>     }
>
> On 3/7/06, Doug Cutting <cu...@apache.org> wrote:
> > barry.kaplan@gmail.com wrote:
> > > I am going through my code and changing over from NDFS to DFS since I
> > > was using Nutch primarily for the DFS. I noticed that the protection
> > > of the DFSClient has changed along with things like DFSFile and
> > > DataNodeInfo.Was the reason solely to make the JavaDoc simpler?
> >
> > Improving the javadoc is nice, but is a side-effect of trying to
> > separate the public API from the implementation.  A smaller public API
> > makes it easier to evolve the implementation without breaking folks
> > (like you!).
> >
> > > I was using those objects to have a jsp rendered status check, so
> > > having them available would be nice.
> >
> > What methods do you need from DFSClient that are not generically
> > available on FileSystem and/or DistributedFilesystem?
> >
> > Doug
> >
>

Re: Question about the "N"DFSClient

Posted by ba...@gmail.com.
That was my intent as the two are almost the same, the issue with just
using DataNodeInfo though is write and readFields would essentially
become public via the Writable interface. This does beg the question
as to why it is called DataNodeInfo vs DataNodeManager, as it is not
really an "info" object.

I think at the end of the day it probably doesn't really matter as it
will be impossible for an outside developer to get an instance of
DataNodeInfo based on the current design, but it does expose the
methods. I think it would simply be safer, especially for a read-only
stat object like this, to have a new object created.

-Barry

On 3/7/06, Doug Cutting <cu...@apache.org> wrote:
> This looks good to me.  +1
>
> Can you start a bug report and add this as a patch file?
>
> DatanodeStats looks a lot like DatanodeInfo.  Was that your intent?  I
> wouldn't mind if this was a separate class, or we could try to sanitize
> DatanodeInfo, so it does not expose, e.g., the block-related methods.
>
> We'll also need some javadoc comments.  And we should probably re-write
> DFSShell.report() to be implemented in terms of these new methods.
>
> Thanks,
>
> Doug
>
> barry.kaplan@gmail.com wrote:
> > I would like to be able to see the capacity, used, and effective bytes
> > of the DFS, as well as a DataNode breakdown listing name, capacity,
> > and remaining.
> >
> > How would you feel about adding the following to DistributedFileSystem:
> >
> >     public long getCapacity() throws IOException{
> >         return dfs.totalRawCapacity();
> >     }
> >
> >     public long getUsed() throws IOException{
> >         return dfs.totalRawUsed();
> >     }
> >
> >     public long getEffectiveBytes()throws IOException{
> >         long totalEffectiveBytes = 0;
> >         DFSFileInfo dfsFiles[] = dfs.listFiles(getPath(new File("/")));
> >         for (DFSFileInfo dfsFileInfo : dfsFiles) {
> >             totalEffectiveBytes += dfsFileInfo.getContentsLen();
> >         }
> >         return totalEffectiveBytes;
> >     }
> >
> >     public DatanodeStats[] getDataNodeStats() throws IOException{
> >         DatanodeInfo[]  dnReport = dfs.datanodeReport();
> >         DatanodeStats[] stats = new DatnodeStats[dnReport.length];
> >         int i =0;
> >
> >         for (DatanodeInfo datanodeInfo : dnReport) {
> >             stats[i].name = datanodeInfo.getName();
> >             stats[i].capacity = datanodeInfo.getCapacity();
> >             stats[i].remaining = datanodeInfo.getRemaining();
> >             i++;
> >         }
> >         return stats;
> >     }
> >
> > On 3/7/06, Doug Cutting <cu...@apache.org> wrote:
> >
> >>barry.kaplan@gmail.com wrote:
> >>
> >>>I am going through my code and changing over from NDFS to DFS since I
> >>>was using Nutch primarily for the DFS. I noticed that the protection
> >>>of the DFSClient has changed along with things like DFSFile and
> >>>DataNodeInfo.Was the reason solely to make the JavaDoc simpler?
> >>
> >>Improving the javadoc is nice, but is a side-effect of trying to
> >>separate the public API from the implementation.  A smaller public API
> >>makes it easier to evolve the implementation without breaking folks
> >>(like you!).
> >>
> >>
> >>>I was using those objects to have a jsp rendered status check, so
> >>>having them available would be nice.
> >>
> >>What methods do you need from DFSClient that are not generically
> >>available on FileSystem and/or DistributedFilesystem?
> >>
> >>Doug
> >>
>

Re: Question about the "N"DFSClient

Posted by Doug Cutting <cu...@apache.org>.
This looks good to me.  +1

Can you start a bug report and add this as a patch file?

DatanodeStats looks a lot like DatanodeInfo.  Was that your intent?  I 
wouldn't mind if this was a separate class, or we could try to sanitize 
DatanodeInfo, so it does not expose, e.g., the block-related methods.

We'll also need some javadoc comments.  And we should probably re-write 
DFSShell.report() to be implemented in terms of these new methods.

Thanks,

Doug

barry.kaplan@gmail.com wrote:
> I would like to be able to see the capacity, used, and effective bytes
> of the DFS, as well as a DataNode breakdown listing name, capacity,
> and remaining.
> 
> How would you feel about adding the following to DistributedFileSystem:
> 
>     public long getCapacity() throws IOException{
>         return dfs.totalRawCapacity();
>     }
> 
>     public long getUsed() throws IOException{
>         return dfs.totalRawUsed();
>     }
> 
>     public long getEffectiveBytes()throws IOException{
>         long totalEffectiveBytes = 0;
>         DFSFileInfo dfsFiles[] = dfs.listFiles(getPath(new File("/")));
>         for (DFSFileInfo dfsFileInfo : dfsFiles) {
>             totalEffectiveBytes += dfsFileInfo.getContentsLen();
>         }
>         return totalEffectiveBytes;
>     }
> 
>     public DatanodeStats[] getDataNodeStats() throws IOException{
>         DatanodeInfo[]  dnReport = dfs.datanodeReport();
>         DatanodeStats[] stats = new DatnodeStats[dnReport.length];
>         int i =0;
> 
>         for (DatanodeInfo datanodeInfo : dnReport) {
>             stats[i].name = datanodeInfo.getName();
>             stats[i].capacity = datanodeInfo.getCapacity();
>             stats[i].remaining = datanodeInfo.getRemaining();
>             i++;
>         }
>         return stats;
>     }
> 
> On 3/7/06, Doug Cutting <cu...@apache.org> wrote:
> 
>>barry.kaplan@gmail.com wrote:
>>
>>>I am going through my code and changing over from NDFS to DFS since I
>>>was using Nutch primarily for the DFS. I noticed that the protection
>>>of the DFSClient has changed along with things like DFSFile and
>>>DataNodeInfo.Was the reason solely to make the JavaDoc simpler?
>>
>>Improving the javadoc is nice, but is a side-effect of trying to
>>separate the public API from the implementation.  A smaller public API
>>makes it easier to evolve the implementation without breaking folks
>>(like you!).
>>
>>
>>>I was using those objects to have a jsp rendered status check, so
>>>having them available would be nice.
>>
>>What methods do you need from DFSClient that are not generically
>>available on FileSystem and/or DistributedFilesystem?
>>
>>Doug
>>

Re: Question about the "N"DFSClient

Posted by ba...@gmail.com.
I would like to be able to see the capacity, used, and effective bytes
of the DFS, as well as a DataNode breakdown listing name, capacity,
and remaining.

How would you feel about adding the following to DistributedFileSystem:

    public long getCapacity() throws IOException{
        return dfs.totalRawCapacity();
    }

    public long getUsed() throws IOException{
        return dfs.totalRawUsed();
    }

    public long getEffectiveBytes()throws IOException{
        long totalEffectiveBytes = 0;
        DFSFileInfo dfsFiles[] = dfs.listFiles(getPath(new File("/")));
        for (DFSFileInfo dfsFileInfo : dfsFiles) {
            totalEffectiveBytes += dfsFileInfo.getContentsLen();
        }
        return totalEffectiveBytes;
    }

    public DatanodeStats[] getDataNodeStats() throws IOException{
        DatanodeInfo[]  dnReport = dfs.datanodeReport();
        DatanodeStats[] stats = new DatnodeStats[dnReport.length];
        int i =0;

        for (DatanodeInfo datanodeInfo : dnReport) {
            stats[i].name = datanodeInfo.getName();
            stats[i].capacity = datanodeInfo.getCapacity();
            stats[i].remaining = datanodeInfo.getRemaining();
            i++;
        }
        return stats;
    }

On 3/7/06, Doug Cutting <cu...@apache.org> wrote:
> barry.kaplan@gmail.com wrote:
> > I am going through my code and changing over from NDFS to DFS since I
> > was using Nutch primarily for the DFS. I noticed that the protection
> > of the DFSClient has changed along with things like DFSFile and
> > DataNodeInfo.Was the reason solely to make the JavaDoc simpler?
>
> Improving the javadoc is nice, but is a side-effect of trying to
> separate the public API from the implementation.  A smaller public API
> makes it easier to evolve the implementation without breaking folks
> (like you!).
>
> > I was using those objects to have a jsp rendered status check, so
> > having them available would be nice.
>
> What methods do you need from DFSClient that are not generically
> available on FileSystem and/or DistributedFilesystem?
>
> Doug
>

Re: Question about the "N"DFSClient

Posted by Doug Cutting <cu...@apache.org>.
barry.kaplan@gmail.com wrote:
> I am going through my code and changing over from NDFS to DFS since I
> was using Nutch primarily for the DFS. I noticed that the protection
> of the DFSClient has changed along with things like DFSFile and
> DataNodeInfo.Was the reason solely to make the JavaDoc simpler?

Improving the javadoc is nice, but is a side-effect of trying to 
separate the public API from the implementation.  A smaller public API 
makes it easier to evolve the implementation without breaking folks 
(like you!).

> I was using those objects to have a jsp rendered status check, so
> having them available would be nice.

What methods do you need from DFSClient that are not generically 
available on FileSystem and/or DistributedFilesystem?

Doug