You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by praveenesh kumar <pr...@gmail.com> on 2011/09/21 10:43:48 UTC

Any other way to copy to HDFS ?

Guys,

As far as I know hadoop, I think, to copy the files to HDFS, first it needs
to be copied to the NameNode's local filesystem. Is it right ??
So does it mean that even if I have a hadoop cluster of 10 nodes with
overall capacity of 6TB, but if my NameNode's hard disk capacity is 500 GB,
I can not copy any file to HDFS greater than 500 GB ?

Is there any other way to directly copy to HDFS without copy the file to
namenode's local filesystem ?
What can be other ways to copy large files greater than namenode's disk
capacity ?

Thanks,
Praveenesh.

Re: Fwd: Any other way to copy to HDFS ?

Posted by Harsh J <ha...@cloudera.com>.

Praveenesh,

It should be understood, as a takeaway from this, that HDFS is a set
of servers, like webservers are. You can send it a request, and you
can expect a response. It is also an FS in the sense that it is
designed to do FS like operations (hold inodes, read/write data), but
primally it behaves like any other server would when you wanna
communicate with it.

When you load files into it, the mechanisms underneath are merely
opening a TCP socket connection to the server(s) and writing packets
through, and closing it down when done. Similarly, when reading out
files as well. Of course the details are much more complex than a
simple, single TCP connection, but that's how it works.

Hope this helps you understand your Hadoop better ;-)

On Wed, Sep 21, 2011 at 4:29 PM, praveenesh kumar <pr...@gmail.com> wrote:
> Thanks a lot..!!
> I guess I can play around with the permissions of dfs for a while.
>
> On Wed, Sep 21, 2011 at 3:59 PM, Uma Maheswara Rao G 72686 <
> maheswara@huawei.com> wrote:
>
>> Hello Praveenesh,
>>
>> If you really need not care about permissions then you can disable it at NN
>> side by using the property dfs.permissions.enable
>>
>> You can the permission for the path before creating as well.
>>
>> from docs:
>> Changes to the File System API
>> All methods that use a path parameter will throw AccessControlException if
>> permission checking fails.
>>
>> New methods:
>>
>> public FSDataOutputStream create(Path f, FsPermission permission, boolean
>> overwrite, int bufferSize, short replication, long blockSize, Progressable
>> progress) throws IOException;
>> public boolean mkdirs(Path f, FsPermission permission) throws IOException;
>> public void setPermission(Path p, FsPermission permission) throws
>> IOException;
>> public void setOwner(Path p, String username, String groupname) throws
>> IOException;
>> public FileStatus getFileStatus(Path f) throws IOException; will
>> additionally return the user, group and mode associated with the path.
>>
>>
>> http://hadoop.apache.org/common/docs/r0.20.2/hdfs_permissions_guide.html
>>
>>
>> Regards,
>> Uma
>> ----- Original Message -----
>> From: praveenesh kumar <pr...@gmail.com>
>> Date: Wednesday, September 21, 2011 3:41 pm
>> Subject: Fwd: Any other way to copy to HDFS ?
>> To: common-user@hadoop.apache.org
>>
>> > Thanks a lot. I am trying to run the following code on my windows
>> > machinethat is not part of cluster.
>>  > **
>> > *public* *static* *void* main(String args[]) *throws* IOException,
>> > URISyntaxException
>> >
>> > {
>> >
>> > FileSystem fs =*new* DistributedFileSystem();
>> >
>> > fs.initialize(*new* URI("hdfs://162.192.100.53:54310/"),
>> > *new*Configuration());
>> > fs.copyFromLocalFile(*new* Path("C:\\Positive.txt"),*new* Path(
>> > "/user/hadoop/Positive.txt"));
>> >
>> > System.*out*.println("Done");
>> >
>> > }
>> >
>> > But I am getting the following exception :
>> >
>> > Exception in thread "main"
>> > org.apache.hadoop.security.AccessControlException:
>> > org.apache.hadoop.security.AccessControlException: Permission denied:
>> > user=DrWho, access=WRITE, inode="hadoop":hadoop:supergroup:rwxr-xr-x
>> > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
>> > Method) at
>> >
>> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>> > at
>> >
>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>> > at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
>> > at
>> >
>> org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:96)
>> > at
>> >
>> org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:58)
>> > at
>> >
>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.<init>(DFSClient.java:2836)
>> > at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:500)
>> > at
>> >
>> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:206)
>> > at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:484)
>> > at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:465)
>> > at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:372)
>> > at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:208)
>> > at
>> > org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1189)
>> at org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1165)
>> > at
>> > org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1137)
>> at com.musigma.hdfs.HdfsBackup.main(HdfsBackup.java:20)
>> > Caused by: org.apache.hadoop.ipc.RemoteException:
>> > org.apache.hadoop.security.AccessControlException: Permission denied:
>> > user=DrWho, access=WRITE, inode="hadoop":hadoop:supergroup:rwxr-xr-x
>> > at
>> >
>> org.apache.hadoop.hdfs.server.namenode.PermissionChecker.check(PermissionChecker.java:176)
>> > at
>> >
>> org.apache.hadoop.hdfs.server.namenode.PermissionChecker.check(PermissionChecker.java:157)
>> > at
>> >
>> org.apache.hadoop.hdfs.server.namenode.PermissionChecker.checkPermission(PermissionChecker.java:105)
>> > at
>> >
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:4702)
>> > at
>> >
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:4672)
>> > at
>> >
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1048)
>> > at
>> >
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1002)
>> > at
>> > org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:381)
>> > at sun.reflect.GeneratedMethodAccessor19.invoke(Unknown Source)
>> > at
>> >
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> > at java.lang.reflect.Method.invoke(Method.java:616)
>> > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
>> > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:961)
>> > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:957)
>> > at java.security.AccessController.doPrivileged(Native Method)
>> > at javax.security.auth.Subject.doAs(Subject.java:416)
>> > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:955)
>> > at org.apache.hadoop.ipc.Client.call(Client.java:740)
>> > at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
>> > at $Proxy0.create(Unknown Source)
>> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> > at
>> >
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>> > at
>> >
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>> > at java.lang.reflect.Method.invoke(Method.java:597)
>> > at
>> >
>> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
>> > at
>> >
>> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
>> > at $Proxy0.create(Unknown Source)
>> > at
>> >
>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.<init>(DFSClient.java:2833)
>> > ... 10 more
>> > As far as I know, the exception is coming because some other user
>> > is trying
>> > to access HDFS than my hadoop user.
>> > Does it mean I have to change permission ?
>> > or is there any other way to do it from java code ?
>> >
>> > Thanks,
>> > Praveenesh
>> > ---------- Forwarded message ----------
>> > From: Uma Maheswara Rao G 72686 <ma...@huawei.com>
>> > Date: Wed, Sep 21, 2011 at 3:27 PM
>> > Subject: Re: Any other way to copy to HDFS ?
>> > To: common-user@hadoop.apache.org
>> >
>> >
>> > When you start the NameNode in Linux Machine, it will listen on one
>> > address.You can configure that address in NameNode by using
>> > fs.default.name.
>> > From the clients, you can give this address to connect to your
>> > NameNode.
>> > initialize API will take URI and configuration.
>> >
>> > Assume if your NameNode is running on hdfs://10.18.52.63:9000
>> >
>> > Then you can caonnect to your NameNode like below.
>> >
>> > FileSystem fs =new DistributedFileSystem();
>> > fs.initialize(new URI("hdfs://10.18.52.63:9000/"), new
>> > Configuration());
>> > Please go through the below mentioned docs, you will more
>> > understanding.
>> > >if I want to
>> > > copy data from windows machine to namenode machine ?
>> > In DFS namenode will be responsible for only nameSpace.
>> >
>> > in simple words to understand quickly the flow:
>> >  Clients will ask NameNode to give some DNs to copy the data.
>> > Then NN will
>> > create file entry in NameSpace and also will return the block
>> > entries based
>> > on client request. Then clients directly will connect to the DNs
>> > and copy
>> > the data.
>> > Reading data back also will the sameway.
>> >
>> > I hope you will understand better now :-)
>> >
>> >
>> > Regards,
>> > Uma
>> >
>> > ----- Original Message -----
>> > From: praveenesh kumar <pr...@gmail.com>
>> > Date: Wednesday, September 21, 2011 3:11 pm
>> > Subject: Re: Any other way to copy to HDFS ?
>> > To: common-user@hadoop.apache.org
>> >
>> > > So I want to copy the file from windows machine to linux namenode.
>> > > How can I define NAMENODE_URI in the code you mention, if I want to
>> > > copy data from windows machine to namenode machine ?
>> > >
>> > > Thanks,
>> > > Praveenesh
>> > >
>> > > On Wed, Sep 21, 2011 at 2:37 PM, Uma Maheswara Rao G 72686 <
>> > > maheswara@huawei.com> wrote:
>> > >
>> > > > For more understanding the flows, i would recommend you to go
>> > > through once
>> > > > below docs
>> > > >
>> > > >
>> > >
>> >
>> http://hadoop.apache.org/common/docs/r0.16.4/hdfs_design.html#The+File+System+Namespace
>> > >
>> > > > Regards,
>> > > > Uma
>> > > >
>> > > > ----- Original Message -----
>> > > > From: Uma Maheswara Rao G 72686 <ma...@huawei.com>
>> > > > Date: Wednesday, September 21, 2011 2:36 pm
>> > > > Subject: Re: Any other way to copy to HDFS ?
>> > > > To: common-user@hadoop.apache.org
>> > > >
>> > > > >
>> > > > > Hi,
>> > > > >
>> > > > > You need not copy the files to NameNode.
>> > > > >
>> > > > > Hadoop provide Client code as well to copy the files.
>> > > > > To copy the files from other node ( non dfs), you need to
>> > put the
>> > > > > hadoop**.jar's into classpath and use the below code snippet.
>> > > > >
>> > > > > FileSystem fs =new DistributedFileSystem();
>> > > > > fs.initialize("NAMENODE_URI", configuration);
>> > > > >
>> > > > > fs.copyFromLocal(srcPath, dstPath);
>> > > > >
>> > > > > using this API, you can copy the files from any machine.
>> > > > >
>> > > > > Regards,
>> > > > > Uma
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > > ----- Original Message -----
>> > > > > From: praveenesh kumar <pr...@gmail.com>
>> > > > > Date: Wednesday, September 21, 2011 2:14 pm
>> > > > > Subject: Any other way to copy to HDFS ?
>> > > > > To: common-user@hadoop.apache.org
>> > > > >
>> > > > > > Guys,
>> > > > > >
>> > > > > > As far as I know hadoop, I think, to copy the files to HDFS,
>> > > > > first
>> > > > > > it needs
>> > > > > > to be copied to the NameNode's local filesystem. Is it
>> > right ??
>> > > > > > So does it mean that even if I have a hadoop cluster of 10
>> > nodes> > > with> overall capacity of 6TB, but if my NameNode's
>> > hard disk
>> > > > > capacity
>> > > > > > is 500 GB,
>> > > > > > I can not copy any file to HDFS greater than 500 GB ?
>> > > > > >
>> > > > > > Is there any other way to directly copy to HDFS without
>> > copy the
>> > > > > > file to
>> > > > > > namenode's local filesystem ?
>> > > > > > What can be other ways to copy large files greater than
>> > > > > namenode's
>> > > > > > diskcapacity ?
>> > > > > >
>> > > > > > Thanks,
>> > > > > > Praveenesh.
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>



-- 
Harsh J

Re: Fwd: Any other way to copy to HDFS ?

Posted by praveenesh kumar <pr...@gmail.com>.

Thanks a lot..!!
I guess I can play around with the permissions of dfs for a while.

On Wed, Sep 21, 2011 at 3:59 PM, Uma Maheswara Rao G 72686 <
maheswara@huawei.com> wrote:

> Hello Praveenesh,
>
> If you really need not care about permissions then you can disable it at NN
> side by using the property dfs.permissions.enable
>
> You can the permission for the path before creating as well.
>
> from docs:
> Changes to the File System API
> All methods that use a path parameter will throw AccessControlException if
> permission checking fails.
>
> New methods:
>
> public FSDataOutputStream create(Path f, FsPermission permission, boolean
> overwrite, int bufferSize, short replication, long blockSize, Progressable
> progress) throws IOException;
> public boolean mkdirs(Path f, FsPermission permission) throws IOException;
> public void setPermission(Path p, FsPermission permission) throws
> IOException;
> public void setOwner(Path p, String username, String groupname) throws
> IOException;
> public FileStatus getFileStatus(Path f) throws IOException; will
> additionally return the user, group and mode associated with the path.
>
>
> http://hadoop.apache.org/common/docs/r0.20.2/hdfs_permissions_guide.html
>
>
> Regards,
> Uma
> ----- Original Message -----
> From: praveenesh kumar <pr...@gmail.com>
> Date: Wednesday, September 21, 2011 3:41 pm
> Subject: Fwd: Any other way to copy to HDFS ?
> To: common-user@hadoop.apache.org
>
> > Thanks a lot. I am trying to run the following code on my windows
> > machinethat is not part of cluster.
>  > **
> > *public* *static* *void* main(String args[]) *throws* IOException,
> > URISyntaxException
> >
> > {
> >
> > FileSystem fs =*new* DistributedFileSystem();
> >
> > fs.initialize(*new* URI("hdfs://162.192.100.53:54310/"),
> > *new*Configuration());
> > fs.copyFromLocalFile(*new* Path("C:\\Positive.txt"),*new* Path(
> > "/user/hadoop/Positive.txt"));
> >
> > System.*out*.println("Done");
> >
> > }
> >
> > But I am getting the following exception :
> >
> > Exception in thread "main"
> > org.apache.hadoop.security.AccessControlException:
> > org.apache.hadoop.security.AccessControlException: Permission denied:
> > user=DrWho, access=WRITE, inode="hadoop":hadoop:supergroup:rwxr-xr-x
> > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> > Method) at
> >
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
> > at
> >
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
> > at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
> > at
> >
> org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:96)
> > at
> >
> org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:58)
> > at
> >
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.<init>(DFSClient.java:2836)
> > at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:500)
> > at
> >
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:206)
> > at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:484)
> > at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:465)
> > at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:372)
> > at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:208)
> > at
> > org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1189)
> at org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1165)
> > at
> > org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1137)
> at com.musigma.hdfs.HdfsBackup.main(HdfsBackup.java:20)
> > Caused by: org.apache.hadoop.ipc.RemoteException:
> > org.apache.hadoop.security.AccessControlException: Permission denied:
> > user=DrWho, access=WRITE, inode="hadoop":hadoop:supergroup:rwxr-xr-x
> > at
> >
> org.apache.hadoop.hdfs.server.namenode.PermissionChecker.check(PermissionChecker.java:176)
> > at
> >
> org.apache.hadoop.hdfs.server.namenode.PermissionChecker.check(PermissionChecker.java:157)
> > at
> >
> org.apache.hadoop.hdfs.server.namenode.PermissionChecker.checkPermission(PermissionChecker.java:105)
> > at
> >
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:4702)
> > at
> >
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:4672)
> > at
> >
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1048)
> > at
> >
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1002)
> > at
> > org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:381)
> > at sun.reflect.GeneratedMethodAccessor19.invoke(Unknown Source)
> > at
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > at java.lang.reflect.Method.invoke(Method.java:616)
> > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
> > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:961)
> > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:957)
> > at java.security.AccessController.doPrivileged(Native Method)
> > at javax.security.auth.Subject.doAs(Subject.java:416)
> > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:955)
> > at org.apache.hadoop.ipc.Client.call(Client.java:740)
> > at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
> > at $Proxy0.create(Unknown Source)
> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > at
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> > at
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> > at java.lang.reflect.Method.invoke(Method.java:597)
> > at
> >
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
> > at
> >
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
> > at $Proxy0.create(Unknown Source)
> > at
> >
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.<init>(DFSClient.java:2833)
> > ... 10 more
> > As far as I know, the exception is coming because some other user
> > is trying
> > to access HDFS than my hadoop user.
> > Does it mean I have to change permission ?
> > or is there any other way to do it from java code ?
> >
> > Thanks,
> > Praveenesh
> > ---------- Forwarded message ----------
> > From: Uma Maheswara Rao G 72686 <ma...@huawei.com>
> > Date: Wed, Sep 21, 2011 at 3:27 PM
> > Subject: Re: Any other way to copy to HDFS ?
> > To: common-user@hadoop.apache.org
> >
> >
> > When you start the NameNode in Linux Machine, it will listen on one
> > address.You can configure that address in NameNode by using
> > fs.default.name.
> > From the clients, you can give this address to connect to your
> > NameNode.
> > initialize API will take URI and configuration.
> >
> > Assume if your NameNode is running on hdfs://10.18.52.63:9000
> >
> > Then you can caonnect to your NameNode like below.
> >
> > FileSystem fs =new DistributedFileSystem();
> > fs.initialize(new URI("hdfs://10.18.52.63:9000/"), new
> > Configuration());
> > Please go through the below mentioned docs, you will more
> > understanding.
> > >if I want to
> > > copy data from windows machine to namenode machine ?
> > In DFS namenode will be responsible for only nameSpace.
> >
> > in simple words to understand quickly the flow:
> >  Clients will ask NameNode to give some DNs to copy the data.
> > Then NN will
> > create file entry in NameSpace and also will return the block
> > entries based
> > on client request. Then clients directly will connect to the DNs
> > and copy
> > the data.
> > Reading data back also will the sameway.
> >
> > I hope you will understand better now :-)
> >
> >
> > Regards,
> > Uma
> >
> > ----- Original Message -----
> > From: praveenesh kumar <pr...@gmail.com>
> > Date: Wednesday, September 21, 2011 3:11 pm
> > Subject: Re: Any other way to copy to HDFS ?
> > To: common-user@hadoop.apache.org
> >
> > > So I want to copy the file from windows machine to linux namenode.
> > > How can I define NAMENODE_URI in the code you mention, if I want to
> > > copy data from windows machine to namenode machine ?
> > >
> > > Thanks,
> > > Praveenesh
> > >
> > > On Wed, Sep 21, 2011 at 2:37 PM, Uma Maheswara Rao G 72686 <
> > > maheswara@huawei.com> wrote:
> > >
> > > > For more understanding the flows, i would recommend you to go
> > > through once
> > > > below docs
> > > >
> > > >
> > >
> >
> http://hadoop.apache.org/common/docs/r0.16.4/hdfs_design.html#The+File+System+Namespace
> > >
> > > > Regards,
> > > > Uma
> > > >
> > > > ----- Original Message -----
> > > > From: Uma Maheswara Rao G 72686 <ma...@huawei.com>
> > > > Date: Wednesday, September 21, 2011 2:36 pm
> > > > Subject: Re: Any other way to copy to HDFS ?
> > > > To: common-user@hadoop.apache.org
> > > >
> > > > >
> > > > > Hi,
> > > > >
> > > > > You need not copy the files to NameNode.
> > > > >
> > > > > Hadoop provide Client code as well to copy the files.
> > > > > To copy the files from other node ( non dfs), you need to
> > put the
> > > > > hadoop**.jar's into classpath and use the below code snippet.
> > > > >
> > > > > FileSystem fs =new DistributedFileSystem();
> > > > > fs.initialize("NAMENODE_URI", configuration);
> > > > >
> > > > > fs.copyFromLocal(srcPath, dstPath);
> > > > >
> > > > > using this API, you can copy the files from any machine.
> > > > >
> > > > > Regards,
> > > > > Uma
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > ----- Original Message -----
> > > > > From: praveenesh kumar <pr...@gmail.com>
> > > > > Date: Wednesday, September 21, 2011 2:14 pm
> > > > > Subject: Any other way to copy to HDFS ?
> > > > > To: common-user@hadoop.apache.org
> > > > >
> > > > > > Guys,
> > > > > >
> > > > > > As far as I know hadoop, I think, to copy the files to HDFS,
> > > > > first
> > > > > > it needs
> > > > > > to be copied to the NameNode's local filesystem. Is it
> > right ??
> > > > > > So does it mean that even if I have a hadoop cluster of 10
> > nodes> > > with> overall capacity of 6TB, but if my NameNode's
> > hard disk
> > > > > capacity
> > > > > > is 500 GB,
> > > > > > I can not copy any file to HDFS greater than 500 GB ?
> > > > > >
> > > > > > Is there any other way to directly copy to HDFS without
> > copy the
> > > > > > file to
> > > > > > namenode's local filesystem ?
> > > > > > What can be other ways to copy large files greater than
> > > > > namenode's
> > > > > > diskcapacity ?
> > > > > >
> > > > > > Thanks,
> > > > > > Praveenesh.
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Fwd: Any other way to copy to HDFS ?

Posted by Uma Maheswara Rao G 72686 <ma...@huawei.com>.

Hello Praveenesh,

If you really need not care about permissions then you can disable it at NN side by using the property dfs.permissions.enable

You can the permission for the path before creating as well.

from docs:
Changes to the File System API
All methods that use a path parameter will throw AccessControlException if permission checking fails. 

New methods:

public FSDataOutputStream create(Path f, FsPermission permission, boolean overwrite, int bufferSize, short replication, long blockSize, Progressable progress) throws IOException; 
public boolean mkdirs(Path f, FsPermission permission) throws IOException; 
public void setPermission(Path p, FsPermission permission) throws IOException; 
public void setOwner(Path p, String username, String groupname) throws IOException; 
public FileStatus getFileStatus(Path f) throws IOException; will additionally return the user, group and mode associated with the path. 


http://hadoop.apache.org/common/docs/r0.20.2/hdfs_permissions_guide.html


Regards,
Uma
----- Original Message -----
From: praveenesh kumar <pr...@gmail.com>
Date: Wednesday, September 21, 2011 3:41 pm
Subject: Fwd: Any other way to copy to HDFS ?
To: common-user@hadoop.apache.org

> Thanks a lot. I am trying to run the following code on my windows 
> machinethat is not part of cluster.
> **
> *public* *static* *void* main(String args[]) *throws* IOException,
> URISyntaxException
> 
> {
> 
> FileSystem fs =*new* DistributedFileSystem();
> 
> fs.initialize(*new* URI("hdfs://162.192.100.53:54310/"), 
> *new*Configuration());
> fs.copyFromLocalFile(*new* Path("C:\\Positive.txt"),*new* Path(
> "/user/hadoop/Positive.txt"));
> 
> System.*out*.println("Done");
> 
> }
> 
> But I am getting the following exception :
> 
> Exception in thread "main"
> org.apache.hadoop.security.AccessControlException:
> org.apache.hadoop.security.AccessControlException: Permission denied:
> user=DrWho, access=WRITE, inode="hadoop":hadoop:supergroup:rwxr-xr-x
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method) at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
> at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
> at
> org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:96)
> at
> org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:58)
> at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.<init>(DFSClient.java:2836)
> at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:500)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:206)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:484)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:465)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:372)
> at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:208)
> at 
> org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1189) at org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1165)
> at 
> org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1137) at com.musigma.hdfs.HdfsBackup.main(HdfsBackup.java:20)
> Caused by: org.apache.hadoop.ipc.RemoteException:
> org.apache.hadoop.security.AccessControlException: Permission denied:
> user=DrWho, access=WRITE, inode="hadoop":hadoop:supergroup:rwxr-xr-x
> at
> org.apache.hadoop.hdfs.server.namenode.PermissionChecker.check(PermissionChecker.java:176)
> at
> org.apache.hadoop.hdfs.server.namenode.PermissionChecker.check(PermissionChecker.java:157)
> at
> org.apache.hadoop.hdfs.server.namenode.PermissionChecker.checkPermission(PermissionChecker.java:105)
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:4702)
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:4672)
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1048)
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1002)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:381)
> at sun.reflect.GeneratedMethodAccessor19.invoke(Unknown Source)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:616)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:961)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:957)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:416)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:955)
> at org.apache.hadoop.ipc.Client.call(Client.java:740)
> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
> at $Proxy0.create(Unknown Source)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
> at $Proxy0.create(Unknown Source)
> at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.<init>(DFSClient.java:2833)
> ... 10 more
> As far as I know, the exception is coming because some other user 
> is trying
> to access HDFS than my hadoop user.
> Does it mean I have to change permission ?
> or is there any other way to do it from java code ?
> 
> Thanks,
> Praveenesh
> ---------- Forwarded message ----------
> From: Uma Maheswara Rao G 72686 <ma...@huawei.com>
> Date: Wed, Sep 21, 2011 at 3:27 PM
> Subject: Re: Any other way to copy to HDFS ?
> To: common-user@hadoop.apache.org
> 
> 
> When you start the NameNode in Linux Machine, it will listen on one
> address.You can configure that address in NameNode by using 
> fs.default.name.
> From the clients, you can give this address to connect to your 
> NameNode.
> initialize API will take URI and configuration.
> 
> Assume if your NameNode is running on hdfs://10.18.52.63:9000
> 
> Then you can caonnect to your NameNode like below.
> 
> FileSystem fs =new DistributedFileSystem();
> fs.initialize(new URI("hdfs://10.18.52.63:9000/"), new 
> Configuration());
> Please go through the below mentioned docs, you will more 
> understanding.
> >if I want to
> > copy data from windows machine to namenode machine ?
> In DFS namenode will be responsible for only nameSpace.
> 
> in simple words to understand quickly the flow:
>  Clients will ask NameNode to give some DNs to copy the data. 
> Then NN will
> create file entry in NameSpace and also will return the block 
> entries based
> on client request. Then clients directly will connect to the DNs 
> and copy
> the data.
> Reading data back also will the sameway.
> 
> I hope you will understand better now :-)
> 
> 
> Regards,
> Uma
> 
> ----- Original Message -----
> From: praveenesh kumar <pr...@gmail.com>
> Date: Wednesday, September 21, 2011 3:11 pm
> Subject: Re: Any other way to copy to HDFS ?
> To: common-user@hadoop.apache.org
> 
> > So I want to copy the file from windows machine to linux namenode.
> > How can I define NAMENODE_URI in the code you mention, if I want to
> > copy data from windows machine to namenode machine ?
> >
> > Thanks,
> > Praveenesh
> >
> > On Wed, Sep 21, 2011 at 2:37 PM, Uma Maheswara Rao G 72686 <
> > maheswara@huawei.com> wrote:
> >
> > > For more understanding the flows, i would recommend you to go
> > through once
> > > below docs
> > >
> > >
> >
> http://hadoop.apache.org/common/docs/r0.16.4/hdfs_design.html#The+File+System+Namespace
> >
> > > Regards,
> > > Uma
> > >
> > > ----- Original Message -----
> > > From: Uma Maheswara Rao G 72686 <ma...@huawei.com>
> > > Date: Wednesday, September 21, 2011 2:36 pm
> > > Subject: Re: Any other way to copy to HDFS ?
> > > To: common-user@hadoop.apache.org
> > >
> > > >
> > > > Hi,
> > > >
> > > > You need not copy the files to NameNode.
> > > >
> > > > Hadoop provide Client code as well to copy the files.
> > > > To copy the files from other node ( non dfs), you need to 
> put the
> > > > hadoop**.jar's into classpath and use the below code snippet.
> > > >
> > > > FileSystem fs =new DistributedFileSystem();
> > > > fs.initialize("NAMENODE_URI", configuration);
> > > >
> > > > fs.copyFromLocal(srcPath, dstPath);
> > > >
> > > > using this API, you can copy the files from any machine.
> > > >
> > > > Regards,
> > > > Uma
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > ----- Original Message -----
> > > > From: praveenesh kumar <pr...@gmail.com>
> > > > Date: Wednesday, September 21, 2011 2:14 pm
> > > > Subject: Any other way to copy to HDFS ?
> > > > To: common-user@hadoop.apache.org
> > > >
> > > > > Guys,
> > > > >
> > > > > As far as I know hadoop, I think, to copy the files to HDFS,
> > > > first
> > > > > it needs
> > > > > to be copied to the NameNode's local filesystem. Is it 
> right ??
> > > > > So does it mean that even if I have a hadoop cluster of 10 
> nodes> > > with> overall capacity of 6TB, but if my NameNode's 
> hard disk
> > > > capacity
> > > > > is 500 GB,
> > > > > I can not copy any file to HDFS greater than 500 GB ?
> > > > >
> > > > > Is there any other way to directly copy to HDFS without 
> copy the
> > > > > file to
> > > > > namenode's local filesystem ?
> > > > > What can be other ways to copy large files greater than
> > > > namenode's
> > > > > diskcapacity ?
> > > > >
> > > > > Thanks,
> > > > > Praveenesh.
> > > > >
> > > >
> > >
> >
>

Fwd: Any other way to copy to HDFS ?

Posted by praveenesh kumar <pr...@gmail.com>.

Thanks a lot. I am trying to run the following code on my windows machine
that is not part of cluster.
**
*public* *static* *void* main(String args[]) *throws* IOException,
URISyntaxException

{

FileSystem fs =*new* DistributedFileSystem();

fs.initialize(*new* URI("hdfs://162.192.100.53:54310/"), *new*Configuration());

fs.copyFromLocalFile(*new* Path("C:\\Positive.txt"),*new* Path(
"/user/hadoop/Positive.txt"));

System.*out*.println("Done");

}

But I am getting the following exception :

Exception in thread "main"
org.apache.hadoop.security.AccessControlException:
org.apache.hadoop.security.AccessControlException: Permission denied:
user=DrWho, access=WRITE, inode="hadoop":hadoop:supergroup:rwxr-xr-x
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
 at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
 at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
 at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
 at
org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:96)
 at
org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:58)
 at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.<init>(DFSClient.java:2836)
 at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:500)
 at
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:206)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:484)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:465)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:372)
 at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:208)
 at org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1189)
 at org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1165)
 at org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1137)
 at com.musigma.hdfs.HdfsBackup.main(HdfsBackup.java:20)
Caused by: org.apache.hadoop.ipc.RemoteException:
org.apache.hadoop.security.AccessControlException: Permission denied:
user=DrWho, access=WRITE, inode="hadoop":hadoop:supergroup:rwxr-xr-x
 at
org.apache.hadoop.hdfs.server.namenode.PermissionChecker.check(PermissionChecker.java:176)
 at
org.apache.hadoop.hdfs.server.namenode.PermissionChecker.check(PermissionChecker.java:157)
 at
org.apache.hadoop.hdfs.server.namenode.PermissionChecker.checkPermission(PermissionChecker.java:105)
 at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:4702)
 at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:4672)
 at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1048)
 at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1002)
 at
org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:381)
 at sun.reflect.GeneratedMethodAccessor19.invoke(Unknown Source)
 at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:616)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:961)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:957)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:416)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:955)
 at org.apache.hadoop.ipc.Client.call(Client.java:740)
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
 at $Proxy0.create(Unknown Source)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
 at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
 at $Proxy0.create(Unknown Source)
 at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.<init>(DFSClient.java:2833)
 ... 10 more
As far as I know, the exception is coming because some other user is trying
to access HDFS than my hadoop user.
Does it mean I have to change permission ?
or is there any other way to do it from java code ?

Thanks,
Praveenesh
---------- Forwarded message ----------
From: Uma Maheswara Rao G 72686 <ma...@huawei.com>
Date: Wed, Sep 21, 2011 at 3:27 PM
Subject: Re: Any other way to copy to HDFS ?
To: common-user@hadoop.apache.org

When you start the NameNode in Linux Machine, it will listen on one
address.You can configure that address in NameNode by using fs.default.name.

>From the clients, you can give this address to connect to your NameNode.

initialize API will take URI and configuration.

 Assume if your NameNode is running on hdfs://10.18.52.63:9000

Then you can caonnect to your NameNode like below.

FileSystem fs =new DistributedFileSystem();
fs.initialize(new URI("hdfs://10.18.52.63:9000/"), new Configuration());

Please go through the below mentioned docs, you will more understanding.

>if I want to
> copy data from windows machine to namenode machine ?
 In DFS namenode will be responsible for only nameSpace.

 in simple words to understand quickly the flow:
  Clients will ask NameNode to give some DNs to copy the data. Then NN will
create file entry in NameSpace and also will return the block entries based
on client request. Then clients directly will connect to the DNs and copy
the data.
Reading data back also will the sameway.

I hope you will understand better now :-)

Regards,
Uma

----- Original Message -----
From: praveenesh kumar <pr...@gmail.com>
 Date: Wednesday, September 21, 2011 3:11 pm
Subject: Re: Any other way to copy to HDFS ?
To: common-user@hadoop.apache.org

> So I want to copy the file from windows machine to linux namenode.
> How can I define NAMENODE_URI in the code you mention, if I want to
> copy data from windows machine to namenode machine ?
>
> Thanks,
> Praveenesh
>
> On Wed, Sep 21, 2011 at 2:37 PM, Uma Maheswara Rao G 72686 <
> maheswara@huawei.com> wrote:
>
> > For more understanding the flows, i would recommend you to go
> through once
> > below docs
> >
> >
>
http://hadoop.apache.org/common/docs/r0.16.4/hdfs_design.html#The+File+System+Namespace
>
> > Regards,
> > Uma
> >
> > ----- Original Message -----
> > From: Uma Maheswara Rao G 72686 <ma...@huawei.com>
> > Date: Wednesday, September 21, 2011 2:36 pm
> > Subject: Re: Any other way to copy to HDFS ?
> > To: common-user@hadoop.apache.org
> >
> > >
> > > Hi,
> > >
> > > You need not copy the files to NameNode.
> > >
> > > Hadoop provide Client code as well to copy the files.
> > > To copy the files from other node ( non dfs), you need to put the
> > > hadoop**.jar's into classpath and use the below code snippet.
> > >
> > > FileSystem fs =new DistributedFileSystem();
> > > fs.initialize("NAMENODE_URI", configuration);
> > >
> > > fs.copyFromLocal(srcPath, dstPath);
> > >
> > > using this API, you can copy the files from any machine.
> > >
> > > Regards,
> > > Uma
> > >
> > >
> > >
> > >
> > >
> > > ----- Original Message -----
> > > From: praveenesh kumar <pr...@gmail.com>
> > > Date: Wednesday, September 21, 2011 2:14 pm
> > > Subject: Any other way to copy to HDFS ?
> > > To: common-user@hadoop.apache.org
> > >
> > > > Guys,
> > > >
> > > > As far as I know hadoop, I think, to copy the files to HDFS,
> > > first
> > > > it needs
> > > > to be copied to the NameNode's local filesystem. Is it right ??
> > > > So does it mean that even if I have a hadoop cluster of 10 nodes
> > > with> overall capacity of 6TB, but if my NameNode's hard disk
> > > capacity
> > > > is 500 GB,
> > > > I can not copy any file to HDFS greater than 500 GB ?
> > > >
> > > > Is there any other way to directly copy to HDFS without copy the
> > > > file to
> > > > namenode's local filesystem ?
> > > > What can be other ways to copy large files greater than
> > > namenode's
> > > > diskcapacity ?
> > > >
> > > > Thanks,
> > > > Praveenesh.
> > > >
> > >
> >
>

Re: Any other way to copy to HDFS ?

Posted by Uma Maheswara Rao G 72686 <ma...@huawei.com>.

When you start the NameNode in Linux Machine, it will listen on one address.You can configure that address in NameNode by using fs.default.name.

>From the clients, you can give this address to connect to your NameNode.

initialize API will take URI and configuration.

 Assume if your NameNode is running on hdfs://10.18.52.63:9000

Then you can caonnect to your NameNode like below.

FileSystem fs =new DistributedFileSystem();
fs.initialize(new URI("hdfs://10.18.52.63:9000/"), new Configuration());

Please go through the below mentioned docs, you will more understanding.

>if I want to
> copy data from windows machine to namenode machine ?
 In DFS namenode will be responsible for only nameSpace. 

 in simple words to understand quickly the flow: 
   Clients will ask NameNode to give some DNs to copy the data. Then NN will create file entry in NameSpace and also will return the block entries based on client request. Then clients directly will connect to the DNs and copy the data.
Reading data back also will the sameway.

I hope you will understand better now :-)


Regards,
Uma

----- Original Message -----
From: praveenesh kumar <pr...@gmail.com>
Date: Wednesday, September 21, 2011 3:11 pm
Subject: Re: Any other way to copy to HDFS ?
To: common-user@hadoop.apache.org

> So I want to copy the file from windows machine to linux namenode.
> How can I define NAMENODE_URI in the code you mention, if I want to
> copy data from windows machine to namenode machine ?
> 
> Thanks,
> Praveenesh
> 
> On Wed, Sep 21, 2011 at 2:37 PM, Uma Maheswara Rao G 72686 <
> maheswara@huawei.com> wrote:
> 
> > For more understanding the flows, i would recommend you to go 
> through once
> > below docs
> >
> > 
> http://hadoop.apache.org/common/docs/r0.16.4/hdfs_design.html#The+File+System+Namespace>
> > Regards,
> > Uma
> >
> > ----- Original Message -----
> > From: Uma Maheswara Rao G 72686 <ma...@huawei.com>
> > Date: Wednesday, September 21, 2011 2:36 pm
> > Subject: Re: Any other way to copy to HDFS ?
> > To: common-user@hadoop.apache.org
> >
> > >
> > > Hi,
> > >
> > > You need not copy the files to NameNode.
> > >
> > > Hadoop provide Client code as well to copy the files.
> > > To copy the files from other node ( non dfs), you need to put the
> > > hadoop**.jar's into classpath and use the below code snippet.
> > >
> > > FileSystem fs =new DistributedFileSystem();
> > > fs.initialize("NAMENODE_URI", configuration);
> > >
> > > fs.copyFromLocal(srcPath, dstPath);
> > >
> > > using this API, you can copy the files from any machine.
> > >
> > > Regards,
> > > Uma
> > >
> > >
> > >
> > >
> > >
> > > ----- Original Message -----
> > > From: praveenesh kumar <pr...@gmail.com>
> > > Date: Wednesday, September 21, 2011 2:14 pm
> > > Subject: Any other way to copy to HDFS ?
> > > To: common-user@hadoop.apache.org
> > >
> > > > Guys,
> > > >
> > > > As far as I know hadoop, I think, to copy the files to HDFS,
> > > first
> > > > it needs
> > > > to be copied to the NameNode's local filesystem. Is it right ??
> > > > So does it mean that even if I have a hadoop cluster of 10 nodes
> > > with> overall capacity of 6TB, but if my NameNode's hard disk
> > > capacity
> > > > is 500 GB,
> > > > I can not copy any file to HDFS greater than 500 GB ?
> > > >
> > > > Is there any other way to directly copy to HDFS without copy the
> > > > file to
> > > > namenode's local filesystem ?
> > > > What can be other ways to copy large files greater than
> > > namenode's
> > > > diskcapacity ?
> > > >
> > > > Thanks,
> > > > Praveenesh.
> > > >
> > >
> >
>

Re: Any other way to copy to HDFS ?

Posted by praveenesh kumar <pr...@gmail.com>.

So I want to copy the file from windows machine to linux namenode.
How can I define NAMENODE_URI in the code you mention, if I want to
copy data from windows machine to namenode machine ?

Thanks,
Praveenesh

On Wed, Sep 21, 2011 at 2:37 PM, Uma Maheswara Rao G 72686 <
maheswara@huawei.com> wrote:

> For more understanding the flows, i would recommend you to go through once
> below docs
>
> http://hadoop.apache.org/common/docs/r0.16.4/hdfs_design.html#The+File+System+Namespace
>
> Regards,
> Uma
>
> ----- Original Message -----
> From: Uma Maheswara Rao G 72686 <ma...@huawei.com>
> Date: Wednesday, September 21, 2011 2:36 pm
> Subject: Re: Any other way to copy to HDFS ?
> To: common-user@hadoop.apache.org
>
> >
> > Hi,
> >
> > You need not copy the files to NameNode.
> >
> > Hadoop provide Client code as well to copy the files.
> > To copy the files from other node ( non dfs), you need to put the
> > hadoop**.jar's into classpath and use the below code snippet.
> >
> > FileSystem fs =new DistributedFileSystem();
> > fs.initialize("NAMENODE_URI", configuration);
> >
> > fs.copyFromLocal(srcPath, dstPath);
> >
> > using this API, you can copy the files from any machine.
> >
> > Regards,
> > Uma
> >
> >
> >
> >
> >
> > ----- Original Message -----
> > From: praveenesh kumar <pr...@gmail.com>
> > Date: Wednesday, September 21, 2011 2:14 pm
> > Subject: Any other way to copy to HDFS ?
> > To: common-user@hadoop.apache.org
> >
> > > Guys,
> > >
> > > As far as I know hadoop, I think, to copy the files to HDFS,
> > first
> > > it needs
> > > to be copied to the NameNode's local filesystem. Is it right ??
> > > So does it mean that even if I have a hadoop cluster of 10 nodes
> > with> overall capacity of 6TB, but if my NameNode's hard disk
> > capacity
> > > is 500 GB,
> > > I can not copy any file to HDFS greater than 500 GB ?
> > >
> > > Is there any other way to directly copy to HDFS without copy the
> > > file to
> > > namenode's local filesystem ?
> > > What can be other ways to copy large files greater than
> > namenode's
> > > diskcapacity ?
> > >
> > > Thanks,
> > > Praveenesh.
> > >
> >
>

Re: Any other way to copy to HDFS ?

Posted by Uma Maheswara Rao G 72686 <ma...@huawei.com>.

For more understanding the flows, i would recommend you to go through once below docs
http://hadoop.apache.org/common/docs/r0.16.4/hdfs_design.html#The+File+System+Namespace

Regards,
Uma

----- Original Message -----
From: Uma Maheswara Rao G 72686 <ma...@huawei.com>
Date: Wednesday, September 21, 2011 2:36 pm
Subject: Re: Any other way to copy to HDFS ?
To: common-user@hadoop.apache.org

> 
> Hi,
> 
> You need not copy the files to NameNode.
> 
> Hadoop provide Client code as well to copy the files.
> To copy the files from other node ( non dfs), you need to put the 
> hadoop**.jar's into classpath and use the below code snippet.
> 
> FileSystem fs =new DistributedFileSystem();
> fs.initialize("NAMENODE_URI", configuration);
> 
> fs.copyFromLocal(srcPath, dstPath);
> 
> using this API, you can copy the files from any machine. 
> 
> Regards,
> Uma
> 
> 
> 
> 
> 
> ----- Original Message -----
> From: praveenesh kumar <pr...@gmail.com>
> Date: Wednesday, September 21, 2011 2:14 pm
> Subject: Any other way to copy to HDFS ?
> To: common-user@hadoop.apache.org
> 
> > Guys,
> > 
> > As far as I know hadoop, I think, to copy the files to HDFS, 
> first 
> > it needs
> > to be copied to the NameNode's local filesystem. Is it right ??
> > So does it mean that even if I have a hadoop cluster of 10 nodes 
> with> overall capacity of 6TB, but if my NameNode's hard disk 
> capacity 
> > is 500 GB,
> > I can not copy any file to HDFS greater than 500 GB ?
> > 
> > Is there any other way to directly copy to HDFS without copy the 
> > file to
> > namenode's local filesystem ?
> > What can be other ways to copy large files greater than 
> namenode's 
> > diskcapacity ?
> > 
> > Thanks,
> > Praveenesh.
> > 
>

Re: Any other way to copy to HDFS ?

Posted by Uma Maheswara Rao G 72686 <ma...@huawei.com>.

Hi,

You need not copy the files to NameNode.

Hadoop provide Client code as well to copy the files.
To copy the files from other node ( non dfs), you need to put the hadoop**.jar's into classpath and use the below code snippet.

 FileSystem fs =new DistributedFileSystem();
 fs.initialize("NAMENODE_URI", configuration);

 fs.copyFromLocal(srcPath, dstPath);
 
 using this API, you can copy the files from any machine. 

Regards,
Uma
 




----- Original Message -----
From: praveenesh kumar <pr...@gmail.com>
Date: Wednesday, September 21, 2011 2:14 pm
Subject: Any other way to copy to HDFS ?
To: common-user@hadoop.apache.org

> Guys,
> 
> As far as I know hadoop, I think, to copy the files to HDFS, first 
> it needs
> to be copied to the NameNode's local filesystem. Is it right ??
> So does it mean that even if I have a hadoop cluster of 10 nodes with
> overall capacity of 6TB, but if my NameNode's hard disk capacity 
> is 500 GB,
> I can not copy any file to HDFS greater than 500 GB ?
> 
> Is there any other way to directly copy to HDFS without copy the 
> file to
> namenode's local filesystem ?
> What can be other ways to copy large files greater than namenode's 
> diskcapacity ?
> 
> Thanks,
> Praveenesh.
>