You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Sugandha Naolekar <su...@gmail.com> on 2009/06/11 16:01:36 UTC

Code not working..!

Hello!

I am trying to transfer data from a remote node's filesystem to HDFS. But,
somehow, it's not working.!

***********************************************************************
I have a 7 node cluster, It's config file(hadoop-site.xml) is as follows::

<property>
  <name>fs.default.name</name>
  <value>hdfs://nikhilname:50130</value>
</property>

<property>
  <name>dfs.http.address</name>
  <value>nikhilname:50070</value>
</property>

For not getting too lengthy, I am sending u just the important tags. So
here, nikhilname is the namenode. I have specified its IP in /etc/hosts.

************************************************************************



**************************************************************************
Then, here is the 8th machine(client or remote) which has this config file::

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>

    <property>
        <name>fs.default.name</name>
          <value>hdfs://nikhilname:50130</value>
    </property>

    <property>
          <name>dfs.http.address</name>
              <value>nikhilname:50070</value>
    </property>

</configuration>

Here, I have pointed fs.default.name to the namenode

**********************************************************


**********************************************************
Then, here is the code that simply tries to copy a file from
localfilesystem(remote node) and place it into HDFS, thereby leading to
replication.

The path is /home/hadoop/Desktop/test.java(of remote node)
I want to place it in HDFS(/user/hadoop)

package data.pkg;

import java.io.IOException;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.FileUtil;
import org.apache.hadoop.fs.Path;

public class Try
{
    public static void main(String[] args)
    {
        Configuration conf_hdfs=new Configuration();
        Configuration conf_remote=new Configuration();


        try
        {
            FileSystem hdfs_filesystem=FileSystem.get(conf_hdfs);
            FileSystem remote_filesystem=FileSystem.getLocal(conf_remote);

            String
in_path_name=remote_filesystem+"/home/hadoop/Desktop/test.java";
            Path in_path =new Path(in_path_name);

            String out_path_name=hdfs_filesystem+"";
            Path out_path=new Path("/user/hadoop");

            FileUtil.copy(remote_filesystem,in_path,hdfs_filesystem,
out_path, false, false,conf_hdfs);

            System.out.println("Done...!");
        }
        catch (IOException e)
        {
            e.printStackTrace();
        }
    }


}

********************************************************


********************************************************
But, following are the errors I am getting after it's execution....

java.io.FileNotFoundException: File
org.apache.hadoop.fs.LocalFileSystem@15a8767/home/hadoop/Desktop/test.java
does not exist.
        at
org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:420)
        at
org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:244)
        at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:192)
        at data.pkg.Try.main(Try.java:103)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
        at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
        at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)
******************************************************************


Briefly what I have done as of now::


-> Got the instances of both the filesystems.
-> Passed the paths appropriately.
-> I have also taken care of proxy issues
-> The file is also placed in /home/hadoop/Desktop/test.java on remote node.

*******Also, Can you tel me the difference between LocalFileSystem and
RawFileSystem

Thanking You,


-- 
Regards!
Sugandha

Re: Code not working..!

Posted by Todd Lipcon <to...@cloudera.com>.
On Thu, Jun 11, 2009 at 7:01 AM, Sugandha Naolekar
<su...@gmail.com>wrote:

> Hello!
>
> I am trying to transfer data from a remote node's filesystem to HDFS. But,
> somehow, it's not working.!
>

First, thanks for the good context and pasting of all of the revelant bits!


>
> ***********************************************************************
> I have a 7 node cluster, It's config file(hadoop-site.xml) is as follows::
>
> <property>
>  <name>fs.default.name</name>
>  <value>hdfs://nikhilname:50130</value>
> </property>
>
> <property>
>  <name>dfs.http.address</name>
>  <value>nikhilname:50070</value>
> </property>
>
> For not getting too lengthy, I am sending u just the important tags. So
> here, nikhilname is the namenode. I have specified its IP in /etc/hosts.
>
> ************************************************************************
>
>
>
> **************************************************************************
> Then, here is the 8th machine(client or remote) which has this config
> file::
>
> <?xml version="1.0"?>
> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>
> <!-- Put site-specific property overrides in this file. -->
>
> <configuration>
>
>    <property>
>        <name>fs.default.name</name>
>          <value>hdfs://nikhilname:50130</value>
>    </property>
>
>    <property>
>          <name>dfs.http.address</name>
>              <value>nikhilname:50070</value>
>    </property>
>
> </configuration>
>
> Here, I have pointed fs.default.name to the namenode
>
> **********************************************************
>
>
> **********************************************************
> Then, here is the code that simply tries to copy a file from
> localfilesystem(remote node) and place it into HDFS, thereby leading to
> replication.
>
> The path is /home/hadoop/Desktop/test.java(of remote node)
> I want to place it in HDFS(/user/hadoop)
>
> package data.pkg;
>
> import java.io.IOException;
>
> import org.apache.hadoop.conf.Configuration;
> import org.apache.hadoop.fs.FileSystem;
> import org.apache.hadoop.fs.FileUtil;
> import org.apache.hadoop.fs.Path;
>
> public class Try
> {
>    public static void main(String[] args)
>    {
>        Configuration conf_hdfs=new Configuration();
>        Configuration conf_remote=new Configuration();
>
No need to have two different Configuration objects, but fine.

>
>
>        try
>        {
>            FileSystem hdfs_filesystem=FileSystem.get(conf_hdfs);
>            FileSystem remote_filesystem=FileSystem.getLocal(conf_remote);
>
>            String
> in_path_name=remote_filesystem+"/home/hadoop/Desktop/test.java";
>            Path in_path =new Path(in_path_name);
>
>            String out_path_name=hdfs_filesystem+"";

Your issues are here. You don't need to do this concatenation - simply use
"/home/hadoop/Desktop/test.java" and "/user/hadoop/test.java" to construct
the Path objects. They'll resolve to the right Filesystems by virtue of your
passing them to FileUtil.copy.

>
>            Path out_path=new Path("/user/hadoop");
>
>            FileUtil.copy(remote_filesystem,in_path,hdfs_filesystem,
> out_path, false, false,conf_hdfs);
>
>            System.out.println("Done...!");
>        }
>        catch (IOException e)
>        {
>            e.printStackTrace();
>        }
>    }
>
>
> }
>
> ********************************************************
>
>
> ********************************************************
> But, following are the errors I am getting after it's execution....
>
> java.io.FileNotFoundException: File
> org.apache.hadoop.fs.LocalFileSystem@15a8767/home/hadoop/Desktop/test.java
> does not exist.

Notice here that the filename created is the concatenation of a Java
stringification (File ...LocalFileSystem@<address>) with your path. This
obviously is not found.


>
>        at
>
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:420)
>        at
>
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:244)
>        at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:192)
>        at data.pkg.Try.main(Try.java:103)
>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>        at
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>        at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
>        at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>        at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)
> ******************************************************************
>
>
> Briefly what I have done as of now::
>
>
> -> Got the instances of both the filesystems.
> -> Passed the paths appropriately.
> -> I have also taken care of proxy issues
> -> The file is also placed in /home/hadoop/Desktop/test.java on remote
> node.
>
> *******Also, Can you tel me the difference between LocalFileSystem and
> RawFileSystem
>

Raw is not checksummed, whereas Local is.

-Todd