You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Hao Ren <h....@claravista.fr> on 2013/07/11 15:27:09 UTC

copy files from ftp to hdfs in parallel, distcp failed

Hi,

I am running a hdfs on Amazon EC2

Say, I have a ftp server where stores some data.

I just want to copy these data directly to hdfs in a parallel way (which 
maybe more efficient).

I think hadoop distcp is what I need.

But

     $ bin/hadoop distcp ftp://username:passwd@hostname/some/path/ 
hdfs://namenode/some/path

doesn't work.

     13/07/05 16:13:46 INFO tools.DistCp: 
srcPaths=[ftp://username:passwd@hostname/some/path/]
     13/07/05 16:13:46 INFO tools.DistCp: destPath=hdfs://namenode/some/path
     Copy failed: org.apache.hadoop.mapred.InvalidInputException: Input 
source ftp://username:passwd@hostname/some/path/ does not exist.
     at org.apache.hadoop.tools.DistCp.checkSrcPath(DistCp.java:641)
     at org.apache.hadoop.tools.DistCp.copy(DistCp.java:656)
     at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
     at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)

I checked the path by copying the ftp path in Chrome , and the file 
really exists, I can even download it.

And then, I tried to list the files under the path by:

     $ bin/hadoop dfs -ls ftp://username:passwd@hostname/some/path/

It ends with:

     ls: Cannot access ftp://username:passwd@hostname/some/path/: No 
such file or directory.

That seems the same pb.

Any workaround here ?

Thank you in advance.

Hao.

-- 
Hao Ren
ClaraVista
www.claravista.fr

Re: copy files from ftp to hdfs in parallel, distcp failed

Posted by Shlash <a....@ymail.com>.
Hi 
Can help me to solve this problem please, if you solved it.
Best regards

Shlash


Re: copy files from ftp to hdfs in parallel, distcp failed

Posted by Shlash <a....@ymail.com>.
Hi 
Can help me to solve this problem please, if you solved it.
Best regards

Shlash


Re: copy files from ftp to hdfs in parallel, distcp failed

Posted by Hao Ren <h....@claravista.fr>.
Hi,

I am just wondering whether I can move data from Ftp to Hdfs via Hadoop 
distcp.

Can someone give me an example ?

In my case, I always encounter the "can not access ftp" error.

I am quite sure that the link, login et passwd are correct, actually, I 
have just copy and paste the ftp address to Firefox. It does work. 
However,//it doesn't work with:
bin/hadoop -ls ftp://<my ftp location>

Any workaround here ?

Thank you.

Hao

Le 16/07/2013 17:47, Hao Ren a écrit :
> Hi,
>
> Actually, I test with my own ftp host at first, however it doesn't work.
>
> Then I changed it into 0.0.0.0.
>
> But I always get the "can not access ftp" msg.
>
> Thank you .
>
> Hao.
>
> Le 16/07/2013 17:03, Ram a écrit :
>> Hi,
>>     Please replace 0.0.0.0.with your ftp host ip address and try it.
>>
>> Hi,
>>
>>
>>
>> From,
>> Ramesh.
>>
>>
>>
>>
>> On Mon, Jul 15, 2013 at 3:22 PM, Hao Ren <h.ren@claravista.fr 
>> <ma...@claravista.fr>> wrote:
>>
>>     Thank you, Ram
>>
>>     I have configured core-site.xml as following:
>>
>>     <?xml version="1.0"?>
>>     <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>>
>>     <!-- Put site-specific property overrides in this file. -->
>>
>>     <configuration>
>>
>>         <property>
>>             <name>hadoop.tmp.dir</name>
>>     <value>/vol/persistent-hdfs</value>
>>         </property>
>>
>>         <property>
>>             <name>fs.default.name <http://fs.default.name></name>
>>            
>>     <value>hdfs://ec2-23-23-33-234.compute-1.amazonaws.com:9010
>>     <http://ec2-23-23-33-234.compute-1.amazonaws.com:9010></value>
>>         </property>
>>
>>         <property>
>>             <name>io.file.buffer.size</name>
>>             <value>65536</value>
>>         </property>
>>
>>         <property>
>>             <name>fs.ftp.host</name>
>>             <value>0.0.0.0</value>
>>         </property>
>>
>>         <property>
>>             <name>fs.ftp.host.port</name>
>>             <value>21</value>
>>         </property>
>>
>>     </configuration>
>>
>>     Then I tried  hadoop fs -ls file:/// , it works.
>>     But hadoop fs -ls ftp://<login>:<password>@<ftp server
>>     ip>/<directory>/ doesn't work as usual:
>>         ls: Cannot access ftp://<user>:<password>@<ftp server
>>     ip>/<directory>/: No such file or directory.
>>
>>     When ignoring <directroy> as :
>>
>>     hadoop fs -ls ftp://<login>:<password>@<ftp server ip>/
>>
>>     There are no error msgs, but it lists nothing.
>>
>>
>>     I have also check the rights for my /home/<user> directroy:
>>
>>     drwxr-xr-x 11 <user> <user>  4096 jui 11 16:30 <user>
>>
>>     and all the files under /home/<user> have rights 755.
>>
>>     I can easily copy the link ftp://<user>:<password>@<ftp server
>>     ip>/<directory>/ to firefox, it lists all the files as expected.
>>
>>     Any workaround here ?
>>
>>     Thank you.
>>
>>     Le 12/07/2013 14:01, Ram a écrit :
>>>     Please configure the following in core-ste.xml and try.
>>>        Use hadoop fs -ls file:///  -- to display local file system files
>>>        Use hadoop fs -ls ftp://<your ftp location>   -- to display
>>>     ftp files if it is listing files go for distcp.
>>>
>>>     reference from
>>>     http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/core-default.xml
>>>
>>>     fs.ftp.host 	0.0.0.0 	FTP filesystem connects to this server
>>>     fs.ftp.host.port 	21 	FTP filesystem connects to fs.ftp.host on
>>>     this port
>>>
>>
>>
>>     -- 
>>     Hao Ren
>>     ClaraVista
>>     www.claravista.fr  <http://www.claravista.fr>
>>
>>
>
>
> -- 
> Hao Ren
> ClaraVista
> www.claravista.fr


-- 
Hao Ren
ClaraVista
www.claravista.fr


Re: copy files from ftp to hdfs in parallel, distcp failed

Posted by Hao Ren <h....@claravista.fr>.
Hi,

I am just wondering whether I can move data from Ftp to Hdfs via Hadoop 
distcp.

Can someone give me an example ?

In my case, I always encounter the "can not access ftp" error.

I am quite sure that the link, login et passwd are correct, actually, I 
have just copy and paste the ftp address to Firefox. It does work. 
However,//it doesn't work with:
bin/hadoop -ls ftp://<my ftp location>

Any workaround here ?

Thank you.

Hao

Le 16/07/2013 17:47, Hao Ren a écrit :
> Hi,
>
> Actually, I test with my own ftp host at first, however it doesn't work.
>
> Then I changed it into 0.0.0.0.
>
> But I always get the "can not access ftp" msg.
>
> Thank you .
>
> Hao.
>
> Le 16/07/2013 17:03, Ram a écrit :
>> Hi,
>>     Please replace 0.0.0.0.with your ftp host ip address and try it.
>>
>> Hi,
>>
>>
>>
>> From,
>> Ramesh.
>>
>>
>>
>>
>> On Mon, Jul 15, 2013 at 3:22 PM, Hao Ren <h.ren@claravista.fr 
>> <ma...@claravista.fr>> wrote:
>>
>>     Thank you, Ram
>>
>>     I have configured core-site.xml as following:
>>
>>     <?xml version="1.0"?>
>>     <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>>
>>     <!-- Put site-specific property overrides in this file. -->
>>
>>     <configuration>
>>
>>         <property>
>>             <name>hadoop.tmp.dir</name>
>>     <value>/vol/persistent-hdfs</value>
>>         </property>
>>
>>         <property>
>>             <name>fs.default.name <http://fs.default.name></name>
>>            
>>     <value>hdfs://ec2-23-23-33-234.compute-1.amazonaws.com:9010
>>     <http://ec2-23-23-33-234.compute-1.amazonaws.com:9010></value>
>>         </property>
>>
>>         <property>
>>             <name>io.file.buffer.size</name>
>>             <value>65536</value>
>>         </property>
>>
>>         <property>
>>             <name>fs.ftp.host</name>
>>             <value>0.0.0.0</value>
>>         </property>
>>
>>         <property>
>>             <name>fs.ftp.host.port</name>
>>             <value>21</value>
>>         </property>
>>
>>     </configuration>
>>
>>     Then I tried  hadoop fs -ls file:/// , it works.
>>     But hadoop fs -ls ftp://<login>:<password>@<ftp server
>>     ip>/<directory>/ doesn't work as usual:
>>         ls: Cannot access ftp://<user>:<password>@<ftp server
>>     ip>/<directory>/: No such file or directory.
>>
>>     When ignoring <directroy> as :
>>
>>     hadoop fs -ls ftp://<login>:<password>@<ftp server ip>/
>>
>>     There are no error msgs, but it lists nothing.
>>
>>
>>     I have also check the rights for my /home/<user> directroy:
>>
>>     drwxr-xr-x 11 <user> <user>  4096 jui 11 16:30 <user>
>>
>>     and all the files under /home/<user> have rights 755.
>>
>>     I can easily copy the link ftp://<user>:<password>@<ftp server
>>     ip>/<directory>/ to firefox, it lists all the files as expected.
>>
>>     Any workaround here ?
>>
>>     Thank you.
>>
>>     Le 12/07/2013 14:01, Ram a écrit :
>>>     Please configure the following in core-ste.xml and try.
>>>        Use hadoop fs -ls file:///  -- to display local file system files
>>>        Use hadoop fs -ls ftp://<your ftp location>   -- to display
>>>     ftp files if it is listing files go for distcp.
>>>
>>>     reference from
>>>     http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/core-default.xml
>>>
>>>     fs.ftp.host 	0.0.0.0 	FTP filesystem connects to this server
>>>     fs.ftp.host.port 	21 	FTP filesystem connects to fs.ftp.host on
>>>     this port
>>>
>>
>>
>>     -- 
>>     Hao Ren
>>     ClaraVista
>>     www.claravista.fr  <http://www.claravista.fr>
>>
>>
>
>
> -- 
> Hao Ren
> ClaraVista
> www.claravista.fr


-- 
Hao Ren
ClaraVista
www.claravista.fr


Re: copy files from ftp to hdfs in parallel, distcp failed

Posted by Hao Ren <h....@claravista.fr>.
Hi,

I am just wondering whether I can move data from Ftp to Hdfs via Hadoop 
distcp.

Can someone give me an example ?

In my case, I always encounter the "can not access ftp" error.

I am quite sure that the link, login et passwd are correct, actually, I 
have just copy and paste the ftp address to Firefox. It does work. 
However,//it doesn't work with:
bin/hadoop -ls ftp://<my ftp location>

Any workaround here ?

Thank you.

Hao

Le 16/07/2013 17:47, Hao Ren a écrit :
> Hi,
>
> Actually, I test with my own ftp host at first, however it doesn't work.
>
> Then I changed it into 0.0.0.0.
>
> But I always get the "can not access ftp" msg.
>
> Thank you .
>
> Hao.
>
> Le 16/07/2013 17:03, Ram a écrit :
>> Hi,
>>     Please replace 0.0.0.0.with your ftp host ip address and try it.
>>
>> Hi,
>>
>>
>>
>> From,
>> Ramesh.
>>
>>
>>
>>
>> On Mon, Jul 15, 2013 at 3:22 PM, Hao Ren <h.ren@claravista.fr 
>> <ma...@claravista.fr>> wrote:
>>
>>     Thank you, Ram
>>
>>     I have configured core-site.xml as following:
>>
>>     <?xml version="1.0"?>
>>     <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>>
>>     <!-- Put site-specific property overrides in this file. -->
>>
>>     <configuration>
>>
>>         <property>
>>             <name>hadoop.tmp.dir</name>
>>     <value>/vol/persistent-hdfs</value>
>>         </property>
>>
>>         <property>
>>             <name>fs.default.name <http://fs.default.name></name>
>>            
>>     <value>hdfs://ec2-23-23-33-234.compute-1.amazonaws.com:9010
>>     <http://ec2-23-23-33-234.compute-1.amazonaws.com:9010></value>
>>         </property>
>>
>>         <property>
>>             <name>io.file.buffer.size</name>
>>             <value>65536</value>
>>         </property>
>>
>>         <property>
>>             <name>fs.ftp.host</name>
>>             <value>0.0.0.0</value>
>>         </property>
>>
>>         <property>
>>             <name>fs.ftp.host.port</name>
>>             <value>21</value>
>>         </property>
>>
>>     </configuration>
>>
>>     Then I tried  hadoop fs -ls file:/// , it works.
>>     But hadoop fs -ls ftp://<login>:<password>@<ftp server
>>     ip>/<directory>/ doesn't work as usual:
>>         ls: Cannot access ftp://<user>:<password>@<ftp server
>>     ip>/<directory>/: No such file or directory.
>>
>>     When ignoring <directroy> as :
>>
>>     hadoop fs -ls ftp://<login>:<password>@<ftp server ip>/
>>
>>     There are no error msgs, but it lists nothing.
>>
>>
>>     I have also check the rights for my /home/<user> directroy:
>>
>>     drwxr-xr-x 11 <user> <user>  4096 jui 11 16:30 <user>
>>
>>     and all the files under /home/<user> have rights 755.
>>
>>     I can easily copy the link ftp://<user>:<password>@<ftp server
>>     ip>/<directory>/ to firefox, it lists all the files as expected.
>>
>>     Any workaround here ?
>>
>>     Thank you.
>>
>>     Le 12/07/2013 14:01, Ram a écrit :
>>>     Please configure the following in core-ste.xml and try.
>>>        Use hadoop fs -ls file:///  -- to display local file system files
>>>        Use hadoop fs -ls ftp://<your ftp location>   -- to display
>>>     ftp files if it is listing files go for distcp.
>>>
>>>     reference from
>>>     http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/core-default.xml
>>>
>>>     fs.ftp.host 	0.0.0.0 	FTP filesystem connects to this server
>>>     fs.ftp.host.port 	21 	FTP filesystem connects to fs.ftp.host on
>>>     this port
>>>
>>
>>
>>     -- 
>>     Hao Ren
>>     ClaraVista
>>     www.claravista.fr  <http://www.claravista.fr>
>>
>>
>
>
> -- 
> Hao Ren
> ClaraVista
> www.claravista.fr


-- 
Hao Ren
ClaraVista
www.claravista.fr


Re: copy files from ftp to hdfs in parallel, distcp failed

Posted by Hao Ren <h....@claravista.fr>.
Hi,

I am just wondering whether I can move data from Ftp to Hdfs via Hadoop 
distcp.

Can someone give me an example ?

In my case, I always encounter the "can not access ftp" error.

I am quite sure that the link, login et passwd are correct, actually, I 
have just copy and paste the ftp address to Firefox. It does work. 
However,//it doesn't work with:
bin/hadoop -ls ftp://<my ftp location>

Any workaround here ?

Thank you.

Hao

Le 16/07/2013 17:47, Hao Ren a écrit :
> Hi,
>
> Actually, I test with my own ftp host at first, however it doesn't work.
>
> Then I changed it into 0.0.0.0.
>
> But I always get the "can not access ftp" msg.
>
> Thank you .
>
> Hao.
>
> Le 16/07/2013 17:03, Ram a écrit :
>> Hi,
>>     Please replace 0.0.0.0.with your ftp host ip address and try it.
>>
>> Hi,
>>
>>
>>
>> From,
>> Ramesh.
>>
>>
>>
>>
>> On Mon, Jul 15, 2013 at 3:22 PM, Hao Ren <h.ren@claravista.fr 
>> <ma...@claravista.fr>> wrote:
>>
>>     Thank you, Ram
>>
>>     I have configured core-site.xml as following:
>>
>>     <?xml version="1.0"?>
>>     <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>>
>>     <!-- Put site-specific property overrides in this file. -->
>>
>>     <configuration>
>>
>>         <property>
>>             <name>hadoop.tmp.dir</name>
>>     <value>/vol/persistent-hdfs</value>
>>         </property>
>>
>>         <property>
>>             <name>fs.default.name <http://fs.default.name></name>
>>            
>>     <value>hdfs://ec2-23-23-33-234.compute-1.amazonaws.com:9010
>>     <http://ec2-23-23-33-234.compute-1.amazonaws.com:9010></value>
>>         </property>
>>
>>         <property>
>>             <name>io.file.buffer.size</name>
>>             <value>65536</value>
>>         </property>
>>
>>         <property>
>>             <name>fs.ftp.host</name>
>>             <value>0.0.0.0</value>
>>         </property>
>>
>>         <property>
>>             <name>fs.ftp.host.port</name>
>>             <value>21</value>
>>         </property>
>>
>>     </configuration>
>>
>>     Then I tried  hadoop fs -ls file:/// , it works.
>>     But hadoop fs -ls ftp://<login>:<password>@<ftp server
>>     ip>/<directory>/ doesn't work as usual:
>>         ls: Cannot access ftp://<user>:<password>@<ftp server
>>     ip>/<directory>/: No such file or directory.
>>
>>     When ignoring <directroy> as :
>>
>>     hadoop fs -ls ftp://<login>:<password>@<ftp server ip>/
>>
>>     There are no error msgs, but it lists nothing.
>>
>>
>>     I have also check the rights for my /home/<user> directroy:
>>
>>     drwxr-xr-x 11 <user> <user>  4096 jui 11 16:30 <user>
>>
>>     and all the files under /home/<user> have rights 755.
>>
>>     I can easily copy the link ftp://<user>:<password>@<ftp server
>>     ip>/<directory>/ to firefox, it lists all the files as expected.
>>
>>     Any workaround here ?
>>
>>     Thank you.
>>
>>     Le 12/07/2013 14:01, Ram a écrit :
>>>     Please configure the following in core-ste.xml and try.
>>>        Use hadoop fs -ls file:///  -- to display local file system files
>>>        Use hadoop fs -ls ftp://<your ftp location>   -- to display
>>>     ftp files if it is listing files go for distcp.
>>>
>>>     reference from
>>>     http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/core-default.xml
>>>
>>>     fs.ftp.host 	0.0.0.0 	FTP filesystem connects to this server
>>>     fs.ftp.host.port 	21 	FTP filesystem connects to fs.ftp.host on
>>>     this port
>>>
>>
>>
>>     -- 
>>     Hao Ren
>>     ClaraVista
>>     www.claravista.fr  <http://www.claravista.fr>
>>
>>
>
>
> -- 
> Hao Ren
> ClaraVista
> www.claravista.fr


-- 
Hao Ren
ClaraVista
www.claravista.fr


Re: copy files from ftp to hdfs in parallel, distcp failed

Posted by Hao Ren <h....@claravista.fr>.
Hi,

Actually, I test with my own ftp host at first, however it doesn't work.

Then I changed it into 0.0.0.0.

But I always get the "can not access ftp" msg.

Thank you .

Hao.

Le 16/07/2013 17:03, Ram a écrit :
> Hi,
>     Please replace 0.0.0.0.with your ftp host ip address and try it.
>
> Hi,
>
>
>
> From,
> Ramesh.
>
>
>
>
> On Mon, Jul 15, 2013 at 3:22 PM, Hao Ren <h.ren@claravista.fr 
> <ma...@claravista.fr>> wrote:
>
>     Thank you, Ram
>
>     I have configured core-site.xml as following:
>
>     <?xml version="1.0"?>
>     <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>
>     <!-- Put site-specific property overrides in this file. -->
>
>     <configuration>
>
>         <property>
>             <name>hadoop.tmp.dir</name>
>             <value>/vol/persistent-hdfs</value>
>         </property>
>
>         <property>
>             <name>fs.default.name <http://fs.default.name></name>
>     <value>hdfs://ec2-23-23-33-234.compute-1.amazonaws.com:9010
>     <http://ec2-23-23-33-234.compute-1.amazonaws.com:9010></value>
>         </property>
>
>         <property>
>             <name>io.file.buffer.size</name>
>             <value>65536</value>
>         </property>
>
>         <property>
>             <name>fs.ftp.host</name>
>             <value>0.0.0.0</value>
>         </property>
>
>         <property>
>             <name>fs.ftp.host.port</name>
>             <value>21</value>
>         </property>
>
>     </configuration>
>
>     Then I tried  hadoop fs -ls file:/// , it works.
>     But hadoop fs -ls ftp://<login>:<password>@<ftp server
>     ip>/<directory>/ doesn't work as usual:
>         ls: Cannot access ftp://<user>:<password>@<ftp server
>     ip>/<directory>/: No such file or directory.
>
>     When ignoring <directroy> as :
>
>     hadoop fs -ls ftp://<login>:<password>@<ftp server ip>/
>
>     There are no error msgs, but it lists nothing.
>
>
>     I have also check the rights for my /home/<user> directroy:
>
>     drwxr-xr-x 11 <user> <user>  4096 jui 11 16:30 <user>
>
>     and all the files under /home/<user> have rights 755.
>
>     I can easily copy the link ftp://<user>:<password>@<ftp server
>     ip>/<directory>/ to firefox, it lists all the files as expected.
>
>     Any workaround here ?
>
>     Thank you.
>
>     Le 12/07/2013 14:01, Ram a écrit :
>>     Please configure the following in core-ste.xml and try.
>>        Use hadoop fs -ls file:///  -- to display local file system files
>>        Use hadoop fs -ls ftp://<your ftp location>   -- to display
>>     ftp files if it is listing files go for distcp.
>>
>>     reference from
>>     http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/core-default.xml
>>
>>     fs.ftp.host 	0.0.0.0 	FTP filesystem connects to this server
>>     fs.ftp.host.port 	21 	FTP filesystem connects to fs.ftp.host on
>>     this port
>>
>
>
>     -- 
>     Hao Ren
>     ClaraVista
>     www.claravista.fr  <http://www.claravista.fr>
>
>


-- 
Hao Ren
ClaraVista
www.claravista.fr


Re: copy files from ftp to hdfs in parallel, distcp failed

Posted by Hao Ren <h....@claravista.fr>.
Hi,

Actually, I test with my own ftp host at first, however it doesn't work.

Then I changed it into 0.0.0.0.

But I always get the "can not access ftp" msg.

Thank you .

Hao.

Le 16/07/2013 17:03, Ram a écrit :
> Hi,
>     Please replace 0.0.0.0.with your ftp host ip address and try it.
>
> Hi,
>
>
>
> From,
> Ramesh.
>
>
>
>
> On Mon, Jul 15, 2013 at 3:22 PM, Hao Ren <h.ren@claravista.fr 
> <ma...@claravista.fr>> wrote:
>
>     Thank you, Ram
>
>     I have configured core-site.xml as following:
>
>     <?xml version="1.0"?>
>     <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>
>     <!-- Put site-specific property overrides in this file. -->
>
>     <configuration>
>
>         <property>
>             <name>hadoop.tmp.dir</name>
>             <value>/vol/persistent-hdfs</value>
>         </property>
>
>         <property>
>             <name>fs.default.name <http://fs.default.name></name>
>     <value>hdfs://ec2-23-23-33-234.compute-1.amazonaws.com:9010
>     <http://ec2-23-23-33-234.compute-1.amazonaws.com:9010></value>
>         </property>
>
>         <property>
>             <name>io.file.buffer.size</name>
>             <value>65536</value>
>         </property>
>
>         <property>
>             <name>fs.ftp.host</name>
>             <value>0.0.0.0</value>
>         </property>
>
>         <property>
>             <name>fs.ftp.host.port</name>
>             <value>21</value>
>         </property>
>
>     </configuration>
>
>     Then I tried  hadoop fs -ls file:/// , it works.
>     But hadoop fs -ls ftp://<login>:<password>@<ftp server
>     ip>/<directory>/ doesn't work as usual:
>         ls: Cannot access ftp://<user>:<password>@<ftp server
>     ip>/<directory>/: No such file or directory.
>
>     When ignoring <directroy> as :
>
>     hadoop fs -ls ftp://<login>:<password>@<ftp server ip>/
>
>     There are no error msgs, but it lists nothing.
>
>
>     I have also check the rights for my /home/<user> directroy:
>
>     drwxr-xr-x 11 <user> <user>  4096 jui 11 16:30 <user>
>
>     and all the files under /home/<user> have rights 755.
>
>     I can easily copy the link ftp://<user>:<password>@<ftp server
>     ip>/<directory>/ to firefox, it lists all the files as expected.
>
>     Any workaround here ?
>
>     Thank you.
>
>     Le 12/07/2013 14:01, Ram a écrit :
>>     Please configure the following in core-ste.xml and try.
>>        Use hadoop fs -ls file:///  -- to display local file system files
>>        Use hadoop fs -ls ftp://<your ftp location>   -- to display
>>     ftp files if it is listing files go for distcp.
>>
>>     reference from
>>     http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/core-default.xml
>>
>>     fs.ftp.host 	0.0.0.0 	FTP filesystem connects to this server
>>     fs.ftp.host.port 	21 	FTP filesystem connects to fs.ftp.host on
>>     this port
>>
>
>
>     -- 
>     Hao Ren
>     ClaraVista
>     www.claravista.fr  <http://www.claravista.fr>
>
>


-- 
Hao Ren
ClaraVista
www.claravista.fr


Re: copy files from ftp to hdfs in parallel, distcp failed

Posted by Hao Ren <h....@claravista.fr>.
Hi,

Actually, I test with my own ftp host at first, however it doesn't work.

Then I changed it into 0.0.0.0.

But I always get the "can not access ftp" msg.

Thank you .

Hao.

Le 16/07/2013 17:03, Ram a écrit :
> Hi,
>     Please replace 0.0.0.0.with your ftp host ip address and try it.
>
> Hi,
>
>
>
> From,
> Ramesh.
>
>
>
>
> On Mon, Jul 15, 2013 at 3:22 PM, Hao Ren <h.ren@claravista.fr 
> <ma...@claravista.fr>> wrote:
>
>     Thank you, Ram
>
>     I have configured core-site.xml as following:
>
>     <?xml version="1.0"?>
>     <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>
>     <!-- Put site-specific property overrides in this file. -->
>
>     <configuration>
>
>         <property>
>             <name>hadoop.tmp.dir</name>
>             <value>/vol/persistent-hdfs</value>
>         </property>
>
>         <property>
>             <name>fs.default.name <http://fs.default.name></name>
>     <value>hdfs://ec2-23-23-33-234.compute-1.amazonaws.com:9010
>     <http://ec2-23-23-33-234.compute-1.amazonaws.com:9010></value>
>         </property>
>
>         <property>
>             <name>io.file.buffer.size</name>
>             <value>65536</value>
>         </property>
>
>         <property>
>             <name>fs.ftp.host</name>
>             <value>0.0.0.0</value>
>         </property>
>
>         <property>
>             <name>fs.ftp.host.port</name>
>             <value>21</value>
>         </property>
>
>     </configuration>
>
>     Then I tried  hadoop fs -ls file:/// , it works.
>     But hadoop fs -ls ftp://<login>:<password>@<ftp server
>     ip>/<directory>/ doesn't work as usual:
>         ls: Cannot access ftp://<user>:<password>@<ftp server
>     ip>/<directory>/: No such file or directory.
>
>     When ignoring <directroy> as :
>
>     hadoop fs -ls ftp://<login>:<password>@<ftp server ip>/
>
>     There are no error msgs, but it lists nothing.
>
>
>     I have also check the rights for my /home/<user> directroy:
>
>     drwxr-xr-x 11 <user> <user>  4096 jui 11 16:30 <user>
>
>     and all the files under /home/<user> have rights 755.
>
>     I can easily copy the link ftp://<user>:<password>@<ftp server
>     ip>/<directory>/ to firefox, it lists all the files as expected.
>
>     Any workaround here ?
>
>     Thank you.
>
>     Le 12/07/2013 14:01, Ram a écrit :
>>     Please configure the following in core-ste.xml and try.
>>        Use hadoop fs -ls file:///  -- to display local file system files
>>        Use hadoop fs -ls ftp://<your ftp location>   -- to display
>>     ftp files if it is listing files go for distcp.
>>
>>     reference from
>>     http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/core-default.xml
>>
>>     fs.ftp.host 	0.0.0.0 	FTP filesystem connects to this server
>>     fs.ftp.host.port 	21 	FTP filesystem connects to fs.ftp.host on
>>     this port
>>
>
>
>     -- 
>     Hao Ren
>     ClaraVista
>     www.claravista.fr  <http://www.claravista.fr>
>
>


-- 
Hao Ren
ClaraVista
www.claravista.fr


Re: copy files from ftp to hdfs in parallel, distcp failed

Posted by Hao Ren <h....@claravista.fr>.
Hi,

Actually, I test with my own ftp host at first, however it doesn't work.

Then I changed it into 0.0.0.0.

But I always get the "can not access ftp" msg.

Thank you .

Hao.

Le 16/07/2013 17:03, Ram a écrit :
> Hi,
>     Please replace 0.0.0.0.with your ftp host ip address and try it.
>
> Hi,
>
>
>
> From,
> Ramesh.
>
>
>
>
> On Mon, Jul 15, 2013 at 3:22 PM, Hao Ren <h.ren@claravista.fr 
> <ma...@claravista.fr>> wrote:
>
>     Thank you, Ram
>
>     I have configured core-site.xml as following:
>
>     <?xml version="1.0"?>
>     <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>
>     <!-- Put site-specific property overrides in this file. -->
>
>     <configuration>
>
>         <property>
>             <name>hadoop.tmp.dir</name>
>             <value>/vol/persistent-hdfs</value>
>         </property>
>
>         <property>
>             <name>fs.default.name <http://fs.default.name></name>
>     <value>hdfs://ec2-23-23-33-234.compute-1.amazonaws.com:9010
>     <http://ec2-23-23-33-234.compute-1.amazonaws.com:9010></value>
>         </property>
>
>         <property>
>             <name>io.file.buffer.size</name>
>             <value>65536</value>
>         </property>
>
>         <property>
>             <name>fs.ftp.host</name>
>             <value>0.0.0.0</value>
>         </property>
>
>         <property>
>             <name>fs.ftp.host.port</name>
>             <value>21</value>
>         </property>
>
>     </configuration>
>
>     Then I tried  hadoop fs -ls file:/// , it works.
>     But hadoop fs -ls ftp://<login>:<password>@<ftp server
>     ip>/<directory>/ doesn't work as usual:
>         ls: Cannot access ftp://<user>:<password>@<ftp server
>     ip>/<directory>/: No such file or directory.
>
>     When ignoring <directroy> as :
>
>     hadoop fs -ls ftp://<login>:<password>@<ftp server ip>/
>
>     There are no error msgs, but it lists nothing.
>
>
>     I have also check the rights for my /home/<user> directroy:
>
>     drwxr-xr-x 11 <user> <user>  4096 jui 11 16:30 <user>
>
>     and all the files under /home/<user> have rights 755.
>
>     I can easily copy the link ftp://<user>:<password>@<ftp server
>     ip>/<directory>/ to firefox, it lists all the files as expected.
>
>     Any workaround here ?
>
>     Thank you.
>
>     Le 12/07/2013 14:01, Ram a écrit :
>>     Please configure the following in core-ste.xml and try.
>>        Use hadoop fs -ls file:///  -- to display local file system files
>>        Use hadoop fs -ls ftp://<your ftp location>   -- to display
>>     ftp files if it is listing files go for distcp.
>>
>>     reference from
>>     http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/core-default.xml
>>
>>     fs.ftp.host 	0.0.0.0 	FTP filesystem connects to this server
>>     fs.ftp.host.port 	21 	FTP filesystem connects to fs.ftp.host on
>>     this port
>>
>
>
>     -- 
>     Hao Ren
>     ClaraVista
>     www.claravista.fr  <http://www.claravista.fr>
>
>


-- 
Hao Ren
ClaraVista
www.claravista.fr


Re: copy files from ftp to hdfs in parallel, distcp failed

Posted by Ram <pr...@gmail.com>.
Hi,
    Please replace 0.0.0.0.with your ftp host ip address and try it.

Hi,



From,
Ramesh.




On Mon, Jul 15, 2013 at 3:22 PM, Hao Ren <h....@claravista.fr> wrote:

>  Thank you, Ram
>
> I have configured core-site.xml as following:
>
> <?xml version="1.0"?>
> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>
> <!-- Put site-specific property overrides in this file. -->
>
> <configuration>
>
>     <property>
>         <name>hadoop.tmp.dir</name>
>         <value>/vol/persistent-hdfs</value>
>     </property>
>
>     <property>
>         <name>fs.default.name</name>
>         <value>hdfs://ec2-23-23-33-234.compute-1.amazonaws.com:9010
> </value>
>     </property>
>
>     <property>
>         <name>io.file.buffer.size</name>
>         <value>65536</value>
>     </property>
>
>     <property>
>         <name>fs.ftp.host</name>
>         <value>0.0.0.0</value>
>     </property>
>
>     <property>
>         <name>fs.ftp.host.port</name>
>         <value>21</value>
>     </property>
>
> </configuration>
>
> Then I tried  hadoop fs -ls file:/// , it works.
> But hadoop fs -ls ftp://<login>:<password>@<ftp server ip>/<directory>/
> doesn't work as usual:
>     ls: Cannot access ftp://<user>:<password>@<ftp server
> ip>/<directory>/: No such file or directory.
>
> When ignoring <directroy> as :
>
> hadoop fs -ls ftp://<login>:<password>@<ftp server ip>/
>
> There are no error msgs, but it lists nothing.
>
>
> I have also check the rights for my /home/<user> directroy:
>
> drwxr-xr-x 11 <user> <user>  4096 jui 11 16:30 <user>
>
> and all the files under /home/<user> have rights 755.
>
> I can easily copy the link ftp://<user>:<password>@<ftp server
> ip>/<directory>/ to firefox, it lists all the files as expected.
>
> Any workaround here ?
>
> Thank you.
>
> Le 12/07/2013 14:01, Ram a écrit :
>
> Please configure the following in core-ste.xml and try.
>    Use hadoop fs -ls file:///  -- to display local file system files
>    Use hadoop fs -ls ftp://<your ftp location>   -- to display ftp files
> if it is listing files go for distcp.
>
>  reference from
> http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/core-default.xml
>
>
>    fs.ftp.host 0.0.0.0 FTP filesystem connects to this server
> fs.ftp.host.port 21 FTP filesystem connects to fs.ftp.host on this port
>
>
>
> --
> Hao Ren
> ClaraVistawww.claravista.fr
>
>

Re: copy files from ftp to hdfs in parallel, distcp failed

Posted by Ram <pr...@gmail.com>.
Hi,
    Please replace 0.0.0.0.with your ftp host ip address and try it.

Hi,



From,
Ramesh.




On Mon, Jul 15, 2013 at 3:22 PM, Hao Ren <h....@claravista.fr> wrote:

>  Thank you, Ram
>
> I have configured core-site.xml as following:
>
> <?xml version="1.0"?>
> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>
> <!-- Put site-specific property overrides in this file. -->
>
> <configuration>
>
>     <property>
>         <name>hadoop.tmp.dir</name>
>         <value>/vol/persistent-hdfs</value>
>     </property>
>
>     <property>
>         <name>fs.default.name</name>
>         <value>hdfs://ec2-23-23-33-234.compute-1.amazonaws.com:9010
> </value>
>     </property>
>
>     <property>
>         <name>io.file.buffer.size</name>
>         <value>65536</value>
>     </property>
>
>     <property>
>         <name>fs.ftp.host</name>
>         <value>0.0.0.0</value>
>     </property>
>
>     <property>
>         <name>fs.ftp.host.port</name>
>         <value>21</value>
>     </property>
>
> </configuration>
>
> Then I tried  hadoop fs -ls file:/// , it works.
> But hadoop fs -ls ftp://<login>:<password>@<ftp server ip>/<directory>/
> doesn't work as usual:
>     ls: Cannot access ftp://<user>:<password>@<ftp server
> ip>/<directory>/: No such file or directory.
>
> When ignoring <directroy> as :
>
> hadoop fs -ls ftp://<login>:<password>@<ftp server ip>/
>
> There are no error msgs, but it lists nothing.
>
>
> I have also check the rights for my /home/<user> directroy:
>
> drwxr-xr-x 11 <user> <user>  4096 jui 11 16:30 <user>
>
> and all the files under /home/<user> have rights 755.
>
> I can easily copy the link ftp://<user>:<password>@<ftp server
> ip>/<directory>/ to firefox, it lists all the files as expected.
>
> Any workaround here ?
>
> Thank you.
>
> Le 12/07/2013 14:01, Ram a écrit :
>
> Please configure the following in core-ste.xml and try.
>    Use hadoop fs -ls file:///  -- to display local file system files
>    Use hadoop fs -ls ftp://<your ftp location>   -- to display ftp files
> if it is listing files go for distcp.
>
>  reference from
> http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/core-default.xml
>
>
>    fs.ftp.host 0.0.0.0 FTP filesystem connects to this server
> fs.ftp.host.port 21 FTP filesystem connects to fs.ftp.host on this port
>
>
>
> --
> Hao Ren
> ClaraVistawww.claravista.fr
>
>

Re: copy files from ftp to hdfs in parallel, distcp failed

Posted by Ram <pr...@gmail.com>.
Hi,
    Please replace 0.0.0.0.with your ftp host ip address and try it.

Hi,



From,
Ramesh.




On Mon, Jul 15, 2013 at 3:22 PM, Hao Ren <h....@claravista.fr> wrote:

>  Thank you, Ram
>
> I have configured core-site.xml as following:
>
> <?xml version="1.0"?>
> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>
> <!-- Put site-specific property overrides in this file. -->
>
> <configuration>
>
>     <property>
>         <name>hadoop.tmp.dir</name>
>         <value>/vol/persistent-hdfs</value>
>     </property>
>
>     <property>
>         <name>fs.default.name</name>
>         <value>hdfs://ec2-23-23-33-234.compute-1.amazonaws.com:9010
> </value>
>     </property>
>
>     <property>
>         <name>io.file.buffer.size</name>
>         <value>65536</value>
>     </property>
>
>     <property>
>         <name>fs.ftp.host</name>
>         <value>0.0.0.0</value>
>     </property>
>
>     <property>
>         <name>fs.ftp.host.port</name>
>         <value>21</value>
>     </property>
>
> </configuration>
>
> Then I tried  hadoop fs -ls file:/// , it works.
> But hadoop fs -ls ftp://<login>:<password>@<ftp server ip>/<directory>/
> doesn't work as usual:
>     ls: Cannot access ftp://<user>:<password>@<ftp server
> ip>/<directory>/: No such file or directory.
>
> When ignoring <directroy> as :
>
> hadoop fs -ls ftp://<login>:<password>@<ftp server ip>/
>
> There are no error msgs, but it lists nothing.
>
>
> I have also check the rights for my /home/<user> directroy:
>
> drwxr-xr-x 11 <user> <user>  4096 jui 11 16:30 <user>
>
> and all the files under /home/<user> have rights 755.
>
> I can easily copy the link ftp://<user>:<password>@<ftp server
> ip>/<directory>/ to firefox, it lists all the files as expected.
>
> Any workaround here ?
>
> Thank you.
>
> Le 12/07/2013 14:01, Ram a écrit :
>
> Please configure the following in core-ste.xml and try.
>    Use hadoop fs -ls file:///  -- to display local file system files
>    Use hadoop fs -ls ftp://<your ftp location>   -- to display ftp files
> if it is listing files go for distcp.
>
>  reference from
> http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/core-default.xml
>
>
>    fs.ftp.host 0.0.0.0 FTP filesystem connects to this server
> fs.ftp.host.port 21 FTP filesystem connects to fs.ftp.host on this port
>
>
>
> --
> Hao Ren
> ClaraVistawww.claravista.fr
>
>

Re: copy files from ftp to hdfs in parallel, distcp failed

Posted by Ram <pr...@gmail.com>.
Hi,
    Please replace 0.0.0.0.with your ftp host ip address and try it.

Hi,



From,
Ramesh.




On Mon, Jul 15, 2013 at 3:22 PM, Hao Ren <h....@claravista.fr> wrote:

>  Thank you, Ram
>
> I have configured core-site.xml as following:
>
> <?xml version="1.0"?>
> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>
> <!-- Put site-specific property overrides in this file. -->
>
> <configuration>
>
>     <property>
>         <name>hadoop.tmp.dir</name>
>         <value>/vol/persistent-hdfs</value>
>     </property>
>
>     <property>
>         <name>fs.default.name</name>
>         <value>hdfs://ec2-23-23-33-234.compute-1.amazonaws.com:9010
> </value>
>     </property>
>
>     <property>
>         <name>io.file.buffer.size</name>
>         <value>65536</value>
>     </property>
>
>     <property>
>         <name>fs.ftp.host</name>
>         <value>0.0.0.0</value>
>     </property>
>
>     <property>
>         <name>fs.ftp.host.port</name>
>         <value>21</value>
>     </property>
>
> </configuration>
>
> Then I tried  hadoop fs -ls file:/// , it works.
> But hadoop fs -ls ftp://<login>:<password>@<ftp server ip>/<directory>/
> doesn't work as usual:
>     ls: Cannot access ftp://<user>:<password>@<ftp server
> ip>/<directory>/: No such file or directory.
>
> When ignoring <directroy> as :
>
> hadoop fs -ls ftp://<login>:<password>@<ftp server ip>/
>
> There are no error msgs, but it lists nothing.
>
>
> I have also check the rights for my /home/<user> directroy:
>
> drwxr-xr-x 11 <user> <user>  4096 jui 11 16:30 <user>
>
> and all the files under /home/<user> have rights 755.
>
> I can easily copy the link ftp://<user>:<password>@<ftp server
> ip>/<directory>/ to firefox, it lists all the files as expected.
>
> Any workaround here ?
>
> Thank you.
>
> Le 12/07/2013 14:01, Ram a écrit :
>
> Please configure the following in core-ste.xml and try.
>    Use hadoop fs -ls file:///  -- to display local file system files
>    Use hadoop fs -ls ftp://<your ftp location>   -- to display ftp files
> if it is listing files go for distcp.
>
>  reference from
> http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/core-default.xml
>
>
>    fs.ftp.host 0.0.0.0 FTP filesystem connects to this server
> fs.ftp.host.port 21 FTP filesystem connects to fs.ftp.host on this port
>
>
>
> --
> Hao Ren
> ClaraVistawww.claravista.fr
>
>

Re: copy files from ftp to hdfs in parallel, distcp failed

Posted by Hao Ren <h....@claravista.fr>.
Thank you, Ram

I have configured core-site.xml as following:

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>

     <property>
         <name>hadoop.tmp.dir</name>
         <value>/vol/persistent-hdfs</value>
     </property>

     <property>
         <name>fs.default.name</name>
<value>hdfs://ec2-23-23-33-234.compute-1.amazonaws.com:9010</value>
     </property>

     <property>
         <name>io.file.buffer.size</name>
         <value>65536</value>
     </property>

     <property>
         <name>fs.ftp.host</name>
         <value>0.0.0.0</value>
     </property>

     <property>
         <name>fs.ftp.host.port</name>
         <value>21</value>
     </property>

</configuration>

Then I tried  hadoop fs -ls file:/// , it works.
But hadoop fs -ls ftp://<login>:<password>@<ftp server ip>/<directory>/ 
doesn't work as usual:
     ls: Cannot access ftp://<user>:<password>@<ftp server 
ip>/<directory>/: No such file or directory.

When ignoring <directroy> as :

hadoop fs -ls ftp://<login>:<password>@<ftp server ip>/

There are no error msgs, but it lists nothing.


I have also check the rights for my /home/<user> directroy:

drwxr-xr-x 11 <user> <user>  4096 jui 11 16:30 <user>

and all the files under /home/<user> have rights 755.

I can easily copy the link ftp://<user>:<password>@<ftp server 
ip>/<directory>/ to firefox, it lists all the files as expected.

Any workaround here ?

Thank you.

Le 12/07/2013 14:01, Ram a écrit :
> Please configure the following in core-ste.xml and try.
>    Use hadoop fs -ls file:///  -- to display local file system files
>    Use hadoop fs -ls ftp://<your ftp location>   -- to display ftp 
> files if it is listing files go for distcp.
>
> reference from 
> http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/core-default.xml
>
> fs.ftp.host 	0.0.0.0 	FTP filesystem connects to this server
> fs.ftp.host.port 	21 	FTP filesystem connects to fs.ftp.host on this port
>


-- 
Hao Ren
ClaraVista
www.claravista.fr


Re: copy files from ftp to hdfs in parallel, distcp failed

Posted by Hao Ren <h....@claravista.fr>.
Thank you, Ram

I have configured core-site.xml as following:

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>

     <property>
         <name>hadoop.tmp.dir</name>
         <value>/vol/persistent-hdfs</value>
     </property>

     <property>
         <name>fs.default.name</name>
<value>hdfs://ec2-23-23-33-234.compute-1.amazonaws.com:9010</value>
     </property>

     <property>
         <name>io.file.buffer.size</name>
         <value>65536</value>
     </property>

     <property>
         <name>fs.ftp.host</name>
         <value>0.0.0.0</value>
     </property>

     <property>
         <name>fs.ftp.host.port</name>
         <value>21</value>
     </property>

</configuration>

Then I tried  hadoop fs -ls file:/// , it works.
But hadoop fs -ls ftp://<login>:<password>@<ftp server ip>/<directory>/ 
doesn't work as usual:
     ls: Cannot access ftp://<user>:<password>@<ftp server 
ip>/<directory>/: No such file or directory.

When ignoring <directroy> as :

hadoop fs -ls ftp://<login>:<password>@<ftp server ip>/

There are no error msgs, but it lists nothing.


I have also check the rights for my /home/<user> directroy:

drwxr-xr-x 11 <user> <user>  4096 jui 11 16:30 <user>

and all the files under /home/<user> have rights 755.

I can easily copy the link ftp://<user>:<password>@<ftp server 
ip>/<directory>/ to firefox, it lists all the files as expected.

Any workaround here ?

Thank you.

Le 12/07/2013 14:01, Ram a écrit :
> Please configure the following in core-ste.xml and try.
>    Use hadoop fs -ls file:///  -- to display local file system files
>    Use hadoop fs -ls ftp://<your ftp location>   -- to display ftp 
> files if it is listing files go for distcp.
>
> reference from 
> http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/core-default.xml
>
> fs.ftp.host 	0.0.0.0 	FTP filesystem connects to this server
> fs.ftp.host.port 	21 	FTP filesystem connects to fs.ftp.host on this port
>


-- 
Hao Ren
ClaraVista
www.claravista.fr


Re: copy files from ftp to hdfs in parallel, distcp failed

Posted by Hao Ren <h....@claravista.fr>.
Thank you, Ram

I have configured core-site.xml as following:

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>

     <property>
         <name>hadoop.tmp.dir</name>
         <value>/vol/persistent-hdfs</value>
     </property>

     <property>
         <name>fs.default.name</name>
<value>hdfs://ec2-23-23-33-234.compute-1.amazonaws.com:9010</value>
     </property>

     <property>
         <name>io.file.buffer.size</name>
         <value>65536</value>
     </property>

     <property>
         <name>fs.ftp.host</name>
         <value>0.0.0.0</value>
     </property>

     <property>
         <name>fs.ftp.host.port</name>
         <value>21</value>
     </property>

</configuration>

Then I tried  hadoop fs -ls file:/// , it works.
But hadoop fs -ls ftp://<login>:<password>@<ftp server ip>/<directory>/ 
doesn't work as usual:
     ls: Cannot access ftp://<user>:<password>@<ftp server 
ip>/<directory>/: No such file or directory.

When ignoring <directroy> as :

hadoop fs -ls ftp://<login>:<password>@<ftp server ip>/

There are no error msgs, but it lists nothing.


I have also check the rights for my /home/<user> directroy:

drwxr-xr-x 11 <user> <user>  4096 jui 11 16:30 <user>

and all the files under /home/<user> have rights 755.

I can easily copy the link ftp://<user>:<password>@<ftp server 
ip>/<directory>/ to firefox, it lists all the files as expected.

Any workaround here ?

Thank you.

Le 12/07/2013 14:01, Ram a écrit :
> Please configure the following in core-ste.xml and try.
>    Use hadoop fs -ls file:///  -- to display local file system files
>    Use hadoop fs -ls ftp://<your ftp location>   -- to display ftp 
> files if it is listing files go for distcp.
>
> reference from 
> http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/core-default.xml
>
> fs.ftp.host 	0.0.0.0 	FTP filesystem connects to this server
> fs.ftp.host.port 	21 	FTP filesystem connects to fs.ftp.host on this port
>


-- 
Hao Ren
ClaraVista
www.claravista.fr


Re: copy files from ftp to hdfs in parallel, distcp failed

Posted by Hao Ren <h....@claravista.fr>.
Thank you, Ram

I have configured core-site.xml as following:

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>

     <property>
         <name>hadoop.tmp.dir</name>
         <value>/vol/persistent-hdfs</value>
     </property>

     <property>
         <name>fs.default.name</name>
<value>hdfs://ec2-23-23-33-234.compute-1.amazonaws.com:9010</value>
     </property>

     <property>
         <name>io.file.buffer.size</name>
         <value>65536</value>
     </property>

     <property>
         <name>fs.ftp.host</name>
         <value>0.0.0.0</value>
     </property>

     <property>
         <name>fs.ftp.host.port</name>
         <value>21</value>
     </property>

</configuration>

Then I tried  hadoop fs -ls file:/// , it works.
But hadoop fs -ls ftp://<login>:<password>@<ftp server ip>/<directory>/ 
doesn't work as usual:
     ls: Cannot access ftp://<user>:<password>@<ftp server 
ip>/<directory>/: No such file or directory.

When ignoring <directroy> as :

hadoop fs -ls ftp://<login>:<password>@<ftp server ip>/

There are no error msgs, but it lists nothing.


I have also check the rights for my /home/<user> directroy:

drwxr-xr-x 11 <user> <user>  4096 jui 11 16:30 <user>

and all the files under /home/<user> have rights 755.

I can easily copy the link ftp://<user>:<password>@<ftp server 
ip>/<directory>/ to firefox, it lists all the files as expected.

Any workaround here ?

Thank you.

Le 12/07/2013 14:01, Ram a écrit :
> Please configure the following in core-ste.xml and try.
>    Use hadoop fs -ls file:///  -- to display local file system files
>    Use hadoop fs -ls ftp://<your ftp location>   -- to display ftp 
> files if it is listing files go for distcp.
>
> reference from 
> http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/core-default.xml
>
> fs.ftp.host 	0.0.0.0 	FTP filesystem connects to this server
> fs.ftp.host.port 	21 	FTP filesystem connects to fs.ftp.host on this port
>


-- 
Hao Ren
ClaraVista
www.claravista.fr


Re: copy files from ftp to hdfs in parallel, distcp failed

Posted by Ram <pr...@gmail.com>.
Hi,
   Please configure the following in core-ste.xml and try.
   Use hadoop fs -ls file:///  -- to display local file system files
   Use hadoop fs -ls ftp://<your ftp location>   -- to display ftp files if
it is listing files go for distcp.

reference from
http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/core-default.xml


fs.ftp.host0.0.0.0FTP filesystem connects to this serverfs.ftp.host.port21FTP
filesystem connects to fs.ftp.host on this port
and try to set the property also

reference from hadoop definitive guide hadoop file system.

Filesystem     URI scheme         Java implementation
Description
                                          (all under org.apache.hadoop)

FTP                 ftp                     fs.ftp.FTPFileSystem
    A filesystem backed by an FTP server.


Hi,



From,
Ramesh.




On Fri, Jul 12, 2013 at 1:04 PM, Hao Ren <h....@claravista.fr> wrote:

> Le 11/07/2013 20:47, Balaji Narayanan (பாலாஜி நாராயணன்) a écrit :
>
>> multiple copy jobs to hdfs
>>
>
> Thank you for your reply and the link.
>
> I read the link before, but I didn't find any examples about copying file
> from ftp to hdfs.
>
> There are about 20-40 file in my directory. I just want to move or copy
> that directory to hdfs on Amazon EC2.
>
> Actually, I am new to hadoop. I would like to know how to do multiple copy
> jobs to hdfs without distcp.
>
> Thank you again.
>
>
> --
> Hao Ren
> ClaraVista
> www.claravista.fr
>

Re: copy files from ftp to hdfs in parallel, distcp failed

Posted by Ram <pr...@gmail.com>.
Hi,
   Please configure the following in core-ste.xml and try.
   Use hadoop fs -ls file:///  -- to display local file system files
   Use hadoop fs -ls ftp://<your ftp location>   -- to display ftp files if
it is listing files go for distcp.

reference from
http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/core-default.xml


fs.ftp.host0.0.0.0FTP filesystem connects to this serverfs.ftp.host.port21FTP
filesystem connects to fs.ftp.host on this port
and try to set the property also

reference from hadoop definitive guide hadoop file system.

Filesystem     URI scheme         Java implementation
Description
                                          (all under org.apache.hadoop)

FTP                 ftp                     fs.ftp.FTPFileSystem
    A filesystem backed by an FTP server.


Hi,



From,
Ramesh.




On Fri, Jul 12, 2013 at 1:04 PM, Hao Ren <h....@claravista.fr> wrote:

> Le 11/07/2013 20:47, Balaji Narayanan (பாலாஜி நாராயணன்) a écrit :
>
>> multiple copy jobs to hdfs
>>
>
> Thank you for your reply and the link.
>
> I read the link before, but I didn't find any examples about copying file
> from ftp to hdfs.
>
> There are about 20-40 file in my directory. I just want to move or copy
> that directory to hdfs on Amazon EC2.
>
> Actually, I am new to hadoop. I would like to know how to do multiple copy
> jobs to hdfs without distcp.
>
> Thank you again.
>
>
> --
> Hao Ren
> ClaraVista
> www.claravista.fr
>

Re: copy files from ftp to hdfs in parallel, distcp failed

Posted by Ram <pr...@gmail.com>.
Hi,
   Please configure the following in core-ste.xml and try.
   Use hadoop fs -ls file:///  -- to display local file system files
   Use hadoop fs -ls ftp://<your ftp location>   -- to display ftp files if
it is listing files go for distcp.

reference from
http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/core-default.xml


fs.ftp.host0.0.0.0FTP filesystem connects to this serverfs.ftp.host.port21FTP
filesystem connects to fs.ftp.host on this port
and try to set the property also

reference from hadoop definitive guide hadoop file system.

Filesystem     URI scheme         Java implementation
Description
                                          (all under org.apache.hadoop)

FTP                 ftp                     fs.ftp.FTPFileSystem
    A filesystem backed by an FTP server.


Hi,



From,
Ramesh.




On Fri, Jul 12, 2013 at 1:04 PM, Hao Ren <h....@claravista.fr> wrote:

> Le 11/07/2013 20:47, Balaji Narayanan (பாலாஜி நாராயணன்) a écrit :
>
>> multiple copy jobs to hdfs
>>
>
> Thank you for your reply and the link.
>
> I read the link before, but I didn't find any examples about copying file
> from ftp to hdfs.
>
> There are about 20-40 file in my directory. I just want to move or copy
> that directory to hdfs on Amazon EC2.
>
> Actually, I am new to hadoop. I would like to know how to do multiple copy
> jobs to hdfs without distcp.
>
> Thank you again.
>
>
> --
> Hao Ren
> ClaraVista
> www.claravista.fr
>

Re: copy files from ftp to hdfs in parallel, distcp failed

Posted by Ram <pr...@gmail.com>.
Hi,
   Please configure the following in core-ste.xml and try.
   Use hadoop fs -ls file:///  -- to display local file system files
   Use hadoop fs -ls ftp://<your ftp location>   -- to display ftp files if
it is listing files go for distcp.

reference from
http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/core-default.xml


fs.ftp.host0.0.0.0FTP filesystem connects to this serverfs.ftp.host.port21FTP
filesystem connects to fs.ftp.host on this port
and try to set the property also

reference from hadoop definitive guide hadoop file system.

Filesystem     URI scheme         Java implementation
Description
                                          (all under org.apache.hadoop)

FTP                 ftp                     fs.ftp.FTPFileSystem
    A filesystem backed by an FTP server.


Hi,



From,
Ramesh.




On Fri, Jul 12, 2013 at 1:04 PM, Hao Ren <h....@claravista.fr> wrote:

> Le 11/07/2013 20:47, Balaji Narayanan (பாலாஜி நாராயணன்) a écrit :
>
>> multiple copy jobs to hdfs
>>
>
> Thank you for your reply and the link.
>
> I read the link before, but I didn't find any examples about copying file
> from ftp to hdfs.
>
> There are about 20-40 file in my directory. I just want to move or copy
> that directory to hdfs on Amazon EC2.
>
> Actually, I am new to hadoop. I would like to know how to do multiple copy
> jobs to hdfs without distcp.
>
> Thank you again.
>
>
> --
> Hao Ren
> ClaraVista
> www.claravista.fr
>

Re: copy files from ftp to hdfs in parallel, distcp failed

Posted by Hao Ren <h....@claravista.fr>.
Le 11/07/2013 20:47, Balaji Narayanan (பாலாஜி நாராயணன்) a écrit :
> multiple copy jobs to hdfs

Thank you for your reply and the link.

I read the link before, but I didn't find any examples about copying 
file from ftp to hdfs.

There are about 20-40 file in my directory. I just want to move or copy 
that directory to hdfs on Amazon EC2.

Actually, I am new to hadoop. I would like to know how to do multiple 
copy jobs to hdfs without distcp.

Thank you again.

-- 
Hao Ren
ClaraVista
www.claravista.fr

Re: copy files from ftp to hdfs in parallel, distcp failed

Posted by Hao Ren <h....@claravista.fr>.
Le 11/07/2013 20:47, Balaji Narayanan (பாலாஜி நாராயணன்) a écrit :
> multiple copy jobs to hdfs

Thank you for your reply and the link.

I read the link before, but I didn't find any examples about copying 
file from ftp to hdfs.

There are about 20-40 file in my directory. I just want to move or copy 
that directory to hdfs on Amazon EC2.

Actually, I am new to hadoop. I would like to know how to do multiple 
copy jobs to hdfs without distcp.

Thank you again.

-- 
Hao Ren
ClaraVista
www.claravista.fr

Re: copy files from ftp to hdfs in parallel, distcp failed

Posted by Hao Ren <h....@claravista.fr>.
Le 11/07/2013 20:47, Balaji Narayanan (பாலாஜி நாராயணன்) a écrit :
> multiple copy jobs to hdfs

Thank you for your reply and the link.

I read the link before, but I didn't find any examples about copying 
file from ftp to hdfs.

There are about 20-40 file in my directory. I just want to move or copy 
that directory to hdfs on Amazon EC2.

Actually, I am new to hadoop. I would like to know how to do multiple 
copy jobs to hdfs without distcp.

Thank you again.

-- 
Hao Ren
ClaraVista
www.claravista.fr

Re: copy files from ftp to hdfs in parallel, distcp failed

Posted by Hao Ren <h....@claravista.fr>.
Le 11/07/2013 20:47, Balaji Narayanan (பாலாஜி நாராயணன்) a écrit :
> multiple copy jobs to hdfs

Thank you for your reply and the link.

I read the link before, but I didn't find any examples about copying 
file from ftp to hdfs.

There are about 20-40 file in my directory. I just want to move or copy 
that directory to hdfs on Amazon EC2.

Actually, I am new to hadoop. I would like to know how to do multiple 
copy jobs to hdfs without distcp.

Thank you again.

-- 
Hao Ren
ClaraVista
www.claravista.fr

Re: copy files from ftp to hdfs in parallel, distcp failed

Posted by "Balaji Narayanan (பாலாஜி நாராயணன்)" <li...@balajin.net>.
On 11 July 2013 06:27, Hao Ren <h....@claravista.fr> wrote:

> Hi,
>
> I am running a hdfs on Amazon EC2
>
> Say, I have a ftp server where stores some data.
>

I just want to copy these data directly to hdfs in a parallel way (which
> maybe more efficient).
>
> I think hadoop distcp is what I need.
>

http://hadoop.apache.org/docs/stable/distcp.html

DistCp (distributed copy) is a tool used for large inter/intra-cluster
copying. It uses MapReduce to effect its distribution, error handling and
recovery, and reporting


I doubt this is going to help. Are these lot of files. If yes, how about
multiple copy jobs to hdfs?
-balaji

Re: copy files from ftp to hdfs in parallel, distcp failed

Posted by "Balaji Narayanan (பாலாஜி நாராயணன்)" <li...@balajin.net>.
On 11 July 2013 06:27, Hao Ren <h....@claravista.fr> wrote:

> Hi,
>
> I am running a hdfs on Amazon EC2
>
> Say, I have a ftp server where stores some data.
>

I just want to copy these data directly to hdfs in a parallel way (which
> maybe more efficient).
>
> I think hadoop distcp is what I need.
>

http://hadoop.apache.org/docs/stable/distcp.html

DistCp (distributed copy) is a tool used for large inter/intra-cluster
copying. It uses MapReduce to effect its distribution, error handling and
recovery, and reporting


I doubt this is going to help. Are these lot of files. If yes, how about
multiple copy jobs to hdfs?
-balaji

Re: copy files from ftp to hdfs in parallel, distcp failed

Posted by "Balaji Narayanan (பாலாஜி நாராயணன்)" <li...@balajin.net>.
On 11 July 2013 06:27, Hao Ren <h....@claravista.fr> wrote:

> Hi,
>
> I am running a hdfs on Amazon EC2
>
> Say, I have a ftp server where stores some data.
>

I just want to copy these data directly to hdfs in a parallel way (which
> maybe more efficient).
>
> I think hadoop distcp is what I need.
>

http://hadoop.apache.org/docs/stable/distcp.html

DistCp (distributed copy) is a tool used for large inter/intra-cluster
copying. It uses MapReduce to effect its distribution, error handling and
recovery, and reporting


I doubt this is going to help. Are these lot of files. If yes, how about
multiple copy jobs to hdfs?
-balaji

Re: copy files from ftp to hdfs in parallel, distcp failed

Posted by Shlash <a....@ymail.com>.
Hi 
Can help me to solve this problem please, if you solved it.
Best regards

Shlash


Re: copy files from ftp to hdfs in parallel, distcp failed

Posted by "Balaji Narayanan (பாலாஜி நாராயணன்)" <li...@balajin.net>.
On 11 July 2013 06:27, Hao Ren <h....@claravista.fr> wrote:

> Hi,
>
> I am running a hdfs on Amazon EC2
>
> Say, I have a ftp server where stores some data.
>

I just want to copy these data directly to hdfs in a parallel way (which
> maybe more efficient).
>
> I think hadoop distcp is what I need.
>

http://hadoop.apache.org/docs/stable/distcp.html

DistCp (distributed copy) is a tool used for large inter/intra-cluster
copying. It uses MapReduce to effect its distribution, error handling and
recovery, and reporting


I doubt this is going to help. Are these lot of files. If yes, how about
multiple copy jobs to hdfs?
-balaji

Re: copy files from ftp to hdfs in parallel, distcp failed

Posted by Shlash <a....@ymail.com>.
Hi 
Can help me to solve this problem please, if you solved it.
Best regards

Shlash