You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Ranjith <ra...@gmail.com> on 2012/05/22 03:18:30 UTC
CopyFromLocal
I have always wondered about this and and not sure as to phenomenon. When I fire a map reduce job to copy data over in a distributed fashion I would expect to see mappers executing the copy. What happens with a copy command from Hadoop fs?
Thanks,
Ranjith
Re: CopyFromLocal
Posted by Ranjith <ra...@gmail.com>.
Harsh,
Thanks for the response bud. Appreciate it!
Thanks,
Ranjith
On May 21, 2012, at 11:09 PM, Harsh J <ha...@cloudera.com> wrote:
> Ranjith,
>
> MapReduce and HDFS are two different things. MapReduce uses HDFS (and
> can use any other FS as well) to do some efficient work, but HDFS does
> not use MapReduce.
>
> A simple HDFS transfer is done via network directly - Yes its just a
> block by block copy/write to/from the relevant DataNodes, done over
> network sockets at each end.
>
> On Tue, May 22, 2012 at 8:58 AM, Ranjith <ra...@gmail.com> wrote:
>> Thanks harsh. So when it connects directly to the data nodes it does not fire off any mappers. So how does it get the data over? Is it just a block by block copy?
>>
>> Thanks,
>> Ranjith
>>
>> On May 21, 2012, at 9:22 PM, Harsh J <ha...@cloudera.com> wrote:
>>
>>> Ranjith,
>>>
>>> Are you speaking of DistCp?
>>> http://hadoop.apache.org/common/docs/current/distcp.html
>>>
>>> An 'fs -copyFromLocal' otherwise just runs as a single program that
>>> connects to your DFS nodes and writes data from a single client
>>> thread, and is not distributed on its own.
>>>
>>> On Tue, May 22, 2012 at 6:48 AM, Ranjith <ra...@gmail.com> wrote:
>>>>
>>>> I have always wondered about this and and not sure as to phenomenon. When I fire a map reduce job to copy data over in a distributed fashion I would expect to see mappers executing the copy. What happens with a copy command from Hadoop fs?
>>>>
>>>> Thanks,
>>>> Ranjith
>>>
>>>
>>>
>>> --
>>> Harsh J
>
>
>
> --
> Harsh J
Re: CopyFromLocal
Posted by Harsh J <ha...@cloudera.com>.
Ranjith,
MapReduce and HDFS are two different things. MapReduce uses HDFS (and
can use any other FS as well) to do some efficient work, but HDFS does
not use MapReduce.
A simple HDFS transfer is done via network directly - Yes its just a
block by block copy/write to/from the relevant DataNodes, done over
network sockets at each end.
On Tue, May 22, 2012 at 8:58 AM, Ranjith <ra...@gmail.com> wrote:
> Thanks harsh. So when it connects directly to the data nodes it does not fire off any mappers. So how does it get the data over? Is it just a block by block copy?
>
> Thanks,
> Ranjith
>
> On May 21, 2012, at 9:22 PM, Harsh J <ha...@cloudera.com> wrote:
>
>> Ranjith,
>>
>> Are you speaking of DistCp?
>> http://hadoop.apache.org/common/docs/current/distcp.html
>>
>> An 'fs -copyFromLocal' otherwise just runs as a single program that
>> connects to your DFS nodes and writes data from a single client
>> thread, and is not distributed on its own.
>>
>> On Tue, May 22, 2012 at 6:48 AM, Ranjith <ra...@gmail.com> wrote:
>>>
>>> I have always wondered about this and and not sure as to phenomenon. When I fire a map reduce job to copy data over in a distributed fashion I would expect to see mappers executing the copy. What happens with a copy command from Hadoop fs?
>>>
>>> Thanks,
>>> Ranjith
>>
>>
>>
>> --
>> Harsh J
--
Harsh J
Re: CopyFromLocal
Posted by Ranjith <ra...@gmail.com>.
Thanks harsh. So when it connects directly to the data nodes it does not fire off any mappers. So how does it get the data over? Is it just a block by block copy?
Thanks,
Ranjith
On May 21, 2012, at 9:22 PM, Harsh J <ha...@cloudera.com> wrote:
> Ranjith,
>
> Are you speaking of DistCp?
> http://hadoop.apache.org/common/docs/current/distcp.html
>
> An 'fs -copyFromLocal' otherwise just runs as a single program that
> connects to your DFS nodes and writes data from a single client
> thread, and is not distributed on its own.
>
> On Tue, May 22, 2012 at 6:48 AM, Ranjith <ra...@gmail.com> wrote:
>>
>> I have always wondered about this and and not sure as to phenomenon. When I fire a map reduce job to copy data over in a distributed fashion I would expect to see mappers executing the copy. What happens with a copy command from Hadoop fs?
>>
>> Thanks,
>> Ranjith
>
>
>
> --
> Harsh J
Re: CopyFromLocal
Posted by Harsh J <ha...@cloudera.com>.
Ranjith,
Are you speaking of DistCp?
http://hadoop.apache.org/common/docs/current/distcp.html
An 'fs -copyFromLocal' otherwise just runs as a single program that
connects to your DFS nodes and writes data from a single client
thread, and is not distributed on its own.
On Tue, May 22, 2012 at 6:48 AM, Ranjith <ra...@gmail.com> wrote:
>
> I have always wondered about this and and not sure as to phenomenon. When I fire a map reduce job to copy data over in a distributed fashion I would expect to see mappers executing the copy. What happens with a copy command from Hadoop fs?
>
> Thanks,
> Ranjith
--
Harsh J