You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by unmesha sreeveni <un...@gmail.com> on 2013/12/09 08:22:20 UTC

Which is hdfs?

Can anyone tell me what is the difference between the below details

My cluster is a remote system "sree".

1. I have a "chck"  file in my /home/sree
   I did
            > hadoop fs -copFromLocal /home/sree/chck
            > hadoop fs -ls

                       -rw-r--r--   1 *sree*     supergroup         32
2013-12-03 14:27 chck

               *  whether chck file is now resided in hdfs?*
2.After executing wordcount in my remote system my output folder looks
like this

                    drwxr-xr-x   - *hdfs*      supergroup          0
2013-11-19 09:41 wcout


I have a confusion -  which is *hdfs*?
The area where *chck* resided or *wcout* ?


3. Am i able to update/append "chck" file through MR job?

4.  -rw-r--r--   1 *hdfs*     supergroup         32 2013-12-03 14:27 myfile
    Am i able to update/append "myfile" file through MR job?

**I read that updation is not allowed in hdfs**

-- 
*Thanks & Regards*

Unmesha Sreeveni U.B

*Junior Developer*

RE: Which is hdfs?

Posted by Vinayakumar B <vi...@huawei.com>.
Hi,

Please find the answers inline, if its helpful.

Thanks and Regargds,
Vinayakumar B

From: unmesha sreeveni [mailto:unmeshabiju@gmail.com]
Sent: 09 December 2013 12:52
To: User Hadoop
Subject: Which is hdfs?

Can anyone tell me what is the difference between the below details

My cluster is a remote system "sree".

If you have set the fs.defaultFS as "hdfs://<namenode address>", then -copyFromLocal will copy to hdfs.

1. I have a "chck" file in my /home/sree
   I did
            > hadoop fs -copFromLocal /home/sree/chck
            > hadoop fs -ls

                       -rw-r--r--   1 sree     supergroup         32 2013-12-03 14:27 chck

I think you have done above operation from sree user.  This is why -ls showing sree as the owner of the file.
Since here you didn't pass the destination, by default file will be copied under /user/sree directory. -ls also shows from same directory.
To make sure, you do '-ls /user/sree'

                 whether chck file is now resided in hdfs?
2.After executing wordcount in my remote system my output folder looks like this

                   drwxr-xr-x   - hdfs      supergroup          0 2013-11-19 09:41 wcout

WordCount job is executing from the user hdfs, so -ls for the wcount is showing hdfs as owner.

I have a confusion -  which is hdfs?

In my opinion, both files are in hdfs but under different user homes.

The area where chck resided or wcout ?

3. Am i able to update/append "chck" file through MR job?
HDFS supports only append from one client at a time.

4.  -rw-r--r--   1 hdfs     supergroup         32 2013-12-03 14:27 myfile
    Am i able to update/append "myfile" file through MR job?


Basically here you can update the  file by creating the same file again. Updation to same file is not possible.

**I read that updation is not allowed in hdfs**

--
Thanks & Regards

Unmesha Sreeveni U.B
Junior Developer



Re: Which is hdfs?

Posted by unmesha sreeveni <un...@gmail.com>.
1. So if we are running a Mr job form eclipse by setting
                        Configuration conf = new Configuration();
conf.set("fs.defaultFS", "remotesystem-ip/");
                conf.set("hadoop.job.ugi", "hdfs");
the results are getting reflected in my remote cluster.
2. Exporting jar file from my home and "scp" to remote cluster and run jar
in remote cluster.

Do 1 and 2 yields the same result?(Not getting same results :( )
whether eclipse act as the original remote cluster even if we are setting
 Configuration conf = new Configuration();
conf.set("fs.defaultFS", "remotesystem-ip/");
                conf.set("hadoop.job.ugi", "hdfs");



On Tue, Dec 10, 2013 at 12:25 PM, unmesha sreeveni <un...@gmail.com>wrote:

> Thank you vinayakumar and Jagat :)
>
>
> On Mon, Dec 9, 2013 at 3:06 PM, Jagat Singh <ja...@gmail.com> wrote:
>
>> Both are inside hdfs
>>
>> One file is under your name other is under system user hdfs name.
>>
>> Hadoop fs  command says i am accessing hdfs.
>>
>> Read about hdfs append in 2.x version.
>> On 09/12/2013 6:22 PM, "unmesha sreeveni" <un...@gmail.com> wrote:
>>
>>> Can anyone tell me what is the difference between the below details
>>>
>>> My cluster is a remote system "sree".
>>>
>>> 1. I have a "chck" file in my /home/sree
>>>    I did
>>>             > hadoop fs -copFromLocal /home/sree/chck
>>>             > hadoop fs -ls
>>>
>>>                        -rw-r--r--   1 *sree*     supergroup         32
>>> 2013-12-03 14:27 chck
>>>
>>>                *  whether chck file is now resided in hdfs?*
>>> 2.After executing wordcount in my remote system my output folder looks
>>> like this
>>>
>>>                    drwxr-xr-x   - *hdfs*      supergroup          0
>>> 2013-11-19 09:41 wcout
>>>
>>>
>>> I have a confusion -  which is *hdfs*?
>>> The area where *chck* resided or *wcout* ?
>>>
>>>
>>> 3. Am i able to update/append "chck" file through MR job?
>>>
>>> 4.  -rw-r--r--   1 *hdfs*     supergroup         32 2013-12-03 14:27
>>> myfile
>>>     Am i able to update/append "myfile" file through MR job?
>>>
>>> **I read that updation is not allowed in hdfs**
>>>
>>> --
>>> *Thanks & Regards*
>>>
>>> Unmesha Sreeveni U.B
>>>
>>> *Junior Developer*
>>>
>>>
>>>
>
>
> --
> *Thanks & Regards*
>
> Unmesha Sreeveni U.B
>
> *Junior Developer*
>
>
>


-- 
*Thanks & Regards*

Unmesha Sreeveni U.B

*Junior Developer*

Re: Which is hdfs?

Posted by unmesha sreeveni <un...@gmail.com>.
1. So if we are running a Mr job form eclipse by setting
                        Configuration conf = new Configuration();
conf.set("fs.defaultFS", "remotesystem-ip/");
                conf.set("hadoop.job.ugi", "hdfs");
the results are getting reflected in my remote cluster.
2. Exporting jar file from my home and "scp" to remote cluster and run jar
in remote cluster.

Do 1 and 2 yields the same result?(Not getting same results :( )
whether eclipse act as the original remote cluster even if we are setting
 Configuration conf = new Configuration();
conf.set("fs.defaultFS", "remotesystem-ip/");
                conf.set("hadoop.job.ugi", "hdfs");



On Tue, Dec 10, 2013 at 12:25 PM, unmesha sreeveni <un...@gmail.com>wrote:

> Thank you vinayakumar and Jagat :)
>
>
> On Mon, Dec 9, 2013 at 3:06 PM, Jagat Singh <ja...@gmail.com> wrote:
>
>> Both are inside hdfs
>>
>> One file is under your name other is under system user hdfs name.
>>
>> Hadoop fs  command says i am accessing hdfs.
>>
>> Read about hdfs append in 2.x version.
>> On 09/12/2013 6:22 PM, "unmesha sreeveni" <un...@gmail.com> wrote:
>>
>>> Can anyone tell me what is the difference between the below details
>>>
>>> My cluster is a remote system "sree".
>>>
>>> 1. I have a "chck" file in my /home/sree
>>>    I did
>>>             > hadoop fs -copFromLocal /home/sree/chck
>>>             > hadoop fs -ls
>>>
>>>                        -rw-r--r--   1 *sree*     supergroup         32
>>> 2013-12-03 14:27 chck
>>>
>>>                *  whether chck file is now resided in hdfs?*
>>> 2.After executing wordcount in my remote system my output folder looks
>>> like this
>>>
>>>                    drwxr-xr-x   - *hdfs*      supergroup          0
>>> 2013-11-19 09:41 wcout
>>>
>>>
>>> I have a confusion -  which is *hdfs*?
>>> The area where *chck* resided or *wcout* ?
>>>
>>>
>>> 3. Am i able to update/append "chck" file through MR job?
>>>
>>> 4.  -rw-r--r--   1 *hdfs*     supergroup         32 2013-12-03 14:27
>>> myfile
>>>     Am i able to update/append "myfile" file through MR job?
>>>
>>> **I read that updation is not allowed in hdfs**
>>>
>>> --
>>> *Thanks & Regards*
>>>
>>> Unmesha Sreeveni U.B
>>>
>>> *Junior Developer*
>>>
>>>
>>>
>
>
> --
> *Thanks & Regards*
>
> Unmesha Sreeveni U.B
>
> *Junior Developer*
>
>
>


-- 
*Thanks & Regards*

Unmesha Sreeveni U.B

*Junior Developer*

Re: Which is hdfs?

Posted by unmesha sreeveni <un...@gmail.com>.
1. So if we are running a Mr job form eclipse by setting
                        Configuration conf = new Configuration();
conf.set("fs.defaultFS", "remotesystem-ip/");
                conf.set("hadoop.job.ugi", "hdfs");
the results are getting reflected in my remote cluster.
2. Exporting jar file from my home and "scp" to remote cluster and run jar
in remote cluster.

Do 1 and 2 yields the same result?(Not getting same results :( )
whether eclipse act as the original remote cluster even if we are setting
 Configuration conf = new Configuration();
conf.set("fs.defaultFS", "remotesystem-ip/");
                conf.set("hadoop.job.ugi", "hdfs");



On Tue, Dec 10, 2013 at 12:25 PM, unmesha sreeveni <un...@gmail.com>wrote:

> Thank you vinayakumar and Jagat :)
>
>
> On Mon, Dec 9, 2013 at 3:06 PM, Jagat Singh <ja...@gmail.com> wrote:
>
>> Both are inside hdfs
>>
>> One file is under your name other is under system user hdfs name.
>>
>> Hadoop fs  command says i am accessing hdfs.
>>
>> Read about hdfs append in 2.x version.
>> On 09/12/2013 6:22 PM, "unmesha sreeveni" <un...@gmail.com> wrote:
>>
>>> Can anyone tell me what is the difference between the below details
>>>
>>> My cluster is a remote system "sree".
>>>
>>> 1. I have a "chck" file in my /home/sree
>>>    I did
>>>             > hadoop fs -copFromLocal /home/sree/chck
>>>             > hadoop fs -ls
>>>
>>>                        -rw-r--r--   1 *sree*     supergroup         32
>>> 2013-12-03 14:27 chck
>>>
>>>                *  whether chck file is now resided in hdfs?*
>>> 2.After executing wordcount in my remote system my output folder looks
>>> like this
>>>
>>>                    drwxr-xr-x   - *hdfs*      supergroup          0
>>> 2013-11-19 09:41 wcout
>>>
>>>
>>> I have a confusion -  which is *hdfs*?
>>> The area where *chck* resided or *wcout* ?
>>>
>>>
>>> 3. Am i able to update/append "chck" file through MR job?
>>>
>>> 4.  -rw-r--r--   1 *hdfs*     supergroup         32 2013-12-03 14:27
>>> myfile
>>>     Am i able to update/append "myfile" file through MR job?
>>>
>>> **I read that updation is not allowed in hdfs**
>>>
>>> --
>>> *Thanks & Regards*
>>>
>>> Unmesha Sreeveni U.B
>>>
>>> *Junior Developer*
>>>
>>>
>>>
>
>
> --
> *Thanks & Regards*
>
> Unmesha Sreeveni U.B
>
> *Junior Developer*
>
>
>


-- 
*Thanks & Regards*

Unmesha Sreeveni U.B

*Junior Developer*

Re: Which is hdfs?

Posted by unmesha sreeveni <un...@gmail.com>.
1. So if we are running a Mr job form eclipse by setting
                        Configuration conf = new Configuration();
conf.set("fs.defaultFS", "remotesystem-ip/");
                conf.set("hadoop.job.ugi", "hdfs");
the results are getting reflected in my remote cluster.
2. Exporting jar file from my home and "scp" to remote cluster and run jar
in remote cluster.

Do 1 and 2 yields the same result?(Not getting same results :( )
whether eclipse act as the original remote cluster even if we are setting
 Configuration conf = new Configuration();
conf.set("fs.defaultFS", "remotesystem-ip/");
                conf.set("hadoop.job.ugi", "hdfs");



On Tue, Dec 10, 2013 at 12:25 PM, unmesha sreeveni <un...@gmail.com>wrote:

> Thank you vinayakumar and Jagat :)
>
>
> On Mon, Dec 9, 2013 at 3:06 PM, Jagat Singh <ja...@gmail.com> wrote:
>
>> Both are inside hdfs
>>
>> One file is under your name other is under system user hdfs name.
>>
>> Hadoop fs  command says i am accessing hdfs.
>>
>> Read about hdfs append in 2.x version.
>> On 09/12/2013 6:22 PM, "unmesha sreeveni" <un...@gmail.com> wrote:
>>
>>> Can anyone tell me what is the difference between the below details
>>>
>>> My cluster is a remote system "sree".
>>>
>>> 1. I have a "chck" file in my /home/sree
>>>    I did
>>>             > hadoop fs -copFromLocal /home/sree/chck
>>>             > hadoop fs -ls
>>>
>>>                        -rw-r--r--   1 *sree*     supergroup         32
>>> 2013-12-03 14:27 chck
>>>
>>>                *  whether chck file is now resided in hdfs?*
>>> 2.After executing wordcount in my remote system my output folder looks
>>> like this
>>>
>>>                    drwxr-xr-x   - *hdfs*      supergroup          0
>>> 2013-11-19 09:41 wcout
>>>
>>>
>>> I have a confusion -  which is *hdfs*?
>>> The area where *chck* resided or *wcout* ?
>>>
>>>
>>> 3. Am i able to update/append "chck" file through MR job?
>>>
>>> 4.  -rw-r--r--   1 *hdfs*     supergroup         32 2013-12-03 14:27
>>> myfile
>>>     Am i able to update/append "myfile" file through MR job?
>>>
>>> **I read that updation is not allowed in hdfs**
>>>
>>> --
>>> *Thanks & Regards*
>>>
>>> Unmesha Sreeveni U.B
>>>
>>> *Junior Developer*
>>>
>>>
>>>
>
>
> --
> *Thanks & Regards*
>
> Unmesha Sreeveni U.B
>
> *Junior Developer*
>
>
>


-- 
*Thanks & Regards*

Unmesha Sreeveni U.B

*Junior Developer*

Re: Which is hdfs?

Posted by unmesha sreeveni <un...@gmail.com>.
Thank you vinayakumar and Jagat :)


On Mon, Dec 9, 2013 at 3:06 PM, Jagat Singh <ja...@gmail.com> wrote:

> Both are inside hdfs
>
> One file is under your name other is under system user hdfs name.
>
> Hadoop fs  command says i am accessing hdfs.
>
> Read about hdfs append in 2.x version.
> On 09/12/2013 6:22 PM, "unmesha sreeveni" <un...@gmail.com> wrote:
>
>> Can anyone tell me what is the difference between the below details
>>
>> My cluster is a remote system "sree".
>>
>> 1. I have a "chck" file in my /home/sree
>>    I did
>>             > hadoop fs -copFromLocal /home/sree/chck
>>             > hadoop fs -ls
>>
>>                        -rw-r--r--   1 *sree*     supergroup         32
>> 2013-12-03 14:27 chck
>>
>>                *  whether chck file is now resided in hdfs?*
>> 2.After executing wordcount in my remote system my output folder looks
>> like this
>>
>>                    drwxr-xr-x   - *hdfs*      supergroup          0
>> 2013-11-19 09:41 wcout
>>
>>
>> I have a confusion -  which is *hdfs*?
>> The area where *chck* resided or *wcout* ?
>>
>>
>> 3. Am i able to update/append "chck" file through MR job?
>>
>> 4.  -rw-r--r--   1 *hdfs*     supergroup         32 2013-12-03 14:27
>> myfile
>>     Am i able to update/append "myfile" file through MR job?
>>
>> **I read that updation is not allowed in hdfs**
>>
>> --
>> *Thanks & Regards*
>>
>> Unmesha Sreeveni U.B
>>
>> *Junior Developer*
>>
>>
>>


-- 
*Thanks & Regards*

Unmesha Sreeveni U.B

*Junior Developer*

Re: Which is hdfs?

Posted by unmesha sreeveni <un...@gmail.com>.
Thank you vinayakumar and Jagat :)


On Mon, Dec 9, 2013 at 3:06 PM, Jagat Singh <ja...@gmail.com> wrote:

> Both are inside hdfs
>
> One file is under your name other is under system user hdfs name.
>
> Hadoop fs  command says i am accessing hdfs.
>
> Read about hdfs append in 2.x version.
> On 09/12/2013 6:22 PM, "unmesha sreeveni" <un...@gmail.com> wrote:
>
>> Can anyone tell me what is the difference between the below details
>>
>> My cluster is a remote system "sree".
>>
>> 1. I have a "chck" file in my /home/sree
>>    I did
>>             > hadoop fs -copFromLocal /home/sree/chck
>>             > hadoop fs -ls
>>
>>                        -rw-r--r--   1 *sree*     supergroup         32
>> 2013-12-03 14:27 chck
>>
>>                *  whether chck file is now resided in hdfs?*
>> 2.After executing wordcount in my remote system my output folder looks
>> like this
>>
>>                    drwxr-xr-x   - *hdfs*      supergroup          0
>> 2013-11-19 09:41 wcout
>>
>>
>> I have a confusion -  which is *hdfs*?
>> The area where *chck* resided or *wcout* ?
>>
>>
>> 3. Am i able to update/append "chck" file through MR job?
>>
>> 4.  -rw-r--r--   1 *hdfs*     supergroup         32 2013-12-03 14:27
>> myfile
>>     Am i able to update/append "myfile" file through MR job?
>>
>> **I read that updation is not allowed in hdfs**
>>
>> --
>> *Thanks & Regards*
>>
>> Unmesha Sreeveni U.B
>>
>> *Junior Developer*
>>
>>
>>


-- 
*Thanks & Regards*

Unmesha Sreeveni U.B

*Junior Developer*

Re: Which is hdfs?

Posted by unmesha sreeveni <un...@gmail.com>.
Thank you vinayakumar and Jagat :)


On Mon, Dec 9, 2013 at 3:06 PM, Jagat Singh <ja...@gmail.com> wrote:

> Both are inside hdfs
>
> One file is under your name other is under system user hdfs name.
>
> Hadoop fs  command says i am accessing hdfs.
>
> Read about hdfs append in 2.x version.
> On 09/12/2013 6:22 PM, "unmesha sreeveni" <un...@gmail.com> wrote:
>
>> Can anyone tell me what is the difference between the below details
>>
>> My cluster is a remote system "sree".
>>
>> 1. I have a "chck" file in my /home/sree
>>    I did
>>             > hadoop fs -copFromLocal /home/sree/chck
>>             > hadoop fs -ls
>>
>>                        -rw-r--r--   1 *sree*     supergroup         32
>> 2013-12-03 14:27 chck
>>
>>                *  whether chck file is now resided in hdfs?*
>> 2.After executing wordcount in my remote system my output folder looks
>> like this
>>
>>                    drwxr-xr-x   - *hdfs*      supergroup          0
>> 2013-11-19 09:41 wcout
>>
>>
>> I have a confusion -  which is *hdfs*?
>> The area where *chck* resided or *wcout* ?
>>
>>
>> 3. Am i able to update/append "chck" file through MR job?
>>
>> 4.  -rw-r--r--   1 *hdfs*     supergroup         32 2013-12-03 14:27
>> myfile
>>     Am i able to update/append "myfile" file through MR job?
>>
>> **I read that updation is not allowed in hdfs**
>>
>> --
>> *Thanks & Regards*
>>
>> Unmesha Sreeveni U.B
>>
>> *Junior Developer*
>>
>>
>>


-- 
*Thanks & Regards*

Unmesha Sreeveni U.B

*Junior Developer*

Re: Which is hdfs?

Posted by unmesha sreeveni <un...@gmail.com>.
Thank you vinayakumar and Jagat :)


On Mon, Dec 9, 2013 at 3:06 PM, Jagat Singh <ja...@gmail.com> wrote:

> Both are inside hdfs
>
> One file is under your name other is under system user hdfs name.
>
> Hadoop fs  command says i am accessing hdfs.
>
> Read about hdfs append in 2.x version.
> On 09/12/2013 6:22 PM, "unmesha sreeveni" <un...@gmail.com> wrote:
>
>> Can anyone tell me what is the difference between the below details
>>
>> My cluster is a remote system "sree".
>>
>> 1. I have a "chck" file in my /home/sree
>>    I did
>>             > hadoop fs -copFromLocal /home/sree/chck
>>             > hadoop fs -ls
>>
>>                        -rw-r--r--   1 *sree*     supergroup         32
>> 2013-12-03 14:27 chck
>>
>>                *  whether chck file is now resided in hdfs?*
>> 2.After executing wordcount in my remote system my output folder looks
>> like this
>>
>>                    drwxr-xr-x   - *hdfs*      supergroup          0
>> 2013-11-19 09:41 wcout
>>
>>
>> I have a confusion -  which is *hdfs*?
>> The area where *chck* resided or *wcout* ?
>>
>>
>> 3. Am i able to update/append "chck" file through MR job?
>>
>> 4.  -rw-r--r--   1 *hdfs*     supergroup         32 2013-12-03 14:27
>> myfile
>>     Am i able to update/append "myfile" file through MR job?
>>
>> **I read that updation is not allowed in hdfs**
>>
>> --
>> *Thanks & Regards*
>>
>> Unmesha Sreeveni U.B
>>
>> *Junior Developer*
>>
>>
>>


-- 
*Thanks & Regards*

Unmesha Sreeveni U.B

*Junior Developer*

Re: Which is hdfs?

Posted by Jagat Singh <ja...@gmail.com>.
Both are inside hdfs

One file is under your name other is under system user hdfs name.

Hadoop fs  command says i am accessing hdfs.

Read about hdfs append in 2.x version.
On 09/12/2013 6:22 PM, "unmesha sreeveni" <un...@gmail.com> wrote:

> Can anyone tell me what is the difference between the below details
>
> My cluster is a remote system "sree".
>
> 1. I have a "chck" file in my /home/sree
>    I did
>             > hadoop fs -copFromLocal /home/sree/chck
>             > hadoop fs -ls
>
>                        -rw-r--r--   1 *sree*     supergroup         32
> 2013-12-03 14:27 chck
>
>                *  whether chck file is now resided in hdfs?*
> 2.After executing wordcount in my remote system my output folder looks
> like this
>
>                    drwxr-xr-x   - *hdfs*      supergroup          0
> 2013-11-19 09:41 wcout
>
>
> I have a confusion -  which is *hdfs*?
> The area where *chck* resided or *wcout* ?
>
>
> 3. Am i able to update/append "chck" file through MR job?
>
> 4.  -rw-r--r--   1 *hdfs*     supergroup         32 2013-12-03 14:27
> myfile
>     Am i able to update/append "myfile" file through MR job?
>
> **I read that updation is not allowed in hdfs**
>
> --
> *Thanks & Regards*
>
> Unmesha Sreeveni U.B
>
> *Junior Developer*
>
>
>

Re: Which is hdfs?

Posted by Jagat Singh <ja...@gmail.com>.
Both are inside hdfs

One file is under your name other is under system user hdfs name.

Hadoop fs  command says i am accessing hdfs.

Read about hdfs append in 2.x version.
On 09/12/2013 6:22 PM, "unmesha sreeveni" <un...@gmail.com> wrote:

> Can anyone tell me what is the difference between the below details
>
> My cluster is a remote system "sree".
>
> 1. I have a "chck" file in my /home/sree
>    I did
>             > hadoop fs -copFromLocal /home/sree/chck
>             > hadoop fs -ls
>
>                        -rw-r--r--   1 *sree*     supergroup         32
> 2013-12-03 14:27 chck
>
>                *  whether chck file is now resided in hdfs?*
> 2.After executing wordcount in my remote system my output folder looks
> like this
>
>                    drwxr-xr-x   - *hdfs*      supergroup          0
> 2013-11-19 09:41 wcout
>
>
> I have a confusion -  which is *hdfs*?
> The area where *chck* resided or *wcout* ?
>
>
> 3. Am i able to update/append "chck" file through MR job?
>
> 4.  -rw-r--r--   1 *hdfs*     supergroup         32 2013-12-03 14:27
> myfile
>     Am i able to update/append "myfile" file through MR job?
>
> **I read that updation is not allowed in hdfs**
>
> --
> *Thanks & Regards*
>
> Unmesha Sreeveni U.B
>
> *Junior Developer*
>
>
>

RE: Which is hdfs?

Posted by Vinayakumar B <vi...@huawei.com>.
Hi,

Please find the answers inline, if its helpful.

Thanks and Regargds,
Vinayakumar B

From: unmesha sreeveni [mailto:unmeshabiju@gmail.com]
Sent: 09 December 2013 12:52
To: User Hadoop
Subject: Which is hdfs?

Can anyone tell me what is the difference between the below details

My cluster is a remote system "sree".

If you have set the fs.defaultFS as "hdfs://<namenode address>", then -copyFromLocal will copy to hdfs.

1. I have a "chck" file in my /home/sree
   I did
            > hadoop fs -copFromLocal /home/sree/chck
            > hadoop fs -ls

                       -rw-r--r--   1 sree     supergroup         32 2013-12-03 14:27 chck

I think you have done above operation from sree user.  This is why -ls showing sree as the owner of the file.
Since here you didn't pass the destination, by default file will be copied under /user/sree directory. -ls also shows from same directory.
To make sure, you do '-ls /user/sree'

                 whether chck file is now resided in hdfs?
2.After executing wordcount in my remote system my output folder looks like this

                   drwxr-xr-x   - hdfs      supergroup          0 2013-11-19 09:41 wcout

WordCount job is executing from the user hdfs, so -ls for the wcount is showing hdfs as owner.

I have a confusion -  which is hdfs?

In my opinion, both files are in hdfs but under different user homes.

The area where chck resided or wcout ?

3. Am i able to update/append "chck" file through MR job?
HDFS supports only append from one client at a time.

4.  -rw-r--r--   1 hdfs     supergroup         32 2013-12-03 14:27 myfile
    Am i able to update/append "myfile" file through MR job?


Basically here you can update the  file by creating the same file again. Updation to same file is not possible.

**I read that updation is not allowed in hdfs**

--
Thanks & Regards

Unmesha Sreeveni U.B
Junior Developer



Re: Which is hdfs?

Posted by Jagat Singh <ja...@gmail.com>.
Both are inside hdfs

One file is under your name other is under system user hdfs name.

Hadoop fs  command says i am accessing hdfs.

Read about hdfs append in 2.x version.
On 09/12/2013 6:22 PM, "unmesha sreeveni" <un...@gmail.com> wrote:

> Can anyone tell me what is the difference between the below details
>
> My cluster is a remote system "sree".
>
> 1. I have a "chck" file in my /home/sree
>    I did
>             > hadoop fs -copFromLocal /home/sree/chck
>             > hadoop fs -ls
>
>                        -rw-r--r--   1 *sree*     supergroup         32
> 2013-12-03 14:27 chck
>
>                *  whether chck file is now resided in hdfs?*
> 2.After executing wordcount in my remote system my output folder looks
> like this
>
>                    drwxr-xr-x   - *hdfs*      supergroup          0
> 2013-11-19 09:41 wcout
>
>
> I have a confusion -  which is *hdfs*?
> The area where *chck* resided or *wcout* ?
>
>
> 3. Am i able to update/append "chck" file through MR job?
>
> 4.  -rw-r--r--   1 *hdfs*     supergroup         32 2013-12-03 14:27
> myfile
>     Am i able to update/append "myfile" file through MR job?
>
> **I read that updation is not allowed in hdfs**
>
> --
> *Thanks & Regards*
>
> Unmesha Sreeveni U.B
>
> *Junior Developer*
>
>
>

RE: Which is hdfs?

Posted by Vinayakumar B <vi...@huawei.com>.
Hi,

Please find the answers inline, if its helpful.

Thanks and Regargds,
Vinayakumar B

From: unmesha sreeveni [mailto:unmeshabiju@gmail.com]
Sent: 09 December 2013 12:52
To: User Hadoop
Subject: Which is hdfs?

Can anyone tell me what is the difference between the below details

My cluster is a remote system "sree".

If you have set the fs.defaultFS as "hdfs://<namenode address>", then -copyFromLocal will copy to hdfs.

1. I have a "chck" file in my /home/sree
   I did
            > hadoop fs -copFromLocal /home/sree/chck
            > hadoop fs -ls

                       -rw-r--r--   1 sree     supergroup         32 2013-12-03 14:27 chck

I think you have done above operation from sree user.  This is why -ls showing sree as the owner of the file.
Since here you didn't pass the destination, by default file will be copied under /user/sree directory. -ls also shows from same directory.
To make sure, you do '-ls /user/sree'

                 whether chck file is now resided in hdfs?
2.After executing wordcount in my remote system my output folder looks like this

                   drwxr-xr-x   - hdfs      supergroup          0 2013-11-19 09:41 wcout

WordCount job is executing from the user hdfs, so -ls for the wcount is showing hdfs as owner.

I have a confusion -  which is hdfs?

In my opinion, both files are in hdfs but under different user homes.

The area where chck resided or wcout ?

3. Am i able to update/append "chck" file through MR job?
HDFS supports only append from one client at a time.

4.  -rw-r--r--   1 hdfs     supergroup         32 2013-12-03 14:27 myfile
    Am i able to update/append "myfile" file through MR job?


Basically here you can update the  file by creating the same file again. Updation to same file is not possible.

**I read that updation is not allowed in hdfs**

--
Thanks & Regards

Unmesha Sreeveni U.B
Junior Developer



RE: Which is hdfs?

Posted by Vinayakumar B <vi...@huawei.com>.
Hi,

Please find the answers inline, if its helpful.

Thanks and Regargds,
Vinayakumar B

From: unmesha sreeveni [mailto:unmeshabiju@gmail.com]
Sent: 09 December 2013 12:52
To: User Hadoop
Subject: Which is hdfs?

Can anyone tell me what is the difference between the below details

My cluster is a remote system "sree".

If you have set the fs.defaultFS as "hdfs://<namenode address>", then -copyFromLocal will copy to hdfs.

1. I have a "chck" file in my /home/sree
   I did
            > hadoop fs -copFromLocal /home/sree/chck
            > hadoop fs -ls

                       -rw-r--r--   1 sree     supergroup         32 2013-12-03 14:27 chck

I think you have done above operation from sree user.  This is why -ls showing sree as the owner of the file.
Since here you didn't pass the destination, by default file will be copied under /user/sree directory. -ls also shows from same directory.
To make sure, you do '-ls /user/sree'

                 whether chck file is now resided in hdfs?
2.After executing wordcount in my remote system my output folder looks like this

                   drwxr-xr-x   - hdfs      supergroup          0 2013-11-19 09:41 wcout

WordCount job is executing from the user hdfs, so -ls for the wcount is showing hdfs as owner.

I have a confusion -  which is hdfs?

In my opinion, both files are in hdfs but under different user homes.

The area where chck resided or wcout ?

3. Am i able to update/append "chck" file through MR job?
HDFS supports only append from one client at a time.

4.  -rw-r--r--   1 hdfs     supergroup         32 2013-12-03 14:27 myfile
    Am i able to update/append "myfile" file through MR job?


Basically here you can update the  file by creating the same file again. Updation to same file is not possible.

**I read that updation is not allowed in hdfs**

--
Thanks & Regards

Unmesha Sreeveni U.B
Junior Developer



Re: Which is hdfs?

Posted by Jagat Singh <ja...@gmail.com>.
Both are inside hdfs

One file is under your name other is under system user hdfs name.

Hadoop fs  command says i am accessing hdfs.

Read about hdfs append in 2.x version.
On 09/12/2013 6:22 PM, "unmesha sreeveni" <un...@gmail.com> wrote:

> Can anyone tell me what is the difference between the below details
>
> My cluster is a remote system "sree".
>
> 1. I have a "chck" file in my /home/sree
>    I did
>             > hadoop fs -copFromLocal /home/sree/chck
>             > hadoop fs -ls
>
>                        -rw-r--r--   1 *sree*     supergroup         32
> 2013-12-03 14:27 chck
>
>                *  whether chck file is now resided in hdfs?*
> 2.After executing wordcount in my remote system my output folder looks
> like this
>
>                    drwxr-xr-x   - *hdfs*      supergroup          0
> 2013-11-19 09:41 wcout
>
>
> I have a confusion -  which is *hdfs*?
> The area where *chck* resided or *wcout* ?
>
>
> 3. Am i able to update/append "chck" file through MR job?
>
> 4.  -rw-r--r--   1 *hdfs*     supergroup         32 2013-12-03 14:27
> myfile
>     Am i able to update/append "myfile" file through MR job?
>
> **I read that updation is not allowed in hdfs**
>
> --
> *Thanks & Regards*
>
> Unmesha Sreeveni U.B
>
> *Junior Developer*
>
>
>