You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Mohit Anchlia <mo...@gmail.com> on 2013/03/12 21:46:10 UTC

Replication factor

Is it possible to set replication factor to a different value than the
default at the directory level?

Re: Replication factor

Posted by Harsh J <ha...@cloudera.com>.
Hi,

On Wed, Mar 13, 2013 at 2:56 AM, Mohit Anchlia <mo...@gmail.com> wrote:
> Does it mean if I set replication factor on directory /abc and I run a -put
> command and add a file to the directory it will use the new replication
> factor set on the directory /abc?

No, the client sets the desired replication factor requirement when it
invokes the create functions at the NameNode. If not present
explicitly in the invocation call, a default from the client is sent
(which is why setting dfs.replication at a specific client works).

>>> On Wed, Mar 13, 2013 at 2:16 AM, Mohit Anchlia <mo...@gmail.com>
>>> wrote:
>>>>
>>>> Is it possible to set replication factor to a different value than the
>>>> default at the directory level?

No, the metadata signifying the value of replication is held at the
file inode alone, the dirs are not aware/do not maintain replication
counts of the files under them.

-- 
Harsh J

Re: Replication factor

Posted by Bertrand Dechoux <de...@gmail.com>.
The best way would be to test it. The provided links indeed do not seem to
help.

I would say the default replication factor is the one found in the
configuration and it can be overwritten at runtime. I don't remember
anything about using the parent directory in order to find the "default
replication factor". The use case is understandable though. If the file
tree is clean, one could expect many use cases where under one directory
each file have the same replication factor but that this replication factor
may change from one directory to another.

(A ticket could be opened on the jira if none exist about that subject,
that way a discussion could start about the interest and difficulty of a
patch.)

I am afraid you will have to specify the replication factor yourself and/or
build yourself a mapping.

Regards

Bertrand

On Tue, Mar 12, 2013 at 10:26 PM, Mohit Anchlia <mo...@gmail.com>wrote:

> Does it mean if I set replication factor on directory /abc and I run a
> -put command and add a file to the directory it will use the new
> replication factor set on the directory /abc?
>
> On Tue, Mar 12, 2013 at 2:04 PM, Chris Embree <ce...@gmail.com> wrote:
>
>> Aww..  You could've used lmgtfy.com :)
>>
>>
>> On Tue, Mar 12, 2013 at 4:57 PM, varun kumar <va...@gmail.com> wrote:
>>
>>> http://hadoopblogfromvarun.wordpress.com/
>>>
>>>
>>> On Wed, Mar 13, 2013 at 2:16 AM, Mohit Anchlia <mo...@gmail.com>wrote:
>>>
>>>> Is it possible to set replication factor to a different value than the
>>>> default at the directory level?
>>>
>>>
>>>
>>>
>>> --
>>> Regards,
>>> Varun Kumar.P
>>>
>>
>>
>


-- 
Bertrand Dechoux

Re: Replication factor

Posted by Harsh J <ha...@cloudera.com>.
Hi,

On Wed, Mar 13, 2013 at 2:56 AM, Mohit Anchlia <mo...@gmail.com> wrote:
> Does it mean if I set replication factor on directory /abc and I run a -put
> command and add a file to the directory it will use the new replication
> factor set on the directory /abc?

No, the client sets the desired replication factor requirement when it
invokes the create functions at the NameNode. If not present
explicitly in the invocation call, a default from the client is sent
(which is why setting dfs.replication at a specific client works).

>>> On Wed, Mar 13, 2013 at 2:16 AM, Mohit Anchlia <mo...@gmail.com>
>>> wrote:
>>>>
>>>> Is it possible to set replication factor to a different value than the
>>>> default at the directory level?

No, the metadata signifying the value of replication is held at the
file inode alone, the dirs are not aware/do not maintain replication
counts of the files under them.

-- 
Harsh J

Re: Replication factor

Posted by Bertrand Dechoux <de...@gmail.com>.
The best way would be to test it. The provided links indeed do not seem to
help.

I would say the default replication factor is the one found in the
configuration and it can be overwritten at runtime. I don't remember
anything about using the parent directory in order to find the "default
replication factor". The use case is understandable though. If the file
tree is clean, one could expect many use cases where under one directory
each file have the same replication factor but that this replication factor
may change from one directory to another.

(A ticket could be opened on the jira if none exist about that subject,
that way a discussion could start about the interest and difficulty of a
patch.)

I am afraid you will have to specify the replication factor yourself and/or
build yourself a mapping.

Regards

Bertrand

On Tue, Mar 12, 2013 at 10:26 PM, Mohit Anchlia <mo...@gmail.com>wrote:

> Does it mean if I set replication factor on directory /abc and I run a
> -put command and add a file to the directory it will use the new
> replication factor set on the directory /abc?
>
> On Tue, Mar 12, 2013 at 2:04 PM, Chris Embree <ce...@gmail.com> wrote:
>
>> Aww..  You could've used lmgtfy.com :)
>>
>>
>> On Tue, Mar 12, 2013 at 4:57 PM, varun kumar <va...@gmail.com> wrote:
>>
>>> http://hadoopblogfromvarun.wordpress.com/
>>>
>>>
>>> On Wed, Mar 13, 2013 at 2:16 AM, Mohit Anchlia <mo...@gmail.com>wrote:
>>>
>>>> Is it possible to set replication factor to a different value than the
>>>> default at the directory level?
>>>
>>>
>>>
>>>
>>> --
>>> Regards,
>>> Varun Kumar.P
>>>
>>
>>
>


-- 
Bertrand Dechoux

Re: Replication factor

Posted by Harsh J <ha...@cloudera.com>.
Hi,

On Wed, Mar 13, 2013 at 2:56 AM, Mohit Anchlia <mo...@gmail.com> wrote:
> Does it mean if I set replication factor on directory /abc and I run a -put
> command and add a file to the directory it will use the new replication
> factor set on the directory /abc?

No, the client sets the desired replication factor requirement when it
invokes the create functions at the NameNode. If not present
explicitly in the invocation call, a default from the client is sent
(which is why setting dfs.replication at a specific client works).

>>> On Wed, Mar 13, 2013 at 2:16 AM, Mohit Anchlia <mo...@gmail.com>
>>> wrote:
>>>>
>>>> Is it possible to set replication factor to a different value than the
>>>> default at the directory level?

No, the metadata signifying the value of replication is held at the
file inode alone, the dirs are not aware/do not maintain replication
counts of the files under them.

-- 
Harsh J

Re: Replication factor

Posted by Bertrand Dechoux <de...@gmail.com>.
The best way would be to test it. The provided links indeed do not seem to
help.

I would say the default replication factor is the one found in the
configuration and it can be overwritten at runtime. I don't remember
anything about using the parent directory in order to find the "default
replication factor". The use case is understandable though. If the file
tree is clean, one could expect many use cases where under one directory
each file have the same replication factor but that this replication factor
may change from one directory to another.

(A ticket could be opened on the jira if none exist about that subject,
that way a discussion could start about the interest and difficulty of a
patch.)

I am afraid you will have to specify the replication factor yourself and/or
build yourself a mapping.

Regards

Bertrand

On Tue, Mar 12, 2013 at 10:26 PM, Mohit Anchlia <mo...@gmail.com>wrote:

> Does it mean if I set replication factor on directory /abc and I run a
> -put command and add a file to the directory it will use the new
> replication factor set on the directory /abc?
>
> On Tue, Mar 12, 2013 at 2:04 PM, Chris Embree <ce...@gmail.com> wrote:
>
>> Aww..  You could've used lmgtfy.com :)
>>
>>
>> On Tue, Mar 12, 2013 at 4:57 PM, varun kumar <va...@gmail.com> wrote:
>>
>>> http://hadoopblogfromvarun.wordpress.com/
>>>
>>>
>>> On Wed, Mar 13, 2013 at 2:16 AM, Mohit Anchlia <mo...@gmail.com>wrote:
>>>
>>>> Is it possible to set replication factor to a different value than the
>>>> default at the directory level?
>>>
>>>
>>>
>>>
>>> --
>>> Regards,
>>> Varun Kumar.P
>>>
>>
>>
>


-- 
Bertrand Dechoux

Re: Replication factor

Posted by Harsh J <ha...@cloudera.com>.
Hi,

On Wed, Mar 13, 2013 at 2:56 AM, Mohit Anchlia <mo...@gmail.com> wrote:
> Does it mean if I set replication factor on directory /abc and I run a -put
> command and add a file to the directory it will use the new replication
> factor set on the directory /abc?

No, the client sets the desired replication factor requirement when it
invokes the create functions at the NameNode. If not present
explicitly in the invocation call, a default from the client is sent
(which is why setting dfs.replication at a specific client works).

>>> On Wed, Mar 13, 2013 at 2:16 AM, Mohit Anchlia <mo...@gmail.com>
>>> wrote:
>>>>
>>>> Is it possible to set replication factor to a different value than the
>>>> default at the directory level?

No, the metadata signifying the value of replication is held at the
file inode alone, the dirs are not aware/do not maintain replication
counts of the files under them.

-- 
Harsh J

Re: Replication factor

Posted by Bertrand Dechoux <de...@gmail.com>.
The best way would be to test it. The provided links indeed do not seem to
help.

I would say the default replication factor is the one found in the
configuration and it can be overwritten at runtime. I don't remember
anything about using the parent directory in order to find the "default
replication factor". The use case is understandable though. If the file
tree is clean, one could expect many use cases where under one directory
each file have the same replication factor but that this replication factor
may change from one directory to another.

(A ticket could be opened on the jira if none exist about that subject,
that way a discussion could start about the interest and difficulty of a
patch.)

I am afraid you will have to specify the replication factor yourself and/or
build yourself a mapping.

Regards

Bertrand

On Tue, Mar 12, 2013 at 10:26 PM, Mohit Anchlia <mo...@gmail.com>wrote:

> Does it mean if I set replication factor on directory /abc and I run a
> -put command and add a file to the directory it will use the new
> replication factor set on the directory /abc?
>
> On Tue, Mar 12, 2013 at 2:04 PM, Chris Embree <ce...@gmail.com> wrote:
>
>> Aww..  You could've used lmgtfy.com :)
>>
>>
>> On Tue, Mar 12, 2013 at 4:57 PM, varun kumar <va...@gmail.com> wrote:
>>
>>> http://hadoopblogfromvarun.wordpress.com/
>>>
>>>
>>> On Wed, Mar 13, 2013 at 2:16 AM, Mohit Anchlia <mo...@gmail.com>wrote:
>>>
>>>> Is it possible to set replication factor to a different value than the
>>>> default at the directory level?
>>>
>>>
>>>
>>>
>>> --
>>> Regards,
>>> Varun Kumar.P
>>>
>>
>>
>


-- 
Bertrand Dechoux

Re: Replication factor

Posted by Mohit Anchlia <mo...@gmail.com>.
Does it mean if I set replication factor on directory /abc and I run a -put
command and add a file to the directory it will use the new replication
factor set on the directory /abc?

On Tue, Mar 12, 2013 at 2:04 PM, Chris Embree <ce...@gmail.com> wrote:

> Aww..  You could've used lmgtfy.com :)
>
>
> On Tue, Mar 12, 2013 at 4:57 PM, varun kumar <va...@gmail.com> wrote:
>
>> http://hadoopblogfromvarun.wordpress.com/
>>
>>
>> On Wed, Mar 13, 2013 at 2:16 AM, Mohit Anchlia <mo...@gmail.com>wrote:
>>
>>> Is it possible to set replication factor to a different value than the
>>> default at the directory level?
>>
>>
>>
>>
>> --
>> Regards,
>> Varun Kumar.P
>>
>
>

Re: Replication factor

Posted by Mohit Anchlia <mo...@gmail.com>.
Does it mean if I set replication factor on directory /abc and I run a -put
command and add a file to the directory it will use the new replication
factor set on the directory /abc?

On Tue, Mar 12, 2013 at 2:04 PM, Chris Embree <ce...@gmail.com> wrote:

> Aww..  You could've used lmgtfy.com :)
>
>
> On Tue, Mar 12, 2013 at 4:57 PM, varun kumar <va...@gmail.com> wrote:
>
>> http://hadoopblogfromvarun.wordpress.com/
>>
>>
>> On Wed, Mar 13, 2013 at 2:16 AM, Mohit Anchlia <mo...@gmail.com>wrote:
>>
>>> Is it possible to set replication factor to a different value than the
>>> default at the directory level?
>>
>>
>>
>>
>> --
>> Regards,
>> Varun Kumar.P
>>
>
>

Re: Replication factor

Posted by Mohit Anchlia <mo...@gmail.com>.
Does it mean if I set replication factor on directory /abc and I run a -put
command and add a file to the directory it will use the new replication
factor set on the directory /abc?

On Tue, Mar 12, 2013 at 2:04 PM, Chris Embree <ce...@gmail.com> wrote:

> Aww..  You could've used lmgtfy.com :)
>
>
> On Tue, Mar 12, 2013 at 4:57 PM, varun kumar <va...@gmail.com> wrote:
>
>> http://hadoopblogfromvarun.wordpress.com/
>>
>>
>> On Wed, Mar 13, 2013 at 2:16 AM, Mohit Anchlia <mo...@gmail.com>wrote:
>>
>>> Is it possible to set replication factor to a different value than the
>>> default at the directory level?
>>
>>
>>
>>
>> --
>> Regards,
>> Varun Kumar.P
>>
>
>

Re: Replication factor

Posted by Mohit Anchlia <mo...@gmail.com>.
Does it mean if I set replication factor on directory /abc and I run a -put
command and add a file to the directory it will use the new replication
factor set on the directory /abc?

On Tue, Mar 12, 2013 at 2:04 PM, Chris Embree <ce...@gmail.com> wrote:

> Aww..  You could've used lmgtfy.com :)
>
>
> On Tue, Mar 12, 2013 at 4:57 PM, varun kumar <va...@gmail.com> wrote:
>
>> http://hadoopblogfromvarun.wordpress.com/
>>
>>
>> On Wed, Mar 13, 2013 at 2:16 AM, Mohit Anchlia <mo...@gmail.com>wrote:
>>
>>> Is it possible to set replication factor to a different value than the
>>> default at the directory level?
>>
>>
>>
>>
>> --
>> Regards,
>> Varun Kumar.P
>>
>
>

Re: Replication factor

Posted by Chris Embree <ce...@gmail.com>.
Aww..  You could've used lmgtfy.com :)

On Tue, Mar 12, 2013 at 4:57 PM, varun kumar <va...@gmail.com> wrote:

> http://hadoopblogfromvarun.wordpress.com/
>
>
> On Wed, Mar 13, 2013 at 2:16 AM, Mohit Anchlia <mo...@gmail.com>wrote:
>
>> Is it possible to set replication factor to a different value than the
>> default at the directory level?
>
>
>
>
> --
> Regards,
> Varun Kumar.P
>

Re: Replication factor

Posted by Chris Embree <ce...@gmail.com>.
Aww..  You could've used lmgtfy.com :)

On Tue, Mar 12, 2013 at 4:57 PM, varun kumar <va...@gmail.com> wrote:

> http://hadoopblogfromvarun.wordpress.com/
>
>
> On Wed, Mar 13, 2013 at 2:16 AM, Mohit Anchlia <mo...@gmail.com>wrote:
>
>> Is it possible to set replication factor to a different value than the
>> default at the directory level?
>
>
>
>
> --
> Regards,
> Varun Kumar.P
>

Re: Replication factor

Posted by Chris Embree <ce...@gmail.com>.
Aww..  You could've used lmgtfy.com :)

On Tue, Mar 12, 2013 at 4:57 PM, varun kumar <va...@gmail.com> wrote:

> http://hadoopblogfromvarun.wordpress.com/
>
>
> On Wed, Mar 13, 2013 at 2:16 AM, Mohit Anchlia <mo...@gmail.com>wrote:
>
>> Is it possible to set replication factor to a different value than the
>> default at the directory level?
>
>
>
>
> --
> Regards,
> Varun Kumar.P
>

Re: Replication factor

Posted by Chris Embree <ce...@gmail.com>.
Aww..  You could've used lmgtfy.com :)

On Tue, Mar 12, 2013 at 4:57 PM, varun kumar <va...@gmail.com> wrote:

> http://hadoopblogfromvarun.wordpress.com/
>
>
> On Wed, Mar 13, 2013 at 2:16 AM, Mohit Anchlia <mo...@gmail.com>wrote:
>
>> Is it possible to set replication factor to a different value than the
>> default at the directory level?
>
>
>
>
> --
> Regards,
> Varun Kumar.P
>

Re: Replication factor

Posted by varun kumar <va...@gmail.com>.
http://hadoopblogfromvarun.wordpress.com/

On Wed, Mar 13, 2013 at 2:16 AM, Mohit Anchlia <mo...@gmail.com>wrote:

> Is it possible to set replication factor to a different value than the
> default at the directory level?




-- 
Regards,
Varun Kumar.P

Re: Replication factor

Posted by varun kumar <va...@gmail.com>.
http://hadoopblogfromvarun.wordpress.com/

On Wed, Mar 13, 2013 at 2:16 AM, Mohit Anchlia <mo...@gmail.com>wrote:

> Is it possible to set replication factor to a different value than the
> default at the directory level?




-- 
Regards,
Varun Kumar.P

Re: Replication factor

Posted by varun kumar <va...@gmail.com>.
http://hadoopblogfromvarun.wordpress.com/

On Wed, Mar 13, 2013 at 2:16 AM, Mohit Anchlia <mo...@gmail.com>wrote:

> Is it possible to set replication factor to a different value than the
> default at the directory level?




-- 
Regards,
Varun Kumar.P

Re: Replication factor

Posted by varun kumar <va...@gmail.com>.
http://hadoopblogfromvarun.wordpress.com/

On Wed, Mar 13, 2013 at 2:16 AM, Mohit Anchlia <mo...@gmail.com>wrote:

> Is it possible to set replication factor to a different value than the
> default at the directory level?




-- 
Regards,
Varun Kumar.P