You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Jun Li <jl...@gmail.com> on 2013/09/10 20:08:19 UTC

can the parameters dfs.block.size and dfs.replication be different from one file to the other

Hi,

I am trying to evaluate the MapReduce with different setting. I wonder
whether the following two HDFS parameters:

*dfs.block.size
*dfs.replication

can be set at the time I load the file to the HDFS (that is, it is the
client side setting)?  or these are the system parameter settings that can
not be changed from the HDFS client invocation.


I am using Hadoop 1.1.2 (the recent stable release), rather than the new
Hadoop 2.x. By reading the  Cloudera document, I wonder even if such
parameters can be set per HDFS client, will it be supported only after
certain Hadoop version?

Thank you!

Jun

Re: can the parameters dfs.block.size and dfs.replication be different from one file to the other

Posted by Shahab Yunus <sh...@gmail.com>.
Yes, my example was for Hadoop 2.0.0-cdh4.2.0

Try this:
dfs.block.size
dfs.replication

More details here:
http://cloudfront.blogspot.in/2012/07/how-to-configure-hadoop.html#.UjCV5WRrNtI
http://hadoop.apache.org/docs/stable/cluster_setup.html#Site+Configuration

Regards
Shahab


On Tue, Sep 10, 2013 at 4:13 PM, Jun Li <jl...@gmail.com> wrote:

> Hello Shahab,
>
> Thanks for the reply. Typically, to invoke the HDFS client, I will use
> "bin/haddop dfs ...". But the command that you used "hadoop fs ...". makes
> me wonder what this is the Hadoop 2.* client commands. Could you clarify
> for me such "-D fs.local.block.size" is supported in Hadoop 1.1. or not?
>
> Thank you!
>
> Jun
>
>
>
> On Tue, Sep 10, 2013 at 11:38 AM, Shahab Yunus <sh...@gmail.com>wrote:
>
>> "can be set at the time I load the file to the HDFS (that is, it is the
>> client side setting)? "
>> I don't think you can do this while reading. These are done at the time
>> of writing.
>>
>> You can do it like this (the example is for CLI as evident):
>>
>> hadoop fs -D fs.local.block.size=134217728 -put local_name remote_location
>>
>> Same is applicable with replication property.
>>
>> So given that, you I think you have to modify the FileOutputFormat (and
>> other 'writing' classes) to allow these to be configurable at the time
>> files are being generated by M/R
>>
>> Regards,
>> Shahab
>>
>>
>> On Tue, Sep 10, 2013 at 2:08 PM, Jun Li <jl...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> I am trying to evaluate the MapReduce with different setting. I wonder
>>> whether the following two HDFS parameters:
>>>
>>> *dfs.block.size
>>> *dfs.replication
>>>
>>> can be set at the time I load the file to the HDFS (that is, it is the
>>> client side setting)?  or these are the system parameter settings that can
>>> not be changed from the HDFS client invocation.
>>>
>>>
>>> I am using Hadoop 1.1.2 (the recent stable release), rather than the new
>>> Hadoop 2.x. By reading the  Cloudera document, I wonder even if such
>>> parameters can be set per HDFS client, will it be supported only after
>>> certain Hadoop version?
>>>
>>> Thank you!
>>>
>>> Jun
>>>
>>>
>>
>

Re: can the parameters dfs.block.size and dfs.replication be different from one file to the other

Posted by Shahab Yunus <sh...@gmail.com>.
Yes, my example was for Hadoop 2.0.0-cdh4.2.0

Try this:
dfs.block.size
dfs.replication

More details here:
http://cloudfront.blogspot.in/2012/07/how-to-configure-hadoop.html#.UjCV5WRrNtI
http://hadoop.apache.org/docs/stable/cluster_setup.html#Site+Configuration

Regards
Shahab


On Tue, Sep 10, 2013 at 4:13 PM, Jun Li <jl...@gmail.com> wrote:

> Hello Shahab,
>
> Thanks for the reply. Typically, to invoke the HDFS client, I will use
> "bin/haddop dfs ...". But the command that you used "hadoop fs ...". makes
> me wonder what this is the Hadoop 2.* client commands. Could you clarify
> for me such "-D fs.local.block.size" is supported in Hadoop 1.1. or not?
>
> Thank you!
>
> Jun
>
>
>
> On Tue, Sep 10, 2013 at 11:38 AM, Shahab Yunus <sh...@gmail.com>wrote:
>
>> "can be set at the time I load the file to the HDFS (that is, it is the
>> client side setting)? "
>> I don't think you can do this while reading. These are done at the time
>> of writing.
>>
>> You can do it like this (the example is for CLI as evident):
>>
>> hadoop fs -D fs.local.block.size=134217728 -put local_name remote_location
>>
>> Same is applicable with replication property.
>>
>> So given that, you I think you have to modify the FileOutputFormat (and
>> other 'writing' classes) to allow these to be configurable at the time
>> files are being generated by M/R
>>
>> Regards,
>> Shahab
>>
>>
>> On Tue, Sep 10, 2013 at 2:08 PM, Jun Li <jl...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> I am trying to evaluate the MapReduce with different setting. I wonder
>>> whether the following two HDFS parameters:
>>>
>>> *dfs.block.size
>>> *dfs.replication
>>>
>>> can be set at the time I load the file to the HDFS (that is, it is the
>>> client side setting)?  or these are the system parameter settings that can
>>> not be changed from the HDFS client invocation.
>>>
>>>
>>> I am using Hadoop 1.1.2 (the recent stable release), rather than the new
>>> Hadoop 2.x. By reading the  Cloudera document, I wonder even if such
>>> parameters can be set per HDFS client, will it be supported only after
>>> certain Hadoop version?
>>>
>>> Thank you!
>>>
>>> Jun
>>>
>>>
>>
>

Re: can the parameters dfs.block.size and dfs.replication be different from one file to the other

Posted by Shahab Yunus <sh...@gmail.com>.
Yes, my example was for Hadoop 2.0.0-cdh4.2.0

Try this:
dfs.block.size
dfs.replication

More details here:
http://cloudfront.blogspot.in/2012/07/how-to-configure-hadoop.html#.UjCV5WRrNtI
http://hadoop.apache.org/docs/stable/cluster_setup.html#Site+Configuration

Regards
Shahab


On Tue, Sep 10, 2013 at 4:13 PM, Jun Li <jl...@gmail.com> wrote:

> Hello Shahab,
>
> Thanks for the reply. Typically, to invoke the HDFS client, I will use
> "bin/haddop dfs ...". But the command that you used "hadoop fs ...". makes
> me wonder what this is the Hadoop 2.* client commands. Could you clarify
> for me such "-D fs.local.block.size" is supported in Hadoop 1.1. or not?
>
> Thank you!
>
> Jun
>
>
>
> On Tue, Sep 10, 2013 at 11:38 AM, Shahab Yunus <sh...@gmail.com>wrote:
>
>> "can be set at the time I load the file to the HDFS (that is, it is the
>> client side setting)? "
>> I don't think you can do this while reading. These are done at the time
>> of writing.
>>
>> You can do it like this (the example is for CLI as evident):
>>
>> hadoop fs -D fs.local.block.size=134217728 -put local_name remote_location
>>
>> Same is applicable with replication property.
>>
>> So given that, you I think you have to modify the FileOutputFormat (and
>> other 'writing' classes) to allow these to be configurable at the time
>> files are being generated by M/R
>>
>> Regards,
>> Shahab
>>
>>
>> On Tue, Sep 10, 2013 at 2:08 PM, Jun Li <jl...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> I am trying to evaluate the MapReduce with different setting. I wonder
>>> whether the following two HDFS parameters:
>>>
>>> *dfs.block.size
>>> *dfs.replication
>>>
>>> can be set at the time I load the file to the HDFS (that is, it is the
>>> client side setting)?  or these are the system parameter settings that can
>>> not be changed from the HDFS client invocation.
>>>
>>>
>>> I am using Hadoop 1.1.2 (the recent stable release), rather than the new
>>> Hadoop 2.x. By reading the  Cloudera document, I wonder even if such
>>> parameters can be set per HDFS client, will it be supported only after
>>> certain Hadoop version?
>>>
>>> Thank you!
>>>
>>> Jun
>>>
>>>
>>
>

Re: can the parameters dfs.block.size and dfs.replication be different from one file to the other

Posted by Shahab Yunus <sh...@gmail.com>.
Yes, my example was for Hadoop 2.0.0-cdh4.2.0

Try this:
dfs.block.size
dfs.replication

More details here:
http://cloudfront.blogspot.in/2012/07/how-to-configure-hadoop.html#.UjCV5WRrNtI
http://hadoop.apache.org/docs/stable/cluster_setup.html#Site+Configuration

Regards
Shahab


On Tue, Sep 10, 2013 at 4:13 PM, Jun Li <jl...@gmail.com> wrote:

> Hello Shahab,
>
> Thanks for the reply. Typically, to invoke the HDFS client, I will use
> "bin/haddop dfs ...". But the command that you used "hadoop fs ...". makes
> me wonder what this is the Hadoop 2.* client commands. Could you clarify
> for me such "-D fs.local.block.size" is supported in Hadoop 1.1. or not?
>
> Thank you!
>
> Jun
>
>
>
> On Tue, Sep 10, 2013 at 11:38 AM, Shahab Yunus <sh...@gmail.com>wrote:
>
>> "can be set at the time I load the file to the HDFS (that is, it is the
>> client side setting)? "
>> I don't think you can do this while reading. These are done at the time
>> of writing.
>>
>> You can do it like this (the example is for CLI as evident):
>>
>> hadoop fs -D fs.local.block.size=134217728 -put local_name remote_location
>>
>> Same is applicable with replication property.
>>
>> So given that, you I think you have to modify the FileOutputFormat (and
>> other 'writing' classes) to allow these to be configurable at the time
>> files are being generated by M/R
>>
>> Regards,
>> Shahab
>>
>>
>> On Tue, Sep 10, 2013 at 2:08 PM, Jun Li <jl...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> I am trying to evaluate the MapReduce with different setting. I wonder
>>> whether the following two HDFS parameters:
>>>
>>> *dfs.block.size
>>> *dfs.replication
>>>
>>> can be set at the time I load the file to the HDFS (that is, it is the
>>> client side setting)?  or these are the system parameter settings that can
>>> not be changed from the HDFS client invocation.
>>>
>>>
>>> I am using Hadoop 1.1.2 (the recent stable release), rather than the new
>>> Hadoop 2.x. By reading the  Cloudera document, I wonder even if such
>>> parameters can be set per HDFS client, will it be supported only after
>>> certain Hadoop version?
>>>
>>> Thank you!
>>>
>>> Jun
>>>
>>>
>>
>

Re: can the parameters dfs.block.size and dfs.replication be different from one file to the other

Posted by Jun Li <jl...@gmail.com>.
Hello Shahab,

Thanks for the reply. Typically, to invoke the HDFS client, I will use
"bin/haddop dfs ...". But the command that you used "hadoop fs ...". makes
me wonder what this is the Hadoop 2.* client commands. Could you clarify
for me such "-D fs.local.block.size" is supported in Hadoop 1.1. or not?

Thank you!

Jun



On Tue, Sep 10, 2013 at 11:38 AM, Shahab Yunus <sh...@gmail.com>wrote:

> "can be set at the time I load the file to the HDFS (that is, it is the
> client side setting)? "
> I don't think you can do this while reading. These are done at the time of
> writing.
>
> You can do it like this (the example is for CLI as evident):
>
> hadoop fs -D fs.local.block.size=134217728 -put local_name remote_location
>
> Same is applicable with replication property.
>
> So given that, you I think you have to modify the FileOutputFormat (and
> other 'writing' classes) to allow these to be configurable at the time
> files are being generated by M/R
>
> Regards,
> Shahab
>
>
> On Tue, Sep 10, 2013 at 2:08 PM, Jun Li <jl...@gmail.com> wrote:
>
>> Hi,
>>
>> I am trying to evaluate the MapReduce with different setting. I wonder
>> whether the following two HDFS parameters:
>>
>> *dfs.block.size
>> *dfs.replication
>>
>> can be set at the time I load the file to the HDFS (that is, it is the
>> client side setting)?  or these are the system parameter settings that can
>> not be changed from the HDFS client invocation.
>>
>>
>> I am using Hadoop 1.1.2 (the recent stable release), rather than the new
>> Hadoop 2.x. By reading the  Cloudera document, I wonder even if such
>> parameters can be set per HDFS client, will it be supported only after
>> certain Hadoop version?
>>
>> Thank you!
>>
>> Jun
>>
>>
>

Re: can the parameters dfs.block.size and dfs.replication be different from one file to the other

Posted by Jun Li <jl...@gmail.com>.
Hello Shahab,

Thanks for the reply. Typically, to invoke the HDFS client, I will use
"bin/haddop dfs ...". But the command that you used "hadoop fs ...". makes
me wonder what this is the Hadoop 2.* client commands. Could you clarify
for me such "-D fs.local.block.size" is supported in Hadoop 1.1. or not?

Thank you!

Jun



On Tue, Sep 10, 2013 at 11:38 AM, Shahab Yunus <sh...@gmail.com>wrote:

> "can be set at the time I load the file to the HDFS (that is, it is the
> client side setting)? "
> I don't think you can do this while reading. These are done at the time of
> writing.
>
> You can do it like this (the example is for CLI as evident):
>
> hadoop fs -D fs.local.block.size=134217728 -put local_name remote_location
>
> Same is applicable with replication property.
>
> So given that, you I think you have to modify the FileOutputFormat (and
> other 'writing' classes) to allow these to be configurable at the time
> files are being generated by M/R
>
> Regards,
> Shahab
>
>
> On Tue, Sep 10, 2013 at 2:08 PM, Jun Li <jl...@gmail.com> wrote:
>
>> Hi,
>>
>> I am trying to evaluate the MapReduce with different setting. I wonder
>> whether the following two HDFS parameters:
>>
>> *dfs.block.size
>> *dfs.replication
>>
>> can be set at the time I load the file to the HDFS (that is, it is the
>> client side setting)?  or these are the system parameter settings that can
>> not be changed from the HDFS client invocation.
>>
>>
>> I am using Hadoop 1.1.2 (the recent stable release), rather than the new
>> Hadoop 2.x. By reading the  Cloudera document, I wonder even if such
>> parameters can be set per HDFS client, will it be supported only after
>> certain Hadoop version?
>>
>> Thank you!
>>
>> Jun
>>
>>
>

Re: can the parameters dfs.block.size and dfs.replication be different from one file to the other

Posted by Jun Li <jl...@gmail.com>.
Hello Shahab,

Thanks for the reply. Typically, to invoke the HDFS client, I will use
"bin/haddop dfs ...". But the command that you used "hadoop fs ...". makes
me wonder what this is the Hadoop 2.* client commands. Could you clarify
for me such "-D fs.local.block.size" is supported in Hadoop 1.1. or not?

Thank you!

Jun



On Tue, Sep 10, 2013 at 11:38 AM, Shahab Yunus <sh...@gmail.com>wrote:

> "can be set at the time I load the file to the HDFS (that is, it is the
> client side setting)? "
> I don't think you can do this while reading. These are done at the time of
> writing.
>
> You can do it like this (the example is for CLI as evident):
>
> hadoop fs -D fs.local.block.size=134217728 -put local_name remote_location
>
> Same is applicable with replication property.
>
> So given that, you I think you have to modify the FileOutputFormat (and
> other 'writing' classes) to allow these to be configurable at the time
> files are being generated by M/R
>
> Regards,
> Shahab
>
>
> On Tue, Sep 10, 2013 at 2:08 PM, Jun Li <jl...@gmail.com> wrote:
>
>> Hi,
>>
>> I am trying to evaluate the MapReduce with different setting. I wonder
>> whether the following two HDFS parameters:
>>
>> *dfs.block.size
>> *dfs.replication
>>
>> can be set at the time I load the file to the HDFS (that is, it is the
>> client side setting)?  or these are the system parameter settings that can
>> not be changed from the HDFS client invocation.
>>
>>
>> I am using Hadoop 1.1.2 (the recent stable release), rather than the new
>> Hadoop 2.x. By reading the  Cloudera document, I wonder even if such
>> parameters can be set per HDFS client, will it be supported only after
>> certain Hadoop version?
>>
>> Thank you!
>>
>> Jun
>>
>>
>

Re: can the parameters dfs.block.size and dfs.replication be different from one file to the other

Posted by Jun Li <jl...@gmail.com>.
Hello Shahab,

Thanks for the reply. Typically, to invoke the HDFS client, I will use
"bin/haddop dfs ...". But the command that you used "hadoop fs ...". makes
me wonder what this is the Hadoop 2.* client commands. Could you clarify
for me such "-D fs.local.block.size" is supported in Hadoop 1.1. or not?

Thank you!

Jun



On Tue, Sep 10, 2013 at 11:38 AM, Shahab Yunus <sh...@gmail.com>wrote:

> "can be set at the time I load the file to the HDFS (that is, it is the
> client side setting)? "
> I don't think you can do this while reading. These are done at the time of
> writing.
>
> You can do it like this (the example is for CLI as evident):
>
> hadoop fs -D fs.local.block.size=134217728 -put local_name remote_location
>
> Same is applicable with replication property.
>
> So given that, you I think you have to modify the FileOutputFormat (and
> other 'writing' classes) to allow these to be configurable at the time
> files are being generated by M/R
>
> Regards,
> Shahab
>
>
> On Tue, Sep 10, 2013 at 2:08 PM, Jun Li <jl...@gmail.com> wrote:
>
>> Hi,
>>
>> I am trying to evaluate the MapReduce with different setting. I wonder
>> whether the following two HDFS parameters:
>>
>> *dfs.block.size
>> *dfs.replication
>>
>> can be set at the time I load the file to the HDFS (that is, it is the
>> client side setting)?  or these are the system parameter settings that can
>> not be changed from the HDFS client invocation.
>>
>>
>> I am using Hadoop 1.1.2 (the recent stable release), rather than the new
>> Hadoop 2.x. By reading the  Cloudera document, I wonder even if such
>> parameters can be set per HDFS client, will it be supported only after
>> certain Hadoop version?
>>
>> Thank you!
>>
>> Jun
>>
>>
>

Re: can the parameters dfs.block.size and dfs.replication be different from one file to the other

Posted by Shahab Yunus <sh...@gmail.com>.
"can be set at the time I load the file to the HDFS (that is, it is the
client side setting)? "
I don't think you can do this while reading. These are done at the time of
writing.

You can do it like this (the example is for CLI as evident):

hadoop fs -D fs.local.block.size=134217728 -put local_name remote_location

Same is applicable with replication property.

So given that, you I think you have to modify the FileOutputFormat (and
other 'writing' classes) to allow these to be configurable at the time
files are being generated by M/R

Regards,
Shahab


On Tue, Sep 10, 2013 at 2:08 PM, Jun Li <jl...@gmail.com> wrote:

> Hi,
>
> I am trying to evaluate the MapReduce with different setting. I wonder
> whether the following two HDFS parameters:
>
> *dfs.block.size
> *dfs.replication
>
> can be set at the time I load the file to the HDFS (that is, it is the
> client side setting)?  or these are the system parameter settings that can
> not be changed from the HDFS client invocation.
>
>
> I am using Hadoop 1.1.2 (the recent stable release), rather than the new
> Hadoop 2.x. By reading the  Cloudera document, I wonder even if such
> parameters can be set per HDFS client, will it be supported only after
> certain Hadoop version?
>
> Thank you!
>
> Jun
>
>

Re: can the parameters dfs.block.size and dfs.replication be different from one file to the other

Posted by Shahab Yunus <sh...@gmail.com>.
"can be set at the time I load the file to the HDFS (that is, it is the
client side setting)? "
I don't think you can do this while reading. These are done at the time of
writing.

You can do it like this (the example is for CLI as evident):

hadoop fs -D fs.local.block.size=134217728 -put local_name remote_location

Same is applicable with replication property.

So given that, you I think you have to modify the FileOutputFormat (and
other 'writing' classes) to allow these to be configurable at the time
files are being generated by M/R

Regards,
Shahab


On Tue, Sep 10, 2013 at 2:08 PM, Jun Li <jl...@gmail.com> wrote:

> Hi,
>
> I am trying to evaluate the MapReduce with different setting. I wonder
> whether the following two HDFS parameters:
>
> *dfs.block.size
> *dfs.replication
>
> can be set at the time I load the file to the HDFS (that is, it is the
> client side setting)?  or these are the system parameter settings that can
> not be changed from the HDFS client invocation.
>
>
> I am using Hadoop 1.1.2 (the recent stable release), rather than the new
> Hadoop 2.x. By reading the  Cloudera document, I wonder even if such
> parameters can be set per HDFS client, will it be supported only after
> certain Hadoop version?
>
> Thank you!
>
> Jun
>
>

Re: can the parameters dfs.block.size and dfs.replication be different from one file to the other

Posted by Shahab Yunus <sh...@gmail.com>.
"can be set at the time I load the file to the HDFS (that is, it is the
client side setting)? "
I don't think you can do this while reading. These are done at the time of
writing.

You can do it like this (the example is for CLI as evident):

hadoop fs -D fs.local.block.size=134217728 -put local_name remote_location

Same is applicable with replication property.

So given that, you I think you have to modify the FileOutputFormat (and
other 'writing' classes) to allow these to be configurable at the time
files are being generated by M/R

Regards,
Shahab


On Tue, Sep 10, 2013 at 2:08 PM, Jun Li <jl...@gmail.com> wrote:

> Hi,
>
> I am trying to evaluate the MapReduce with different setting. I wonder
> whether the following two HDFS parameters:
>
> *dfs.block.size
> *dfs.replication
>
> can be set at the time I load the file to the HDFS (that is, it is the
> client side setting)?  or these are the system parameter settings that can
> not be changed from the HDFS client invocation.
>
>
> I am using Hadoop 1.1.2 (the recent stable release), rather than the new
> Hadoop 2.x. By reading the  Cloudera document, I wonder even if such
> parameters can be set per HDFS client, will it be supported only after
> certain Hadoop version?
>
> Thank you!
>
> Jun
>
>

Re: can the parameters dfs.block.size and dfs.replication be different from one file to the other

Posted by Shahab Yunus <sh...@gmail.com>.
"can be set at the time I load the file to the HDFS (that is, it is the
client side setting)? "
I don't think you can do this while reading. These are done at the time of
writing.

You can do it like this (the example is for CLI as evident):

hadoop fs -D fs.local.block.size=134217728 -put local_name remote_location

Same is applicable with replication property.

So given that, you I think you have to modify the FileOutputFormat (and
other 'writing' classes) to allow these to be configurable at the time
files are being generated by M/R

Regards,
Shahab


On Tue, Sep 10, 2013 at 2:08 PM, Jun Li <jl...@gmail.com> wrote:

> Hi,
>
> I am trying to evaluate the MapReduce with different setting. I wonder
> whether the following two HDFS parameters:
>
> *dfs.block.size
> *dfs.replication
>
> can be set at the time I load the file to the HDFS (that is, it is the
> client side setting)?  or these are the system parameter settings that can
> not be changed from the HDFS client invocation.
>
>
> I am using Hadoop 1.1.2 (the recent stable release), rather than the new
> Hadoop 2.x. By reading the  Cloudera document, I wonder even if such
> parameters can be set per HDFS client, will it be supported only after
> certain Hadoop version?
>
> Thank you!
>
> Jun
>
>