You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Jun Young Kim <ju...@gmail.com> on 2011/03/09 12:57:56 UTC
what's the differences between file.blocksize and dfs.blocksize in
a job.xml?
hi,
I am wondering the concepts of file.blocksize and dfs.blocksize.
in hdfs-site.xml, I set
<property>
<name>dfs.block.size</name>
<value>536870912</value>
<final>true</final>
</property>
in job.xml, I found
*file.blocksize* 67108864
*dfs.blocksize* 536870912
dfs browser's page>
*Name*
*Type*
*Size*
*Replication*
*Block Size*
*Modification Time*
*Permission*
*Owner*
*Group*
*20110309160005
<http://thadps06.scast.nhnsystem.com:50075/browseDirectory.jsp?dir=%2Fuser%2Firteam%2F20110309160005&namenodeInfoPort=50070&delegation=null>*
*dir*
*2011-03-09 16:51*
*rwxr-xr-x*
*test*
*supergroup*
*all0307.ep
<http://thadps06.scast.nhnsystem.com:50075/browseDirectory.jsp?dir=%2Fuser%2Firteam%2Fall0307.ep&namenodeInfoPort=50070&delegation=null>*
*file*
*21.53 GB*
*2*
*64 MB*
*2011-03-09 15:58*
*rw-r--r--*
*test*
*supergroup*
*all0307.svc
<http://thadps06.scast.nhnsystem.com:50075/browseDirectory.jsp?dir=%2Fuser%2Firteam%2Fall0307.svc&namenodeInfoPort=50070&delegation=null>*
*file*
*21.53 GB*
*2*
*64 MB*
*2011-03-09 15:13*
*rw-r--r--*
*test*
*supergroup*
total size of inputs of a job is about 44GB(all0307.ep + all0307.svc).
in the step of maping, the split's numbers are 690. (that means a map
task took a single block size as 64MB).
I thought the splits counts should be about 88 because a single block
size is 512MB and input file's size are 44GB).
How could I get the result I want?
thanks.
--
Junyoung Kim (juneng603@gmail.com)
Re: what's the differences between file.blocksize and dfs.blocksize in a job.xml?
Posted by JunYoung Kim <ju...@gmail.com>.
hi, harsh.
is there a way to put my file on a hdfs with another block size?
usually, I did to copy my files to a hdfs.
$> hadoop fs -copyFromLocal localFile hdfsFile
do I need to put some another field to re-create in my command?
thank
2011. 3. 13., 오후 5:42, Harsh J 작성:
> Hello,
>
> On Wed, Mar 9, 2011 at 5:27 PM, Jun Young Kim <ju...@gmail.com> wrote:
>> hi,
>> I thought the splits counts should be about 88 because a single block size
>> is 512MB and input file's size are 44GB).
>
> From your browser copy-paste (you could also use `fs -ls`, much more
> readable in mails :), it appears that your file has been created with
> a 64 MiB block size, not 512 MiB. Try re-creating the file with the
> new block size, and you should get what you want.
>
> --
> Harsh J
> www.harshj.com
Re: what's the differences between file.blocksize and dfs.blocksize
in a job.xml?
Posted by Harsh J <qw...@gmail.com>.
Hello,
On Wed, Mar 9, 2011 at 5:27 PM, Jun Young Kim <ju...@gmail.com> wrote:
> hi,
> I thought the splits counts should be about 88 because a single block size
> is 512MB and input file's size are 44GB).
>From your browser copy-paste (you could also use `fs -ls`, much more
readable in mails :), it appears that your file has been created with
a 64 MiB block size, not 512 MiB. Try re-creating the file with the
new block size, and you should get what you want.
--
Harsh J
www.harshj.com