You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-dev@hadoop.apache.org by Mike Andrews <mr...@xoba.com> on 2009/03/14 21:58:56 UTC

large block size problem

hi there,

i tried "-put" then "-cat" for a 1.6 gb file and it worked fine, but
when trying it on a 16.4 gb file ("bigfile.dat"), i get the following
errors (see below). i got this failure both times i tried it, each
with a fresh install of single-node 0.19.1. also, i set block size to
32 gb with larger buffer and checksum sizes in config (see below as
well) -- any thoughts on what i may be doing wrong, or is it a bug?
---

<configuration>
<property>
  <name>dfs.block.size</name>
  <value>34359738368</value>
  <description>The default block size for new files.</description>
</property>
<property>
  <name>io.file.buffer.size</name>
  <value>65536</value>
  <description>The size of buffer for use in sequence files.
  The size of this buffer should probably be a multiple of hardware
  page size (4096 on Intel x86), and it determines how much data is
  buffered during read and write operations.</description>
</property>
<property>
  <name>io.bytes.per.checksum</name>
  <value>4096</value>
  <description>The number of bytes per checksum.  Must not be larger than
  io.file.buffer.size.</description>
</property><property>
    <name>fs.default.name</name>
    <value>hdfs://localhost:9000</value>
  </property>
  <property>
    <name>mapred.job.tracker</name>
    <value>localhost:9001</value>
  </property>
  <property>
    <name>dfs.replication</name>
    <value>1</value>
</property>
</configuration>


[mra@sacws hadoop-0.19.1]$ bin/hadoop fs -put /tmp/bigfile.dat /
[mra@sacws hadoop-0.19.1]$ bin/hadoop fs -cat /bigfile.dat | md5sum
09/03/14 15:52:34 WARN hdfs.DFSClient: Exception while reading from
blk_-4992364814640383286_1013 of /bigfile.dat from 127.0.0.1:50010:
java.io.IOException: BlockReader: error in packet header(chunkOffset :
415956992, dataLen : 41284, seqno : 0 (last: -1))
	at org.apache.hadoop.hdfs.DFSClient$BlockReader.readChunk(DFSClient.java:1186)
	at org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:238)
	at org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:190)
	at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:159)
	at org.apache.hadoop.hdfs.DFSClient$BlockReader.read(DFSClient.java:1060)
	at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.readBuffer(DFSClient.java:1615)
	at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1665)
	at java.io.DataInputStream.read(DataInputStream.java:83)
	at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:53)
	at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:85)
	at org.apache.hadoop.fs.FsShell.printToStdout(FsShell.java:120)
	at org.apache.hadoop.fs.FsShell.access$100(FsShell.java:49)
	at org.apache.hadoop.fs.FsShell$1.process(FsShell.java:351)
	at org.apache.hadoop.fs.FsShell$DelayedExceptionThrowing.globAndProcess(FsShell.java:1872)
	at org.apache.hadoop.fs.FsShell.cat(FsShell.java:345)
	at org.apache.hadoop.fs.FsShell.doall(FsShell.java:1519)
	at org.apache.hadoop.fs.FsShell.run(FsShell.java:1735)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
	at org.apache.hadoop.fs.FsShell.main(FsShell.java:1854)

09/03/14 15:52:34 INFO hdfs.DFSClient: Could not obtain block
blk_-4992364814640383286_1013 from any node:  java.io.IOException: No
live nodes contain current block
09/03/14 15:52:37 WARN hdfs.DFSClient: Exception while reading from
blk_-4992364814640383286_1013 of /bigfile.dat from 127.0.0.1:50010:
java.io.IOException: BlockReader: error in packet header(chunkOffset :
415956992, dataLen : 41284, seqno : 0 (last: -1))
	at org.apache.hadoop.hdfs.DFSClient$BlockReader.readChunk(DFSClient.java:1186)
	at org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:238)
	at org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:190)
	at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:159)
	at org.apache.hadoop.hdfs.DFSClient$BlockReader.read(DFSClient.java:1060)
	at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.readBuffer(DFSClient.java:1615)
	at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1665)
	at java.io.DataInputStream.read(DataInputStream.java:83)
	at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:53)
	at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:85)
	at org.apache.hadoop.fs.FsShell.printToStdout(FsShell.java:120)
	at org.apache.hadoop.fs.FsShell.access$100(FsShell.java:49)
	at org.apache.hadoop.fs.FsShell$1.process(FsShell.java:351)
	at org.apache.hadoop.fs.FsShell$DelayedExceptionThrowing.globAndProcess(FsShell.java:1872)
	at org.apache.hadoop.fs.FsShell.cat(FsShell.java:345)
	at org.apache.hadoop.fs.FsShell.doall(FsShell.java:1519)
	at org.apache.hadoop.fs.FsShell.run(FsShell.java:1735)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
	at org.apache.hadoop.fs.FsShell.main(FsShell.java:1854)

09/03/14 15:52:37 INFO hdfs.DFSClient: Could not obtain block
blk_-4992364814640383286_1013 from any node:  java.io.IOException: No
live nodes contain current block
09/03/14 15:52:40 WARN hdfs.DFSClient: DFS Read: java.io.IOException:
BlockReader: error in packet header(chunkOffset : 415956992, dataLen :
41284, seqno : 0 (last: -1))
	at org.apache.hadoop.hdfs.DFSClient$BlockReader.readChunk(DFSClient.java:1186)
	at org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:238)
	at org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:190)
	at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:159)
	at org.apache.hadoop.hdfs.DFSClient$BlockReader.read(DFSClient.java:1060)
	at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.readBuffer(DFSClient.java:1615)
	at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1665)
	at java.io.DataInputStream.read(DataInputStream.java:83)
	at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:53)
	at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:85)
	at org.apache.hadoop.fs.FsShell.printToStdout(FsShell.java:120)
	at org.apache.hadoop.fs.FsShell.access$100(FsShell.java:49)
	at org.apache.hadoop.fs.FsShell$1.process(FsShell.java:351)
	at org.apache.hadoop.fs.FsShell$DelayedExceptionThrowing.globAndProcess(FsShell.java:1872)
	at org.apache.hadoop.fs.FsShell.cat(FsShell.java:345)
	at org.apache.hadoop.fs.FsShell.doall(FsShell.java:1519)
	at org.apache.hadoop.fs.FsShell.run(FsShell.java:1735)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
	at org.apache.hadoop.fs.FsShell.main(FsShell.java:1854)

cat: BlockReader: error in packet header(chunkOffset : 415956992,
dataLen : 41284, seqno : 0 (last: -1))
ef8033a70b6691c2b99ad1c74583161a  -
[mra@sacws hadoop-0.19.1]$


-- 
permanent contact information at http://mikerandrews.com

Re: large block size problem

Posted by Owen O'Malley <ow...@gmail.com>.

Since it is per a file, you'd need to check at file create too.

-- Owen

On Mar 16, 2009, at 4:29, Steve Loughran <st...@apache.org> wrote:

> Steve Loughran wrote:
>> Owen O'Malley wrote:
>>> I seem to remember someone saying that blocks over 2^31 don't  
>>> work. I don't know if there is a jira already.
>> Looking at the stack trace, int is being used everywhere, which  
>> implies an upper limit of (2^31)-1, for blocks. Easy to fix, though  
>> it may change APIs, and then there is the testing.
>
>
> thinking about this a bit more, a quick early patch would be to  
> print a warning whenever you try to bring up a namenode with a block  
> size >= 2GB ; have the system continue so that people can test and  
> fix the code, but at least it stops end users being surprised.
>
> I spoke with someone from the local university on their High Energy  
> Physics problems last week -their single event files are about 2GB,  
> so that's the only sensible block size to use when scheduling work.  
> He'll be at ApacheCon next week, to make his use cases known.
>
> -steve

Re: large block size problem

Posted by Dhruba Borthakur <dh...@gmail.com>.

I went ahead and created a JIRA HADOOP-5552, created a unit test that
demonstrates this bug and a first version of a patch. I suspect that the
patch needs some more work. If somebody wants to extend this patch to make
the unit test pass, that would be awesome.

thanks,
dhruba

http://issues.apache.org/jira/browse/HADOOP-5552

On Mon, Mar 16, 2009 at 10:55 AM, Brian Bockelman <bb...@cse.unl.edu>wrote:

>
> On Mar 16, 2009, at 11:03 AM, Owen O'Malley wrote:
>
>  On Mar 16, 2009, at 4:29 AM, Steve Loughran wrote:
>>
>>  I spoke with someone from the local university on their High Energy
>>> Physics problems last week -their single event files are about 2GB, so
>>> that's the only sensible block size to use when scheduling work. He'll be at
>>> ApacheCon next week, to make his use cases known.
>>>
>>
>> I don't follow. Not all files need to be 1 block long. If your files are
>> 2GB, 1GB blocks should be fine and I've personally tested those when I've
>> wanted to have longer maps. (The block size of a dataset is the natural size
>> of the input for each map.)
>>
>>
> Hm ... I work on the same project and I'm not sure I agree with this
> statement.
>
> The problem is that the files contain independent event data from a
> particle detector (about 1 - 2MB / event).  However, the file organization
> is such that it's not possible to split the file at this point (not to
> mention that it takes quite some overhead to startup the process)
>
> Turning the block size way up would mean that any jobs could keep data
> access completely node-local.  OTOH, this probably defeats one of the best
> advantages for using HDFS: block-decomposition mostly solves the "hot spot"
> issue.  Ever seen what happens to a file system when a user submits 1000
> jobs to analyze a single 2GB file?  Without block-decomposition to spread
> the reads over 20 or so servers, with only one block per file, the read
> happens to 1-3 servers.  Big difference.
>
> Brian
>

Re: large block size problem

Posted by Brian Bockelman <bb...@cse.unl.edu>.

On Mar 16, 2009, at 11:03 AM, Owen O'Malley wrote:

> On Mar 16, 2009, at 4:29 AM, Steve Loughran wrote:
>
>> I spoke with someone from the local university on their High Energy  
>> Physics problems last week -their single event files are about 2GB,  
>> so that's the only sensible block size to use when scheduling work.  
>> He'll be at ApacheCon next week, to make his use cases known.
>
> I don't follow. Not all files need to be 1 block long. If your files  
> are 2GB, 1GB blocks should be fine and I've personally tested those  
> when I've wanted to have longer maps. (The block size of a dataset  
> is the natural size of the input for each map.)
>

Hm ... I work on the same project and I'm not sure I agree with this  
statement.

The problem is that the files contain independent event data from a  
particle detector (about 1 - 2MB / event).  However, the file  
organization is such that it's not possible to split the file at this  
point (not to mention that it takes quite some overhead to startup the  
process)

Turning the block size way up would mean that any jobs could keep data  
access completely node-local.  OTOH, this probably defeats one of the  
best advantages for using HDFS: block-decomposition mostly solves the  
"hot spot" issue.  Ever seen what happens to a file system when a user  
submits 1000 jobs to analyze a single 2GB file?  Without block- 
decomposition to spread the reads over 20 or so servers, with only one  
block per file, the read happens to 1-3 servers.  Big difference.

Brian

Re: large block size problem

Posted by Ted Dunning <te...@gmail.com>.

On Mon, Mar 16, 2009 at 9:36 AM, Steve Loughran <st...@apache.org> wrote:

> Owen O'Malley wrote:
>
>> On Mar 16, 2009, at 4:29 AM, Steve Loughran wrote:
>>
>>  I spoke with someone from the local university on their High Energy
>>> Physics problems last week -their single event files are about 2GB, so
>>> that's the only sensible block size to use when scheduling work. He'll be at
>>> ApacheCon next week, to make his use cases known.
>>>
>>
>> I don't follow. Not all files need to be 1 block long. If your files are
>> 2GB, 1GB blocks should be fine and I've personally tested those when I've
>> wanted to have longer maps. (The block size of a dataset is the natural size
>> of the input for each map.)
>>
>
> within a single 2GB event, data access is very random; you'd need all 2GB
> on a single machine and efficient random-access within it. The natural size
> for each map -and hence block- really is 2GB.
>

To me, this suggests that the map function should copy the file to known
local storage in order to get good random access performance.  In fact, it
suggests that the single event should be in RAM.

Which makes the block size almost irrelevant especially if a somewhat
smaller block size allows better average locality.

Re: large block size problem

Posted by Steve Loughran <st...@apache.org>.

Owen O'Malley wrote:
> On Mar 16, 2009, at 4:29 AM, Steve Loughran wrote:
> 
>> I spoke with someone from the local university on their High Energy 
>> Physics problems last week -their single event files are about 2GB, so 
>> that's the only sensible block size to use when scheduling work. He'll 
>> be at ApacheCon next week, to make his use cases known.
> 
> I don't follow. Not all files need to be 1 block long. If your files are 
> 2GB, 1GB blocks should be fine and I've personally tested those when 
> I've wanted to have longer maps. (The block size of a dataset is the 
> natural size of the input for each map.)

within a single 2GB event, data access is very random; you'd need all 
2GB on a single machine and efficient random-access within it. The 
natural size for each map -and hence block- really is 2GB.

Re: large block size problem

Posted by Owen O'Malley <om...@apache.org>.

On Mar 16, 2009, at 4:29 AM, Steve Loughran wrote:

> I spoke with someone from the local university on their High Energy  
> Physics problems last week -their single event files are about 2GB,  
> so that's the only sensible block size to use when scheduling work.  
> He'll be at ApacheCon next week, to make his use cases known.

I don't follow. Not all files need to be 1 block long. If your files  
are 2GB, 1GB blocks should be fine and I've personally tested those  
when I've wanted to have longer maps. (The block size of a dataset is  
the natural size of the input for each map.)

-- Owen

Re: large block size problem

Posted by Steve Loughran <st...@apache.org>.

Steve Loughran wrote:
> Owen O'Malley wrote:
>> I seem to remember someone saying that blocks over 2^31 don't work. I 
>> don't know if there is a jira already.
> 
> Looking at the stack trace, int is being used everywhere, which implies 
> an upper limit of (2^31)-1, for blocks. Easy to fix, though it may 
> change APIs, and then there is the testing.
> 

thinking about this a bit more, a quick early patch would be to print a 
warning whenever you try to bring up a namenode with a block size >= 2GB 
; have the system continue so that people can test and fix the code, but 
at least it stops end users being surprised.

I spoke with someone from the local university on their High Energy 
Physics problems last week -their single event files are about 2GB, so 
that's the only sensible block size to use when scheduling work. He'll 
be at ApacheCon next week, to make his use cases known.

-steve

Re: large block size problem

Posted by Steve Loughran <st...@apache.org>.

Owen O'Malley wrote:
> I seem to remember someone saying that blocks over 2^31 don't work. I 
> don't know if there is a jira already.

Looking at the stack trace, int is being used everywhere, which implies 
an upper limit of (2^31)-1, for blocks. Easy to fix, though it may 
change APIs, and then there is the testing.

> 
> -- Owen
> 
> On Mar 14, 2009, at 20:28, Raghu Angadi <ra...@yahoo-inc.com> wrote:
> 
>>
>> I haven't looked much into this but most likely this is a bug. I am 
>> pretty sure large block size is not handled correctly.
>>
>> A fix might be pretty straight fwd. I suggest you to file a jira and 
>> preferably give any justification for large block sizes. I don't think 
>> there is any reason to limit the block size.
>>
>> Raghu.
>>
>> Mike Andrews wrote:
>>> hi there,
>>> i tried "-put" then "-cat" for a 1.6 gb file and it worked fine, but
>>> when trying it on a 16.4 gb file ("bigfile.dat"), i get the following
>>> errors (see below). i got this failure both times i tried it, each
>>> with a fresh install of single-node 0.19.1. also, i set block size to
>>> 32 gb with larger buffer and checksum sizes in config (see below as
>>> well) -- any thoughts on what i may be doing wrong, or is it a bug?
>>> ---
>>> <configuration>
>>> <property>
>>>  <name>dfs.block.size</name>
>>>  <value>34359738368</value>
>>>  <description>The default block size for new files.</description>
>>> </property>
>>> <property>
>>>  <name>io.file.buffer.size</name>
>>>  <value>65536</value>
>>>  <description>The size of buffer for use in sequence files.
>>>  The size of this buffer should probably be a multiple of hardware
>>>  page size (4096 on Intel x86), and it determines how much data is
>>>  buffered during read and write operations.</description>
>>> </property>
>>> <property>
>>>  <name>io.bytes.per.checksum</name>
>>>  <value>4096</value>
>>>  <description>The number of bytes per checksum.  Must not be larger than
>>>  io.file.buffer.size.</description>
>>> </property><property>
>>>    <name>fs.default.name</name>
>>>    <value>hdfs://localhost:9000</value>
>>>  </property>
>>>  <property>
>>>    <name>mapred.job.tracker</name>
>>>    <value>localhost:9001</value>
>>>  </property>
>>>  <property>
>>>    <name>dfs.replication</name>
>>>    <value>1</value>
>>> </property>
>>> </configuration>
>>> [mra@sacws hadoop-0.19.1]$ bin/hadoop fs -put /tmp/bigfile.dat /
>>> [mra@sacws hadoop-0.19.1]$ bin/hadoop fs -cat /bigfile.dat | md5sum
>>> 09/03/14 15:52:34 WARN hdfs.DFSClient: Exception while reading from
>>> blk_-4992364814640383286_1013 of /bigfile.dat from 127.0.0.1:50010:
>>> java.io.IOException: BlockReader: error in packet header(chunkOffset :
>>> 415956992, dataLen : 41284, seqno : 0 (last: -1))
>>>    at 
>>> org.apache.hadoop.hdfs.DFSClient$BlockReader.readChunk(DFSClient.java:1186) 
>>>
>>>    at 
>>> org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:238) 
>>>
>>>    at org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:190)
>>>    at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:159)
>>>    at 
>>> org.apache.hadoop.hdfs.DFSClient$BlockReader.read(DFSClient.java:1060)
>>>    at 
>>> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.readBuffer(DFSClient.java:1615) 
>>>
>>>    at 
>>> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1665) 
>>>
>>>    at java.io.DataInputStream.read(DataInputStream.java:83)
>>>    at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:53)
>>>    at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:85)
>>>    at org.apache.hadoop.fs.FsShell.printToStdout(FsShell.java:120)
>>>    at org.apache.hadoop.fs.FsShell.access$100(FsShell.java:49)
>>>    at org.apache.hadoop.fs.FsShell$1.process(FsShell.java:351)
>>>    at 
>>> org.apache.hadoop.fs.FsShell$DelayedExceptionThrowing.globAndProcess(FsShell.java:1872) 
>>>
>>>    at org.apache.hadoop.fs.FsShell.cat(FsShell.java:345)
>>>    at org.apache.hadoop.fs.FsShell.doall(FsShell.java:1519)
>>>    at org.apache.hadoop.fs.FsShell.run(FsShell.java:1735)
>>>    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>>>    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>>>    at org.apache.hadoop.fs.FsShell.main(FsShell.java:1854)
>>> 09/03/14 15:52:34 INFO hdfs.DFSClient: Could not obtain block
>>> blk_-4992364814640383286_1013 from any node:  java.io.IOException: No
>>> live nodes contain current block
>>> 09/03/14 15:52:37 WARN hdfs.DFSClient: Exception while reading from
>>> blk_-4992364814640383286_1013 of /bigfile.dat from 127.0.0.1:50010:
>>> java.io.IOException: BlockReader: error in packet header(chunkOffset :
>>> 415956992, dataLen : 41284, seqno : 0 (last: -1))
>>>    at 
>>> org.apache.hadoop.hdfs.DFSClient$BlockReader.readChunk(DFSClient.java:1186) 
>>>
>>>    at 
>>> org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:238) 
>>>
>>>    at org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:190)
>>>    at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:159)
>>>    at 
>>> org.apache.hadoop.hdfs.DFSClient$BlockReader.read(DFSClient.java:1060)
>>>    at 
>>> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.readBuffer(DFSClient.java:1615) 
>>>
>>>    at 
>>> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1665) 
>>>
>>>    at java.io.DataInputStream.read(DataInputStream.java:83)
>>>    at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:53)
>>>    at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:85)
>>>    at org.apache.hadoop.fs.FsShell.printToStdout(FsShell.java:120)
>>>    at org.apache.hadoop.fs.FsShell.access$100(FsShell.java:49)
>>>    at org.apache.hadoop.fs.FsShell$1.process(FsShell.java:351)
>>>    at 
>>> org.apache.hadoop.fs.FsShell$DelayedExceptionThrowing.globAndProcess(FsShell.java:1872) 
>>>
>>>    at org.apache.hadoop.fs.FsShell.cat(FsShell.java:345)
>>>    at org.apache.hadoop.fs.FsShell.doall(FsShell.java:1519)
>>>    at org.apache.hadoop.fs.FsShell.run(FsShell.java:1735)
>>>    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>>>    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>>>    at org.apache.hadoop.fs.FsShell.main(FsShell.java:1854)
>>> 09/03/14 15:52:37 INFO hdfs.DFSClient: Could not obtain block
>>> blk_-4992364814640383286_1013 from any node:  java.io.IOException: No
>>> live nodes contain current block
>>> 09/03/14 15:52:40 WARN hdfs.DFSClient: DFS Read: java.io.IOException:
>>> BlockReader: error in packet header(chunkOffset : 415956992, dataLen :
>>> 41284, seqno : 0 (last: -1))
>>>    at 
>>> org.apache.hadoop.hdfs.DFSClient$BlockReader.readChunk(DFSClient.java:1186) 
>>>
>>>    at 
>>> org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:238) 
>>>
>>>    at org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:190)
>>>    at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:159)
>>>    at 
>>> org.apache.hadoop.hdfs.DFSClient$BlockReader.read(DFSClient.java:1060)
>>>    at 
>>> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.readBuffer(DFSClient.java:1615) 
>>>
>>>    at 
>>> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1665) 
>>>
>>>    at java.io.DataInputStream.read(DataInputStream.java:83)
>>>    at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:53)
>>>    at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:85)
>>>    at org.apache.hadoop.fs.FsShell.printToStdout(FsShell.java:120)
>>>    at org.apache.hadoop.fs.FsShell.access$100(FsShell.java:49)
>>>    at org.apache.hadoop.fs.FsShell$1.process(FsShell.java:351)
>>>    at 
>>> org.apache.hadoop.fs.FsShell$DelayedExceptionThrowing.globAndProcess(FsShell.java:1872) 
>>>
>>>    at org.apache.hadoop.fs.FsShell.cat(FsShell.java:345)
>>>    at org.apache.hadoop.fs.FsShell.doall(FsShell.java:1519)
>>>    at org.apache.hadoop.fs.FsShell.run(FsShell.java:1735)
>>>    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>>>    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>>>    at org.apache.hadoop.fs.FsShell.main(FsShell.java:1854)
>>> cat: BlockReader: error in packet header(chunkOffset : 415956992,
>>> dataLen : 41284, seqno : 0 (last: -1))
>>> ef8033a70b6691c2b99ad1c74583161a  -
>>> [mra@sacws hadoop-0.19.1]$
>>


-- 
Steve Loughran                  http://www.1060.org/blogxter/publish/5
Author: Ant in Action           http://antbook.org/

Re: large block size problem

Posted by Owen O'Malley <ow...@gmail.com>.

I seem to remember someone saying that blocks over 2^31 don't work. I  
don't know if there is a jira already.

-- Owen

On Mar 14, 2009, at 20:28, Raghu Angadi <ra...@yahoo-inc.com> wrote:

>
> I haven't looked much into this but most likely this is a bug. I am  
> pretty sure large block size is not handled correctly.
>
> A fix might be pretty straight fwd. I suggest you to file a jira and  
> preferably give any justification for large block sizes. I don't  
> think there is any reason to limit the block size.
>
> Raghu.
>
> Mike Andrews wrote:
>> hi there,
>> i tried "-put" then "-cat" for a 1.6 gb file and it worked fine, but
>> when trying it on a 16.4 gb file ("bigfile.dat"), i get the following
>> errors (see below). i got this failure both times i tried it, each
>> with a fresh install of single-node 0.19.1. also, i set block size to
>> 32 gb with larger buffer and checksum sizes in config (see below as
>> well) -- any thoughts on what i may be doing wrong, or is it a bug?
>> ---
>> <configuration>
>> <property>
>>  <name>dfs.block.size</name>
>>  <value>34359738368</value>
>>  <description>The default block size for new files.</description>
>> </property>
>> <property>
>>  <name>io.file.buffer.size</name>
>>  <value>65536</value>
>>  <description>The size of buffer for use in sequence files.
>>  The size of this buffer should probably be a multiple of hardware
>>  page size (4096 on Intel x86), and it determines how much data is
>>  buffered during read and write operations.</description>
>> </property>
>> <property>
>>  <name>io.bytes.per.checksum</name>
>>  <value>4096</value>
>>  <description>The number of bytes per checksum.  Must not be larger  
>> than
>>  io.file.buffer.size.</description>
>> </property><property>
>>    <name>fs.default.name</name>
>>    <value>hdfs://localhost:9000</value>
>>  </property>
>>  <property>
>>    <name>mapred.job.tracker</name>
>>    <value>localhost:9001</value>
>>  </property>
>>  <property>
>>    <name>dfs.replication</name>
>>    <value>1</value>
>> </property>
>> </configuration>
>> [mra@sacws hadoop-0.19.1]$ bin/hadoop fs -put /tmp/bigfile.dat /
>> [mra@sacws hadoop-0.19.1]$ bin/hadoop fs -cat /bigfile.dat | md5sum
>> 09/03/14 15:52:34 WARN hdfs.DFSClient: Exception while reading from
>> blk_-4992364814640383286_1013 of /bigfile.dat from 127.0.0.1:50010:
>> java.io.IOException: BlockReader: error in packet  
>> header(chunkOffset :
>> 415956992, dataLen : 41284, seqno : 0 (last: -1))
>>    at org.apache.hadoop.hdfs.DFSClient 
>> $BlockReader.readChunk(DFSClient.java:1186)
>>    at  
>> org. 
>> apache. 
>> hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:238)
>>    at org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java: 
>> 190)
>>    at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java: 
>> 159)
>>    at org.apache.hadoop.hdfs.DFSClient 
>> $BlockReader.read(DFSClient.java:1060)
>>    at org.apache.hadoop.hdfs.DFSClient 
>> $DFSInputStream.readBuffer(DFSClient.java:1615)
>>    at org.apache.hadoop.hdfs.DFSClient 
>> $DFSInputStream.read(DFSClient.java:1665)
>>    at java.io.DataInputStream.read(DataInputStream.java:83)
>>    at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:53)
>>    at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:85)
>>    at org.apache.hadoop.fs.FsShell.printToStdout(FsShell.java:120)
>>    at org.apache.hadoop.fs.FsShell.access$100(FsShell.java:49)
>>    at org.apache.hadoop.fs.FsShell$1.process(FsShell.java:351)
>>    at org.apache.hadoop.fs.FsShell 
>> $DelayedExceptionThrowing.globAndProcess(FsShell.java:1872)
>>    at org.apache.hadoop.fs.FsShell.cat(FsShell.java:345)
>>    at org.apache.hadoop.fs.FsShell.doall(FsShell.java:1519)
>>    at org.apache.hadoop.fs.FsShell.run(FsShell.java:1735)
>>    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>>    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>>    at org.apache.hadoop.fs.FsShell.main(FsShell.java:1854)
>> 09/03/14 15:52:34 INFO hdfs.DFSClient: Could not obtain block
>> blk_-4992364814640383286_1013 from any node:  java.io.IOException: No
>> live nodes contain current block
>> 09/03/14 15:52:37 WARN hdfs.DFSClient: Exception while reading from
>> blk_-4992364814640383286_1013 of /bigfile.dat from 127.0.0.1:50010:
>> java.io.IOException: BlockReader: error in packet  
>> header(chunkOffset :
>> 415956992, dataLen : 41284, seqno : 0 (last: -1))
>>    at org.apache.hadoop.hdfs.DFSClient 
>> $BlockReader.readChunk(DFSClient.java:1186)
>>    at  
>> org. 
>> apache. 
>> hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:238)
>>    at org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java: 
>> 190)
>>    at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java: 
>> 159)
>>    at org.apache.hadoop.hdfs.DFSClient 
>> $BlockReader.read(DFSClient.java:1060)
>>    at org.apache.hadoop.hdfs.DFSClient 
>> $DFSInputStream.readBuffer(DFSClient.java:1615)
>>    at org.apache.hadoop.hdfs.DFSClient 
>> $DFSInputStream.read(DFSClient.java:1665)
>>    at java.io.DataInputStream.read(DataInputStream.java:83)
>>    at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:53)
>>    at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:85)
>>    at org.apache.hadoop.fs.FsShell.printToStdout(FsShell.java:120)
>>    at org.apache.hadoop.fs.FsShell.access$100(FsShell.java:49)
>>    at org.apache.hadoop.fs.FsShell$1.process(FsShell.java:351)
>>    at org.apache.hadoop.fs.FsShell 
>> $DelayedExceptionThrowing.globAndProcess(FsShell.java:1872)
>>    at org.apache.hadoop.fs.FsShell.cat(FsShell.java:345)
>>    at org.apache.hadoop.fs.FsShell.doall(FsShell.java:1519)
>>    at org.apache.hadoop.fs.FsShell.run(FsShell.java:1735)
>>    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>>    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>>    at org.apache.hadoop.fs.FsShell.main(FsShell.java:1854)
>> 09/03/14 15:52:37 INFO hdfs.DFSClient: Could not obtain block
>> blk_-4992364814640383286_1013 from any node:  java.io.IOException: No
>> live nodes contain current block
>> 09/03/14 15:52:40 WARN hdfs.DFSClient: DFS Read: java.io.IOException:
>> BlockReader: error in packet header(chunkOffset : 415956992,  
>> dataLen :
>> 41284, seqno : 0 (last: -1))
>>    at org.apache.hadoop.hdfs.DFSClient 
>> $BlockReader.readChunk(DFSClient.java:1186)
>>    at  
>> org. 
>> apache. 
>> hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:238)
>>    at org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java: 
>> 190)
>>    at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java: 
>> 159)
>>    at org.apache.hadoop.hdfs.DFSClient 
>> $BlockReader.read(DFSClient.java:1060)
>>    at org.apache.hadoop.hdfs.DFSClient 
>> $DFSInputStream.readBuffer(DFSClient.java:1615)
>>    at org.apache.hadoop.hdfs.DFSClient 
>> $DFSInputStream.read(DFSClient.java:1665)
>>    at java.io.DataInputStream.read(DataInputStream.java:83)
>>    at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:53)
>>    at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:85)
>>    at org.apache.hadoop.fs.FsShell.printToStdout(FsShell.java:120)
>>    at org.apache.hadoop.fs.FsShell.access$100(FsShell.java:49)
>>    at org.apache.hadoop.fs.FsShell$1.process(FsShell.java:351)
>>    at org.apache.hadoop.fs.FsShell 
>> $DelayedExceptionThrowing.globAndProcess(FsShell.java:1872)
>>    at org.apache.hadoop.fs.FsShell.cat(FsShell.java:345)
>>    at org.apache.hadoop.fs.FsShell.doall(FsShell.java:1519)
>>    at org.apache.hadoop.fs.FsShell.run(FsShell.java:1735)
>>    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>>    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>>    at org.apache.hadoop.fs.FsShell.main(FsShell.java:1854)
>> cat: BlockReader: error in packet header(chunkOffset : 415956992,
>> dataLen : 41284, seqno : 0 (last: -1))
>> ef8033a70b6691c2b99ad1c74583161a  -
>> [mra@sacws hadoop-0.19.1]$
>

Re: large block size problem

Posted by Raghu Angadi <ra...@yahoo-inc.com>.

I haven't looked much into this but most likely this is a bug. I am 
pretty sure large block size is not handled correctly.

A fix might be pretty straight fwd. I suggest you to file a jira and 
preferably give any justification for large block sizes. I don't think 
there is any reason to limit the block size.

Raghu.

Mike Andrews wrote:
> hi there,
> 
> i tried "-put" then "-cat" for a 1.6 gb file and it worked fine, but
> when trying it on a 16.4 gb file ("bigfile.dat"), i get the following
> errors (see below). i got this failure both times i tried it, each
> with a fresh install of single-node 0.19.1. also, i set block size to
> 32 gb with larger buffer and checksum sizes in config (see below as
> well) -- any thoughts on what i may be doing wrong, or is it a bug?
> ---
> 
> <configuration>
> <property>
>   <name>dfs.block.size</name>
>   <value>34359738368</value>
>   <description>The default block size for new files.</description>
> </property>
> <property>
>   <name>io.file.buffer.size</name>
>   <value>65536</value>
>   <description>The size of buffer for use in sequence files.
>   The size of this buffer should probably be a multiple of hardware
>   page size (4096 on Intel x86), and it determines how much data is
>   buffered during read and write operations.</description>
> </property>
> <property>
>   <name>io.bytes.per.checksum</name>
>   <value>4096</value>
>   <description>The number of bytes per checksum.  Must not be larger than
>   io.file.buffer.size.</description>
> </property><property>
>     <name>fs.default.name</name>
>     <value>hdfs://localhost:9000</value>
>   </property>
>   <property>
>     <name>mapred.job.tracker</name>
>     <value>localhost:9001</value>
>   </property>
>   <property>
>     <name>dfs.replication</name>
>     <value>1</value>
> </property>
> </configuration>
> 
> 
> [mra@sacws hadoop-0.19.1]$ bin/hadoop fs -put /tmp/bigfile.dat /
> [mra@sacws hadoop-0.19.1]$ bin/hadoop fs -cat /bigfile.dat | md5sum
> 09/03/14 15:52:34 WARN hdfs.DFSClient: Exception while reading from
> blk_-4992364814640383286_1013 of /bigfile.dat from 127.0.0.1:50010:
> java.io.IOException: BlockReader: error in packet header(chunkOffset :
> 415956992, dataLen : 41284, seqno : 0 (last: -1))
> 	at org.apache.hadoop.hdfs.DFSClient$BlockReader.readChunk(DFSClient.java:1186)
> 	at org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:238)
> 	at org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:190)
> 	at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:159)
> 	at org.apache.hadoop.hdfs.DFSClient$BlockReader.read(DFSClient.java:1060)
> 	at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.readBuffer(DFSClient.java:1615)
> 	at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1665)
> 	at java.io.DataInputStream.read(DataInputStream.java:83)
> 	at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:53)
> 	at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:85)
> 	at org.apache.hadoop.fs.FsShell.printToStdout(FsShell.java:120)
> 	at org.apache.hadoop.fs.FsShell.access$100(FsShell.java:49)
> 	at org.apache.hadoop.fs.FsShell$1.process(FsShell.java:351)
> 	at org.apache.hadoop.fs.FsShell$DelayedExceptionThrowing.globAndProcess(FsShell.java:1872)
> 	at org.apache.hadoop.fs.FsShell.cat(FsShell.java:345)
> 	at org.apache.hadoop.fs.FsShell.doall(FsShell.java:1519)
> 	at org.apache.hadoop.fs.FsShell.run(FsShell.java:1735)
> 	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> 	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
> 	at org.apache.hadoop.fs.FsShell.main(FsShell.java:1854)
> 
> 09/03/14 15:52:34 INFO hdfs.DFSClient: Could not obtain block
> blk_-4992364814640383286_1013 from any node:  java.io.IOException: No
> live nodes contain current block
> 09/03/14 15:52:37 WARN hdfs.DFSClient: Exception while reading from
> blk_-4992364814640383286_1013 of /bigfile.dat from 127.0.0.1:50010:
> java.io.IOException: BlockReader: error in packet header(chunkOffset :
> 415956992, dataLen : 41284, seqno : 0 (last: -1))
> 	at org.apache.hadoop.hdfs.DFSClient$BlockReader.readChunk(DFSClient.java:1186)
> 	at org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:238)
> 	at org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:190)
> 	at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:159)
> 	at org.apache.hadoop.hdfs.DFSClient$BlockReader.read(DFSClient.java:1060)
> 	at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.readBuffer(DFSClient.java:1615)
> 	at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1665)
> 	at java.io.DataInputStream.read(DataInputStream.java:83)
> 	at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:53)
> 	at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:85)
> 	at org.apache.hadoop.fs.FsShell.printToStdout(FsShell.java:120)
> 	at org.apache.hadoop.fs.FsShell.access$100(FsShell.java:49)
> 	at org.apache.hadoop.fs.FsShell$1.process(FsShell.java:351)
> 	at org.apache.hadoop.fs.FsShell$DelayedExceptionThrowing.globAndProcess(FsShell.java:1872)
> 	at org.apache.hadoop.fs.FsShell.cat(FsShell.java:345)
> 	at org.apache.hadoop.fs.FsShell.doall(FsShell.java:1519)
> 	at org.apache.hadoop.fs.FsShell.run(FsShell.java:1735)
> 	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> 	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
> 	at org.apache.hadoop.fs.FsShell.main(FsShell.java:1854)
> 
> 09/03/14 15:52:37 INFO hdfs.DFSClient: Could not obtain block
> blk_-4992364814640383286_1013 from any node:  java.io.IOException: No
> live nodes contain current block
> 09/03/14 15:52:40 WARN hdfs.DFSClient: DFS Read: java.io.IOException:
> BlockReader: error in packet header(chunkOffset : 415956992, dataLen :
> 41284, seqno : 0 (last: -1))
> 	at org.apache.hadoop.hdfs.DFSClient$BlockReader.readChunk(DFSClient.java:1186)
> 	at org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:238)
> 	at org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:190)
> 	at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:159)
> 	at org.apache.hadoop.hdfs.DFSClient$BlockReader.read(DFSClient.java:1060)
> 	at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.readBuffer(DFSClient.java:1615)
> 	at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1665)
> 	at java.io.DataInputStream.read(DataInputStream.java:83)
> 	at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:53)
> 	at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:85)
> 	at org.apache.hadoop.fs.FsShell.printToStdout(FsShell.java:120)
> 	at org.apache.hadoop.fs.FsShell.access$100(FsShell.java:49)
> 	at org.apache.hadoop.fs.FsShell$1.process(FsShell.java:351)
> 	at org.apache.hadoop.fs.FsShell$DelayedExceptionThrowing.globAndProcess(FsShell.java:1872)
> 	at org.apache.hadoop.fs.FsShell.cat(FsShell.java:345)
> 	at org.apache.hadoop.fs.FsShell.doall(FsShell.java:1519)
> 	at org.apache.hadoop.fs.FsShell.run(FsShell.java:1735)
> 	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> 	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
> 	at org.apache.hadoop.fs.FsShell.main(FsShell.java:1854)
> 
> cat: BlockReader: error in packet header(chunkOffset : 415956992,
> dataLen : 41284, seqno : 0 (last: -1))
> ef8033a70b6691c2b99ad1c74583161a  -
> [mra@sacws hadoop-0.19.1]$
> 
>