You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Ted Yu (JIRA)" <ji...@apache.org> on 2013/08/09 06:06:50 UTC
[jira] [Updated] (HBASE-8615) HLog Compression may fail due to
Hadoop fs input stream returning partial bytes
[ https://issues.apache.org/jira/browse/HBASE-8615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ted Yu updated HBASE-8615:
--------------------------
Summary: HLog Compression may fail due to Hadoop fs input stream returning partial bytes (was: HLog Compression fails in mysterious ways (working title))
> HLog Compression may fail due to Hadoop fs input stream returning partial bytes
> -------------------------------------------------------------------------------
>
> Key: HBASE-8615
> URL: https://issues.apache.org/jira/browse/HBASE-8615
> Project: HBase
> Issue Type: Bug
> Components: Replication
> Reporter: Ted Yu
> Assignee: Ted Yu
> Priority: Critical
> Fix For: 0.98.0, 0.96.0
>
> Attachments: 172.21.3.117%2C60020%2C1375222888304.1375222894855.zip, 8615-v2.txt, 8615-v3.txt, 8615-v4.txt, 8615-v5.txt, HBASE-8615-test.patch, org.apache.hadoop.hbase.replication.TestReplicationQueueFailoverCompressed-output.txt
>
>
> In a recent test run, I noticed the following in test output:
> {code}
> 2013-05-24 22:01:02,424 DEBUG [RegionServer:0;kiyo.gq1.ygridcore.net,42690,1369432806911.replicationSource,2] fs.HFileSystem$ReorderWALBlocks(327): /user/hortonzy/hbase/.logs/kiyo.gq1.ygridcore.net,42690,1369432806911/kiyo.gq1.ygridcore.net%2C42690%2C1369432806911.1369432840428 is an HLog file, so reordering blocks, last hostname will be:kiyo.gq1.ygridcore.net
> 2013-05-24 22:01:02,429 DEBUG [RegionServer:0;kiyo.gq1.ygridcore.net,42690,1369432806911.replicationSource,2] wal.ProtobufLogReader(118): After reading the trailer: walEditsStopOffset: 132235, fileLength: 132243, trailerPresent: true
> 2013-05-24 22:01:02,438 ERROR [RegionServer:0;kiyo.gq1.ygridcore.net,42690,1369432806911.replicationSource,2] wal.ProtobufLogReader(236): Error while reading 691 WAL KVs; started reading at 53272 and read up to 65538
> 2013-05-24 22:01:02,438 WARN [RegionServer:0;kiyo.gq1.ygridcore.net,42690,1369432806911.replicationSource,2] regionserver.ReplicationSource(324): 2 Got:
> java.io.IOException: Error while reading 691 WAL KVs; started reading at 53272 and read up to 65538
> at org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.readNext(ProtobufLogReader.java:237)
> at org.apache.hadoop.hbase.regionserver.wal.ReaderBase.next(ReaderBase.java:96)
> at org.apache.hadoop.hbase.replication.regionserver.ReplicationHLogReaderManager.readNextAndSetPosition(ReplicationHLogReaderManager.java:89)
> at org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.readAllEntriesToReplicateOrNextFile(ReplicationSource.java:404)
> at org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:320)
> Caused by: java.lang.IndexOutOfBoundsException: index (30062) must be less than size (1)
> at com.google.common.base.Preconditions.checkElementIndex(Preconditions.java:305)
> at com.google.common.base.Preconditions.checkElementIndex(Preconditions.java:284)
> at org.apache.hadoop.hbase.regionserver.wal.LRUDictionary$BidirectionalLRUMap.get(LRUDictionary.java:124)
> at org.apache.hadoop.hbase.regionserver.wal.LRUDictionary$BidirectionalLRUMap.access$000(LRUDictionary.java:71)
> at org.apache.hadoop.hbase.regionserver.wal.LRUDictionary.getEntry(LRUDictionary.java:42)
> at org.apache.hadoop.hbase.regionserver.wal.WALCellCodec$CompressedKvDecoder.readIntoArray(WALCellCodec.java:210)
> at org.apache.hadoop.hbase.regionserver.wal.WALCellCodec$CompressedKvDecoder.parseCell(WALCellCodec.java:184)
> at org.apache.hadoop.hbase.codec.BaseDecoder.advance(BaseDecoder.java:46)
> at org.apache.hadoop.hbase.regionserver.wal.WALEdit.readFromCells(WALEdit.java:213)
> at org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.readNext(ProtobufLogReader.java:217)
> ... 4 more
> 2013-05-24 22:01:02,439 DEBUG [RegionServer:0;kiyo.gq1.ygridcore.net,42690,1369432806911.replicationSource,2] regionserver.ReplicationSource(583): Nothing to replicate, sleeping 100 times 10
> {code}
> Will attach test output.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira