You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "stack (JIRA)" <ji...@apache.org> on 2014/03/14 20:16:46 UTC

[jira] [Updated] (HBASE-10751) TestHRegion testWritesWhileScanning occasional fail since HBASE-10514 went in

     [ https://issues.apache.org/jira/browse/HBASE-10751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-10751:
--------------------------

    Attachment: 10751.txt

So, the stack trace above is a bit of a red herring.  It is happening because we are interrupting the tests's background thread on our way out.  It is causing a DroppedSnapshotException to be thrown that we are ignoring (because it is happening when we are 'done').  Because we are not 'exiting' on this DSE, the memory accounting is all off so we are in strange state -- unable to successfully flush yet memory accountings says there is stuff to flush (Because we did not react to the original DSE).

Let me apply this small patch so we just ignore the second DSE that happens on the way out (The reason this test failed).

> TestHRegion testWritesWhileScanning occasional fail since HBASE-10514 went in
> -----------------------------------------------------------------------------
>
>                 Key: HBASE-10751
>                 URL: https://issues.apache.org/jira/browse/HBASE-10751
>             Project: HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: stack
>         Attachments: 10751.txt
>
>
> I saw this here https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/213/testReport/junit/org.apache.hadoop.hbase.regionserver/TestHRegion/testWritesWhileScanning/
> This patch looks to have exposed a problem in our HStore commit logic.  We are supposed to crash out if we fail to write but we keep going here.  I am having trouble figuring why.  Let me write a little test:
> {code}
> 2014-03-14 01:58:48,647 DEBUG [Thread-3] regionserver.HRegionFileSystem(339): Committing store file /home/jenkins/jenkins-slave/workspace/HBase-0.98-on-Hadoop-1.1/0.98-hadoop1.1/hbase-server/target/test-data/f7999012-e166-4619-ab3c-5014e0f65007/data/default/testWritesWhileScanning/306ea000673d780f06daf2469e7f9bab/.tmp/a0e6579af25f463ebb7eebe3c043b8a0 as /home/jenkins/jenkins-slave/workspace/HBase-0.98-on-Hadoop-1.1/0.98-hadoop1.1/hbase-server/target/test-data/f7999012-e166-4619-ab3c-5014e0f65007/data/default/testWritesWhileScanning/306ea000673d780f06daf2469e7f9bab/family7/a0e6579af25f463ebb7eebe3c043b8a0
> 2014-03-14 01:58:48,647 INFO  [Thread-2] regionserver.HRegion(5779): writing data to region testWritesWhileScanning,,1394762315120.306ea000673d780f06daf2469e7f9bab. with WAL disabled. Data may be lost in the event of a crash.
> 2014-03-14 01:58:48,648 ERROR [Thread-3] regionserver.HStore$StoreFlusherImpl(1964): Failed to commit store file /home/jenkins/jenkins-slave/workspace/HBase-0.98-on-Hadoop-1.1/0.98-hadoop1.1/hbase-server/target/test-data/f7999012-e166-4619-ab3c-5014e0f65007/data/default/testWritesWhileScanning/306ea000673d780f06daf2469e7f9bab/.tmp/a0e6579af25f463ebb7eebe3c043b8a0
> org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem reading HFile Trailer from file file:/home/jenkins/jenkins-slave/workspace/HBase-0.98-on-Hadoop-1.1/0.98-hadoop1.1/hbase-server/target/test-data/f7999012-e166-4619-ab3c-5014e0f65007/data/default/testWritesWhileScanning/306ea000673d780f06daf2469e7f9bab/family7/a0e6579af25f463ebb7eebe3c043b8a0
> 	at org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:552)
> 	at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:580)
> 	at org.apache.hadoop.hbase.regionserver.StoreFile$Reader.<init>(StoreFile.java:1019)
> 	at org.apache.hadoop.hbase.regionserver.StoreFileInfo.open(StoreFileInfo.java:211)
> 	at org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:350)
> 	at org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:445)
> 	at org.apache.hadoop.hbase.regionserver.HStore.createStoreFileAndReader(HStore.java:551)
> 	at org.apache.hadoop.hbase.regionserver.HStore.commitFile(HStore.java:842)
> 	at org.apache.hadoop.hbase.regionserver.HStore.access$200(HStore.java:118)
> 	at org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.commit(HStore.java:1961)
> 	at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1706)
> 	at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1583)
> 	at org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1498)
> 	at org.apache.hadoop.hbase.regionserver.TestHRegion$FlushThread.run(TestHRegion.java:3034)
> Caused by: java.nio.channels.ClosedByInterruptException
> 	at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:184)
> 	at sun.nio.ch.FileChannelImpl.position(FileChannelImpl.java:282)
> 	at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileInputStream.seek(RawLocalFileSystem.java:111)
> 	at org.apache.hadoop.fs.BufferedFSInputStream.seek(BufferedFSInputStream.java:78)
> 	at org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:37)
> 	at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.readChunk(ChecksumFileSystem.java:206)
> 	at org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:237)
> 	at org.apache.hadoop.fs.FSInputChecker.fill(FSInputChecker.java:176)
> 	at org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:193)
> 	at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:158)
> 	at org.apache.hadoop.fs.FSInputChecker.readFully(FSInputChecker.java:384)
> 	at org.apache.hadoop.fs.FSInputChecker.seek(FSInputChecker.java:365)
> 	at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.seek(ChecksumFileSystem.java:271)
> 	at org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:37)
> 	at org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:389)
> 	at org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:537)
> 	... 13 more
> 2014-03-14 01:58:48,657 DEBUG [pool-1-thread-1] regionserver.HRegion(1037): Closing testWritesWhileScanning,,1394762315120.306ea000673d780f06daf2469e7f9bab.: disabling compactions & flushes
> 2014-03-14 01:58:48,657 INFO  [pool-1-thread-1] regionserver.HRegion(1045): Running close preflush of testWritesWhileScanning,,1394762315120.306ea000673d780f06daf2469e7f9bab.
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)