You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "stack (JIRA)" <ji...@apache.org> on 2007/10/12 07:46:50 UTC
[jira] Commented: (HADOOP-2040) [hbase]
TestHStoreFile/TestBloomFilter hang occasionally on hudson AFTER test has
finished
[ https://issues.apache.org/jira/browse/HADOOP-2040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12534225 ]
stack commented on HADOOP-2040:
-------------------------------
Looks like it hung again in same build -- #931 -- but this time in a test that hasn't been prone to hanging, TestListTables. Again I can't get a thread dump but log is interesting on the way out:
{code}
[junit] Shutting down the Mini HDFS Cluster
[junit] Shutting down DataNode 1
[junit] 2007-10-12 05:23:16,082 WARN [DataNode: [/export/home/hudson/hudson/jobs/Hadoop-Patch/workspace/trunk/build/contrib/hbase/test/data/dfs/data/data3,/export/home/hudson/hudson/jobs/Hadoop-Patch/workspace/trunk/build/contrib/hbase/test/data/dfs/data/data4]] org.apache.hadoop.dfs.DataNode.offerService(DataNode.java:596): java.io.IOException: java.lang.InterruptedException
[junit] Shutting down DataNode 0
[junit] at org.apache.hadoop.fs.ShellCommand.runCommand(ShellCommand.java:59)
[junit] at org.apache.hadoop.fs.ShellCommand.run(ShellCommand.java:42)
[junit] at org.apache.hadoop.fs.DU.getUsed(DU.java:52)
[junit] at org.apache.hadoop.dfs.FSDataset$FSVolume.getDfsUsed(FSDataset.java:299)
[junit] at org.apache.hadoop.dfs.FSDataset$FSVolumeSet.getDfsUsed(FSDataset.java:396)
[junit] at org.apache.hadoop.dfs.FSDataset.getDfsUsed(FSDataset.java:495)
[junit] at org.apache.hadoop.dfs.DataNode.offerService(DataNode.java:520)
[junit] at org.apache.hadoop.dfs.DataNode.run(DataNode.java:1494)
[junit] at java.lang.Thread.run(Thread.java:595)
[junit] 2007-10-12 05:23:16,349 WARN [DataNode: [/export/home/hudson/hudson/jobs/Hadoop-Patch/workspace/trunk/build/contrib/hbase/test/data/dfs/data/data1,/export/home/hudson/hudson/jobs/Hadoop-Patch/workspace/trunk/build/contrib/hbase/test/data/dfs/data/data2]] org.apache.hadoop.dfs.DataNode.offerService(DataNode.java:596): java.io.InterruptedIOException
[junit] at java.net.SocketOutputStream.socketWrite0(Native Method)
[junit] at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
[junit] at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
[junit] at org.apache.hadoop.ipc.Client$Connection$2.write(Client.java:192)
[junit] at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
[junit] at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
[junit] at java.io.DataOutputStream.flush(DataOutputStream.java:106)
[junit] at org.apache.hadoop.ipc.Client$Connection.sendParam(Client.java:327)
[junit] at org.apache.hadoop.ipc.Client.call(Client.java:474)
[junit] at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184)
[junit] at org.apache.hadoop.dfs.$Proxy1.sendHeartbeat(Unknown Source)
[junit] at org.apache.hadoop.dfs.DataNode.offerService(DataNode.java:520)
[junit] at org.apache.hadoop.dfs.DataNode.run(DataNode.java:1494)
[junit] at java.lang.Thread.run(Thread.java:595)
[junit] 2007-10-12 05:23:16,351 WARN [org.apache.hadoop.dfs.PendingReplicationBlocks$PendingReplicationMonitor@157c2bd] org.apache.hadoop.dfs.PendingReplicationBlocks$PendingReplicationMonitor.run(PendingReplicationBlocks.java:186): PendingReplicationMonitor thread received exception. java.lang.InterruptedException: sleep interrupted
[junit] 2007-10-12 05:23:16,610 INFO [main] org.apache.hadoop.hbase.MiniHBaseCluster.shutdown(MiniHBaseCluster.java:424): Shutting down FileSystem
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 36.108 sec
{code}
It reports tests succeeded but just before hand its reporting and interrupted flush. I wonder if interrupt broke the flush. It would be interesting to know (for HADOOP-1924).
> [hbase] TestHStoreFile/TestBloomFilter hang occasionally on hudson AFTER test has finished
> ------------------------------------------------------------------------------------------
>
> Key: HADOOP-2040
> URL: https://issues.apache.org/jira/browse/HADOOP-2040
> Project: Hadoop
> Issue Type: Bug
> Components: contrib/hbase
> Reporter: stack
> Priority: Minor
>
> Weird. Last night TestBloomFilter was hung after junit had printed test had completed without error. Just now, I noticed a hung TestHStore -- again after junit had printed out test had succeeded (Nigel Daley has reported he's seen at least two hangs in TestHStoreFile, perhaps in same location).
> Last night and just now I was unable to get a thread dump.
> Here is log from around this evenings hang:
> {code}
> ...
> [junit] 2007-10-12 04:19:28,477 INFO [main] org.apache.hadoop.hbase.TestHStoreFile.testOutOfRangeMidkeyHalfMapFile(TestHStoreFile.java:366): Last bottom when key > top: zz/zz/1192162768317
> [junit] 2007-10-12 04:19:28,493 WARN [IPC Server handler 0 on 36620] org.apache.hadoop.dfs.FSDirectory.unprotectedDelete(FSDirectory.java:400): DIR* FSDirectory.unprotectedDelete: failed to remove /testOutOfRangeMidkeyHalfMapFile because it does not exist
> [junit] Shutting down the Mini HDFS Cluster
> [junit] Shutting down DataNode 1
> [junit] Shutting down DataNode 0
> [junit] 2007-10-12 04:19:29,316 WARN [org.apache.hadoop.dfs.PendingReplicationBlocks$PendingReplicationMonitor@ed9f47] org.apache.hadoop.dfs.PendingReplicationBlocks$PendingReplicationMonitor.run(PendingReplicationBlocks.java:186): PendingReplicationMonitor thread received exception. java.lang.InterruptedException: sleep interrupted
> [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 16.274 sec
> [junit] Running org.apache.hadoop.hbase.TestHTable
> [junit] Starting DataNode 0 with dfs.data.dir: /export/home/hudson/hudson/jobs/Hadoop-Patch/workspace/trunk/build/contrib/hbase/test/data/dfs/data/data1,/export/home/hudson/hudson/jobs/Hadoop-Patch/workspace/trunk/build/contrib/hbase/test/data/dfs/data/data2
> [junit] Starting DataNode 1 with dfs.data.dir: /export/home/hudson/hudson/jobs/Hadoop-Patch/workspace/trunk/build/contrib/hbase/test/data/dfs/data/data3,/export/home/hudson/hudson/jobs/Hadoop-Patch/workspace/trunk/build/contrib/hbase/test/data/dfs/data/data4
> [junit] 2007-10-12 05:21:48,332 INFO [main] org.apache.hadoop.hbase.HMaster.<init>(HMaster.java:862): Root region dir: /hbase/hregion_-ROOT-,,0
> ...
> {code}
> Notice the hour of elapsed (hung) time in above.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.