You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Hairong Kuang (JIRA)" <ji...@apache.org> on 2010/09/30 23:34:37 UTC
[jira] Created: (HBASE-3059) TestReadWriteConsistencyControl
occasionally hangs
TestReadWriteConsistencyControl occasionally hangs
--------------------------------------------------
Key: HBASE-3059
URL: https://issues.apache.org/jira/browse/HBASE-3059
Project: HBase
Issue Type: Bug
Reporter: Hairong Kuang
Assignee: Hairong Kuang
The test hung when I ran mvn test today. The jstack shows that a Writer thread hung at
"Thread-1" prio=10 tid=0x00002aaad81d2800 nid=0x6ce9 in Object.wait() [0x0000000040f37000]
java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method)
at org.apache.hadoop.hbase.regionserver.ReadWriteConsistencyControl.completeMemstoreInsert(ReadWriteConsistencyControl.java:130)
- locked <0x00002aaac9fa0f50> (a java.lang.Object)
at org.apache.hadoop.hbase.regionserver.TestReadWriteConsistencyControl$Writer.run(TestReadWriteConsistencyControl.java:56) at java.lang.Thread.run(Thread.java:619)
It seems to be caused by a race condition in ReadWriteConsistencyControl#completeMemStoreInsert. Accesses/updates of the value of memStoreRead should be done while holding the readWaiters lock.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3059) TestReadWriteConsistencyControl
occasionally hangs
Posted by "ryan rawson (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-3059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12916693#action_12916693 ]
ryan rawson commented on HBASE-3059:
------------------------------------
a few questions:
- What jvm are you using? You must be using a 64 bit jvm to run hbase. 32 bit jvms dont offer atomic updates to longs, which is required for this code.
- how and why does this fix the hang? The variable is volatile, so adding a synchronized block should not improve the characteristics according the the JMM.
- How much does this slow down the code? Readwaiters is a fairly sensitive lock and holding it for less time would be better.
> TestReadWriteConsistencyControl occasionally hangs
> --------------------------------------------------
>
> Key: HBASE-3059
> URL: https://issues.apache.org/jira/browse/HBASE-3059
> Project: HBase
> Issue Type: Bug
> Reporter: Hairong Kuang
> Assignee: Hairong Kuang
> Attachments: hbase_trunk_consistency.patch
>
>
> The test hung when I ran mvn test today. The jstack shows that a Writer thread hung at
> "Thread-1" prio=10 tid=0x00002aaad81d2800 nid=0x6ce9 in Object.wait() [0x0000000040f37000]
> java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method)
> at org.apache.hadoop.hbase.regionserver.ReadWriteConsistencyControl.completeMemstoreInsert(ReadWriteConsistencyControl.java:130)
> -- locked <0x00002aaac9fa0f50> (a java.lang.Object)
> at org.apache.hadoop.hbase.regionserver.TestReadWriteConsistencyControl$Writer.run(TestReadWriteConsistencyControl.java:56) at java.lang.Thread.run(Thread.java:619)
> It seems to be caused by a race condition in ReadWriteConsistencyControl#completeMemStoreInsert. Accesses/updates of the value of memStoreRead should be done while holding the readWaiters lock.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3059) TestReadWriteConsistencyControl
occasionally hangs
Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-3059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12916721#action_12916721 ]
Hairong Kuang commented on HBASE-3059:
--------------------------------------
I ran with 64 bit jvm.
But this bug can not be fixed by atomic update. The problem is that it is possible that after a writer W1 checked
memstoreRead < e.getWriterNumber() to be true, another Writer W2 sets memestoreRead = nextReadValue then does a notifyAll(). Then W1 does readWaiters.wait(0). Since it misses the singal from W2, W1 waits there forever.
> TestReadWriteConsistencyControl occasionally hangs
> --------------------------------------------------
>
> Key: HBASE-3059
> URL: https://issues.apache.org/jira/browse/HBASE-3059
> Project: HBase
> Issue Type: Bug
> Reporter: Hairong Kuang
> Assignee: Hairong Kuang
> Attachments: hbase_trunk_consistency.patch
>
>
> The test hung when I ran mvn test today. The jstack shows that a Writer thread hung at
> "Thread-1" prio=10 tid=0x00002aaad81d2800 nid=0x6ce9 in Object.wait() [0x0000000040f37000]
> java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method)
> at org.apache.hadoop.hbase.regionserver.ReadWriteConsistencyControl.completeMemstoreInsert(ReadWriteConsistencyControl.java:130)
> -- locked <0x00002aaac9fa0f50> (a java.lang.Object)
> at org.apache.hadoop.hbase.regionserver.TestReadWriteConsistencyControl$Writer.run(TestReadWriteConsistencyControl.java:56) at java.lang.Thread.run(Thread.java:619)
> It seems to be caused by a race condition in ReadWriteConsistencyControl#completeMemStoreInsert. Accesses/updates of the value of memStoreRead should be done while holding the readWaiters lock.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HBASE-3059) TestReadWriteConsistencyControl
occasionally hangs
Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-3059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hairong Kuang updated HBASE-3059:
---------------------------------
Description:
The test hung when I ran mvn test today. The jstack shows that a Writer thread hung at
"Thread-1" prio=10 tid=0x00002aaad81d2800 nid=0x6ce9 in Object.wait() [0x0000000040f37000]
java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method)
at org.apache.hadoop.hbase.regionserver.ReadWriteConsistencyControl.completeMemstoreInsert(ReadWriteConsistencyControl.java:130)
-- locked <0x00002aaac9fa0f50> (a java.lang.Object)
at org.apache.hadoop.hbase.regionserver.TestReadWriteConsistencyControl$Writer.run(TestReadWriteConsistencyControl.java:56) at java.lang.Thread.run(Thread.java:619)
It seems to be caused by a race condition in ReadWriteConsistencyControl#completeMemStoreInsert. Accesses/updates of the value of memStoreRead should be done while holding the readWaiters lock.
was:
The test hung when I ran mvn test today. The jstack shows that a Writer thread hung at
"Thread-1" prio=10 tid=0x00002aaad81d2800 nid=0x6ce9 in Object.wait() [0x0000000040f37000]
java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method)
at org.apache.hadoop.hbase.regionserver.ReadWriteConsistencyControl.completeMemstoreInsert(ReadWriteConsistencyControl.java:130)
- locked <0x00002aaac9fa0f50> (a java.lang.Object)
at org.apache.hadoop.hbase.regionserver.TestReadWriteConsistencyControl$Writer.run(TestReadWriteConsistencyControl.java:56) at java.lang.Thread.run(Thread.java:619)
It seems to be caused by a race condition in ReadWriteConsistencyControl#completeMemStoreInsert. Accesses/updates of the value of memStoreRead should be done while holding the readWaiters lock.
> TestReadWriteConsistencyControl occasionally hangs
> --------------------------------------------------
>
> Key: HBASE-3059
> URL: https://issues.apache.org/jira/browse/HBASE-3059
> Project: HBase
> Issue Type: Bug
> Reporter: Hairong Kuang
> Assignee: Hairong Kuang
>
> The test hung when I ran mvn test today. The jstack shows that a Writer thread hung at
> "Thread-1" prio=10 tid=0x00002aaad81d2800 nid=0x6ce9 in Object.wait() [0x0000000040f37000]
> java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method)
> at org.apache.hadoop.hbase.regionserver.ReadWriteConsistencyControl.completeMemstoreInsert(ReadWriteConsistencyControl.java:130)
> -- locked <0x00002aaac9fa0f50> (a java.lang.Object)
> at org.apache.hadoop.hbase.regionserver.TestReadWriteConsistencyControl$Writer.run(TestReadWriteConsistencyControl.java:56) at java.lang.Thread.run(Thread.java:619)
> It seems to be caused by a race condition in ReadWriteConsistencyControl#completeMemStoreInsert. Accesses/updates of the value of memStoreRead should be done while holding the readWaiters lock.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HBASE-3059) TestReadWriteConsistencyControl
occasionally hangs
Posted by "stack (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-3059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
stack updated HBASE-3059:
-------------------------
Status: Patch Available (was: Open)
> TestReadWriteConsistencyControl occasionally hangs
> --------------------------------------------------
>
> Key: HBASE-3059
> URL: https://issues.apache.org/jira/browse/HBASE-3059
> Project: HBase
> Issue Type: Bug
> Reporter: Hairong Kuang
> Assignee: Hairong Kuang
> Fix For: 0.90.0
>
> Attachments: hbase_trunk_consistency.patch
>
>
> The test hung when I ran mvn test today. The jstack shows that a Writer thread hung at
> "Thread-1" prio=10 tid=0x00002aaad81d2800 nid=0x6ce9 in Object.wait() [0x0000000040f37000]
> java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method)
> at org.apache.hadoop.hbase.regionserver.ReadWriteConsistencyControl.completeMemstoreInsert(ReadWriteConsistencyControl.java:130)
> -- locked <0x00002aaac9fa0f50> (a java.lang.Object)
> at org.apache.hadoop.hbase.regionserver.TestReadWriteConsistencyControl$Writer.run(TestReadWriteConsistencyControl.java:56) at java.lang.Thread.run(Thread.java:619)
> It seems to be caused by a race condition in ReadWriteConsistencyControl#completeMemStoreInsert. Accesses/updates of the value of memStoreRead should be done while holding the readWaiters lock.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HBASE-3059) TestReadWriteConsistencyControl
occasionally hangs
Posted by "ryan rawson (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-3059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
ryan rawson updated HBASE-3059:
-------------------------------
Resolution: Fixed
Status: Resolved (was: Patch Available)
committed, thanks for that.
Strange that we did not notice it sooner...
> TestReadWriteConsistencyControl occasionally hangs
> --------------------------------------------------
>
> Key: HBASE-3059
> URL: https://issues.apache.org/jira/browse/HBASE-3059
> Project: HBase
> Issue Type: Bug
> Reporter: Hairong Kuang
> Assignee: Hairong Kuang
> Fix For: 0.90.0
>
> Attachments: hbase_trunk_consistency.patch
>
>
> The test hung when I ran mvn test today. The jstack shows that a Writer thread hung at
> "Thread-1" prio=10 tid=0x00002aaad81d2800 nid=0x6ce9 in Object.wait() [0x0000000040f37000]
> java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method)
> at org.apache.hadoop.hbase.regionserver.ReadWriteConsistencyControl.completeMemstoreInsert(ReadWriteConsistencyControl.java:130)
> -- locked <0x00002aaac9fa0f50> (a java.lang.Object)
> at org.apache.hadoop.hbase.regionserver.TestReadWriteConsistencyControl$Writer.run(TestReadWriteConsistencyControl.java:56) at java.lang.Thread.run(Thread.java:619)
> It seems to be caused by a race condition in ReadWriteConsistencyControl#completeMemStoreInsert. Accesses/updates of the value of memStoreRead should be done while holding the readWaiters lock.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3059) TestReadWriteConsistencyControl
occasionally hangs
Posted by "ryan rawson (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-3059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12916691#action_12916691 ]
ryan rawson commented on HBASE-3059:
------------------------------------
what jvm version were you using?
> TestReadWriteConsistencyControl occasionally hangs
> --------------------------------------------------
>
> Key: HBASE-3059
> URL: https://issues.apache.org/jira/browse/HBASE-3059
> Project: HBase
> Issue Type: Bug
> Reporter: Hairong Kuang
> Assignee: Hairong Kuang
> Attachments: hbase_trunk_consistency.patch
>
>
> The test hung when I ran mvn test today. The jstack shows that a Writer thread hung at
> "Thread-1" prio=10 tid=0x00002aaad81d2800 nid=0x6ce9 in Object.wait() [0x0000000040f37000]
> java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method)
> at org.apache.hadoop.hbase.regionserver.ReadWriteConsistencyControl.completeMemstoreInsert(ReadWriteConsistencyControl.java:130)
> -- locked <0x00002aaac9fa0f50> (a java.lang.Object)
> at org.apache.hadoop.hbase.regionserver.TestReadWriteConsistencyControl$Writer.run(TestReadWriteConsistencyControl.java:56) at java.lang.Thread.run(Thread.java:619)
> It seems to be caused by a race condition in ReadWriteConsistencyControl#completeMemStoreInsert. Accesses/updates of the value of memStoreRead should be done while holding the readWaiters lock.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3059) TestReadWriteConsistencyControl
occasionally hangs
Posted by "ryan rawson (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-3059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12917776#action_12917776 ]
ryan rawson commented on HBASE-3059:
------------------------------------
thanks, the patch looks good, i'll commit it
> TestReadWriteConsistencyControl occasionally hangs
> --------------------------------------------------
>
> Key: HBASE-3059
> URL: https://issues.apache.org/jira/browse/HBASE-3059
> Project: HBase
> Issue Type: Bug
> Reporter: Hairong Kuang
> Assignee: Hairong Kuang
> Attachments: hbase_trunk_consistency.patch
>
>
> The test hung when I ran mvn test today. The jstack shows that a Writer thread hung at
> "Thread-1" prio=10 tid=0x00002aaad81d2800 nid=0x6ce9 in Object.wait() [0x0000000040f37000]
> java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method)
> at org.apache.hadoop.hbase.regionserver.ReadWriteConsistencyControl.completeMemstoreInsert(ReadWriteConsistencyControl.java:130)
> -- locked <0x00002aaac9fa0f50> (a java.lang.Object)
> at org.apache.hadoop.hbase.regionserver.TestReadWriteConsistencyControl$Writer.run(TestReadWriteConsistencyControl.java:56) at java.lang.Thread.run(Thread.java:619)
> It seems to be caused by a race condition in ReadWriteConsistencyControl#completeMemStoreInsert. Accesses/updates of the value of memStoreRead should be done while holding the readWaiters lock.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HBASE-3059) TestReadWriteConsistencyControl
occasionally hangs
Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-3059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hairong Kuang updated HBASE-3059:
---------------------------------
Attachment: hbase_trunk_consistency.patch
> TestReadWriteConsistencyControl occasionally hangs
> --------------------------------------------------
>
> Key: HBASE-3059
> URL: https://issues.apache.org/jira/browse/HBASE-3059
> Project: HBase
> Issue Type: Bug
> Reporter: Hairong Kuang
> Assignee: Hairong Kuang
> Attachments: hbase_trunk_consistency.patch
>
>
> The test hung when I ran mvn test today. The jstack shows that a Writer thread hung at
> "Thread-1" prio=10 tid=0x00002aaad81d2800 nid=0x6ce9 in Object.wait() [0x0000000040f37000]
> java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method)
> at org.apache.hadoop.hbase.regionserver.ReadWriteConsistencyControl.completeMemstoreInsert(ReadWriteConsistencyControl.java:130)
> -- locked <0x00002aaac9fa0f50> (a java.lang.Object)
> at org.apache.hadoop.hbase.regionserver.TestReadWriteConsistencyControl$Writer.run(TestReadWriteConsistencyControl.java:56) at java.lang.Thread.run(Thread.java:619)
> It seems to be caused by a race condition in ReadWriteConsistencyControl#completeMemStoreInsert. Accesses/updates of the value of memStoreRead should be done while holding the readWaiters lock.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3059) TestReadWriteConsistencyControl
occasionally hangs
Posted by "Kannan Muthukkaruppan (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-3059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12917769#action_12917769 ]
Kannan Muthukkaruppan commented on HBASE-3059:
----------------------------------------------
+1 on the patch.
> TestReadWriteConsistencyControl occasionally hangs
> --------------------------------------------------
>
> Key: HBASE-3059
> URL: https://issues.apache.org/jira/browse/HBASE-3059
> Project: HBase
> Issue Type: Bug
> Reporter: Hairong Kuang
> Assignee: Hairong Kuang
> Attachments: hbase_trunk_consistency.patch
>
>
> The test hung when I ran mvn test today. The jstack shows that a Writer thread hung at
> "Thread-1" prio=10 tid=0x00002aaad81d2800 nid=0x6ce9 in Object.wait() [0x0000000040f37000]
> java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method)
> at org.apache.hadoop.hbase.regionserver.ReadWriteConsistencyControl.completeMemstoreInsert(ReadWriteConsistencyControl.java:130)
> -- locked <0x00002aaac9fa0f50> (a java.lang.Object)
> at org.apache.hadoop.hbase.regionserver.TestReadWriteConsistencyControl$Writer.run(TestReadWriteConsistencyControl.java:56) at java.lang.Thread.run(Thread.java:619)
> It seems to be caused by a race condition in ReadWriteConsistencyControl#completeMemStoreInsert. Accesses/updates of the value of memStoreRead should be done while holding the readWaiters lock.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HBASE-3059) TestReadWriteConsistencyControl
occasionally hangs
Posted by "ryan rawson (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-3059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
ryan rawson updated HBASE-3059:
-------------------------------
Fix Version/s: 0.90.0
> TestReadWriteConsistencyControl occasionally hangs
> --------------------------------------------------
>
> Key: HBASE-3059
> URL: https://issues.apache.org/jira/browse/HBASE-3059
> Project: HBase
> Issue Type: Bug
> Reporter: Hairong Kuang
> Assignee: Hairong Kuang
> Fix For: 0.90.0
>
> Attachments: hbase_trunk_consistency.patch
>
>
> The test hung when I ran mvn test today. The jstack shows that a Writer thread hung at
> "Thread-1" prio=10 tid=0x00002aaad81d2800 nid=0x6ce9 in Object.wait() [0x0000000040f37000]
> java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method)
> at org.apache.hadoop.hbase.regionserver.ReadWriteConsistencyControl.completeMemstoreInsert(ReadWriteConsistencyControl.java:130)
> -- locked <0x00002aaac9fa0f50> (a java.lang.Object)
> at org.apache.hadoop.hbase.regionserver.TestReadWriteConsistencyControl$Writer.run(TestReadWriteConsistencyControl.java:56) at java.lang.Thread.run(Thread.java:619)
> It seems to be caused by a race condition in ReadWriteConsistencyControl#completeMemStoreInsert. Accesses/updates of the value of memStoreRead should be done while holding the readWaiters lock.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.