You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Hairong Kuang (JIRA)" <ji...@apache.org> on 2010/09/30 23:34:37 UTC

[jira] Created: (HBASE-3059) TestReadWriteConsistencyControl occasionally hangs

TestReadWriteConsistencyControl occasionally hangs
--------------------------------------------------

                 Key: HBASE-3059
                 URL: https://issues.apache.org/jira/browse/HBASE-3059
             Project: HBase
          Issue Type: Bug
            Reporter: Hairong Kuang
            Assignee: Hairong Kuang


The test hung when I ran mvn test today. The jstack shows that a Writer thread hung at

"Thread-1" prio=10 tid=0x00002aaad81d2800 nid=0x6ce9 in Object.wait() [0x0000000040f37000]
   java.lang.Thread.State: WAITING (on object monitor)  at java.lang.Object.wait(Native Method)
  at org.apache.hadoop.hbase.regionserver.ReadWriteConsistencyControl.completeMemstoreInsert(ReadWriteConsistencyControl.java:130)
  - locked <0x00002aaac9fa0f50> (a java.lang.Object)                
  at org.apache.hadoop.hbase.regionserver.TestReadWriteConsistencyControl$Writer.run(TestReadWriteConsistencyControl.java:56)  at java.lang.Thread.run(Thread.java:619)

It seems to be caused by a race condition in ReadWriteConsistencyControl#completeMemStoreInsert. Accesses/updates of the value of memStoreRead should be done while holding the readWaiters lock.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-3059) TestReadWriteConsistencyControl occasionally hangs

Posted by "ryan rawson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12916693#action_12916693 ] 

ryan rawson commented on HBASE-3059:
------------------------------------

a few questions:

- What jvm are you using?  You must be using a 64 bit jvm to run hbase.  32 bit jvms dont offer atomic updates to longs, which is required for this code.
- how and why does this fix the hang?  The variable is volatile, so adding a synchronized block should not improve the characteristics according the the JMM.
- How much does this slow down the code?  Readwaiters is a fairly sensitive lock and holding it for less time would be better.


> TestReadWriteConsistencyControl occasionally hangs
> --------------------------------------------------
>
>                 Key: HBASE-3059
>                 URL: https://issues.apache.org/jira/browse/HBASE-3059
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>         Attachments: hbase_trunk_consistency.patch
>
>
> The test hung when I ran mvn test today. The jstack shows that a Writer thread hung at
> "Thread-1" prio=10 tid=0x00002aaad81d2800 nid=0x6ce9 in Object.wait() [0x0000000040f37000]
>    java.lang.Thread.State: WAITING (on object monitor)  at java.lang.Object.wait(Native Method)
>   at org.apache.hadoop.hbase.regionserver.ReadWriteConsistencyControl.completeMemstoreInsert(ReadWriteConsistencyControl.java:130)
>   -- locked <0x00002aaac9fa0f50> (a java.lang.Object)                
>   at org.apache.hadoop.hbase.regionserver.TestReadWriteConsistencyControl$Writer.run(TestReadWriteConsistencyControl.java:56)  at java.lang.Thread.run(Thread.java:619)
> It seems to be caused by a race condition in ReadWriteConsistencyControl#completeMemStoreInsert. Accesses/updates of the value of memStoreRead should be done while holding the readWaiters lock.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-3059) TestReadWriteConsistencyControl occasionally hangs

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12916721#action_12916721 ] 

Hairong Kuang commented on HBASE-3059:
--------------------------------------

I ran with 64 bit jvm.

But this bug can not be fixed by atomic update. The problem is that it is possible that after a writer W1 checked
memstoreRead < e.getWriterNumber() to be true, another Writer W2 sets memestoreRead = nextReadValue then does a notifyAll(). Then W1 does readWaiters.wait(0). Since it misses the singal from W2, W1 waits there forever.

> TestReadWriteConsistencyControl occasionally hangs
> --------------------------------------------------
>
>                 Key: HBASE-3059
>                 URL: https://issues.apache.org/jira/browse/HBASE-3059
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>         Attachments: hbase_trunk_consistency.patch
>
>
> The test hung when I ran mvn test today. The jstack shows that a Writer thread hung at
> "Thread-1" prio=10 tid=0x00002aaad81d2800 nid=0x6ce9 in Object.wait() [0x0000000040f37000]
>    java.lang.Thread.State: WAITING (on object monitor)  at java.lang.Object.wait(Native Method)
>   at org.apache.hadoop.hbase.regionserver.ReadWriteConsistencyControl.completeMemstoreInsert(ReadWriteConsistencyControl.java:130)
>   -- locked <0x00002aaac9fa0f50> (a java.lang.Object)                
>   at org.apache.hadoop.hbase.regionserver.TestReadWriteConsistencyControl$Writer.run(TestReadWriteConsistencyControl.java:56)  at java.lang.Thread.run(Thread.java:619)
> It seems to be caused by a race condition in ReadWriteConsistencyControl#completeMemStoreInsert. Accesses/updates of the value of memStoreRead should be done while holding the readWaiters lock.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-3059) TestReadWriteConsistencyControl occasionally hangs

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hairong Kuang updated HBASE-3059:
---------------------------------

    Description: 
The test hung when I ran mvn test today. The jstack shows that a Writer thread hung at

"Thread-1" prio=10 tid=0x00002aaad81d2800 nid=0x6ce9 in Object.wait() [0x0000000040f37000]
   java.lang.Thread.State: WAITING (on object monitor)  at java.lang.Object.wait(Native Method)
  at org.apache.hadoop.hbase.regionserver.ReadWriteConsistencyControl.completeMemstoreInsert(ReadWriteConsistencyControl.java:130)
  -- locked <0x00002aaac9fa0f50> (a java.lang.Object)                
  at org.apache.hadoop.hbase.regionserver.TestReadWriteConsistencyControl$Writer.run(TestReadWriteConsistencyControl.java:56)  at java.lang.Thread.run(Thread.java:619)

It seems to be caused by a race condition in ReadWriteConsistencyControl#completeMemStoreInsert. Accesses/updates of the value of memStoreRead should be done while holding the readWaiters lock.

  was:
The test hung when I ran mvn test today. The jstack shows that a Writer thread hung at

"Thread-1" prio=10 tid=0x00002aaad81d2800 nid=0x6ce9 in Object.wait() [0x0000000040f37000]
   java.lang.Thread.State: WAITING (on object monitor)  at java.lang.Object.wait(Native Method)
  at org.apache.hadoop.hbase.regionserver.ReadWriteConsistencyControl.completeMemstoreInsert(ReadWriteConsistencyControl.java:130)
  - locked <0x00002aaac9fa0f50> (a java.lang.Object)                
  at org.apache.hadoop.hbase.regionserver.TestReadWriteConsistencyControl$Writer.run(TestReadWriteConsistencyControl.java:56)  at java.lang.Thread.run(Thread.java:619)

It seems to be caused by a race condition in ReadWriteConsistencyControl#completeMemStoreInsert. Accesses/updates of the value of memStoreRead should be done while holding the readWaiters lock.


> TestReadWriteConsistencyControl occasionally hangs
> --------------------------------------------------
>
>                 Key: HBASE-3059
>                 URL: https://issues.apache.org/jira/browse/HBASE-3059
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>
> The test hung when I ran mvn test today. The jstack shows that a Writer thread hung at
> "Thread-1" prio=10 tid=0x00002aaad81d2800 nid=0x6ce9 in Object.wait() [0x0000000040f37000]
>    java.lang.Thread.State: WAITING (on object monitor)  at java.lang.Object.wait(Native Method)
>   at org.apache.hadoop.hbase.regionserver.ReadWriteConsistencyControl.completeMemstoreInsert(ReadWriteConsistencyControl.java:130)
>   -- locked <0x00002aaac9fa0f50> (a java.lang.Object)                
>   at org.apache.hadoop.hbase.regionserver.TestReadWriteConsistencyControl$Writer.run(TestReadWriteConsistencyControl.java:56)  at java.lang.Thread.run(Thread.java:619)
> It seems to be caused by a race condition in ReadWriteConsistencyControl#completeMemStoreInsert. Accesses/updates of the value of memStoreRead should be done while holding the readWaiters lock.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-3059) TestReadWriteConsistencyControl occasionally hangs

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-3059:
-------------------------

    Status: Patch Available  (was: Open)

> TestReadWriteConsistencyControl occasionally hangs
> --------------------------------------------------
>
>                 Key: HBASE-3059
>                 URL: https://issues.apache.org/jira/browse/HBASE-3059
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>             Fix For: 0.90.0
>
>         Attachments: hbase_trunk_consistency.patch
>
>
> The test hung when I ran mvn test today. The jstack shows that a Writer thread hung at
> "Thread-1" prio=10 tid=0x00002aaad81d2800 nid=0x6ce9 in Object.wait() [0x0000000040f37000]
>    java.lang.Thread.State: WAITING (on object monitor)  at java.lang.Object.wait(Native Method)
>   at org.apache.hadoop.hbase.regionserver.ReadWriteConsistencyControl.completeMemstoreInsert(ReadWriteConsistencyControl.java:130)
>   -- locked <0x00002aaac9fa0f50> (a java.lang.Object)                
>   at org.apache.hadoop.hbase.regionserver.TestReadWriteConsistencyControl$Writer.run(TestReadWriteConsistencyControl.java:56)  at java.lang.Thread.run(Thread.java:619)
> It seems to be caused by a race condition in ReadWriteConsistencyControl#completeMemStoreInsert. Accesses/updates of the value of memStoreRead should be done while holding the readWaiters lock.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-3059) TestReadWriteConsistencyControl occasionally hangs

Posted by "ryan rawson (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

ryan rawson updated HBASE-3059:
-------------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

committed, thanks for that.

Strange that we did not notice it sooner...

> TestReadWriteConsistencyControl occasionally hangs
> --------------------------------------------------
>
>                 Key: HBASE-3059
>                 URL: https://issues.apache.org/jira/browse/HBASE-3059
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>             Fix For: 0.90.0
>
>         Attachments: hbase_trunk_consistency.patch
>
>
> The test hung when I ran mvn test today. The jstack shows that a Writer thread hung at
> "Thread-1" prio=10 tid=0x00002aaad81d2800 nid=0x6ce9 in Object.wait() [0x0000000040f37000]
>    java.lang.Thread.State: WAITING (on object monitor)  at java.lang.Object.wait(Native Method)
>   at org.apache.hadoop.hbase.regionserver.ReadWriteConsistencyControl.completeMemstoreInsert(ReadWriteConsistencyControl.java:130)
>   -- locked <0x00002aaac9fa0f50> (a java.lang.Object)                
>   at org.apache.hadoop.hbase.regionserver.TestReadWriteConsistencyControl$Writer.run(TestReadWriteConsistencyControl.java:56)  at java.lang.Thread.run(Thread.java:619)
> It seems to be caused by a race condition in ReadWriteConsistencyControl#completeMemStoreInsert. Accesses/updates of the value of memStoreRead should be done while holding the readWaiters lock.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-3059) TestReadWriteConsistencyControl occasionally hangs

Posted by "ryan rawson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12916691#action_12916691 ] 

ryan rawson commented on HBASE-3059:
------------------------------------

what jvm version were you using?

> TestReadWriteConsistencyControl occasionally hangs
> --------------------------------------------------
>
>                 Key: HBASE-3059
>                 URL: https://issues.apache.org/jira/browse/HBASE-3059
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>         Attachments: hbase_trunk_consistency.patch
>
>
> The test hung when I ran mvn test today. The jstack shows that a Writer thread hung at
> "Thread-1" prio=10 tid=0x00002aaad81d2800 nid=0x6ce9 in Object.wait() [0x0000000040f37000]
>    java.lang.Thread.State: WAITING (on object monitor)  at java.lang.Object.wait(Native Method)
>   at org.apache.hadoop.hbase.regionserver.ReadWriteConsistencyControl.completeMemstoreInsert(ReadWriteConsistencyControl.java:130)
>   -- locked <0x00002aaac9fa0f50> (a java.lang.Object)                
>   at org.apache.hadoop.hbase.regionserver.TestReadWriteConsistencyControl$Writer.run(TestReadWriteConsistencyControl.java:56)  at java.lang.Thread.run(Thread.java:619)
> It seems to be caused by a race condition in ReadWriteConsistencyControl#completeMemStoreInsert. Accesses/updates of the value of memStoreRead should be done while holding the readWaiters lock.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-3059) TestReadWriteConsistencyControl occasionally hangs

Posted by "ryan rawson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12917776#action_12917776 ] 

ryan rawson commented on HBASE-3059:
------------------------------------

thanks, the patch looks good, i'll commit it

> TestReadWriteConsistencyControl occasionally hangs
> --------------------------------------------------
>
>                 Key: HBASE-3059
>                 URL: https://issues.apache.org/jira/browse/HBASE-3059
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>         Attachments: hbase_trunk_consistency.patch
>
>
> The test hung when I ran mvn test today. The jstack shows that a Writer thread hung at
> "Thread-1" prio=10 tid=0x00002aaad81d2800 nid=0x6ce9 in Object.wait() [0x0000000040f37000]
>    java.lang.Thread.State: WAITING (on object monitor)  at java.lang.Object.wait(Native Method)
>   at org.apache.hadoop.hbase.regionserver.ReadWriteConsistencyControl.completeMemstoreInsert(ReadWriteConsistencyControl.java:130)
>   -- locked <0x00002aaac9fa0f50> (a java.lang.Object)                
>   at org.apache.hadoop.hbase.regionserver.TestReadWriteConsistencyControl$Writer.run(TestReadWriteConsistencyControl.java:56)  at java.lang.Thread.run(Thread.java:619)
> It seems to be caused by a race condition in ReadWriteConsistencyControl#completeMemStoreInsert. Accesses/updates of the value of memStoreRead should be done while holding the readWaiters lock.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-3059) TestReadWriteConsistencyControl occasionally hangs

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hairong Kuang updated HBASE-3059:
---------------------------------

    Attachment: hbase_trunk_consistency.patch

> TestReadWriteConsistencyControl occasionally hangs
> --------------------------------------------------
>
>                 Key: HBASE-3059
>                 URL: https://issues.apache.org/jira/browse/HBASE-3059
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>         Attachments: hbase_trunk_consistency.patch
>
>
> The test hung when I ran mvn test today. The jstack shows that a Writer thread hung at
> "Thread-1" prio=10 tid=0x00002aaad81d2800 nid=0x6ce9 in Object.wait() [0x0000000040f37000]
>    java.lang.Thread.State: WAITING (on object monitor)  at java.lang.Object.wait(Native Method)
>   at org.apache.hadoop.hbase.regionserver.ReadWriteConsistencyControl.completeMemstoreInsert(ReadWriteConsistencyControl.java:130)
>   -- locked <0x00002aaac9fa0f50> (a java.lang.Object)                
>   at org.apache.hadoop.hbase.regionserver.TestReadWriteConsistencyControl$Writer.run(TestReadWriteConsistencyControl.java:56)  at java.lang.Thread.run(Thread.java:619)
> It seems to be caused by a race condition in ReadWriteConsistencyControl#completeMemStoreInsert. Accesses/updates of the value of memStoreRead should be done while holding the readWaiters lock.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-3059) TestReadWriteConsistencyControl occasionally hangs

Posted by "Kannan Muthukkaruppan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12917769#action_12917769 ] 

Kannan Muthukkaruppan commented on HBASE-3059:
----------------------------------------------

+1 on the patch.

> TestReadWriteConsistencyControl occasionally hangs
> --------------------------------------------------
>
>                 Key: HBASE-3059
>                 URL: https://issues.apache.org/jira/browse/HBASE-3059
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>         Attachments: hbase_trunk_consistency.patch
>
>
> The test hung when I ran mvn test today. The jstack shows that a Writer thread hung at
> "Thread-1" prio=10 tid=0x00002aaad81d2800 nid=0x6ce9 in Object.wait() [0x0000000040f37000]
>    java.lang.Thread.State: WAITING (on object monitor)  at java.lang.Object.wait(Native Method)
>   at org.apache.hadoop.hbase.regionserver.ReadWriteConsistencyControl.completeMemstoreInsert(ReadWriteConsistencyControl.java:130)
>   -- locked <0x00002aaac9fa0f50> (a java.lang.Object)                
>   at org.apache.hadoop.hbase.regionserver.TestReadWriteConsistencyControl$Writer.run(TestReadWriteConsistencyControl.java:56)  at java.lang.Thread.run(Thread.java:619)
> It seems to be caused by a race condition in ReadWriteConsistencyControl#completeMemStoreInsert. Accesses/updates of the value of memStoreRead should be done while holding the readWaiters lock.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-3059) TestReadWriteConsistencyControl occasionally hangs

Posted by "ryan rawson (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

ryan rawson updated HBASE-3059:
-------------------------------

    Fix Version/s: 0.90.0

> TestReadWriteConsistencyControl occasionally hangs
> --------------------------------------------------
>
>                 Key: HBASE-3059
>                 URL: https://issues.apache.org/jira/browse/HBASE-3059
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>             Fix For: 0.90.0
>
>         Attachments: hbase_trunk_consistency.patch
>
>
> The test hung when I ran mvn test today. The jstack shows that a Writer thread hung at
> "Thread-1" prio=10 tid=0x00002aaad81d2800 nid=0x6ce9 in Object.wait() [0x0000000040f37000]
>    java.lang.Thread.State: WAITING (on object monitor)  at java.lang.Object.wait(Native Method)
>   at org.apache.hadoop.hbase.regionserver.ReadWriteConsistencyControl.completeMemstoreInsert(ReadWriteConsistencyControl.java:130)
>   -- locked <0x00002aaac9fa0f50> (a java.lang.Object)                
>   at org.apache.hadoop.hbase.regionserver.TestReadWriteConsistencyControl$Writer.run(TestReadWriteConsistencyControl.java:56)  at java.lang.Thread.run(Thread.java:619)
> It seems to be caused by a race condition in ReadWriteConsistencyControl#completeMemStoreInsert. Accesses/updates of the value of memStoreRead should be done while holding the readWaiters lock.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.