You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Adam Warrington (JIRA)" <ji...@apache.org> on 2011/07/07 20:25:16 UTC

[jira] [Created] (HBASE-4077) Deadlock if WrongRegionException is thrown from getLock in HRegion.delete

Deadlock if WrongRegionException is thrown from getLock in HRegion.delete
-------------------------------------------------------------------------

                 Key: HBASE-4077
                 URL: https://issues.apache.org/jira/browse/HBASE-4077
             Project: HBase
          Issue Type: Bug
          Components: regionserver
    Affects Versions: 0.90.3
            Reporter: Adam Warrington
            Priority: Critical


In the HRegion.delete function, If getLock throws a WrongRegionException, no lock id is ever returned, yet in the finally block, it tries to release the row lock using that lock id (which is null). This causes an NPE in the finally clause, and the closeRegionOperation() to never execute, keeping a read lock open forever.

ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: 
java.lang.NullPointerException 
at org.apache.hadoop.hbase.util.Bytes.compareTo(Bytes.java:840) 
at org.apache.hadoop.hbase.util.Bytes$ByteArrayComparator.compare(Bytes.java:108) 
at org.apache.hadoop.hbase.util.Bytes$ByteArrayComparator.compare(Bytes.java:100) 
at java.util.TreeMap.getEntryUsingComparator(TreeMap.java:351) 
at java.util.TreeMap.getEntry(TreeMap.java:322) 
at java.util.TreeMap.remove(TreeMap.java:580) 
at java.util.TreeSet.remove(TreeSet.java:259) 
at org.apache.hadoop.hbase.regionserver.HRegion.releaseRowLock(HRegion.java:2145) 
at org.apache.hadoop.hbase.regionserver.HRegion.delete(HRegion.java:1174) 
at org.apache.hadoop.hbase.regionserver.HRegionServer.delete(HRegionServer.java:1914) 
at sun.reflect.GeneratedMethodAccessor22.invoke(Unknown Source) 
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) 
at java.lang.reflect.Method.invoke(Method.java:597) 
at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570) 
at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039)

When the region later attempts to close, the write lock can never be acquired, and the region remains in transition forever.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4077) Deadlock if WrongRegionException is thrown from getLock in HRegion.delete

Posted by "Adam Warrington (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Adam Warrington updated HBASE-4077:
-----------------------------------

    Attachment: HBASE-4077-0.patch

> Deadlock if WrongRegionException is thrown from getLock in HRegion.delete
> -------------------------------------------------------------------------
>
>                 Key: HBASE-4077
>                 URL: https://issues.apache.org/jira/browse/HBASE-4077
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.90.3
>            Reporter: Adam Warrington
>            Priority: Critical
>         Attachments: HBASE-4077-0.patch
>
>
> In the HRegion.delete function, If getLock throws a WrongRegionException, no lock id is ever returned, yet in the finally block, it tries to release the row lock using that lock id (which is null). This causes an NPE in the finally clause, and the closeRegionOperation() to never execute, keeping a read lock open forever.
> ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: 
> java.lang.NullPointerException 
> at org.apache.hadoop.hbase.util.Bytes.compareTo(Bytes.java:840) 
> at org.apache.hadoop.hbase.util.Bytes$ByteArrayComparator.compare(Bytes.java:108) 
> at org.apache.hadoop.hbase.util.Bytes$ByteArrayComparator.compare(Bytes.java:100) 
> at java.util.TreeMap.getEntryUsingComparator(TreeMap.java:351) 
> at java.util.TreeMap.getEntry(TreeMap.java:322) 
> at java.util.TreeMap.remove(TreeMap.java:580) 
> at java.util.TreeSet.remove(TreeSet.java:259) 
> at org.apache.hadoop.hbase.regionserver.HRegion.releaseRowLock(HRegion.java:2145) 
> at org.apache.hadoop.hbase.regionserver.HRegion.delete(HRegion.java:1174) 
> at org.apache.hadoop.hbase.regionserver.HRegionServer.delete(HRegionServer.java:1914) 
> at sun.reflect.GeneratedMethodAccessor22.invoke(Unknown Source) 
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) 
> at java.lang.reflect.Method.invoke(Method.java:597) 
> at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570) 
> at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039)
> When the region later attempts to close, the write lock can never be acquired, and the region remains in transition forever.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4077) Deadlock if WrongRegionException is thrown from getLock in HRegion.delete

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13061599#comment-13061599 ] 

Ted Yu commented on HBASE-4077:
-------------------------------

Looking at the code again, delete() was the only method that didn't follow the pattern.
If WrongRegionException was thrown from getLock(), the inner finally block would be skipped, along with it the NPE.

Thanks for the patch Adam.

> Deadlock if WrongRegionException is thrown from getLock in HRegion.delete
> -------------------------------------------------------------------------
>
>                 Key: HBASE-4077
>                 URL: https://issues.apache.org/jira/browse/HBASE-4077
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.90.3
>            Reporter: Adam Warrington
>            Priority: Critical
>         Attachments: HBASE-4077-0.patch
>
>
> In the HRegion.delete function, If getLock throws a WrongRegionException, no lock id is ever returned, yet in the finally block, it tries to release the row lock using that lock id (which is null). This causes an NPE in the finally clause, and the closeRegionOperation() to never execute, keeping a read lock open forever.
> ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: 
> java.lang.NullPointerException 
> at org.apache.hadoop.hbase.util.Bytes.compareTo(Bytes.java:840) 
> at org.apache.hadoop.hbase.util.Bytes$ByteArrayComparator.compare(Bytes.java:108) 
> at org.apache.hadoop.hbase.util.Bytes$ByteArrayComparator.compare(Bytes.java:100) 
> at java.util.TreeMap.getEntryUsingComparator(TreeMap.java:351) 
> at java.util.TreeMap.getEntry(TreeMap.java:322) 
> at java.util.TreeMap.remove(TreeMap.java:580) 
> at java.util.TreeSet.remove(TreeSet.java:259) 
> at org.apache.hadoop.hbase.regionserver.HRegion.releaseRowLock(HRegion.java:2145) 
> at org.apache.hadoop.hbase.regionserver.HRegion.delete(HRegion.java:1174) 
> at org.apache.hadoop.hbase.regionserver.HRegionServer.delete(HRegionServer.java:1914) 
> at sun.reflect.GeneratedMethodAccessor22.invoke(Unknown Source) 
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) 
> at java.lang.reflect.Method.invoke(Method.java:597) 
> at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570) 
> at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039)
> When the region later attempts to close, the write lock can never be acquired, and the region remains in transition forever.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4077) Deadlock if WrongRegionException is thrown from getLock in HRegion.delete

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13061603#comment-13061603 ] 

stack commented on HBASE-4077:
------------------------------

@Ted You going to apply? (+1 from me).

> Deadlock if WrongRegionException is thrown from getLock in HRegion.delete
> -------------------------------------------------------------------------
>
>                 Key: HBASE-4077
>                 URL: https://issues.apache.org/jira/browse/HBASE-4077
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.90.3
>            Reporter: Adam Warrington
>            Priority: Critical
>         Attachments: HBASE-4077-0.patch
>
>
> In the HRegion.delete function, If getLock throws a WrongRegionException, no lock id is ever returned, yet in the finally block, it tries to release the row lock using that lock id (which is null). This causes an NPE in the finally clause, and the closeRegionOperation() to never execute, keeping a read lock open forever.
> ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: 
> java.lang.NullPointerException 
> at org.apache.hadoop.hbase.util.Bytes.compareTo(Bytes.java:840) 
> at org.apache.hadoop.hbase.util.Bytes$ByteArrayComparator.compare(Bytes.java:108) 
> at org.apache.hadoop.hbase.util.Bytes$ByteArrayComparator.compare(Bytes.java:100) 
> at java.util.TreeMap.getEntryUsingComparator(TreeMap.java:351) 
> at java.util.TreeMap.getEntry(TreeMap.java:322) 
> at java.util.TreeMap.remove(TreeMap.java:580) 
> at java.util.TreeSet.remove(TreeSet.java:259) 
> at org.apache.hadoop.hbase.regionserver.HRegion.releaseRowLock(HRegion.java:2145) 
> at org.apache.hadoop.hbase.regionserver.HRegion.delete(HRegion.java:1174) 
> at org.apache.hadoop.hbase.regionserver.HRegionServer.delete(HRegionServer.java:1914) 
> at sun.reflect.GeneratedMethodAccessor22.invoke(Unknown Source) 
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) 
> at java.lang.reflect.Method.invoke(Method.java:597) 
> at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570) 
> at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039)
> When the region later attempts to close, the write lock can never be acquired, and the region remains in transition forever.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4077) Deadlock if WrongRegionException is thrown from getLock in HRegion.delete

Posted by "Adam Warrington (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13061601#comment-13061601 ] 

Adam Warrington commented on HBASE-4077:
----------------------------------------

No problem Ted. My pleasure!

> Deadlock if WrongRegionException is thrown from getLock in HRegion.delete
> -------------------------------------------------------------------------
>
>                 Key: HBASE-4077
>                 URL: https://issues.apache.org/jira/browse/HBASE-4077
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.90.3
>            Reporter: Adam Warrington
>            Priority: Critical
>         Attachments: HBASE-4077-0.patch
>
>
> In the HRegion.delete function, If getLock throws a WrongRegionException, no lock id is ever returned, yet in the finally block, it tries to release the row lock using that lock id (which is null). This causes an NPE in the finally clause, and the closeRegionOperation() to never execute, keeping a read lock open forever.
> ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: 
> java.lang.NullPointerException 
> at org.apache.hadoop.hbase.util.Bytes.compareTo(Bytes.java:840) 
> at org.apache.hadoop.hbase.util.Bytes$ByteArrayComparator.compare(Bytes.java:108) 
> at org.apache.hadoop.hbase.util.Bytes$ByteArrayComparator.compare(Bytes.java:100) 
> at java.util.TreeMap.getEntryUsingComparator(TreeMap.java:351) 
> at java.util.TreeMap.getEntry(TreeMap.java:322) 
> at java.util.TreeMap.remove(TreeMap.java:580) 
> at java.util.TreeSet.remove(TreeSet.java:259) 
> at org.apache.hadoop.hbase.regionserver.HRegion.releaseRowLock(HRegion.java:2145) 
> at org.apache.hadoop.hbase.regionserver.HRegion.delete(HRegion.java:1174) 
> at org.apache.hadoop.hbase.regionserver.HRegionServer.delete(HRegionServer.java:1914) 
> at sun.reflect.GeneratedMethodAccessor22.invoke(Unknown Source) 
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) 
> at java.lang.reflect.Method.invoke(Method.java:597) 
> at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570) 
> at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039)
> When the region later attempts to close, the write lock can never be acquired, and the region remains in transition forever.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4077) Deadlock if WrongRegionException is thrown from getLock in HRegion.delete

Posted by "Adam Warrington (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13061576#comment-13061576 ] 

Adam Warrington commented on HBASE-4077:
----------------------------------------

I followed the same pattern used in the other update functions in HRegion, which include:

checkAndMutate
put
increment
incrementColumnValue

I originally did a check against null, but after looking through the code, decided consistency with the other functions was something I liked better. I'm in no way married to this, so I am game to change it to a null check if that is what folks want. Let me know.

> Deadlock if WrongRegionException is thrown from getLock in HRegion.delete
> -------------------------------------------------------------------------
>
>                 Key: HBASE-4077
>                 URL: https://issues.apache.org/jira/browse/HBASE-4077
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.90.3
>            Reporter: Adam Warrington
>            Priority: Critical
>         Attachments: HBASE-4077-0.patch
>
>
> In the HRegion.delete function, If getLock throws a WrongRegionException, no lock id is ever returned, yet in the finally block, it tries to release the row lock using that lock id (which is null). This causes an NPE in the finally clause, and the closeRegionOperation() to never execute, keeping a read lock open forever.
> ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: 
> java.lang.NullPointerException 
> at org.apache.hadoop.hbase.util.Bytes.compareTo(Bytes.java:840) 
> at org.apache.hadoop.hbase.util.Bytes$ByteArrayComparator.compare(Bytes.java:108) 
> at org.apache.hadoop.hbase.util.Bytes$ByteArrayComparator.compare(Bytes.java:100) 
> at java.util.TreeMap.getEntryUsingComparator(TreeMap.java:351) 
> at java.util.TreeMap.getEntry(TreeMap.java:322) 
> at java.util.TreeMap.remove(TreeMap.java:580) 
> at java.util.TreeSet.remove(TreeSet.java:259) 
> at org.apache.hadoop.hbase.regionserver.HRegion.releaseRowLock(HRegion.java:2145) 
> at org.apache.hadoop.hbase.regionserver.HRegion.delete(HRegion.java:1174) 
> at org.apache.hadoop.hbase.regionserver.HRegionServer.delete(HRegionServer.java:1914) 
> at sun.reflect.GeneratedMethodAccessor22.invoke(Unknown Source) 
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) 
> at java.lang.reflect.Method.invoke(Method.java:597) 
> at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570) 
> at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039)
> When the region later attempts to close, the write lock can never be acquired, and the region remains in transition forever.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4077) Deadlock if WrongRegionException is thrown from getLock in HRegion.delete

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13061669#comment-13061669 ] 

Hudson commented on HBASE-4077:
-------------------------------

Integrated in HBase-TRUNK #2011 (See [https://builds.apache.org/job/HBase-TRUNK/2011/])
    HBASE-4077 formatting spaces
HBASE-4077  Deadlock if WrongRegionException is thrown from getLock in
               HRegion.delete (Adam Warrington via Ted Yu)

tedyu : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java

tedyu : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* /hbase/trunk/CHANGES.txt


> Deadlock if WrongRegionException is thrown from getLock in HRegion.delete
> -------------------------------------------------------------------------
>
>                 Key: HBASE-4077
>                 URL: https://issues.apache.org/jira/browse/HBASE-4077
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.90.3
>            Reporter: Adam Warrington
>            Assignee: Adam Warrington
>            Priority: Critical
>             Fix For: 0.90.4
>
>         Attachments: HBASE-4077-0.patch
>
>
> In the HRegion.delete function, If getLock throws a WrongRegionException, no lock id is ever returned, yet in the finally block, it tries to release the row lock using that lock id (which is null). This causes an NPE in the finally clause, and the closeRegionOperation() to never execute, keeping a read lock open forever.
> ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: 
> java.lang.NullPointerException 
> at org.apache.hadoop.hbase.util.Bytes.compareTo(Bytes.java:840) 
> at org.apache.hadoop.hbase.util.Bytes$ByteArrayComparator.compare(Bytes.java:108) 
> at org.apache.hadoop.hbase.util.Bytes$ByteArrayComparator.compare(Bytes.java:100) 
> at java.util.TreeMap.getEntryUsingComparator(TreeMap.java:351) 
> at java.util.TreeMap.getEntry(TreeMap.java:322) 
> at java.util.TreeMap.remove(TreeMap.java:580) 
> at java.util.TreeSet.remove(TreeSet.java:259) 
> at org.apache.hadoop.hbase.regionserver.HRegion.releaseRowLock(HRegion.java:2145) 
> at org.apache.hadoop.hbase.regionserver.HRegion.delete(HRegion.java:1174) 
> at org.apache.hadoop.hbase.regionserver.HRegionServer.delete(HRegionServer.java:1914) 
> at sun.reflect.GeneratedMethodAccessor22.invoke(Unknown Source) 
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) 
> at java.lang.reflect.Method.invoke(Method.java:597) 
> at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570) 
> at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039)
> When the region later attempts to close, the write lock can never be acquired, and the region remains in transition forever.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4077) Deadlock if WrongRegionException is thrown from getLock in HRegion.delete

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13061495#comment-13061495 ] 

Ted Yu commented on HBASE-4077:
-------------------------------

Why not check against lid being null to prevent NPE ?

> Deadlock if WrongRegionException is thrown from getLock in HRegion.delete
> -------------------------------------------------------------------------
>
>                 Key: HBASE-4077
>                 URL: https://issues.apache.org/jira/browse/HBASE-4077
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.90.3
>            Reporter: Adam Warrington
>            Priority: Critical
>         Attachments: HBASE-4077-0.patch
>
>
> In the HRegion.delete function, If getLock throws a WrongRegionException, no lock id is ever returned, yet in the finally block, it tries to release the row lock using that lock id (which is null). This causes an NPE in the finally clause, and the closeRegionOperation() to never execute, keeping a read lock open forever.
> ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: 
> java.lang.NullPointerException 
> at org.apache.hadoop.hbase.util.Bytes.compareTo(Bytes.java:840) 
> at org.apache.hadoop.hbase.util.Bytes$ByteArrayComparator.compare(Bytes.java:108) 
> at org.apache.hadoop.hbase.util.Bytes$ByteArrayComparator.compare(Bytes.java:100) 
> at java.util.TreeMap.getEntryUsingComparator(TreeMap.java:351) 
> at java.util.TreeMap.getEntry(TreeMap.java:322) 
> at java.util.TreeMap.remove(TreeMap.java:580) 
> at java.util.TreeSet.remove(TreeSet.java:259) 
> at org.apache.hadoop.hbase.regionserver.HRegion.releaseRowLock(HRegion.java:2145) 
> at org.apache.hadoop.hbase.regionserver.HRegion.delete(HRegion.java:1174) 
> at org.apache.hadoop.hbase.regionserver.HRegionServer.delete(HRegionServer.java:1914) 
> at sun.reflect.GeneratedMethodAccessor22.invoke(Unknown Source) 
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) 
> at java.lang.reflect.Method.invoke(Method.java:597) 
> at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570) 
> at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039)
> When the region later attempts to close, the write lock can never be acquired, and the region remains in transition forever.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4077) Deadlock if WrongRegionException is thrown from getLock in HRegion.delete

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13061587#comment-13061587 ] 

Ted Yu commented on HBASE-4077:
-------------------------------

My thinking was to make WrongRegionException prominent in region server log.

> Deadlock if WrongRegionException is thrown from getLock in HRegion.delete
> -------------------------------------------------------------------------
>
>                 Key: HBASE-4077
>                 URL: https://issues.apache.org/jira/browse/HBASE-4077
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.90.3
>            Reporter: Adam Warrington
>            Priority: Critical
>         Attachments: HBASE-4077-0.patch
>
>
> In the HRegion.delete function, If getLock throws a WrongRegionException, no lock id is ever returned, yet in the finally block, it tries to release the row lock using that lock id (which is null). This causes an NPE in the finally clause, and the closeRegionOperation() to never execute, keeping a read lock open forever.
> ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: 
> java.lang.NullPointerException 
> at org.apache.hadoop.hbase.util.Bytes.compareTo(Bytes.java:840) 
> at org.apache.hadoop.hbase.util.Bytes$ByteArrayComparator.compare(Bytes.java:108) 
> at org.apache.hadoop.hbase.util.Bytes$ByteArrayComparator.compare(Bytes.java:100) 
> at java.util.TreeMap.getEntryUsingComparator(TreeMap.java:351) 
> at java.util.TreeMap.getEntry(TreeMap.java:322) 
> at java.util.TreeMap.remove(TreeMap.java:580) 
> at java.util.TreeSet.remove(TreeSet.java:259) 
> at org.apache.hadoop.hbase.regionserver.HRegion.releaseRowLock(HRegion.java:2145) 
> at org.apache.hadoop.hbase.regionserver.HRegion.delete(HRegion.java:1174) 
> at org.apache.hadoop.hbase.regionserver.HRegionServer.delete(HRegionServer.java:1914) 
> at sun.reflect.GeneratedMethodAccessor22.invoke(Unknown Source) 
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) 
> at java.lang.reflect.Method.invoke(Method.java:597) 
> at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570) 
> at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039)
> When the region later attempts to close, the write lock can never be acquired, and the region remains in transition forever.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4077) Deadlock if WrongRegionException is thrown from getLock in HRegion.delete

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13061612#comment-13061612 ] 

Ted Yu commented on HBASE-4077:
-------------------------------

Integrated to branch and TRUNK.
I shortened indentation to 2 spaces.

Thanks for the re view Stack.

> Deadlock if WrongRegionException is thrown from getLock in HRegion.delete
> -------------------------------------------------------------------------
>
>                 Key: HBASE-4077
>                 URL: https://issues.apache.org/jira/browse/HBASE-4077
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.90.3
>            Reporter: Adam Warrington
>            Priority: Critical
>         Attachments: HBASE-4077-0.patch
>
>
> In the HRegion.delete function, If getLock throws a WrongRegionException, no lock id is ever returned, yet in the finally block, it tries to release the row lock using that lock id (which is null). This causes an NPE in the finally clause, and the closeRegionOperation() to never execute, keeping a read lock open forever.
> ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: 
> java.lang.NullPointerException 
> at org.apache.hadoop.hbase.util.Bytes.compareTo(Bytes.java:840) 
> at org.apache.hadoop.hbase.util.Bytes$ByteArrayComparator.compare(Bytes.java:108) 
> at org.apache.hadoop.hbase.util.Bytes$ByteArrayComparator.compare(Bytes.java:100) 
> at java.util.TreeMap.getEntryUsingComparator(TreeMap.java:351) 
> at java.util.TreeMap.getEntry(TreeMap.java:322) 
> at java.util.TreeMap.remove(TreeMap.java:580) 
> at java.util.TreeSet.remove(TreeSet.java:259) 
> at org.apache.hadoop.hbase.regionserver.HRegion.releaseRowLock(HRegion.java:2145) 
> at org.apache.hadoop.hbase.regionserver.HRegion.delete(HRegion.java:1174) 
> at org.apache.hadoop.hbase.regionserver.HRegionServer.delete(HRegionServer.java:1914) 
> at sun.reflect.GeneratedMethodAccessor22.invoke(Unknown Source) 
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) 
> at java.lang.reflect.Method.invoke(Method.java:597) 
> at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570) 
> at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039)
> When the region later attempts to close, the write lock can never be acquired, and the region remains in transition forever.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4077) Deadlock if WrongRegionException is thrown from getLock in HRegion.delete

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13061775#comment-13061775 ] 

stack commented on HBASE-4077:
------------------------------

Yes. Its wacky.  The testing didn't get off the ground.  Chalk it up to jenkins randomness.

> Deadlock if WrongRegionException is thrown from getLock in HRegion.delete
> -------------------------------------------------------------------------
>
>                 Key: HBASE-4077
>                 URL: https://issues.apache.org/jira/browse/HBASE-4077
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.90.3
>            Reporter: Adam Warrington
>            Assignee: Adam Warrington
>            Priority: Critical
>             Fix For: 0.90.4
>
>         Attachments: HBASE-4077-0.patch
>
>
> In the HRegion.delete function, If getLock throws a WrongRegionException, no lock id is ever returned, yet in the finally block, it tries to release the row lock using that lock id (which is null). This causes an NPE in the finally clause, and the closeRegionOperation() to never execute, keeping a read lock open forever.
> ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: 
> java.lang.NullPointerException 
> at org.apache.hadoop.hbase.util.Bytes.compareTo(Bytes.java:840) 
> at org.apache.hadoop.hbase.util.Bytes$ByteArrayComparator.compare(Bytes.java:108) 
> at org.apache.hadoop.hbase.util.Bytes$ByteArrayComparator.compare(Bytes.java:100) 
> at java.util.TreeMap.getEntryUsingComparator(TreeMap.java:351) 
> at java.util.TreeMap.getEntry(TreeMap.java:322) 
> at java.util.TreeMap.remove(TreeMap.java:580) 
> at java.util.TreeSet.remove(TreeSet.java:259) 
> at org.apache.hadoop.hbase.regionserver.HRegion.releaseRowLock(HRegion.java:2145) 
> at org.apache.hadoop.hbase.regionserver.HRegion.delete(HRegion.java:1174) 
> at org.apache.hadoop.hbase.regionserver.HRegionServer.delete(HRegionServer.java:1914) 
> at sun.reflect.GeneratedMethodAccessor22.invoke(Unknown Source) 
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) 
> at java.lang.reflect.Method.invoke(Method.java:597) 
> at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570) 
> at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039)
> When the region later attempts to close, the write lock can never be acquired, and the region remains in transition forever.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4077) Deadlock if WrongRegionException is thrown from getLock in HRegion.delete

Posted by "Adam Warrington (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13061590#comment-13061590 ] 

Adam Warrington commented on HBASE-4077:
----------------------------------------

WrongRegionException actually is and will still be logged as an ERROR level log with the patch I committed. I only posted the NPE exception in the description of this ticket, but the WrongRegionException is also in the logs as well.

> Deadlock if WrongRegionException is thrown from getLock in HRegion.delete
> -------------------------------------------------------------------------
>
>                 Key: HBASE-4077
>                 URL: https://issues.apache.org/jira/browse/HBASE-4077
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.90.3
>            Reporter: Adam Warrington
>            Priority: Critical
>         Attachments: HBASE-4077-0.patch
>
>
> In the HRegion.delete function, If getLock throws a WrongRegionException, no lock id is ever returned, yet in the finally block, it tries to release the row lock using that lock id (which is null). This causes an NPE in the finally clause, and the closeRegionOperation() to never execute, keeping a read lock open forever.
> ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: 
> java.lang.NullPointerException 
> at org.apache.hadoop.hbase.util.Bytes.compareTo(Bytes.java:840) 
> at org.apache.hadoop.hbase.util.Bytes$ByteArrayComparator.compare(Bytes.java:108) 
> at org.apache.hadoop.hbase.util.Bytes$ByteArrayComparator.compare(Bytes.java:100) 
> at java.util.TreeMap.getEntryUsingComparator(TreeMap.java:351) 
> at java.util.TreeMap.getEntry(TreeMap.java:322) 
> at java.util.TreeMap.remove(TreeMap.java:580) 
> at java.util.TreeSet.remove(TreeSet.java:259) 
> at org.apache.hadoop.hbase.regionserver.HRegion.releaseRowLock(HRegion.java:2145) 
> at org.apache.hadoop.hbase.regionserver.HRegion.delete(HRegion.java:1174) 
> at org.apache.hadoop.hbase.regionserver.HRegionServer.delete(HRegionServer.java:1914) 
> at sun.reflect.GeneratedMethodAccessor22.invoke(Unknown Source) 
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) 
> at java.lang.reflect.Method.invoke(Method.java:597) 
> at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570) 
> at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039)
> When the region later attempts to close, the write lock can never be acquired, and the region remains in transition forever.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4077) Deadlock if WrongRegionException is thrown from getLock in HRegion.delete

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13061746#comment-13061746 ] 

Ted Yu commented on HBASE-4077:
-------------------------------

This is an interesting test failure:
https://builds.apache.org/view/G-L/view/HBase/job/hbase-0.90/lastCompletedBuild/testReport/org.apache.hadoop.hbase/TestZooKeeper/testClientSessionExpired/

I refreshed 0.90 branch on a Linux box and TestZooKeeper passes standalone.

> Deadlock if WrongRegionException is thrown from getLock in HRegion.delete
> -------------------------------------------------------------------------
>
>                 Key: HBASE-4077
>                 URL: https://issues.apache.org/jira/browse/HBASE-4077
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.90.3
>            Reporter: Adam Warrington
>            Assignee: Adam Warrington
>            Priority: Critical
>             Fix For: 0.90.4
>
>         Attachments: HBASE-4077-0.patch
>
>
> In the HRegion.delete function, If getLock throws a WrongRegionException, no lock id is ever returned, yet in the finally block, it tries to release the row lock using that lock id (which is null). This causes an NPE in the finally clause, and the closeRegionOperation() to never execute, keeping a read lock open forever.
> ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: 
> java.lang.NullPointerException 
> at org.apache.hadoop.hbase.util.Bytes.compareTo(Bytes.java:840) 
> at org.apache.hadoop.hbase.util.Bytes$ByteArrayComparator.compare(Bytes.java:108) 
> at org.apache.hadoop.hbase.util.Bytes$ByteArrayComparator.compare(Bytes.java:100) 
> at java.util.TreeMap.getEntryUsingComparator(TreeMap.java:351) 
> at java.util.TreeMap.getEntry(TreeMap.java:322) 
> at java.util.TreeMap.remove(TreeMap.java:580) 
> at java.util.TreeSet.remove(TreeSet.java:259) 
> at org.apache.hadoop.hbase.regionserver.HRegion.releaseRowLock(HRegion.java:2145) 
> at org.apache.hadoop.hbase.regionserver.HRegion.delete(HRegion.java:1174) 
> at org.apache.hadoop.hbase.regionserver.HRegionServer.delete(HRegionServer.java:1914) 
> at sun.reflect.GeneratedMethodAccessor22.invoke(Unknown Source) 
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) 
> at java.lang.reflect.Method.invoke(Method.java:597) 
> at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570) 
> at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039)
> When the region later attempts to close, the write lock can never be acquired, and the region remains in transition forever.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (HBASE-4077) Deadlock if WrongRegionException is thrown from getLock in HRegion.delete

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ted Yu resolved HBASE-4077.
---------------------------

       Resolution: Fixed
    Fix Version/s: 0.90.4
     Hadoop Flags: [Reviewed]

> Deadlock if WrongRegionException is thrown from getLock in HRegion.delete
> -------------------------------------------------------------------------
>
>                 Key: HBASE-4077
>                 URL: https://issues.apache.org/jira/browse/HBASE-4077
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.90.3
>            Reporter: Adam Warrington
>            Assignee: Adam Warrington
>            Priority: Critical
>             Fix For: 0.90.4
>
>         Attachments: HBASE-4077-0.patch
>
>
> In the HRegion.delete function, If getLock throws a WrongRegionException, no lock id is ever returned, yet in the finally block, it tries to release the row lock using that lock id (which is null). This causes an NPE in the finally clause, and the closeRegionOperation() to never execute, keeping a read lock open forever.
> ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: 
> java.lang.NullPointerException 
> at org.apache.hadoop.hbase.util.Bytes.compareTo(Bytes.java:840) 
> at org.apache.hadoop.hbase.util.Bytes$ByteArrayComparator.compare(Bytes.java:108) 
> at org.apache.hadoop.hbase.util.Bytes$ByteArrayComparator.compare(Bytes.java:100) 
> at java.util.TreeMap.getEntryUsingComparator(TreeMap.java:351) 
> at java.util.TreeMap.getEntry(TreeMap.java:322) 
> at java.util.TreeMap.remove(TreeMap.java:580) 
> at java.util.TreeSet.remove(TreeSet.java:259) 
> at org.apache.hadoop.hbase.regionserver.HRegion.releaseRowLock(HRegion.java:2145) 
> at org.apache.hadoop.hbase.regionserver.HRegion.delete(HRegion.java:1174) 
> at org.apache.hadoop.hbase.regionserver.HRegionServer.delete(HRegionServer.java:1914) 
> at sun.reflect.GeneratedMethodAccessor22.invoke(Unknown Source) 
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) 
> at java.lang.reflect.Method.invoke(Method.java:597) 
> at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570) 
> at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039)
> When the region later attempts to close, the write lock can never be acquired, and the region remains in transition forever.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4077) Deadlock if WrongRegionException is thrown from getLock in HRegion.delete

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13061755#comment-13061755 ] 

Ted Yu commented on HBASE-4077:
-------------------------------

https://builds.apache.org/view/G-L/view/HBase/job/hbase-0.90/226/console was successful.

> Deadlock if WrongRegionException is thrown from getLock in HRegion.delete
> -------------------------------------------------------------------------
>
>                 Key: HBASE-4077
>                 URL: https://issues.apache.org/jira/browse/HBASE-4077
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.90.3
>            Reporter: Adam Warrington
>            Assignee: Adam Warrington
>            Priority: Critical
>             Fix For: 0.90.4
>
>         Attachments: HBASE-4077-0.patch
>
>
> In the HRegion.delete function, If getLock throws a WrongRegionException, no lock id is ever returned, yet in the finally block, it tries to release the row lock using that lock id (which is null). This causes an NPE in the finally clause, and the closeRegionOperation() to never execute, keeping a read lock open forever.
> ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: 
> java.lang.NullPointerException 
> at org.apache.hadoop.hbase.util.Bytes.compareTo(Bytes.java:840) 
> at org.apache.hadoop.hbase.util.Bytes$ByteArrayComparator.compare(Bytes.java:108) 
> at org.apache.hadoop.hbase.util.Bytes$ByteArrayComparator.compare(Bytes.java:100) 
> at java.util.TreeMap.getEntryUsingComparator(TreeMap.java:351) 
> at java.util.TreeMap.getEntry(TreeMap.java:322) 
> at java.util.TreeMap.remove(TreeMap.java:580) 
> at java.util.TreeSet.remove(TreeSet.java:259) 
> at org.apache.hadoop.hbase.regionserver.HRegion.releaseRowLock(HRegion.java:2145) 
> at org.apache.hadoop.hbase.regionserver.HRegion.delete(HRegion.java:1174) 
> at org.apache.hadoop.hbase.regionserver.HRegionServer.delete(HRegionServer.java:1914) 
> at sun.reflect.GeneratedMethodAccessor22.invoke(Unknown Source) 
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) 
> at java.lang.reflect.Method.invoke(Method.java:597) 
> at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570) 
> at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039)
> When the region later attempts to close, the write lock can never be acquired, and the region remains in transition forever.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Assigned] (HBASE-4077) Deadlock if WrongRegionException is thrown from getLock in HRegion.delete

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ted Yu reassigned HBASE-4077:
-----------------------------

    Assignee: Adam Warrington

> Deadlock if WrongRegionException is thrown from getLock in HRegion.delete
> -------------------------------------------------------------------------
>
>                 Key: HBASE-4077
>                 URL: https://issues.apache.org/jira/browse/HBASE-4077
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.90.3
>            Reporter: Adam Warrington
>            Assignee: Adam Warrington
>            Priority: Critical
>             Fix For: 0.90.4
>
>         Attachments: HBASE-4077-0.patch
>
>
> In the HRegion.delete function, If getLock throws a WrongRegionException, no lock id is ever returned, yet in the finally block, it tries to release the row lock using that lock id (which is null). This causes an NPE in the finally clause, and the closeRegionOperation() to never execute, keeping a read lock open forever.
> ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: 
> java.lang.NullPointerException 
> at org.apache.hadoop.hbase.util.Bytes.compareTo(Bytes.java:840) 
> at org.apache.hadoop.hbase.util.Bytes$ByteArrayComparator.compare(Bytes.java:108) 
> at org.apache.hadoop.hbase.util.Bytes$ByteArrayComparator.compare(Bytes.java:100) 
> at java.util.TreeMap.getEntryUsingComparator(TreeMap.java:351) 
> at java.util.TreeMap.getEntry(TreeMap.java:322) 
> at java.util.TreeMap.remove(TreeMap.java:580) 
> at java.util.TreeSet.remove(TreeSet.java:259) 
> at org.apache.hadoop.hbase.regionserver.HRegion.releaseRowLock(HRegion.java:2145) 
> at org.apache.hadoop.hbase.regionserver.HRegion.delete(HRegion.java:1174) 
> at org.apache.hadoop.hbase.regionserver.HRegionServer.delete(HRegionServer.java:1914) 
> at sun.reflect.GeneratedMethodAccessor22.invoke(Unknown Source) 
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) 
> at java.lang.reflect.Method.invoke(Method.java:597) 
> at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570) 
> at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039)
> When the region later attempts to close, the write lock can never be acquired, and the region remains in transition forever.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira