You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Andrew Purtell (JIRA)" <ji...@apache.org> on 2014/12/01 05:33:13 UTC

[jira] [Commented] (HBASE-10844) Coprocessor failure during batchmutation leaves the memstore datastructs in an inconsistent state

    [ https://issues.apache.org/jira/browse/HBASE-10844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14229404#comment-14229404 ] 

Andrew Purtell commented on HBASE-10844:
----------------------------------------

So with this patch we'd remove the assert and replace it with a warning that memstore datastructures have been only partially updated? 
{code}
--- a/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
+++ b/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
@@ -1109,7 +1109,13 @@ public class HRegion implements HeapSize { // , Writable{
 
         // close each store in parallel
         for (final Store store : stores.values()) {
-          assert abort? true: store.getFlushableSize() == 0;
+          if (store.getFlushableSize() != 0) {
+            LOG.warn("store.getFlushableSize for " + store + " is not zero! It's " 
+                + store.getFlushableSize() + ". Maybe a coprocessor "
+                + "operation failed and "
+                + "left the memstore datastructures in a partially updated state. "
+                + "Current memstoreSize " + this.getMemstoreSize().get());
+          }
           completionService
               .submit(new Callable<Pair<byte[], Collection<StoreFile>>>() {
                 @Override
{code}
Shouldn't we be aborting in that case anyway? Or replace the assert with an abort()?

> Coprocessor failure during batchmutation leaves the memstore datastructs in an inconsistent state
> -------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-10844
>                 URL: https://issues.apache.org/jira/browse/HBASE-10844
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: Devaraj Das
>            Assignee: Devaraj Das
>         Attachments: 10844-1-0.98.txt, 10844-1.txt
>
>
> Observed this in the testing with Phoenix. The test in Phoenix - MutableIndexFailureIT deliberately fails the batchmutation call via the installed coprocessor. But the update is not rolled back. That leaves the memstore inconsistent. In particular, I observed that getFlushableSize is updated before the coprocessor was called but the update is not rolled back. When the region is being closed at some later point, the assert introduced in HBASE-10514 in the HRegion.doClose() causes the RegionServer to shutdown abnormally.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)