You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org> on 2011/08/02 00:08:54 UTC

[jira] [Commented] (HBASE-4014) Coprocessors: Flag the presence of coprocessors in logged exceptions

    [ https://issues.apache.org/jira/browse/HBASE-4014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073821#comment-13073821 ] 

jiraposter@reviews.apache.org commented on HBASE-4014:
------------------------------------------------------



bq.  On 2011-07-27 05:14:19, Michael Stack wrote:
bq.  > Some comments below Eugene.  This thing looks useful and almost done.  Lets get it in!

Thank you Michael for looking at this!


bq.  On 2011-07-27 05:14:19, Michael Stack wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorEnvironment.java, line 54
bq.  > <https://reviews.apache.org/r/969/diff/2/?file=25643#file25643line54>
bq.  >
bq.  >     Do we need to add this?  Doesn't every object inherit Object and so have a toString?

Thanks, Michael; you are right. Removed this unnecessary declaration.


bq.  On 2011-07-27 05:14:19, Michael Stack wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java, line 578
bq.  > <https://reviews.apache.org/r/969/diff/2/?file=25644#file25644line578>
bq.  >
bq.  >     Do you think this needed Eugene?  Is coprocessors a List?  What if you toString'd it?  Maybe'll do right thing (with square bracket delimiters rather than curly's but that might be ok)

Hi Michael, coprocessors is an o.a.h.h.util.SortedCopyOnWriteSet, whose toString() returns :

"org.apache.hadoop.hbase.util.SortedCopyOnWriteSet@4d441b16" (or a similar memory location after the @-sign. 

Thus the need for a string serialization method like "coprocessorSetAsString()". 

Perhaps this should be moved to an (overriding) SortedCopyOnWriteSet's toString()? I am happy to make a sub-JIRA to HBASE-4014 if you think so.


bq.  On 2011-07-27 05:14:19, Michael Stack wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/master/MasterCoprocessorHost.java, line 81
bq.  > <https://reviews.apache.org/r/969/diff/2/?file=25645#file25645line81>
bq.  >
bq.  >     Whats masterServices?  I think it subclasses Server?  If you do getServerName or something, that'll give you something better than 'master'.  It'll include port and startcode which could be important debugging (more important for the RS case than for Master but could be important if multimasters).

Correct; masterServices is an instance of MasterServices, which subclasses Server. 

I think you're right that the port and startcode are important. 

But note that abortServiceWithCoprocessorInfo() *does* show serverName.getServerName() in its output, (so it shows the port and startcode as you recomend).

On the other hand, getServerName() doesn't show what *role* the server plays: is it a master or a regionserver? It seems to me this should be part of the abort message, too.

So here's an example of an entire abort message on the master side:

Aborting service: master running on : 192.168.0.136,56238,1312228544878 because coprocessor: org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorException$BuggyMasterObserver@1658fe12 threw an exception: java.lang.NullPointerException. Loaded coprocessors are: {class org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorException$BuggyMasterObserver}

How does that look to you?


bq.  On 2011-07-27 05:14:19, Michael Stack wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/master/MasterCoprocessorHost.java, line 87
bq.  > <https://reviews.apache.org/r/969/diff/2/?file=25645#file25645line87>
bq.  >
bq.  >     Nice comment

Thank you! :)


bq.  On 2011-07-27 05:14:19, Michael Stack wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/master/MasterCoprocessorHost.java, line 112
bq.  > <https://reviews.apache.org/r/969/diff/2/?file=25645#file25645line112>
bq.  >
bq.  >     Abort seems radical

Hmm..is it? The point of HBASE-4014 is to catch buggy coprocessors by aborting the coprocessor host (the master or regionserver) with as much information as possible so that the bug in the coprocessor can be fixed. For example, BuggyMasterObserver (in TestMasterCoprocessorException.java) tries to dereference a null pointer.


bq.  On 2011-07-27 05:14:19, Michael Stack wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/master/MasterCoprocessorHost.java, line 237
bq.  > <https://reviews.apache.org/r/969/diff/2/?file=25645#file25645line237>
bq.  >
bq.  >     The convention is to put } catch { on the one line rather than line break after the } (no biggie)

Thank you; fixed.


bq.  On 2011-07-27 05:14:19, Michael Stack wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java, line 26
bq.  > <https://reviews.apache.org/r/969/diff/2/?file=25647#file25647line26>
bq.  >
bq.  >     We are importing but we don't seem to use the imports, is that so?

You are right: removed unused "import o.a.h.h.coprocessor.Coprocessor;" as well as "import o.a.h.h.coprocessor.CoprocessorHost;".


bq.  On 2011-07-27 05:14:19, Michael Stack wrote:
bq.  > src/test/java/org/apache/hadoop/hbase/coprocessor/TestMasterCoprocessorException.java, line 65
bq.  > <https://reviews.apache.org/r/969/diff/2/?file=25651#file25651line65>
bq.  >
bq.  >     No need for these extra new lines

Thank you for catching that; removed extra new lines.


bq.  On 2011-07-27 05:14:19, Michael Stack wrote:
bq.  > src/test/java/org/apache/hadoop/hbase/master/TestActiveMasterManager.java, line 36
bq.  > <https://reviews.apache.org/r/969/diff/2/?file=25653#file25653line36>
bq.  >
bq.  >     Unused import?

Thanks; removed.


bq.  On 2011-07-27 05:14:19, Michael Stack wrote:
bq.  > src/test/java/org/apache/hadoop/hbase/master/TestActiveMasterManager.java, line 279
bq.  > <https://reviews.apache.org/r/969/diff/2/?file=25653#file25653line279>
bq.  >
bq.  >     Added line we don't need?

Thanks; removed.


bq.  On 2011-07-27 05:14:19, Michael Stack wrote:
bq.  > src/test/java/org/apache/hadoop/hbase/master/TestCatalogJanitor.java, line 49
bq.  > <https://reviews.apache.org/r/969/diff/2/?file=25654#file25654line49>
bq.  >
bq.  >     Unused import

Thanks; removed.


bq.  On 2011-07-27 05:14:19, Michael Stack wrote:
bq.  > src/test/java/org/apache/hadoop/hbase/master/TestClockSkewDetection.java, line 34
bq.  > <https://reviews.apache.org/r/969/diff/2/?file=25655#file25655line34>
bq.  >
bq.  >     Unused import?

Thanks; removed.


bq.  On 2011-07-27 05:14:19, Michael Stack wrote:
bq.  > src/test/java/org/apache/hadoop/hbase/master/TestLogsCleaner.java, line 37
bq.  > <https://reviews.apache.org/r/969/diff/2/?file=25656#file25656line37>
bq.  >
bq.  >     Unused import?

Thanks; removed.


bq.  On 2011-07-27 05:14:19, Michael Stack wrote:
bq.  > src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestOpenRegionHandler.java, line 39
bq.  > <https://reviews.apache.org/r/969/diff/2/?file=25657#file25657line39>
bq.  >
bq.  >     Unused import?

Thanks; removed. I should have caught all these before review; not sure how they crept in. Will try to be more careful.


bq.  On 2011-07-27 05:14:19, Michael Stack wrote:
bq.  > src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestOpenRegionHandler.java, line 219
bq.  > <https://reviews.apache.org/r/969/diff/2/?file=25657#file25657line219>
bq.  >
bq.  >     Addition of commented out code

Thanks; removed.


bq.  On 2011-07-27 05:14:19, Michael Stack wrote:
bq.  > src/test/java/org/apache/hadoop/hbase/replication/regionserver/TestReplicationSourceManager.java, line 43
bq.  > <https://reviews.apache.org/r/969/diff/2/?file=25658#file25658line43>
bq.  >
bq.  >     Unused import

Thanks; removed.


bq.  On 2011-07-27 05:14:19, Michael Stack wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java, line 353
bq.  > <https://reviews.apache.org/r/969/diff/2/?file=25648#file25648line353>
bq.  >
bq.  >     Is there any diff in the above changes?

These are a couple of typos in ZooKeeperWatcher.java that I noticed incidentally - I will file a separate JIRA for these and remove from patch.


bq.  On 2011-07-27 05:14:19, Michael Stack wrote:
bq.  > src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java, line 227
bq.  > <https://reviews.apache.org/r/969/diff/2/?file=25649#file25649line227>
bq.  >
bq.  >     Same here

Another incidental typo - will note & fix in the above separate JIRA and remove from this patch.


- Eugene


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/969/#review1198
-----------------------------------------------------------


On 2011-07-14 23:39:07, Eugene Koontz wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/969/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-07-14 23:39:07)
bq.  
bq.  
bq.  Review request for hbase, Gary Helmling and Mingjie Lai.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  https://issues.apache.org/jira/browse/HBASE-4014 Coprocessors: Flag the presence of coprocessors in logged exceptions
bq.  
bq.  The general gist here is to wrap each of {Master,RegionServer}CoprocessorHost's coprocessor call inside a 
bq.  
bq.  "try { ... } catch (Throwable e) { handleCoprocessorThrowable(e) }"
bq.  
bq.  block. 
bq.  
bq.  handleCoprocessorThrowable() is responsible for either passing 'e' along to the client (if 'e' is an IOException) or, otherwise, aborting the service (Regionserver or Master).
bq.  
bq.  The abort message contains a list of the loaded coprocessors for crash analysis.
bq.  
bq.  
bq.  This addresses bug HBASE-4014.
bq.      https://issues.apache.org/jira/browse/HBASE-4014
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    src/main/java/org/apache/hadoop/hbase/master/MasterCoprocessorHost.java aa930f5 
bq.    src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorEnvironment.java 54ccd6f 
bq.    src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java 18ba6e7 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java c2b3558 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java 8ffa086 
bq.    src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java 4fa82c0 
bq.    src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java 6af0ecf 
bq.    src/test/java/org/apache/hadoop/hbase/coprocessor/BuggyRegionObserver.java PRE-CREATION 
bq.    src/test/java/org/apache/hadoop/hbase/coprocessor/TestMasterCoprocessorException.java PRE-CREATION 
bq.    src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionServerCoprocessorException.java PRE-CREATION 
bq.    src/test/java/org/apache/hadoop/hbase/master/TestActiveMasterManager.java 7f19c72 
bq.    src/test/java/org/apache/hadoop/hbase/master/TestCatalogJanitor.java 78e7d62 
bq.    src/test/java/org/apache/hadoop/hbase/master/TestClockSkewDetection.java c571227 
bq.    src/test/java/org/apache/hadoop/hbase/master/TestLogsCleaner.java fc05e47 
bq.    src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestOpenRegionHandler.java aa48c22 
bq.    src/test/java/org/apache/hadoop/hbase/replication/regionserver/TestReplicationSourceManager.java dd5dc3e 
bq.  
bq.  Diff: https://reviews.apache.org/r/969/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  patch includes two tests:
bq.  
bq.  TestMasterCoprocessorException.java
bq.  TestRegionServerCoprocessorException.java
bq.  
bq.  both tests pass in my build environment.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Eugene
bq.  
bq.



> Coprocessors: Flag the presence of coprocessors in logged exceptions
> --------------------------------------------------------------------
>
>                 Key: HBASE-4014
>                 URL: https://issues.apache.org/jira/browse/HBASE-4014
>             Project: HBase
>          Issue Type: Improvement
>          Components: coprocessors
>            Reporter: Andrew Purtell
>            Assignee: Eugene Koontz
>             Fix For: 0.92.0
>
>         Attachments: HBASE-4014.patch, HBASE-4014.patch, HBASE-4014.patch, HBASE-4014.patch
>
>
> For some initial triage of bug reports for core versus for deployments with loaded coprocessors, we need something like the Linux kernel's taint flag, and list of linked in modules that show up in the output of every OOPS, to appear above or below exceptions that appear in the logs.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira