You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-dev@hadoop.apache.org by Steve Loughran <st...@apache.org> on 2008/05/08 14:14:21 UTC

Design Q: logging/nesting of exceptions


I have a design question, not a jira issue, so I'm asking as a normal email

1. Why do exceptions tend to stringify nested exceptions

  public InconsistentFSStateException(File dir, String descr, Throwable 
ex) {
     this(dir, descr + "\n" + StringUtils.stringifyException(ex));
   }

rather than retain the existing stack trace as a separate exception?

  public InconsistentFSStateException(File dir, String descr, Throwable 
ex) {
     this(dir, descr + "\n" + StringUtils.stringifyException(ex));
     initCause(ex);
   }

2. why do exceptions tend to get stringified before logging, instead of 
leaving the logging framework to handle it (e.g here in DataNode)

LOG.error(dnRegistration + ":DataXceiver: " + 
StringUtils.stringifyException(t));

rather than

LOG.error(dnRegistration + ":DataXceiver: "+t.getMessage(),t);

I ask as I do personally like to keep all those many stack traces around 
in a machine readable format right up into the XML test reports, and 
both actions run a risk of converting them to text too early. Is this 
all a deliberate decision, or just a accidental  policy that can be 
changed if someone is prepared to go through the code and make the changes?

-steve

Re: Design Q: logging/nesting of exceptions

Posted by Doug Cutting <cu...@apache.org>.

Steve Loughran wrote:
> OK, that means when I encounter them I can change them.

Yes, please.  And if you're feeling ambitious, you could change a bunch 
of them wholesale as a distinct issue.

Doug

Re: Design Q: logging/nesting of exceptions

Posted by Steve Loughran <st...@apache.org>.

Nigel Daley wrote:
> 
> On May 9, 2008, at 2:03 AM, Steve Loughran wrote:
>> Owen O'Malley wrote:
>>> On May 8, 2008, at 10:21 AM, Doug Cutting wrote:
>>>> I'd go with something closer to an accidental policy.  My suspicion 
>>>> is that logging framework didn't print nested exceptions well.  Owen 
>>>> is the father of stringifyException and may have more insight.
>>> Yeah, it was my fault. The log4j was misconfigured, so we didn't get 
>>> the exception traces out of the messages. I didn't realize it was a 
>>> misconfiguration until much later and it had become standard practice 
>>> within Hadoop. *sigh* I've fixed a couple of them, but there are a 
>>> lot more.
>>
>> OK, that means when I encounter them I can change them.
>>
>> Somewhere on my todo list is way better Junit reports, including logs 
>> from multiple machines and stack traces in machine readable 
>> form...getting the exceptions out raw is one of the requirements for 
>> this to work
>>
>> http://wiki.apache.org/ant/Proposals/EnhancedTestReports
>>
>> -steve
> 
> Hey Steve,
> 
> Hope Hudson is on your list of CI servers to test wrt to Ant Junit 
> report changes :-)

I hope so too. One of the big problems is backwards compatibility...the 
original junit report saves a summary in the attributes of the root 
node, so you can't stream the results out, you have to buffer it -then, 
if the JVM crashes, you get a 0 byte file.

I'm not sure what the ideal solution would be here, what I may do is 
generate the new format alongside the old, or do a backup format which 
can be used to postmortem a JVM crash when it happens.

> 
> On another note, have you used SmartFrog to config/deploy Hadoop?  If 
> so, I'm interested in your experiences.

I'm working on it in the period of my time I get to do interesting 
stuff; you can track the status here:
http://jira.smartfrog.org/jira/browse/SFOS-780

-I have the ability for hadoop to get its state from our configuration 
files, not the existing XML files
-I can submit work to an existing cluster
-we can poll a cluster for being in a working condition (job tracker 
live &C), and block actions until that state is reached
-I'm just bringing up namenode and data nodes;
-I'm also adding components to do HDFS maintenance: formatting, 
balancing, etc.

What is nice so far is that it does work with our testing components, so 
I can run a test that brings up a cluster with some parameters (such as 
JVM, replication options) and try something, then tear that down and do 
a different set. That should be good for testing interesting values 
-like what happens on different replication options, JVM tuning etc. And 
I can get the logs back into one place. You don't necessarily want that 
in production (one logger=one point of failure), but its good in small 
test runs.

By the end of the month I should have something that others can play 
with; we're going to talk about it at the UK hadoop users meeting in 
august. I'm also taking notes of where I've had to do ugly things where 
some changes to hadoop core would make things a lot better. So far
  -make it easier to get configuration information from some form of 
external factory
  -have the services - namenode, datanode, etc, all have a lifecycle 
interface, with a base class that provides stable thread safe 
startup/ping/shutdown.
  -make package scoped stuff in NameNode/DataNode private, maybe with 
protected accessors
  -find where exceptions are being stringified before nesting/logging 
and retain in their raw form

These changes aren't smartfrog-specific, they're just the things you 
need to do to manage hadoop better from inside the JVM.

Currently my code is here

http://smartfrog.svn.sourceforge.net/viewvc/smartfrog/trunk/core/components/hadoop/
with subclassed things in the apache packages to get at package-scoped 
content.
http://smartfrog.svn.sourceforge.net/viewvc/smartfrog/trunk/core/components/hadoop/src/org/apache/hadoop/dfs/


-steve

Re: Design Q: logging/nesting of exceptions

Posted by Nigel Daley <nd...@yahoo-inc.com>.

On May 9, 2008, at 2:03 AM, Steve Loughran wrote:
> Owen O'Malley wrote:
>> On May 8, 2008, at 10:21 AM, Doug Cutting wrote:
>>> I'd go with something closer to an accidental policy.  My  
>>> suspicion is that logging framework didn't print nested  
>>> exceptions well.  Owen is the father of stringifyException and  
>>> may have more insight.
>> Yeah, it was my fault. The log4j was misconfigured, so we didn't  
>> get the exception traces out of the messages. I didn't realize it  
>> was a misconfiguration until much later and it had become standard  
>> practice within Hadoop. *sigh* I've fixed a couple of them, but  
>> there are a lot more.
>
> OK, that means when I encounter them I can change them.
>
> Somewhere on my todo list is way better Junit reports, including  
> logs from multiple machines and stack traces in machine readable  
> form...getting the exceptions out raw is one of the requirements  
> for this to work
>
> http://wiki.apache.org/ant/Proposals/EnhancedTestReports
>
> -steve

Hey Steve,

Hope Hudson is on your list of CI servers to test wrt to Ant Junit  
report changes :-)

On another note, have you used SmartFrog to config/deploy Hadoop?  If  
so, I'm interested in your experiences.

Cheers,
Nige
Y! Grid QE Manager

Re: Design Q: logging/nesting of exceptions

Posted by Steve Loughran <st...@apache.org>.

Owen O'Malley wrote:
> 
> On May 8, 2008, at 10:21 AM, Doug Cutting wrote:
> 
>> I'd go with something closer to an accidental policy.  My suspicion is 
>> that logging framework didn't print nested exceptions well.  Owen is 
>> the father of stringifyException and may have more insight.
> 
> Yeah, it was my fault. The log4j was misconfigured, so we didn't get the 
> exception traces out of the messages. I didn't realize it was a 
> misconfiguration until much later and it had become standard practice 
> within Hadoop. *sigh* I've fixed a couple of them, but there are a lot 
> more.

OK, that means when I encounter them I can change them.

Somewhere on my todo list is way better Junit reports, including logs 
from multiple machines and stack traces in machine readable 
form...getting the exceptions out raw is one of the requirements for 
this to work

http://wiki.apache.org/ant/Proposals/EnhancedTestReports

-steve

Re: Design Q: logging/nesting of exceptions

Posted by Owen O'Malley <oo...@yahoo-inc.com>.

On May 8, 2008, at 10:21 AM, Doug Cutting wrote:

> I'd go with something closer to an accidental policy.  My suspicion  
> is that logging framework didn't print nested exceptions well.   
> Owen is the father of stringifyException and may have more insight.

Yeah, it was my fault. The log4j was misconfigured, so we didn't get  
the exception traces out of the messages. I didn't realize it was a  
misconfiguration until much later and it had become standard practice  
within Hadoop. *sigh* I've fixed a couple of them, but there are a  
lot more.

-- Owen

Re: Design Q: logging/nesting of exceptions

Posted by Doug Cutting <cu...@apache.org>.

Steve Loughran wrote:
> Is this 
> all a deliberate decision, or just a accidental  policy that can be 
> changed if someone is prepared to go through the code and make the changes?

I'd go with something closer to an accidental policy.  My suspicion is 
that logging framework didn't print nested exceptions well.  Owen is the 
father of stringifyException and may have more insight.

Doug