You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Erik Steffl (JIRA)" <ji...@apache.org> on 2010/07/28 00:46:24 UTC

[jira] Created: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Add LOG.isDebugEnabled() guard for each LOG.debug("...")
--------------------------------------------------------

                 Key: HADOOP-6884
                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
             Project: Hadoop Common
          Issue Type: Improvement
    Affects Versions: 0.22.0
            Reporter: Erik Steffl
             Fix For: 0.22.0


Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12897536#action_12897536 ] 

Doug Cutting commented on HADOOP-6884:
--------------------------------------

This bloats the code when the logged string is a constant.  Also, in the vast majority of cases, this is not a significant optimization.  I'd rather keep most code more readable and only perform the check in performance-critical code.

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12903164#action_12903164 ] 

Tsz Wo (Nicholas), SZE commented on HADOOP-6884:
------------------------------------------------

This issue draw up an unbelievably long discussion.  Erik's patch is great.  Of course, there is a room for improvement as always, and as many of other patches.  Quite a few people have proposed "better" ideas but only Erik has provided patches.  I have fine with ideas like Doug's or Konstantin's as long as some one is willing to work on it.  IMO, trying to make the logging perfect is a kind of wasting resources in this community.

Konstantin and I have +1'ed earlier but no one has  -1'ed yet.  I will commit Erik's patch 24 hours later if there is no -1.

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: FunAgain.java, FunAgain.java, HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12900377#action_12900377 ] 

Tsz Wo (Nicholas), SZE commented on HADOOP-6884:
------------------------------------------------

> What does this show? ...

Your benchmark also shows that LOG.isDebugEnabled() provides the best results.  A patch using this approach is already there.  Do you really want to stop committing it?

There may be other comparable solutions out there but (1) we don't have a patch yet, (2) it may require further modifying the codes and (3) the performance is not as good as LOG.isDebugEnabled().

I forgot to mention that my benchmark only measures running time but it does not account the GC overhead induced by unnecessary object creation due to auto-boxing.  This is another reason that LOG.isDebugEnabled() is better than the other suggested solution.

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: FunAgain.java, FunAgain.java, HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Suresh Srinivas (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12899089#action_12899089 ] 

Suresh Srinivas commented on HADOOP-6884:
-----------------------------------------

I have additional concerns about AOP. Herriot plans to introduce AOP based interfaces for exposing system internals. This should not be deployed on production clusters. Using AOP to check isDebugEnabled(), results in two flavors of AOP code. One that must be deployed on the system and the other that should not be. Confusion here could result in serious issues.

Second, with AOP, if some one deploys the version without AOP based debug check, there is no way to know if the check is happening or not, by the system behavior alone.

My vote is to keep it simple for now and check in the code changes that Erik has. This is a widely used java coding practice and not a huge code bloat. If others still feel that this causes code bloat, we could further choose one of the choices proposed in the jira. At that time, we need to any way remove all debug enabled checks that are currently being made.

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Suresh Srinivas (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12893277#action_12893277 ] 

Suresh Srinivas commented on HADOOP-6884:
-----------------------------------------

Alejandro, with the methods you are proposing, an array or Objects created to pass the varargs. It is still expensive.

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Suresh Srinivas (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12900481#action_12900481 ] 

Suresh Srinivas commented on HADOOP-6884:
-----------------------------------------

I agree with Erik. I am actually surprised by how many comments this simple proposal has generated!

My +1 for committing the patch with a minor change - we do not need isDebugEnabled() check for tests.

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: FunAgain.java, FunAgain.java, HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Konstantin Shvachko reassigned HADOOP-6884:
-------------------------------------------

    Assignee: Erik Steffl

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Scott Carey (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12899228#action_12899228 ] 

Scott Carey commented on HADOOP-6884:
-------------------------------------

bq. Also #3 seems to be fairly simple with a single "around" advice in aspectj and a couple of lines for ajc target in build.xml. Why use an ad-hoc script that's known (according Erik himself) to have false positive that require manual inspection (which is also not part of the version controlled source tree), when you we have a built-in (aspectj*.jars are already included as build/runtime dependencies) tool specifically for such tasks?

You can do this for info messages too for more performance when that is off.  There are no extra lines of code or maintenance; it is the simplest solution.  As long as the change is applied at build time, I don't see any real drawback.



> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Scott Carey (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12899084#action_12899084 ] 

Scott Carey commented on HADOOP-6884:
-------------------------------------

At most two of these can be true:

1. Code must remain as readable/simple as possible and maintainable (no wrapping isDebugEnabled() in all of source code).
2. Code must perform best (no varargs, autoboxing, string concatenation, etc for unused debug lines)
3. The build system can't re-write code (AOP, source modification, or bytecode modification)


There appear to be "-1"'s that require the above to all be true.  Therefore, this ticket will go nowhere until there is consensus to let one of the three be false.   Implicitly, #2 is false if no action is taken.

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Steve Loughran (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898967#action_12898967 ] 

Steve Loughran commented on HADOOP-6884:
----------------------------------------

-1 to ASM-style work as it just complicates build/test cycles
-0 to SLF4J. I don't see the point. I use commons-logging as the front end with either Log4J or other tools as the back end.

Now, we could improve commons-logging to add varargs...

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Jakob Homan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12903536#action_12903536 ] 

Jakob Homan commented on HADOOP-6884:
-------------------------------------

bq. Konstantin and I have +1'ed earlier but no one has -1'ed yet. I will commit Erik's patch 24 hours later if there is no -1.
+1

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: FunAgain.java, FunAgain.java, HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Steve Loughran (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12899352#action_12899352 ] 

Steve Loughran commented on HADOOP-6884:
----------------------------------------

Luke Lu _said Feature/design/performance aside, the main problem of commons logging is that there are no active maintainers working on it (last release in 2007), unlike slf4j. _

We can fix that, if the project is idle it could be taken over, but I fear the #of open bugreps. The thought appeals though. Strip out avalon support, pare back the POM files and assume log4j and java.util are the two backends. 

Maybe I've just had too much negative experience of SLF4J the redistributable, rather than as an API.
# it's another JAR
# you also need to choose at redist time what your back end will be and include that JAR
# If you don't want to log things like Jetty through it you have to write your own bridge specifically for that app [1] and make sure it's on the classpath.

so: I get to deal with the downstream problems, things not logging, things not logging in forked JVMs, etc, the SLF4J api is something I have no knowledge of.

Of course, the other tactic is to say "no layers, go to log4j" and work with the log4j team on having log4j push out data to chukwa and scribus, feed it back into Hadoop for analysis. Just a thought.

[1] http://smartfrog.svn.sourceforge.net/viewvc/smartfrog/trunk/core/components/jetty/src/org/smartfrog/services/jetty/log/JettyToCommonsLogger.java?revision=8363&view=markup


> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Scott Carey (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12899110#action_12899110 ] 

Scott Carey commented on HADOOP-6884:
-------------------------------------

It must also be ensured that isDebugEnabled() is NOT wrapped around an info, warn, error, or fatal.  Someone might check in a change that turns a debug into an info without removing the guard.


> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12900390#action_12900390 ] 

Doug Cutting commented on HADOOP-6884:
--------------------------------------

>  I found at least two three-variable cases in Erik's latest patch.

I missed those, but it's still different from your benchmark, which focussed on boxed variables.  Those examples might look something like:

debug("Exception while invoking {} of {}. Retrying.", method.getName(), implementation.getClass(), StringUtils.stringifyException(e));

That last parameter would make things slow, but, I'd be surprised if you can show that optimizing this substantially improves Hadoop performance, since this only runs in the case that a retryable method fails with an exception.  If we ever found that such cases were problematic, we might add methods like:

static void debug(String message, Throwable t);
static void debug(String message, Throwable t, Object p1);
static void debug(String message, Throwable t, Object p1, Object p2);
static void debug(String message, Throwable t, Object... parameters);

Perhaps we could submit a patch to slf4j adding such methods there.

The second case could become something like:

debug("for protocol authorization compare ({}):{} with ", clientPrincipal, shortName, user.getShortUserName));

In this case all parameters are strings, so no boxing would occur.

> Your benchmark also shows that LOG.isDebugEnabled() provides the best results.

It provides significantly better results in a micro-benchmark.  But optimizations that increase code size should show significant improvement in real application performance, not just micro-benchmarks.  Until someone's shown that this optimization measurably improves end performance, I see no urgency to commit it.

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: FunAgain.java, FunAgain.java, HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Luke Lu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12899127#action_12899127 ] 

Luke Lu commented on HADOOP-6884:
---------------------------------

We can save ~296 lines from the patch if we exclude the test code:

{code}
[...hadoop]$ find */src/java -name \*.java | xargs grep '\.debug(' | wc -l
     410
[...hadoop]$ find */src/test -name \*.java | xargs grep '\.debug(' | wc -l
     148
{code}

Also #3 seems to be fairly simple with a single "around" advice in aspectj and a couple of lines for ajc target in build.xml. Why use an ad-hoc script that's known (according Erik himself) to have false positive that require manual inspection (which is also not part of the version controlled source tree), when you we have a built-in (aspectj*.jars are already included as build/runtime dependencies) tool specifically for such tasks?

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12903742#action_12903742 ] 

Doug Cutting commented on HADOOP-6884:
--------------------------------------

Konstantin, thanks very much for running this benchmark!  It sounds like that, or any other LOG.debug() call that might be called while servicing an otherwise inexpensive RPC request, should be guarded with an isDebugEnabled() or replaced with an slf4j or some other wrapper that avoids string concatenation.

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: FunAgain.java, FunAgain.java, HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12899104#action_12899104 ] 

Konstantin Shvachko commented on HADOOP-6884:
---------------------------------------------

Luke> The problem with the current approach is that you'd have to keep patching people's new code manually.

I understand that Erik has a script, which automates the process of finding calls to {{debug()}} unguarded with {{isDebugEnabled()}}. If we run it as a target of ant build, we can enforce people to maintain the guards automatically.

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Jakob Homan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12899116#action_12899116 ] 

Jakob Homan commented on HADOOP-6884:
-------------------------------------

bq. If we run it as a target of ant build, we can enforce people to maintain the guards automatically.
I'd like to avoid any potential false positives the script may identify.  Checking for appropriate if blocks around costly debug logging should be a responsibility of the patch reviewer.

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Luke Lu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12899461#action_12899461 ] 

Luke Lu commented on HADOOP-6884:
---------------------------------

Did some digging on the aspectj wrapping approach. Looks like it's a no go performance wise, as it cannot solve the arguments building cost issue without fixing aspectj itself, as someone else tried to do exactly the same thing here:

http://dev.eclipse.org/mhonarc/lists/aspectj-users/msg09294.html

Looks the most reasonable current course of action is just committing the patch as it looks correct, low risk and significantly lower gc stress (especially in namenode code, according to Suresh in offline discussions.) 

We should file a separate jira to explore the switching to slf4j api (still using log4j as backend) approach. One open issue even with slf4j API is that it doesn't solve the autoboxing cost issue for primitive types, which we use a lot the logs:

http://qos.ch/pipermail/slf4j-dev/2005-August/000241.html (old but still reflecting the current API design.)

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Scott Carey (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898287#action_12898287 ] 

Scott Carey commented on HADOOP-6884:
-------------------------------------

One could use something like ASM to re-write log.debug() statements with the guard around it at class load time, or some build system trickery can do something similar at build time.  An AOP tool could do this sort of thing.

I wonder if any logging tool project has already done such a thing?

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Erik Steffl (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Erik Steffl updated HADOOP-6884:
--------------------------------

    Attachment: HADOOP-6884-0.22-1.patch

Patch HADOOP-6884-0.22-1.patch adds more changes that were previosuly missed (or new .debug() calls).

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12899145#action_12899145 ] 

Doug Cutting commented on HADOOP-6884:
--------------------------------------

> 1. Code must remain as readable/simple as possible and maintainable (no wrapping isDebugEnabled() in all of source code).

Does anyone believe that?  I think the guards should ideally be limited to performance sensitive parts of the code.  But if folks don't trust that can be maintained, then a warning when non-constant strings are logged might be acceptable.  Constant strings should be as fast as the guard.

> 2. Code must perform best (no varargs, autoboxing, string concatenation, etc for unused debug lines)

We could easily implement a guard that both performs well and does not bloat code, e.g.:
{code}
public static void debugLog(String message) { LOG.debug(message); }
public static void debugLog(String message, Object o1) {
  if (LOG.isDebugEnabled())
    LOG.debug(MessageFormat.format(message, new Object[] {o1}));
}
public static void debugLog(String message, Object o1, Object o2) {
  if (LOG.isDebugEnabled())
    LOG.debug(MessageFormat.format(message, new Object[] {o1, o2}));
}
public static void debugLog(String message, Object o1, Object o2, Object o3) {
  if (LOG.isDebugEnabled())
    LOG.debug(MessageFormat.format(message, new Object[] {o1, o2, o3}));
}
...
{code}

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12899462#action_12899462 ] 

Tsz Wo (Nicholas), SZE commented on HADOOP-6884:
------------------------------------------------

For aspectj, first of all, we currently don't use aspectj in Hadoop except for testing.  I beg there are people against using aspectj or other byte-code rewriting technique on the non-testing codes since it is hard to debug.

Let's be academic (as what I see in this issue) and ignore the statement above.  I cannot see a simple solution on using aspectj.  Clearly, if we define a point cut per LOG.debug(..), aspectj will works but this is too much of works and, in some cases, we need to change the codes in order to define a point cut.  For the people suggesting aspectj and saying that it is simple, could you please provide a simple example?

I heard one suggestion from Luke yesterday.  He claimed we could use the "around" advice.  However, it seems to me that around does not prevent parameter evaluation.  I have not tested it yet.

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Scott Carey (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12899484#action_12899484 ] 

Scott Carey commented on HADOOP-6884:
-------------------------------------

bq. Thanks Luke for clarfying aspectj.

If aspectj can't do this, then its likely a no-go.  Spring or Guice cannot do it either (they both change methods by overriding them, not wrapping them).
That leaves ASM or low level bytecode changes which definitely will work but would not be easy at all and would require significant testing.  At this point, a feature like that should be part of the logging framework.

bq. If I understand the slf4j API correctly, slf4j does not really help for the codes above. It has to create Integer, Long and Double objects for the primitive data type and an array for vararg.

It helps a lot, even with the autoboxing and varargs.  A String has a hashCode and byte array member variable, a minimum memory overhead much larger than a boxed object.
The autoboxing and varargs is significantly less expensive than string concatenation.

{code}
int a = 0; long b = 1L; double c = 0.0d;
LOG.debug("a={}, b={}, c={}", a, b, c);
{code}

In slf4j this requires allocating a varargs array with 3 elements (~32 bytes), and one Integer, one Long, and one Double (16, 24, and 24 bytes).
With log4j, it does not require the vararg array, but creates a string out of each number -- which uses CPU and allocates more memory than just the boxed object, and then concatenates that into another String.  The end result is at minimum 15 characters (30 bytes) + String overhead (~48 bytes), and the intermediate result overhead is larger.  If any of the numbers are not so small, the overhead grows quickly.

The JVM is getting smarter with its Escape Analysis optimizations too -- which will either eliminate many autobox and vararg object allocations or push them on the stack.  I think Sun's JVM will already avoid the varargs Object[] construction, and the Double, but not the Long or Integer, if +UseEscapeAnalysis is on.  That flag becomes the default soon.  



> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12899500#action_12899500 ] 

Tsz Wo (Nicholas), SZE commented on HADOOP-6884:
------------------------------------------------

>The way you'd write this is:
>
>LOG.debug("a={} b={} c={}", a, b, c);

Dong, I have checked the [slf4j Logger API|http://www.slf4j.org/apidocs/org/slf4j/Logger.html] but does find a method like that.  There is one requiring creating an array, debug(String format, Object[] argArray).  From [the FAQ link|http://www.slf4j.org/faq.html#logging_performance] you provided earlier, it also suggests creating an array in this case.  Below is quoted from the FAQ.
{quote}
If three or more arguments need to be passed, you can make use of the Object[] variant. For example, you can write:

logger.debug("Value {} was inserted between {} and {}.", new Object[] {newVal, below, above});
{quote}


> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Alejandro Abdelnur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12892972#action_12892972 ] 

Alejandro Abdelnur commented on HADOOP-6884:
--------------------------------------------

While common practice, this result in lot of code uglyfication/noise.

I'd rather suggest the following (we are doing this in Oozie):

Create a Log wrapper, XLog which extends Log and provides the following additional methods:

  info(String msgTemplate, Object ... args)
  warn(String msgTemplate, Object ... args)
  debug(String msgTemplate, Object ... args)
  trace(String msgTemplate, Object ... args)

In each one of these methods, if the log is enabled use the JDK MessageFormat class to create the log message and call the corresponding Log method, else do nothing.

Caveat: the last args has to be tested for being a Throwable and if so the corresponding Log method with (String, Throwable) signature.


> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12899455#action_12899455 ] 

Tsz Wo (Nicholas), SZE commented on HADOOP-6884:
------------------------------------------------

{code}
int a = ...;
long b = ...;
double c = ...;
LOG.debug("a=" + a + ", b=" + b + ", c=" + c);
{code}
If I understand the slf4j API correctly, slf4j does not really help for the codes above.  It has to create Integer, Long and  Double objects for the primitive data type and an array for vararg.

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Luke Lu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898018#action_12898018 ] 

Luke Lu commented on HADOOP-6884:
---------------------------------

I'd prefer fixing the tools instead of bloating the code.  Some simple heuristics that could work:

* Skip checking all the test code, which is trivial, as they live in different trees.
* Put some marker, say a trailing /// for verified debug calls that are not expensive, so a grep -v '///$' can filter them out etc.

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tsz Wo (Nicholas), SZE updated HADOOP-6884:
-------------------------------------------

    Attachment: FunAgain.java

FunAgain.java: Benchmarks comparing the following calls
- LOG.isDebugEnabled()
- static debug(..) as suggested by Doug
- LOG.debug(..), i.e. do nothing.

Result 1:
{noformat}
java.version = 1.6.0_10
java.runtime.name = Java(TM) SE Runtime Environment
java.runtime.version = 1.6.0_10-b33
java.vm.version = 11.0-b15
java.vm.vendor = Sun Microsystems Inc.
java.vm.name = Java HotSpot(TM) 64-Bit Server VM
java.vm.specification.version = 1.0
java.specification.version = 1.6
os.arch = amd64
os.name = Linux
os.version = 2.6.9-55.ELsmp

n=10000000
LOG.isDebugEnabled(): 82 ms
static debug(..)    : 502 ms
LOG.debug(..)       : 11644 ms
{noformat}

Result 2:
{noformat}
java.version = 1.6.0_16
java.runtime.name = Java(TM) SE Runtime Environment
java.runtime.version = 1.6.0_16-b01
java.vm.version = 14.2-b01
java.vm.vendor = Sun Microsystems Inc.
java.vm.name = Java HotSpot(TM) Client VM
java.vm.specification.version = 1.0
java.specification.version = 1.6
os.arch = x86
os.name = Windows XP
os.version = 5.1

n=10000000
LOG.isDebugEnabled(): 172 ms
static debug(..)    : 547 ms
LOG.debug(..)       : 40421 ms
{noformat}

+1 on committing the existing patch.  We may work on the static debug(..) in a separated JIRA.

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: FunAgain.java, HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12902985#action_12902985 ] 

Konstantin Shvachko commented on HADOOP-6884:
---------------------------------------------

I am reluctant to migrate to SLF4J at the moment for 3 main reasons. 
# Yes, SLF4J does provide a way to log more efficiently, but it also has the regular method {{void debug(String msg)}} so people will just use it as they did before. In my approach I intentionally excluded it from the api.
# Migrating to SLF4J will mean changing logging for all levels in the entire project(s). This issue is targeting only debug messages.
Logging is a big deal. apache.commons worked for us for a long time, will SLF4J - I don't know. There are analytic tools out there relying on the format of the logs, will SLF4J break them? All this needs investigation. Just optimizing debug logging will not affect anything except improving the performance.
# In fact, people in this discussion mentioned that creating extra object[] is a performance problem. I tried to address this issue in my proposal. And I think this makes a big difference compared to SLF4J, because 90% of the calls will end up executing the optimized versions of logDebug().

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: FunAgain.java, FunAgain.java, HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12903747#action_12903747 ] 

Konstantin Shvachko commented on HADOOP-6884:
---------------------------------------------

I've got a feeling that Scott is very strong about moving Hadoop to SLF4J. This clearly does not fall within the scope of this issue as I explained before. But if you would like to invest in this subject you are welcome to create an issue and provide arguments for the transition, may be your experience with SLF4J in Avro, performance comparison, etc.

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: FunAgain.java, FunAgain.java, HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12900533#action_12900533 ] 

Doug Cutting commented on HADOOP-6884:
--------------------------------------

The unanswered question is whether this patch does anything useful for the project.  Does it actually make Hadoop measurably faster?  Shouldn't we demonstrate that for optimizations that increase the code size, a cost/benefit analysis?

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: FunAgain.java, FunAgain.java, HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Scott Carey (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12902930#action_12902930 ] 

Scott Carey commented on HADOOP-6884:
-------------------------------------

This is not much different than just using SLF4J, the only difference is that there would be some extra Object[] creation with SLF4J (which is really not a big performance problem, about the same as auto-boxing).  But SLF4J would have a much faster and easier to use formatting implementation, and have performance benefits for INFO and other debug levels.   

Additionally, we can still create a wrappper around that to add signatures to prevent object array creation and/or autoboxing for a limited subset of cases.

In short, why not just use SLF4J's more featured and flexible wrapper and either extend that or push for adding a few extra signatures on that side?  It even has a tool to go through the source code and change stuff from log4j for you.

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: FunAgain.java, FunAgain.java, HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12899091#action_12899091 ] 

Tsz Wo (Nicholas), SZE commented on HADOOP-6884:
------------------------------------------------

Thanks Scott for a nice [summary|https://issues.apache.org/jira/browse/HADOOP-6884?focusedCommentId=12899084&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12899084].

Actually, #1 is subjective.  I don't think adding an if-statement per LOG.debug(..) make the codes hard to read or maintain.  On the other hand, #2 is very important in production systems.  Source-code/byte-code rewriting mentioned in #3 needs a lot of works.  Why don't we just commit Erik's patch?

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tsz Wo (Nicholas), SZE updated HADOOP-6884:
-------------------------------------------

        Status: Resolved  (was: Patch Available)
    Resolution: Fixed

I have committed this.  Thanks, Erik!

Thanks everyone again.

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: FunAgain.java, FunAgain.java, HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Konstantin Shvachko updated HADOOP-6884:
----------------------------------------

          Status: Patch Available  (was: Open)
    Hadoop Flags: [Reviewed]

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12903956#action_12903956 ] 

Hudson commented on HADOOP-6884:
--------------------------------

Integrated in Hadoop-Common-trunk #437 (See [https://hudson.apache.org/hudson/job/Hadoop-Common-trunk/437/])
    HADOOP-6884. Add LOG.isDebugEnabled() guard for each LOG.debug(..).  Contributed by Erik Steffl


> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: FunAgain.java, FunAgain.java, HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Luke Lu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12899147#action_12899147 ] 

Luke Lu commented on HADOOP-6884:
---------------------------------

@Steve & Suresh: slf4j support "{}" holders in the format string (which has a much faster implementation than jdk MessageFormat), so we don't have to create new strings objects with all these + operators and/or implicit toString() calls. One can create a few overloads (for number of args under n, and delegate to vargs a la slf4j (n=3) for larger n) to avoid vargs (hence the new Object array) overhead in common cases. Feature/design/performance aside, the main problem of commons logging is that there are no active maintainers working on it (last release in 2007), unlike slf4j.

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12899522#action_12899522 ] 

Doug Cutting commented on HADOOP-6884:
--------------------------------------

Nicholas>  I have checked the slf4j Logger API but does find a method like that.

That's right.  I said, such an API "need not be a varargs call, but can a normal, 4-arg method call".  With slf4j it would be varrargs.  Also note, however, that the current patch doesn't contain any log examples with three parameters.

Scott> The autoboxing and varargs is significantly less expensive than string concatenation.
Scott> Sun's JVM will already avoid the varargs Object[] construction, and the Double, but not the Long or Integer, if +UseEscapeAnalysis is on. That flag becomes the default soon.

Good points, Scott.

This issue lacks benchmarks.  Proposed optimizations should include benchmarks.  The following article has comments that indicate that, as Scott suggests, autoboxing and varargs are pretty fast.  With escape analysis they might be even faster.

http://www.nearinfinity.com/blogs/jeff_kunkle/performance_of_guarding_debug_log.html

If someone adds debug log lines in performance sensitive code, e.g., when calculating CRC32 or somesuch, then explicitly calling isDebugEnabled() would probably be a significant optimization, but such cases are rare, and performance sensitive code should be modified cautiously anyway, since other minor changes can have big effects.

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12903750#action_12903750 ] 

Tsz Wo (Nicholas), SZE commented on HADOOP-6884:
------------------------------------------------

Thanks everyone for spending time on this issue.  Again, I do not opposite the other ideas but they are just outside the scope of this issue.  You are very welcome to work on them.

(Erik is running tests on his patch.)

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: FunAgain.java, FunAgain.java, HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Eli Collins (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12903534#action_12903534 ] 

Eli Collins commented on HADOOP-6884:
-------------------------------------

Seems like a version of the patch that only added the ifs statements when needed (when the log argument is not a constant string) would be a nice middle ground since it addresses both the performance and code readability concerns, and does not require a lot of effort. I think we should implement Konstantin/Doug's proposal in a separate jira and make forward progress on this one.

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: FunAgain.java, FunAgain.java, HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Erik Steffl (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12903877#action_12903877 ] 

Erik Steffl commented on HADOOP-6884:
-------------------------------------

HADOOP-6884-0.22-1.patch 'ant test' and 'ant test-patch' results (ran manually because Hudson is down).

Results of 'ant test':

BUILD SUCCESSFUL
Total time: 6 minutes 44 seconds

Results of 'ant test-patch':

     [exec] There appear to be 2 release audit warnings before the patch and 2 release audit warnings after applying the patch.
     [exec]
     [exec]
     [exec]
     [exec]
     [exec] -1 overall.
     [exec]
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec]
     [exec]     +1 tests included.  The patch appears to include 18 new or modified tests.
     [exec]
     [exec]     -1 javadoc.  The javadoc tool appears to have generated 1 warning messages.
     [exec]
     [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
     [exec]
     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
     [exec]
     [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.
     [exec]

Javadoc warnings are unrelated to the patch:

     [exec]   [javadoc] Constructing Javadoc information...
     [exec]   [javadoc] /home/steffl/work/svn.isDebugEnabled/common-trunk/src/java/org/apache/hadoop/security/SecurityUtil.java:39: warning: sun.security.jgss.krb5.Krb5Util is Sun proprietary API and may be removed in a future release
     [exec]   [javadoc] import sun.security.jgss.krb5.Krb5Util;
     [exec]   [javadoc]                              ^
     [exec]   [javadoc] /home/steffl/work/svn.isDebugEnabled/common-trunk/src/java/org/apache/hadoop/security/SecurityUtil.java:40: warning: sun.security.krb5.Credentials is Sun proprietary API and may be removed in a future release
     [exec]   [javadoc] import sun.security.krb5.Credentials;
     [exec]   [javadoc]                         ^
     [exec]   [javadoc] /home/steffl/work/svn.isDebugEnabled/common-trunk/src/java/org/apache/hadoop/security/SecurityUtil.java:41: warning: sun.security.krb5.PrincipalName is Sun proprietary API and may be removed in a future release
     [exec]   [javadoc] import sun.security.krb5.PrincipalName;
     [exec]   [javadoc]                         ^
     [exec]   [javadoc] /home/steffl/work/svn.isDebugEnabled/common-trunk/src/java/org/apache/hadoop/security/KerberosName.java:31: warning: sun.security.krb5.Config is Sun proprietary API and may be removed in a future release
     [exec]   [javadoc] import sun.security.krb5.Config;
     [exec]   [javadoc]                         ^
     [exec]   [javadoc] /home/steffl/work/svn.isDebugEnabled/common-trunk/src/java/org/apache/hadoop/security/KerberosName.java:32: warning: sun.security.krb5.KrbException is Sun proprietary API and may be removed in a future release
     [exec]   [javadoc] import sun.security.krb5.KrbException;
     [exec]   [javadoc]                         ^
     [exec]   [javadoc] /home/steffl/work/svn.isDebugEnabled/common-trunk/src/java/org/apache/hadoop/security/KerberosName.java:81: warning: sun.security.krb5.Config is Sun proprietary API and may be removed in a future release
     [exec]   [javadoc]   private static Config kerbConf;
     [exec]   [javadoc]                  ^
     [exec]   [javadoc] ExcludePrivateAnnotationsStandardDoclet
     [exec]   [javadoc] Standard Doclet version 1.6.0_06
     [exec]   [javadoc] Building tree for all the packages and classes...
     [exec]   [javadoc] Building index for all the packages and classes...
     [exec]   [javadoc] Building index for all classes...
     [exec]   [javadoc] Generating /home/steffl/work/svn.isDebugEnabled/common-trunk/build/docs/api/stylesheet.css...
     [exec]   [javadoc] 6 warnings


> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: FunAgain.java, FunAgain.java, HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12903473#action_12903473 ] 

Doug Cutting commented on HADOOP-6884:
--------------------------------------

-0.  This patch is not well motivated.  Reasonable alternatives have been proposed.  No urgency has been demonstrated.  No great effort was required in creating this patch: if it goes stale it would be easy to reconstruct.  Further effort to reach a solution acceptable to all before anything is committed seems well warranted.

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: FunAgain.java, FunAgain.java, HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Jakob Homan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12899102#action_12899102 ] 

Jakob Homan commented on HADOOP-6884:
-------------------------------------

The current patch seems to be a bit of a sledge hammer where a ball-peen might be more appropriate.  However, since Java doesn't give us any way of lazily evaluating the arguments to a method, this check does seem to be a standard way of working around this limitation.  I'm a bit surprised at the debug logs in the test cases.  Doesn't it seem more reasonable to do all logging in tests at an always-on level?

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12899501#action_12899501 ] 

Tsz Wo (Nicholas), SZE commented on HADOOP-6884:
------------------------------------------------

Doug, sorry that I misspelled you name.

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12902994#action_12902994 ] 

Doug Cutting commented on HADOOP-6884:
--------------------------------------

> SLF4J [ ...]  has the regular method void debug(String msg) so people will just use it as they did before.

Not if we have an automated test that looks for string concatenation in log statements.  That's required for your proposal too, no?

> Migrating to SLF4J will mean changing logging for all levels in the entire project(s).

SLF4J wraps log4j, just like commons logging.  So existing log configuration & format should be unaffected.

> creating extra object[] is a performance problem

No performance problems have actually been identified.  Some have been suggested, but none shown to be significant in real operations.

That said, micro-benchmarks show that, were there a performance problem, getting rid of string concatenation would reduce the impact of logging by around 90%.  Removing Object[] allocation would improve it further, but it's not clear, given that even that the string concatenation has not shown to be significant, that this would be a significant improvement.  Plus, the majority of log statements have a single parameter and would not require an Object[] allocation.  So, unless someone can show that Object[] allocation in multi-parameter log statements is slowing things significantly, slf4j would suffice.

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: FunAgain.java, FunAgain.java, HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898437#action_12898437 ] 

Doug Cutting commented on HADOOP-6884:
--------------------------------------

> code bloat is mostly relevant for people reading the code

Bloat hinders maintainability, a perennial goal.

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12900373#action_12900373 ] 

Tsz Wo (Nicholas), SZE commented on HADOOP-6884:
------------------------------------------------

> You've benchmarked a case that isn't in the existing patch (3 variables) and certainly isn't typical. ...

Which patch are you talking about?  I found at least two three-variable cases in Erik's latest patch.

{code}
-        LOG.debug("Exception while invoking " + method.getName()
-                 + " of " + implementation.getClass() + ". Retrying."
-                 + StringUtils.stringifyException(e));
+        if(LOG.isDebugEnabled()) {
+          LOG.debug("Exception while invoking " + method.getName()
+              + " of " + implementation.getClass() + ". Retrying."
+              + StringUtils.stringifyException(e));
+        }
{code}


{code}
-    LOG.debug("for protocol authorization compare (" + clientPrincipal + "): " 
-        + shortName + " with " + user.getShortUserName());
+    if(LOG.isDebugEnabled()) {
+      LOG.debug("for protocol authorization compare (" + clientPrincipal +
+          "): " + shortName + " with " + user.getShortUserName());
{code}

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: FunAgain.java, FunAgain.java, HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Cutting updated HADOOP-6884:
---------------------------------

    Attachment: FunAgain.java

You've benchmarked a case that isn't in the existing patch (3 variables) and certainly isn't typical.  Your 'static' case still boxes.

So your benchmark mostly shows that boxing costs, and string concatenation costs even more yet.  I don't see how it shows that any of these costs are significant in the Hadoop codebase.

I've attached a version that benchmarks the more typical single-parameter using slf4j and a static version that avoids boxing.

The output this gives is:

java.version = 1.6.0_20
java.runtime.name = Java(TM) SE Runtime Environment
java.runtime.version = 1.6.0_20-b02
java.vm.version = 16.3-b01
java.vm.vendor = Sun Microsystems Inc.
java.vm.name = Java HotSpot(TM) Server VM
java.vm.specification.version = 1.0
java.specification.version = 1.6
os.arch = i386
os.name = Linux
os.version = 2.6.32-24-generic-pae

n=10000000
LOG.isDebugEnabled(): 13 ms
static debug(..)    : 78 ms
LOG.debug(..)       : 472 ms

What does this show?  It shows that simply switching to slf4j using format strings would remove string concatenation costs speeding log statements by around 10x without any bloat.  It does not show whether that improvement would be significant in any larger Hadoop benchmark, but at least the cost in code readability would be null.  Further it shows that, if we find that logging costs are significant somewhere due to boxing, we could optimize that with static methods, gaining another 5x without losing any readability.  Finally, if we find that logging costs are still somewhere significant, we can improve them yet another 6x with some impairment to readability.  This is a bit like manual loop unrolling.  We'll do it in certain performance-critical areas when it's shown to provide a significant advantage, but we shouldn't do it blindly for every loop.


> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: FunAgain.java, FunAgain.java, HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12902917#action_12902917 ] 

Doug Cutting commented on HADOOP-6884:
--------------------------------------

Konstantin, this sounds like it could be a workable approach.

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: FunAgain.java, FunAgain.java, HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Scott Carey (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12903150#action_12903150 ] 

Scott Carey commented on HADOOP-6884:
-------------------------------------

What has been shown is that over 80% of the performance problem, if it exists (I believe it does, but no one has actually shown that here or in the sibling tickets), is corrected by removing String concatenation in favor of varargs and Object[] creation.

So therefore Object[] creation is at least 5x faster than string concatenation.  Generally speaking, autoboxing and transient creation of Object[] would be similar in cost, so it is probably closer to 10x faster.

Pushing on the SLF4J project (which is actively developed, unlike Log4j and commons-logging, etc) to add a few method signatures that reduce Object[] creation and autoboxing feels like a better idea to me than re-inventing SLF4J -- which was created in the first place to solve this problem along with a few others.

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: FunAgain.java, FunAgain.java, HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12903912#action_12903912 ] 

Hudson commented on HADOOP-6884:
--------------------------------

Integrated in Hadoop-Common-trunk-Commit #367 (See [https://hudson.apache.org/hudson/job/Hadoop-Common-trunk-Commit/367/])
    HADOOP-6884. Add LOG.isDebugEnabled() guard for each LOG.debug(..).  Contributed by Erik Steffl


> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: FunAgain.java, FunAgain.java, HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Suresh Srinivas (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12899034#action_12899034 ] 

Suresh Srinivas commented on HADOOP-6884:
-----------------------------------------

As I indicated in my previous patch, java varargs is not a good idea. It creates object array to pass as varargs.

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Suresh Srinivas (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12903586#action_12903586 ] 

Suresh Srinivas commented on HADOOP-6884:
-----------------------------------------

I had posted my +1 for the patch earlier. But echoing what Eli has said already, we should open another jira and explore some of the good ideas that this jira has generated.

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: FunAgain.java, FunAgain.java, HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Luke Lu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12893258#action_12893258 ] 

Luke Lu commented on HADOOP-6884:
---------------------------------

The debug log check is critical for performance sensitive paths. Due to lack of macros in Java, log wrappers doesn't really help if additional objects need to be created for the messages (the template stuff helps but not in many cases.)

-1 for yet another home brew log wrappers.
+1 for slf4j.

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Erik Steffl (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898426#action_12898426 ] 

Erik Steffl commented on HADOOP-6884:
-------------------------------------

Why not committing a simple patch now and work on a good solution later?

Luke: we all prefer fixing the tools. Consider the timeframe and effort required. In the meantime we can have proposed patch as an intermediate solution. Leaving out test code doesn't change the situation. Adding comments is also code bloat (code bloat is mostly relevant for people reading the code) plus it's not guaranteed to work (isDebugEnabled() is guaranteed to work but people will add a marker more or less randomly without considering price of debug call).

Scott: build time - means we wouldn't be able to enable debug at runtime, that's a significant restriction compared to current state and limits troubleshooting capabilities.

I appreciate comments but how about separating a short term quick fix and long term proper solution? At this point it's not clear what a proper solution is. If you want I'll file a jira for long term solution so that we don't forget about it? I mean we probably spent more time discussing this that it take to both create a patch and reverse it (once we have the proper solution).

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Scott Carey (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12899441#action_12899441 ] 

Scott Carey commented on HADOOP-6884:
-------------------------------------

bq. Of course, the other tactic is to say "no layers, go to log4j" and work with the log4j team on having log4j push out data to chukwa and scribus, feed it back into Hadoop for analysis. Just a thought.

slf4j is log4j's successor, the 'founder' of slf4j is the same person that started log4j.  Like commons logging,  log4j is in maintenance mode.  http://logging.apache.org/log4j/1.2/changes-report.html

One maintenance release in 2.5 years.  Thats good, its stable and simple.  But new features or adapters? Unlikely.

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12902709#action_12902709 ] 

Konstantin Shvachko commented on HADOOP-6884:
---------------------------------------------

h3. This issue

Based on what is said in this jira and discussions with developers I'd like to make the following proposal.
There are 3 main goals the change should be targeting for:
# *debug() statement optimization*: do not calculate the arguments if they are not going to be logged.
# *code readability*: avoid if-debug-enabled statement before logging.
# *new code verification*: provide an automatic procedure or simple rules to verify that the newly introduced code does not introduces that same inefficiencies.

My approach is a variant of Doug's proposal and SLF4J formatting.
# Let's create a new class {{org.apache.hadoop.log.Logger}}
We will have only static methods in the class for now:
{code}
class Logger {
  // Optimized logging
  static void logDebug(Log log, Object a1);
  static void logDebug(Log log, Object a1, Object a2);
  static void logDebug(Log log, Object a1, Object a2, Object a3);
  static void logDebug(Log log, Object a1, Object a2, Object a3, Object a4);

  // not efficient, do not use on performance critical paths
  static void logDebug(Log log, Object... args);
  static void logDebug(Log log, String formatter, Object... args);
  static void logDebug(Log log, Throwable e, Object... args);
}
{code}
# Each statement
{{Log.debug(a1 + a2 + ...)}} 
in current code is replaced by one of the methods above:
{{Logger.logDebug(Log, a1, a2, ...)}}
Looking at our code, about 90% of debug calls will be replaced by the first 4 methods,
which do not compute the arguments and do not instantiate an array object.
Most of the rest debug calls will be converted into a {{logDebug(Log log, Object... args)}} call.
Which is less efficient, but still faster than building a string.
The remaining calls can be transformed into "formatter" version. We should make sure these are not
performance-critical parts of the code, or otherwise rewrite the messages to be able to call one 
of the optimized versions of {{logDebug()}}.
Autoboxing of primitive types should be ok with these less efficient versions of {{logDebug()}}.
# The implementation of {{logDebug()}} should use optimized string building tools:
{code}
  static void logDebug(Log log, Object a1, Object a2) {
    if(!log.isDebugEnabled())
      return;
    StringBuilder str = new StringBuilder().append(a1).append(a2);
    log.debug(str);
  }
{code}
# The JavaDoc for {{logDebug()}} methods should state which methods are optimized, and 
what to avoid when calling them, like primitive type autoboxing, string concatenation, etc.

This will give us all three goals above, up to a point. Efficiency and readability should be obvious. 
The tool to verify should be checking that there are no "naked" calls of ".debug" anywhere but 
in the Logger. And that calls of {{logDebug()}} do not contain "+" operations.

This is not the ideal solution, but it is practical imo.

h3. Future directions

In the future we can turn {{org.apache.hadoop.log.Logger}} into a log wrapper by converting
the static methods into non static, and adding methods logInfo(), logError(), logWarn().
This should optimize logging in Hadoop in general. Non optimal string building in info-level
logging hearts the performance right now as I write it.
The idea is to replace {{org.apache.commons.logging.Log}} with {{org.apache.hadoop.log.Logger}}.
{{logging.Log}} will become a member of the {{Logger}}.
This is a bigger change and should be done in a different jira.

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: FunAgain.java, FunAgain.java, HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12903734#action_12903734 ] 

Konstantin Shvachko commented on HADOOP-6884:
---------------------------------------------

I think Doug's request to provide benchmarks is reasonable. We should benchmark more.
So I modified {{NNThroughputBenchmark}}, to benchmark {{getFileStatus()}}. In current trunk {{getFileStatus()}} calls {{Groups.getGroups()}}, which has a LOG.debug() in it. In {{NNThroughputBenchmark}} you can specify a logLevel, so if I run it with {{logLevel = DEBUG}} the debug message from {{Groups.getGroups()}} is printed for every {{getFileStatus()}} call, and no other messages are printed.

I compared two versions of the code with {{logLevel = INFO}}:
# Unmodified {{Groups.getGroups()}}, where the argument for the debug call is calculated as a sum of strings.
# Modified {{Groups.getGroups()}} with {{if(LOG.isDebugEnabled())}} before the logging statement.

Results:
# 10,000 calls of {{getFileStatus()}} in (1) yield 22,905 ops/sec
 10,000 calls of {{getFileStatus()}} in (2) yield 24,610 ops/sec
 This about 7% difference.
# I increased the namespace to 100,000 files and performed that many calls of {{getFileStatus()}} for both setups. I still see 3-4% improvement in case (2).
  The gain is less because it is absorbed by the increased cost of navigating down the namespace tree.
# Did the same with 1,000 files. (2) is better about 12-13%.

{{Groups.getGroups()}} is a very popular method it is a part of user authentication, so this log message is a part of each and every name-node call. In many cases the cost of concatenating strings in it will be absorbed by disk io operations, but it is still good to have this improvement.

I'll post the patch for {{NNThroughputBenchmark}} in another jira shortly.

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: FunAgain.java, FunAgain.java, HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Luke Lu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898462#action_12898462 ] 

Luke Lu commented on HADOOP-6884:
---------------------------------

bq. Leaving out test code doesn't change the situation.

Why? It would make the patches smaller and quicker to review.

bq. build time - means we wouldn't be able to enable debug at runtime, that's a significant restriction compared to current state and limits troubleshooting capabilities.

AOP can insert if LOG.isDebugEnabled()) guards at build time, so you can definitely enable debug at runtime.

The problem with the current approach is that you'd have to keep patching people's new code manually.

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12899485#action_12899485 ] 

Doug Cutting commented on HADOOP-6884:
--------------------------------------

> LOG.debug("a=" + a + ", b=" + b + ", c=" + c);

The way you'd write this is:

LOG.debug("a={} b={} c={}", a, b, c);

This need not be a varargs call, but can a normal, 4-arg method call.

That said, numeric values would get boxed.  In the attached patch there are some calls that pass a single numeric value that would get boxed.  If we're concerned about the performance cost of the boxing in these cases, then we could define methods like:

public static void debug(String format, int i) {  if (log.isDebugEnabled())  log.debug(format, i); }
public static void debug(String format, long l) {  if (log.isDebugEnabled())  log.debug(format, l); }
public static void debug(String format, float f) {  if (log.isDebugEnabled())  log.debug(format, f); }
public static void debug(String format, double d) {  if (log.isDebugEnabled())  log.debug(format, d); }

If benchmarks show this to be a significant optimization, this might argue for creating a Hadoop-specific logging wrapper, as slf4j does not make this optimization.

Here's a random idea: use assertions for logging!  So folks could write:
  assert LOG.debug(....);
where LOG.debug() would need to return 'true'.


    

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Scott Carey (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12903759#action_12903759 ] 

Scott Carey commented on HADOOP-6884:
-------------------------------------

bq.  I've got a feeling that Scott is very strong about moving Hadoop to SLF4J.

Moreso that if there was going to be a lot of work to build custom wrappers and change logging signatures across the code-base, that SLF4J should be a strong contender since that is what it does.   
If building such a thing is outside the scope here, then guard statements surely work.

In a blue-sky world, SLF4J (or some framework) would modify bytecode to push its guard statement from inside its log method to around it at call sites.  But now I'm just dreaming.

I have had good experience with a proprietary logging wrapper that lowered string concatenation churn caused by method invocations in the past -- many years before SLF4J did it -- and with an additional 200 or so method signatures to avoid autoboxing in 99% of use cases.  A bit ugly, but very effective and we didn't have to teach developers when or why to guard their logging statements.

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: FunAgain.java, FunAgain.java, HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12903003#action_12903003 ] 

Tsz Wo (Nicholas), SZE commented on HADOOP-6884:
------------------------------------------------

> ...  , given that even that the string concatenation has not shown to be significant, ...

Doug, have you done any benchmark to support the above statement?

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: FunAgain.java, FunAgain.java, HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Erik Steffl (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Erik Steffl updated HADOOP-6884:
--------------------------------

    Attachment: HADOOP-6884-0.22.patch

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12897106#action_12897106 ] 

Konstantin Shvachko commented on HADOOP-6884:
---------------------------------------------

+1 the patch looks good.
I'll commit both debug-logging related patches if there are no other opinions.

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12902943#action_12902943 ] 

Doug Cutting commented on HADOOP-6884:
--------------------------------------

Scott> This is not much different than just using SLF4J [ ... ]

I too would prefer we simply switch to slf4j.  In performance-critical code that requires complex log messages we might still ocasionally use isDebugEnabled(), as we might make other local code optimizations in critical sections.  To be clear, I find Konstantin's approach acceptable, but feel a move to slf4j would be simpler with essentially the same benefits. 

String concatenation is the biggest potential performance problem.  Adding an automated test that detects string concatenation in log statements would be a good addition.  Before we can add that test though, we need a logging API that does not require string concatenation and does not bloat code.  Slf4j and Konstantin's approach both provide one.


> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: FunAgain.java, FunAgain.java, HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Scott Carey (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898461#action_12898461 ] 

Scott Carey commented on HADOOP-6884:
-------------------------------------

bq. Scott: build time - means we wouldn't be able to enable debug at runtime, that's a significant restriction compared to current state and limits troubleshooting capabilities.

Thats not what I meant.   I mean that at build time, the source code:
{code}
log.debug("hello " + "world" + "!");
{code}

can be transformed into:
{code}
if (log.isDebugEnabled()) {
  log.debug("hello " + "world" + "!");
}
{code}

The transformation can be at the source code level prior to compilation OR at byte code level post compilation.
Obviously, that is a longer term option.

Do we have any information about how much improved the performance is with the patch here?

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12899442#action_12899442 ] 

Doug Cutting commented on HADOOP-6884:
--------------------------------------

As Luke implies, slf4j already implements the API style I proposed above:

http://www.slf4j.org/faq.html#logging_performance

As for commons logging, there are a large number of articles suggesting that it should not be revived:

http://articles.qos.ch/thinkAgain.html
http://tapestryjava.blogspot.com/2007/08/so-long-commons-logging-hello-slf4j.html

So, to my thinking, a good approach might be:
 - switch to slf4j
 - replace string concatenation with slf4j format strings
 - add a checkstyle or other test that warns about string concatenation in log expressions
 
> you also need to choose at redist time what your back end will be and include that JAR

Unless we want to break things incompatibly, Hadoop's standard logging backend will remain log4j for the forseeable future.  So slf4j is only being proposed here as a front end API.  To my thinking, the question is whether we should invent our own improved front-end API for log4j that makes it easier to be efficient, or whether we should just use an existing efficient API like slf4j.  I lean towards the latter.



> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Erik Steffl (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12903524#action_12903524 ] 

Erik Steffl commented on HADOOP-6884:
-------------------------------------

Doug, I don't see your point - the patch only makes the code consistent, as I mentioned previously lot of debug() calls are already guarded by isDebugEnabled().

If I started a new practice or tried to bring in new coding style/practice I would understand your point. But defending inconsistent code?

Patch might not be nice to look at but:

  - it's best from performance viewpoint

  - even though it adds code still LESS code is executed

  - code is less readable on small scale (individual statement is longer) but more readable on big scale - no surprises (all debug messages are same), easy to maintain (easy to find debug messages that are not guarded thus potential performance problems), it's obvious that debug message is optimal (don't need to investigate whether it's on critical path, whether it's good enough - in some cases the arguments themselves might be expensive, i.e. it's a function call or a new object is created)

  - it's relatively standard practice (i.e. java language doesn't offer much better solution, as seen by this long discussion)

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: FunAgain.java, FunAgain.java, HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Erik Steffl (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898007#action_12898007 ] 

Erik Steffl commented on HADOOP-6884:
-------------------------------------

Re https://issues.apache.org/jira/browse/HADOOP-6884?focusedCommentId=12897536&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12897536

Me too however then there's no way to automatically or at least semi-automatically check for the cases where the optimisation would be helpful.

If all debug() logging calls have isDebugEnabled() guard it's easy to check the whole codebase in few minutes (I have a simple script that only produces few false positives that need to be eyeballed). If the guard is used in minority of cases where it's actually useful then verification whether it's used everywhere where it's needed is very expensive.

Possibly better solutions but I don't think they are realistic in short/medium term:

  - use scala and lazy values (arguments for debug() are not constructed unless used)

  - use extensive profiling and see where debug() calls matter (this could be part of automatic builds/tests)

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Erik Steffl (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12900480#action_12900480 ] 

Erik Steffl commented on HADOOP-6884:
-------------------------------------

To put things in perspective:

  - the practice of using isDebugEnabled() is already common in Hadoop code, this patch just makes it more consistent

  - HDFS 187 in patch, 86 before patch

  - Common 63 in patch, 52 before patch

Note: patch adds isDebugEnabled to tests as well so the counts that matter are even lower.

None of the proposed solutions is solving all the problems and none of them is a short term solution. The only complaint against the patch is that it looks inelegant but from the discussion in this bug (and looking up others' effort to do the same) there's really no elegant solution for this problem in Java.

Claiming that patch adds code is a bit tricky. Yes it adds to total code written but in production runs it means less code being executed and less objects created (in alternative proposals even if the objects are not stringified they are still created). Given that we already have problems with GC that seems like a good idea.

We can still work on a long term better solution, once we have one we would have to convert all the code to the new solution anyway (at least remove already existing isDebugEnabled calls). Patch does not add significant (possibly none) additional work for that case.

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: FunAgain.java, FunAgain.java, HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12903009#action_12903009 ] 

Doug Cutting commented on HADOOP-6884:
--------------------------------------

No, I have not demonstrated that the cost string concatenation in debug existing log messages is insignificant, but neither has anyone else to my knowledge shown that this cost *is* significant.  I'm willing to accept optimizations that don't make the code any larger without proof of their significance, but optimizations that increase code size should first be shown to significantly improve performance, no?

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: FunAgain.java, FunAgain.java, HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12899464#action_12899464 ] 

Tsz Wo (Nicholas), SZE commented on HADOOP-6884:
------------------------------------------------

Thanks Luke for clarfying aspectj.

Let's be academic again: one more fantasy idea is to write our own java compiler or JVM.  It definitely will work.

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.