You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Ari Rabkin (JIRA)" <ji...@apache.org> on 2011/08/23 22:06:29 UTC

[jira] [Created] (HADOOP-7573) hadoop should log configuration reads

hadoop should log configuration reads
-------------------------------------

                 Key: HADOOP-7573
                 URL: https://issues.apache.org/jira/browse/HADOOP-7573
             Project: Hadoop Common
          Issue Type: Improvement
          Components: conf
    Affects Versions: 0.20.203.0
            Reporter: Ari Rabkin
            Assignee: Ari Rabkin
            Priority: Minor
         Attachments: HADOOP-7573.patch

For debugging, it would often be valuable to know which configuration options ever got read out of the Configuration into the rest of the program -- an unread option didn't cause a problem. This patch logs the first time each option is read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-7573) hadoop should log configuration reads

Posted by "Steve Loughran (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13104389#comment-13104389 ] 

Steve Loughran commented on HADOOP-7573:
----------------------------------------

I like the idea, but not this initial implementation
# sychronising everything adds a bottleneck. You could do a check for presence outside the synchronized block and then do the synchronized add/trace  iff that first {contains()} operation fails. 
# having a static method stops subclasses doing useful stuff. I think I'm the only person to have done the subclassing other than the ubuquitous JobConf. What that subclassing would do would let you write the tests that this current patch lacks.
# what is your test plan
# the condition for logging is the {{isInfoEnabled()}} but then log.trace() is called; the test should be downgraded to isTraceEnabled().
# the optsRead hash set ought to be final, as should OPT_READ_LOG.
# This would be a 0.23+ feature.
If you can send us a draft of the paper to show the value, then we could look at it some more -I think it could be useful


> hadoop should log configuration reads
> -------------------------------------
>
>                 Key: HADOOP-7573
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7573
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>    Affects Versions: 0.20.203.0
>            Reporter: Ari Rabkin
>            Assignee: Ari Rabkin
>            Priority: Minor
>             Fix For: 0.20.206.0
>
>         Attachments: HADOOP-7573.patch, HADOOP-7573.patch, HADOOP-7573.patch, HADOOP-7573.patch
>
>
> For debugging, it would often be valuable to know which configuration options ever got read out of the Configuration into the rest of the program -- an unread option didn't cause a problem. This patch logs the first time each option is read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-7573) hadoop should log configuration reads

Posted by "Ari Rabkin (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089715#comment-13089715 ] 

Ari Rabkin commented on HADOOP-7573:
------------------------------------

I've done some measurements that suggest that for configuration errors that cause failures at startup, the most-recently-read option is the one most likely at fault. Hence, this patch could help users diagnose configuration errors.

 (This work will be presented at the IEEE/ACM Conference On Automated Software Engineering in November).

> hadoop should log configuration reads
> -------------------------------------
>
>                 Key: HADOOP-7573
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7573
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>    Affects Versions: 0.20.203.0
>            Reporter: Ari Rabkin
>            Assignee: Ari Rabkin
>            Priority: Minor
>         Attachments: HADOOP-7573.patch
>
>
> For debugging, it would often be valuable to know which configuration options ever got read out of the Configuration into the rest of the program -- an unread option didn't cause a problem. This patch logs the first time each option is read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-7573) hadoop should log configuration reads

Posted by "Steve Loughran (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13104392#comment-13104392 ] 

Steve Loughran commented on HADOOP-7573:
----------------------------------------

One more thing: whatever the outcome, this should not be a substitute for improving error messages. The user base of Hadoop spreads from people who set their clusters up in a private /16 subnet with caching DNS servers on every machine to prevent DNS overload and datacentre-grade CM tooling, all the way down to people with laptop who don't know what a TCP Connection Refused error means. 

> hadoop should log configuration reads
> -------------------------------------
>
>                 Key: HADOOP-7573
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7573
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>    Affects Versions: 0.20.203.0
>            Reporter: Ari Rabkin
>            Assignee: Ari Rabkin
>            Priority: Minor
>             Fix For: 0.20.206.0
>
>         Attachments: HADOOP-7573.patch, HADOOP-7573.patch, HADOOP-7573.patch, HADOOP-7573.patch
>
>
> For debugging, it would often be valuable to know which configuration options ever got read out of the Configuration into the rest of the program -- an unread option didn't cause a problem. This patch logs the first time each option is read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-7573) hadoop should log configuration reads

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13096196#comment-13096196 ] 

Hadoop QA commented on HADOOP-7573:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12492787/HADOOP-7573.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no new tests are needed for this patch.
                        Also please list what manual steps were performed to verify this patch.

    -1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/126//console

This message is automatically generated.

> hadoop should log configuration reads
> -------------------------------------
>
>                 Key: HADOOP-7573
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7573
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>    Affects Versions: 0.20.203.0
>            Reporter: Ari Rabkin
>            Assignee: Ari Rabkin
>            Priority: Minor
>             Fix For: 0.20.206.0
>
>         Attachments: HADOOP-7573.patch, HADOOP-7573.patch, HADOOP-7573.patch, HADOOP-7573.patch
>
>
> For debugging, it would often be valuable to know which configuration options ever got read out of the Configuration into the rest of the program -- an unread option didn't cause a problem. This patch logs the first time each option is read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-7573) hadoop should log configuration reads

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089780#comment-13089780 ] 

Hadoop QA commented on HADOOP-7573:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12491391/HADOOP-7573.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no new tests are needed for this patch.
                        Also please list what manual steps were performed to verify this patch.

    -1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/70//console

This message is automatically generated.

> hadoop should log configuration reads
> -------------------------------------
>
>                 Key: HADOOP-7573
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7573
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>    Affects Versions: 0.20.203.0
>            Reporter: Ari Rabkin
>            Assignee: Ari Rabkin
>            Priority: Minor
>             Fix For: 0.20.206.0
>
>         Attachments: HADOOP-7573.patch, HADOOP-7573.patch
>
>
> For debugging, it would often be valuable to know which configuration options ever got read out of the Configuration into the rest of the program -- an unread option didn't cause a problem. This patch logs the first time each option is read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HADOOP-7573) hadoop should log configuration reads

Posted by "Ari Rabkin (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-7573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ari Rabkin updated HADOOP-7573:
-------------------------------

    Attachment: HADOOP-7573.patch

Revised as per Todd. Priority is trace, and there's a commented-out line in log4j.properties to enable this. 

I notice that hardly anything else in Configuration is logged as debug, and nothing else is trace. Perhaps ditch the second Logger, and just log through base logger? 

I'd rather have these messages ON by default: Hadoop should be easy to debug and troubleshoot out of the box. But apparently this isn't the consensus view?

> hadoop should log configuration reads
> -------------------------------------
>
>                 Key: HADOOP-7573
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7573
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>    Affects Versions: 0.20.203.0
>            Reporter: Ari Rabkin
>            Assignee: Ari Rabkin
>            Priority: Minor
>             Fix For: 0.20.206.0
>
>         Attachments: HADOOP-7573.patch, HADOOP-7573.patch, HADOOP-7573.patch, HADOOP-7573.patch
>
>
> For debugging, it would often be valuable to know which configuration options ever got read out of the Configuration into the rest of the program -- an unread option didn't cause a problem. This patch logs the first time each option is read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HADOOP-7573) hadoop should log configuration reads

Posted by "Ari Rabkin (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-7573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ari Rabkin updated HADOOP-7573:
-------------------------------

    Attachment: HADOOP-7573.patch

Re-generated patch with --no-prefix.
This patch should be used against 0.20.2-, not against trunk.  I changed the fix-version; will that cause the QA system to test against the appropriate version?

> hadoop should log configuration reads
> -------------------------------------
>
>                 Key: HADOOP-7573
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7573
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>    Affects Versions: 0.20.203.0
>            Reporter: Ari Rabkin
>            Assignee: Ari Rabkin
>            Priority: Minor
>             Fix For: 0.20.206.0
>
>         Attachments: HADOOP-7573.patch, HADOOP-7573.patch
>
>
> For debugging, it would often be valuable to know which configuration options ever got read out of the Configuration into the rest of the program -- an unread option didn't cause a problem. This patch logs the first time each option is read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-7573) hadoop should log configuration reads

Posted by "Steve Loughran (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13104390#comment-13104390 ] 

Steve Loughran commented on HADOOP-7573:
----------------------------------------

ignore my comment about why the static; I see why. For testing you'd therefore need to get access to that log and flip it into trace mode, then grab the log output. Tricky but not impossible

> hadoop should log configuration reads
> -------------------------------------
>
>                 Key: HADOOP-7573
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7573
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>    Affects Versions: 0.20.203.0
>            Reporter: Ari Rabkin
>            Assignee: Ari Rabkin
>            Priority: Minor
>             Fix For: 0.20.206.0
>
>         Attachments: HADOOP-7573.patch, HADOOP-7573.patch, HADOOP-7573.patch, HADOOP-7573.patch
>
>
> For debugging, it would often be valuable to know which configuration options ever got read out of the Configuration into the rest of the program -- an unread option didn't cause a problem. This patch logs the first time each option is read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-7573) hadoop should log configuration reads

Posted by "Chris Douglas (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13097711#comment-13097711 ] 

Chris Douglas commented on HADOOP-7573:
---------------------------------------

I'm \-0 on the patch. I have never wished for this hook, but it's out of the way and I won't block it if another's experience is otherwise.

bq. I'd rather have these messages ON by default: Hadoop should be easy to debug and troubleshoot out of the box. But apparently this isn't the consensus view?

Without presuming to speak for Todd, I think the consensus view is that this change effects significantly more noise than signal. All accurate debug output is helpful for troubleshooting, but reasonable people will differ in their tolerance for spurious messages.

> hadoop should log configuration reads
> -------------------------------------
>
>                 Key: HADOOP-7573
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7573
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>    Affects Versions: 0.20.203.0
>            Reporter: Ari Rabkin
>            Assignee: Ari Rabkin
>            Priority: Minor
>             Fix For: 0.20.206.0
>
>         Attachments: HADOOP-7573.patch, HADOOP-7573.patch, HADOOP-7573.patch, HADOOP-7573.patch
>
>
> For debugging, it would often be valuable to know which configuration options ever got read out of the Configuration into the rest of the program -- an unread option didn't cause a problem. This patch logs the first time each option is read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-7573) hadoop should log configuration reads

Posted by "Ari Rabkin (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13103850#comment-13103850 ] 

Ari Rabkin commented on HADOOP-7573:
------------------------------------

Hi all. Checking back on this. Are there more changes I should make? I had thought I'd addressed everything...

> hadoop should log configuration reads
> -------------------------------------
>
>                 Key: HADOOP-7573
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7573
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>    Affects Versions: 0.20.203.0
>            Reporter: Ari Rabkin
>            Assignee: Ari Rabkin
>            Priority: Minor
>             Fix For: 0.20.206.0
>
>         Attachments: HADOOP-7573.patch, HADOOP-7573.patch, HADOOP-7573.patch, HADOOP-7573.patch
>
>
> For debugging, it would often be valuable to know which configuration options ever got read out of the Configuration into the rest of the program -- an unread option didn't cause a problem. This patch logs the first time each option is read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HADOOP-7573) hadoop should log configuration reads

Posted by "Ari Rabkin (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-7573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ari Rabkin updated HADOOP-7573:
-------------------------------

    Attachment: HADOOP-7573.patch

-1 overall.  

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no tests are needed for this patch.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.



> hadoop should log configuration reads
> -------------------------------------
>
>                 Key: HADOOP-7573
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7573
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>    Affects Versions: 0.20.203.0
>            Reporter: Ari Rabkin
>            Assignee: Ari Rabkin
>            Priority: Minor
>             Fix For: 0.20.206.0
>
>         Attachments: HADOOP-7573.patch, HADOOP-7573.patch, HADOOP-7573.patch
>
>
> For debugging, it would often be valuable to know which configuration options ever got read out of the Configuration into the rest of the program -- an unread option didn't cause a problem. This patch logs the first time each option is read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-7573) hadoop should log configuration reads

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089717#comment-13089717 ] 

Hadoop QA commented on HADOOP-7573:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12491384/HADOOP-7573.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no new tests are needed for this patch.
                        Also please list what manual steps were performed to verify this patch.

    -1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/68//console

This message is automatically generated.

> hadoop should log configuration reads
> -------------------------------------
>
>                 Key: HADOOP-7573
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7573
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>    Affects Versions: 0.20.203.0
>            Reporter: Ari Rabkin
>            Assignee: Ari Rabkin
>            Priority: Minor
>         Attachments: HADOOP-7573.patch
>
>
> For debugging, it would often be valuable to know which configuration options ever got read out of the Configuration into the rest of the program -- an unread option didn't cause a problem. This patch logs the first time each option is read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HADOOP-7573) hadoop should log configuration reads

Posted by "Ari Rabkin (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-7573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ari Rabkin updated HADOOP-7573:
-------------------------------

    Attachment: HADOOP-7573.patch

Patch is against git branch "branch-0.20-security"

> hadoop should log configuration reads
> -------------------------------------
>
>                 Key: HADOOP-7573
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7573
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>    Affects Versions: 0.20.203.0
>            Reporter: Ari Rabkin
>            Assignee: Ari Rabkin
>            Priority: Minor
>         Attachments: HADOOP-7573.patch
>
>
> For debugging, it would often be valuable to know which configuration options ever got read out of the Configuration into the rest of the program -- an unread option didn't cause a problem. This patch logs the first time each option is read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-7573) hadoop should log configuration reads

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13090524#comment-13090524 ] 

Hadoop QA commented on HADOOP-7573:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12491554/HADOOP-7573.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no new tests are needed for this patch.
                        Also please list what manual steps were performed to verify this patch.

    -1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/75//console

This message is automatically generated.

> hadoop should log configuration reads
> -------------------------------------
>
>                 Key: HADOOP-7573
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7573
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>    Affects Versions: 0.20.203.0
>            Reporter: Ari Rabkin
>            Assignee: Ari Rabkin
>            Priority: Minor
>             Fix For: 0.20.206.0
>
>         Attachments: HADOOP-7573.patch, HADOOP-7573.patch, HADOOP-7573.patch
>
>
> For debugging, it would often be valuable to know which configuration options ever got read out of the Configuration into the rest of the program -- an unread option didn't cause a problem. This patch logs the first time each option is read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-7573) hadoop should log configuration reads

Posted by "Aaron T. Myers (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089779#comment-13089779 ] 

Aaron T. Myers commented on HADOOP-7573:
----------------------------------------

bq. I changed the fix-version; will that cause the QA system to test against the appropriate version?

Nope, there's currently no way of doing that. See this JIRA for more details: HADOOP-7435

You should manually run the test-patch.sh script against the correct branch on your machine and paste the output in a comment on this JIRA.

> hadoop should log configuration reads
> -------------------------------------
>
>                 Key: HADOOP-7573
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7573
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>    Affects Versions: 0.20.203.0
>            Reporter: Ari Rabkin
>            Assignee: Ari Rabkin
>            Priority: Minor
>             Fix For: 0.20.206.0
>
>         Attachments: HADOOP-7573.patch, HADOOP-7573.patch
>
>
> For debugging, it would often be valuable to know which configuration options ever got read out of the Configuration into the rest of the program -- an unread option didn't cause a problem. This patch logs the first time each option is read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-7573) hadoop should log configuration reads

Posted by "Matei Zaharia (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13090541#comment-13090541 ] 

Matei Zaharia commented on HADOOP-7573:
---------------------------------------

I'm not sure why this would be considered unhelpful even for users. As a user, I've often had cases when I saw a weird exception due to a URL being wrong or something like that. Seeing the name and value of a config option before the crash can help with narrowing in on it. From a support point of view, having a dump of the config that was in use when a problem occurred is of course invaluable, because users might change stuff around before they report an issue.

> hadoop should log configuration reads
> -------------------------------------
>
>                 Key: HADOOP-7573
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7573
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>    Affects Versions: 0.20.203.0
>            Reporter: Ari Rabkin
>            Assignee: Ari Rabkin
>            Priority: Minor
>             Fix For: 0.20.206.0
>
>         Attachments: HADOOP-7573.patch, HADOOP-7573.patch, HADOOP-7573.patch
>
>
> For debugging, it would often be valuable to know which configuration options ever got read out of the Configuration into the rest of the program -- an unread option didn't cause a problem. This patch logs the first time each option is read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-7573) hadoop should log configuration reads

Posted by "Ari Rabkin (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089900#comment-13089900 ] 

Ari Rabkin commented on HADOOP-7573:
------------------------------------

The intended audience for these log messages is neither users nor developers -- it's supporters who are brought in once a problem appears. That means it'll be experts able to interpret the messages, but who won't have the opportunity to add debug statements.  

Having a separate logger makes sense; I will do that.

The per-instanceness is a problem; thanks for highlighting that. Open to suggestions here. Is it crazy to make it static? Once-per-program is the desired functionality. But I'd like to have it on-by-default; assuming once per program, the volume of messages will be low. And reproducing problems is a significant source of time and trouble, so getting the right debug info the first time is important.

> hadoop should log configuration reads
> -------------------------------------
>
>                 Key: HADOOP-7573
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7573
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>    Affects Versions: 0.20.203.0
>            Reporter: Ari Rabkin
>            Assignee: Ari Rabkin
>            Priority: Minor
>             Fix For: 0.20.206.0
>
>         Attachments: HADOOP-7573.patch, HADOOP-7573.patch
>
>
> For debugging, it would often be valuable to know which configuration options ever got read out of the Configuration into the rest of the program -- an unread option didn't cause a problem. This patch logs the first time each option is read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-7573) hadoop should log configuration reads

Posted by "Chris Douglas (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13090469#comment-13090469 ] 

Chris Douglas commented on HADOOP-7573:
---------------------------------------

Experts should be able to diagnose a misconfiguration without this aid, or they wouldn't be experts. But your point about reproducing the configuration that produced the problem is salient. Since the Apache release isn't oriented to a support org, I'd lean toward disabling/leaving it at DEBUG/TRACE. A support org can enable it as policy or request a run with this enabled.

Without auditing its use in the servers it's hard to say whether the per-instanceness is actually wrong. Particularly in TaskTrackers, it's common practice to set "final" on stuff the user config shouldn't change. While the TT/user configs are reasonably sorted out internally, it'd be useful to know if a job wrote over a value that should be immutable. Or if something is immutable that shouldn't be.

But to be honest, I remain skeptical. "Debugging" is an impossibly pervasive use case justifying infinitely many hooks, but installing a println in the configuration system to cast the widest net is awfully inexact.

> hadoop should log configuration reads
> -------------------------------------
>
>                 Key: HADOOP-7573
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7573
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>    Affects Versions: 0.20.203.0
>            Reporter: Ari Rabkin
>            Assignee: Ari Rabkin
>            Priority: Minor
>             Fix For: 0.20.206.0
>
>         Attachments: HADOOP-7573.patch, HADOOP-7573.patch
>
>
> For debugging, it would often be valuable to know which configuration options ever got read out of the Configuration into the rest of the program -- an unread option didn't cause a problem. This patch logs the first time each option is read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HADOOP-7573) hadoop should log configuration reads

Posted by "Suresh Srinivas (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-7573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Suresh Srinivas updated HADOOP-7573:
------------------------------------

    Target Version/s: 1.1.0
       Fix Version/s:     (was: 1.1.0)
    
> hadoop should log configuration reads
> -------------------------------------
>
>                 Key: HADOOP-7573
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7573
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>    Affects Versions: 0.20.203.0
>            Reporter: Ari Rabkin
>            Assignee: Ari Rabkin
>            Priority: Minor
>         Attachments: HADOOP-7573.patch, HADOOP-7573.patch, HADOOP-7573.patch, HADOOP-7573.patch
>
>
> For debugging, it would often be valuable to know which configuration options ever got read out of the Configuration into the rest of the program -- an unread option didn't cause a problem. This patch logs the first time each option is read.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-7573) hadoop should log configuration reads

Posted by "Ari Rabkin (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-7573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ari Rabkin updated HADOOP-7573:
-------------------------------

    Fix Version/s: 0.20.206.0

> hadoop should log configuration reads
> -------------------------------------
>
>                 Key: HADOOP-7573
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7573
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>    Affects Versions: 0.20.203.0
>            Reporter: Ari Rabkin
>            Assignee: Ari Rabkin
>            Priority: Minor
>             Fix For: 0.20.206.0
>
>         Attachments: HADOOP-7573.patch
>
>
> For debugging, it would often be valuable to know which configuration options ever got read out of the Configuration into the rest of the program -- an unread option didn't cause a problem. This patch logs the first time each option is read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-7573) hadoop should log configuration reads

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13095003#comment-13095003 ] 

Todd Lipcon commented on HADOOP-7573:
-------------------------------------

I agree that this should be at TRACE level. You'd probably want to add something to the log4j.properties to comment out.

Specific patch problems:
- should be ".options_read" - otherwise we get Configurationoptions_read as the logger category
- The HashSet is unsynchronized - multiple writers to an unsynchronized hashset can cause runtime exceptions or infinite loops
- why are you importing NetUtils?
- extra indentation in the javadoc text

> hadoop should log configuration reads
> -------------------------------------
>
>                 Key: HADOOP-7573
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7573
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>    Affects Versions: 0.20.203.0
>            Reporter: Ari Rabkin
>            Assignee: Ari Rabkin
>            Priority: Minor
>             Fix For: 0.20.206.0
>
>         Attachments: HADOOP-7573.patch, HADOOP-7573.patch, HADOOP-7573.patch
>
>
> For debugging, it would often be valuable to know which configuration options ever got read out of the Configuration into the rest of the program -- an unread option didn't cause a problem. This patch logs the first time each option is read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-7573) hadoop should log configuration reads

Posted by "Chris Douglas (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089802#comment-13089802 ] 

Chris Douglas commented on HADOOP-7573:
---------------------------------------

I'm not sure this is going to be helpful given common usage of that class... the patch tracks the first time each _instance_ of a {{Configuration}} accesses a property; these instances are often duplicated from a template, etc.

At a minimum, adding a separate logger and/or decreasing to DEBUG or even TRACE, then guarding the call to {{reportOptionRead}} behind {{isDebugEnabled}} would at least decrease the noise in the servers and user Tasks.

I'm skeptical that this change should be embedded so deeply in the framework, even with a switch disabling it. It will be unintelligible to most users and developers roll their own {{println}} debugging...

> hadoop should log configuration reads
> -------------------------------------
>
>                 Key: HADOOP-7573
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7573
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>    Affects Versions: 0.20.203.0
>            Reporter: Ari Rabkin
>            Assignee: Ari Rabkin
>            Priority: Minor
>             Fix For: 0.20.206.0
>
>         Attachments: HADOOP-7573.patch, HADOOP-7573.patch
>
>
> For debugging, it would often be valuable to know which configuration options ever got read out of the Configuration into the rest of the program -- an unread option didn't cause a problem. This patch logs the first time each option is read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HADOOP-7573) hadoop should log configuration reads

Posted by "Ari Rabkin (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-7573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ari Rabkin updated HADOOP-7573:
-------------------------------

    Status: Patch Available  (was: Open)

No new unit tests; patch only affects logging and does not cause visible functional changes. 

> hadoop should log configuration reads
> -------------------------------------
>
>                 Key: HADOOP-7573
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7573
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>    Affects Versions: 0.20.203.0
>            Reporter: Ari Rabkin
>            Assignee: Ari Rabkin
>            Priority: Minor
>         Attachments: HADOOP-7573.patch
>
>
> For debugging, it would often be valuable to know which configuration options ever got read out of the Configuration into the rest of the program -- an unread option didn't cause a problem. This patch logs the first time each option is read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira