You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by "Yan Fang (JIRA)" <ji...@apache.org> on 2014/06/17 08:01:11 UTC

[jira] [Commented] (SAMZA-215) Better logging for interactive command-line tools

    [ https://issues.apache.org/jira/browse/SAMZA-215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14033465#comment-14033465 ] 

Yan Fang commented on SAMZA-215:
--------------------------------

There are two approaches in my mind for implementing the suggested solution:
1. pass a parameter, such as consoleLog, from run-job.sh/kill-yarn-job.sh/checkpoint-tool.sh, and have the run-class.sh define which log to use based on the passed parameter.
2. add the log4j properties file variable in the each script, such as _export SAMZA_LOG4J_FILE=path_, the run-class.sh will add this variable in the JAVA_OPTS. 
Not sure if there is a better way to do this. If no, I will go with the second approach.

Also, where should we put the default log4j.xml? Maybe samza-shell?



> Better logging for interactive command-line tools
> -------------------------------------------------
>
>                 Key: SAMZA-215
>                 URL: https://issues.apache.org/jira/browse/SAMZA-215
>             Project: Samza
>          Issue Type: Improvement
>            Reporter: Martin Kleppmann
>
> At the moment, if you use run-job.sh, it prints out a very long JVM invocation (which is arguably not very useful for most users) but no information about what has actually happened (e.g. connecting to YARN RM, etc). Where the progress messages get logged to depends on the configuration of the user project using Samza.
> For example, hello-samza supplies {{samza-job-package/src/main/resources/log4j.xml}} which sends the logs to a file called {{deploy/samza/undefined-samza-container-name.log}} by default. That is not a great experience for new users — if the job won't start up, users need to know to look in an obscurely-named log file to see any errors that occurred in run-job.sh (e.g. could not connect to YARN RM).
> It's good that jobs can supply their own configuration for logging within a container. However, for interactive tools like run-job.sh, kill-yarn-job.sh and checkpoint-tool.sh (SAMZA-180) it would be much better if the logs just went to the console (stdout or stderr).
> Suggested solution: we include a default log4j configuration that sends logs to the console, and use it in the interactive shell scripts (e.g. run-job.sh). We don't use it in run-container.sh and run-am.sh, as those should be configured by the job.
> This will be especially relevant when we make binary releases of Samza. A user should be able to download the tgz of a release and immediately use the shell scripts for managing jobs, without having to worry about configuring log4j.



--
This message was sent by Atlassian JIRA
(v6.2#6252)