You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Alan Gates (JIRA)" <ji...@apache.org> on 2007/12/07 18:41:43 UTC
[jira] Updated: (PIG-12) Please add timestamps to pig map/reduce
progress messages
[ https://issues.apache.org/jira/browse/PIG-12?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alan Gates updated PIG-12:
--------------------------
Attachment: timestamps.diff
Patch uploaded on behalf of Patrick Hunt. Comments from his email:
Here is the patch, some things to note:
1) -b/--brief gives brief logging - no timestamps
2) -4/--log4jconf allows user to specify properties conf file which will be "the" log4j configuration (overrides anything we might do)
3) I tried to keep the semantics of -v and -d the same, see changes to Main.java. Main diff is that it applies to the root (everyone; pig, hadoop, etc...), rather than just pig as it did previously. You should verify bw compatibility (if you care about such things).
4) some of the code is using system.out.println (like POMapreduce.java). As a result, obviously, these messages won't have timestamp. You may/maynot want to clean this up (35 matches in src hierarchy)
Patrick
> Please add timestamps to pig map/reduce progress messages
> ---------------------------------------------------------
>
> Key: PIG-12
> URL: https://issues.apache.org/jira/browse/PIG-12
> Project: Pig
> Issue Type: Improvement
> Components: impl
> Reporter: Olga Natkovich
> Attachments: timestamps.diff
>
>
> From one of the users:
> ------------------------------
> I'm spending a lot of time trying to optimize my pig queries for short
> run-times. This process would be much easier if, in the progress output
> from pig (currently on stdout, but hopefully soon moving to
> stderr?!), the
> initiation and completion of each map/reduce job could be
> timestamped. Pig
> already spits out messages of the form "----- MapReduce Job -----",
> "Input:
> ...", "Combine: ...", etc; could you just add a "Timestamp: ..."
> field as well? Or ideally, both "Starting timestamp: ..." and
> "Finishing
> timestamp ...".
> Additional comments from another user:
> ------------------------------------------------------
> I'm adding my vote for this as well.
> I'd like to know timestamp and "running time" in seconds or D;H:M:S:
> Thu Oct 25 10:06:01 GMT 2007 (0:00:12:56): 56% done
> Starting and stopping timestamps in the log would also be valuable.
> Unforutately, there's no "workaround" such as putting a date command before and after the pig command in logging --
> queuing times can be seconds to hours and completely mess up any notion of job execution time.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.