You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@oozie.apache.org by "Hadoop QA (JIRA)" <ji...@apache.org> on 2011/09/08 06:31:09 UTC

[jira] [Commented] (OOZIE-118) GH-83: Integrate Pig stats into oozie.

    [ https://issues.apache.org/jira/browse/OOZIE-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13099814#comment-13099814 ] 

Hadoop QA commented on OOZIE-118:
---------------------------------

brookwc remarked:
This stats support is only available starting from Pig 0.8. 

To get this stats information, we need to start a Pig script using the PigRunner.run(String args[], ProgressNotificationListener listener). We will get as a result a PigStats object that gives us access to the job hierarchy, the Hadoop counters from each job, and so on.  We can implement the optional ProgressNotificationListener if we wants to watch the job as it progresses; the listener will be notified as different component jobs start and finish. However, this listener thing will be a future task for Oozie down the road.

We need to support Pig 0.8 and previous versions. One way we are thinking of is to figure out the Pig's version on the fly and then invoke different interfaces accordingly.
More specifically, if it's 0.8, we invoke PigRunner.run(), else we invoke Main.main(). 
The following script is one way to figure out Pig's version.

 String findContainingJar = JarManager.findContainingJar(ScriptState.class);
            try { 
                JarFile jar = new JarFile(findContainingJar); 
                final Manifest manifest = jar.getManifest(); 
                final Map <String,Attributes> attrs = manifest.getEntries(); 
                Attributes attr = attrs.get("org/apache/pig");
                pigVersion = attr.getValue("Implementation-Version");
            } catch (Exception e) { 
                LOG.warn("unable to read pigs manifest file"); 
            }

Another question that is still not fully clear is how to present this stats information to Oozie jobs - log file, standard output ...?

Let us know your comments and thoughts.

Thanks.

> GH-83: Integrate Pig stats into oozie.
> --------------------------------------
>
>                 Key: OOZIE-118
>                 URL: https://issues.apache.org/jira/browse/OOZIE-118
>             Project: Oozie
>          Issue Type: Bug
>            Reporter: Hadoop QA
>
> There is new feature in pig that will return statistics about hadoop jobs to oozie. Ooziie needs to make a susbset of stats to oozie user.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira