You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Hong Tang (JIRA)" <ji...@apache.org> on 2009/09/10 00:41:57 UTC

[jira] Commented: (MAPREDUCE-966) Rumen interface improvement

    [ https://issues.apache.org/jira/browse/MAPREDUCE-966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753309#action_12753309 ] 

Hong Tang commented on MAPREDUCE-966:
-------------------------------------

Proposed changes:
- Isolate the tools dependent on Rumen to three simple interfaces: 
## JobStory (describing a MapReduce job).
## ClusterStory (describing the cluster setup and topology etc) 
## JobStoryProducer that produces a sequence of jobs.
 
Accordingly, ZombieJob adapts a LoggedJob to JobStory, and ZombieCluster adapts LoggedNetworkTopology to ClusterStory (indirectly through AbstractClusterStory). Finally ZombieJobProducer reads rumen traces and produces a sequence of JobStory instances.
- Encapsulate the logic of JSON parsing within Rumen and remove the Parser class. Two reader classes are added to parse json encoded LoggedJob and LoggedNetworkTopology (JobTraceReader and ClusterTopologyReader). No throw of Json-specific exceptions from the interface.
- Better sanity check in ZombieJob, and fill in made-up data if the source data are missing or invalid. 


> Rumen interface improvement
> ---------------------------
>
>                 Key: MAPREDUCE-966
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-966
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>    Affects Versions: 0.21.0
>            Reporter: Hong Tang
>            Assignee: Hong Tang
>
> Rumen could expose a cleaner interface to simplify the integration with other tools.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.