You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Amir Youssefi (JIRA)" <ji...@apache.org> on 2008/08/15 02:39:44 UTC

[jira] Commented: (HADOOP-3956) map-reduce doctor (Mr Doctor)

    [ https://issues.apache.org/jira/browse/HADOOP-3956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12622760#action_12622760 ] 

Amir Youssefi commented on HADOOP-3956:
---------------------------------------

Proposed Solutions:

Hopefully in future, Hadoop can develop dynamic configuration capabilities. Given complexity of the issue it may take a long time to get there. 

Meanwhile, we can attack this problem from different angles or levels: 

 1) Having metrics: Providing understandable metrics on Web UI to raise user awareness. We can expand counters web page (or another page) to have more understandable and actionable metrics (e.g. a cluster utilization number) and more flow diagrams.
 2) Detecting issues: Have an agent to interpret logs then highlight issue or trigger a process. Example: A rule-based agent loads a set of exentsible rules and follows Hadoop logs. Applicable rule creates a message/highlight in UI or triggers a separate process.
 3) Notification: User gets notification (e.g. email) from a process triggered by rule-based agent above. This way, user doesn't need to be pinned to his monitor looking at web UI all the time.

Focus of this JIRA is development of rule-based agent of item 2 above which we call Mr Doctor (map-reduce doctor aka Hadoop Doctor). It simply processes Hadoop Logs and will be part of contrib. Mr Doctor will provide recommendations/prescriptions while following a live log of running process or postmortem logs. 

> map-reduce doctor (Mr Doctor)
> -----------------------------
>
>                 Key: HADOOP-3956
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3956
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: Amir Youssefi
>
> Problem Description: 
>  Users typically submit jobs with sub-optimal parameters resulting in under-utilization, black-listed task-trackers, time-outs, re-tries etc.
>  Issue can be mitigated by submitting job with custom Hadoop parameters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.