You are viewing a plain text version of this content. The canonical link for it is here.

Posted to mapreduce-issues@hadoop.apache.org by "Devaraj K (JIRA)" <ji...@apache.org> on 2011/07/06 16:43:16 UTC

[jira] [Created] (MAPREDUCE-2648) High Availability for JobTracker

High Availability for JobTracker
--------------------------------

                 Key: MAPREDUCE-2648
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2648
             Project: Hadoop Map/Reduce
          Issue Type: New Feature
            Reporter: Devaraj K


In Hadoop cluster, JobTracker is responsible for managing the life cycle of MapReduce jobs. If JobTracker fails, then MapReduce service will not be available until JobTracker is restarted. We propose an automatic failover solution for JobTracker to address such single point of failure. It is based on Leader Election Framework suggested in ZOOKEEPER-1080

Please refer to attached document.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2648) High Availability for JobTracker

Posted by "Alejandro Abdelnur (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/MAPREDUCE-2648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13469010#comment-13469010 ] 

Alejandro Abdelnur commented on MAPREDUCE-2648:
-----------------------------------------------

Devaraj, Abhijit,

It has been more than a year since your last comments and a patch was never uploaded. What is the status of this on your your end? 
                
> High Availability for JobTracker
> --------------------------------
>
>                 Key: MAPREDUCE-2648
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2648
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>            Reporter: Devaraj K
>            Assignee: Devaraj K
>         Attachments: High Availability for JobTracker.pdf
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> In Hadoop cluster, JobTracker is responsible for managing the life cycle of MapReduce jobs. If JobTracker fails, then MapReduce service will not be available until JobTracker is restarted. We propose an automatic failover solution for JobTracker to address such single point of failure. It is based on Leader Election Framework suggested in ZOOKEEPER-1080
> Please refer to attached document.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2648) High Availability for JobTracker

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/MAPREDUCE-2648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13060667#comment-13060667 ] 

Mahadev konar commented on MAPREDUCE-2648:
------------------------------------------

devaraj,
 Am not sure, if you already know about the MR-279 branch (the next version of MR framework). We've been trying to integrate ZK into the framework from the beginning. As for now, we are just doing restart with ZK but soon we should have a HA soln with ZK.

> High Availability for JobTracker
> --------------------------------
>
>                 Key: MAPREDUCE-2648
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2648
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>            Reporter: Devaraj K
>         Attachments: High Availability for JobTracker.pdf
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> In Hadoop cluster, JobTracker is responsible for managing the life cycle of MapReduce jobs. If JobTracker fails, then MapReduce service will not be available until JobTracker is restarted. We propose an automatic failover solution for JobTracker to address such single point of failure. It is based on Leader Election Framework suggested in ZOOKEEPER-1080
> Please refer to attached document.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2648) High Availability for JobTracker

Posted by "Devaraj K (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/MAPREDUCE-2648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13061321#comment-13061321 ] 

Devaraj K commented on MAPREDUCE-2648:
--------------------------------------

Small mistake in above comment!!
It should be

Thanks & Regards,
Devaraj & Abhijit 








> High Availability for JobTracker
> --------------------------------
>
>                 Key: MAPREDUCE-2648
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2648
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>            Reporter: Devaraj K
>         Attachments: High Availability for JobTracker.pdf
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> In Hadoop cluster, JobTracker is responsible for managing the life cycle of MapReduce jobs. If JobTracker fails, then MapReduce service will not be available until JobTracker is restarted. We propose an automatic failover solution for JobTracker to address such single point of failure. It is based on Leader Election Framework suggested in ZOOKEEPER-1080
> Please refer to attached document.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-2648) High Availability for JobTracker

Posted by "Devaraj K (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/MAPREDUCE-2648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj K updated MAPREDUCE-2648:
---------------------------------

    Attachment: High Availability for JobTracker.pdf

> High Availability for JobTracker
> --------------------------------
>
>                 Key: MAPREDUCE-2648
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2648
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>            Reporter: Devaraj K
>         Attachments: High Availability for JobTracker.pdf
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> In Hadoop cluster, JobTracker is responsible for managing the life cycle of MapReduce jobs. If JobTracker fails, then MapReduce service will not be available until JobTracker is restarted. We propose an automatic failover solution for JobTracker to address such single point of failure. It is based on Leader Election Framework suggested in ZOOKEEPER-1080
> Please refer to attached document.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (MAPREDUCE-2648) High Availability for JobTracker

Posted by "Devaraj K (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/MAPREDUCE-2648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj K reassigned MAPREDUCE-2648:
------------------------------------

    Assignee: Devaraj K

> High Availability for JobTracker
> --------------------------------
>
>                 Key: MAPREDUCE-2648
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2648
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>            Reporter: Devaraj K
>            Assignee: Devaraj K
>         Attachments: High Availability for JobTracker.pdf
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> In Hadoop cluster, JobTracker is responsible for managing the life cycle of MapReduce jobs. If JobTracker fails, then MapReduce service will not be available until JobTracker is restarted. We propose an automatic failover solution for JobTracker to address such single point of failure. It is based on Leader Election Framework suggested in ZOOKEEPER-1080
> Please refer to attached document.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2648) High Availability for JobTracker

Posted by "Abhijit Suresh Shingate (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/MAPREDUCE-2648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13061326#comment-13061326 ] 

Abhijit Suresh Shingate commented on MAPREDUCE-2648:
----------------------------------------------------

To add,

We tested this solution on a 100 node cluster and 1000 Jobs, it was measured that STANDBY JobTracker can detect the failure of ACTIVE JobTracker and becomes ACTIVE and starts serving requests in less than 1 minute. This includes failure detection time also.

It will be useful for the organizations which are already using hadoop in production environment.

> High Availability for JobTracker
> --------------------------------
>
>                 Key: MAPREDUCE-2648
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2648
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>            Reporter: Devaraj K
>         Attachments: High Availability for JobTracker.pdf
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> In Hadoop cluster, JobTracker is responsible for managing the life cycle of MapReduce jobs. If JobTracker fails, then MapReduce service will not be available until JobTracker is restarted. We propose an automatic failover solution for JobTracker to address such single point of failure. It is based on Leader Election Framework suggested in ZOOKEEPER-1080
> Please refer to attached document.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2648) High Availability for JobTracker

Posted by "Devaraj K (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/MAPREDUCE-2648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13061228#comment-13061228 ] 

Devaraj K commented on MAPREDUCE-2648:
--------------------------------------

Hi Mahadev,

Sorry for the delay in response.

Yes. I am aware of MapRed NextGen.

>From my understanding, it might take some time for MapRed NextGen to stabilize and become production ready.

So I was considering following points.

ZOOKEEPER-1080 provides very simple, generic solution to support HA scenario.

This solution tries to incorporate it for JobTracker.


Thanks & Regards,
Abhijit

> High Availability for JobTracker
> --------------------------------
>
>                 Key: MAPREDUCE-2648
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2648
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>            Reporter: Devaraj K
>         Attachments: High Availability for JobTracker.pdf
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> In Hadoop cluster, JobTracker is responsible for managing the life cycle of MapReduce jobs. If JobTracker fails, then MapReduce service will not be available until JobTracker is restarted. We propose an automatic failover solution for JobTracker to address such single point of failure. It is based on Leader Election Framework suggested in ZOOKEEPER-1080
> Please refer to attached document.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira