You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by "Chris Riccomini (JIRA)" <ji...@apache.org> on 2014/09/06 01:39:28 UTC

[jira] [Commented] (SAMZA-406) Hot standby containers

    [ https://issues.apache.org/jira/browse/SAMZA-406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14123818#comment-14123818 ] 

Chris Riccomini commented on SAMZA-406:
---------------------------------------

This design seems sound to me, at a high level.

A related use case that might overlap somewhat with this one is a zero-downtime deployment (either binary or config changes). The same jobs that wish to not incur extended downtime during changelog restoration when a container fails will also prefer not to incur extended downtime when a new version of their Samza job (or conifg) is deployed. This would involve:

# Shutting down the standby.
# Starting it back up again with the new configuration for the job (with a new yarn.package.path if binaries were upgraded).
# Making the standby the leader.
# Shutting down the new standby.
# Brining the new standby back up again with the new configuration (or binary) for the job.

Following these steps, Samza would be able to support both zero-downtime job upgrades as well as zero-downtime failover when a SamzaContainer fails.

I think the zero-downtime deployment would also involve some of the changes discussed in SAMZA-348, since we'd need a way to dynamically change a config at runtime, and reactively have Samza's YARN AM reconfigure the containers.

> Hot standby containers
> ----------------------
>
>                 Key: SAMZA-406
>                 URL: https://issues.apache.org/jira/browse/SAMZA-406
>             Project: Samza
>          Issue Type: New Feature
>          Components: container
>            Reporter: Martin Kleppmann
>
> If a container dies, Samza currently suspends processing of the input stream partitions assigned to that container until a new container has been brought up (which then resumes processing from the last checkpoint). That downtime can be substantial if the job has a lot of local state which needs to be restored.
> If the application can tolerate such processing latency, that's not a problem. However, some jobs may have an SLA that requires them to always process messages with low latency, even if a container fails. For such jobs, it would be good to have the option of enabling "hot standby" containers, which can take over from a failed container as soon as a failure is detected.
> The proposed implementation is for each active container to have a standby container (thus doubling the number of containers required). The standby container consumes the checkpoint stream and any changelog streams produced by its active counterpart. The standby looks quite like a container that is being restored after a failure, except that it is constantly in restoration mode, and doesn't consume any messages directly from the input streams. This is similar to leader-based replication (master-slave replication) found in many databases: a follower/slave is constantly mirroring changes on the leader/master, but does not process any writes from clients.
> When an active container fails, its standby can be promoted to active (like failover in database replication). When thus instructed, the standby stops consuming the checkpoint and changelog streams, starts consuming the input streams from the most recent checkpoint, and starts producing output streams and changelogs. In the background, a new standby container can be fired up.
> There will need to be some care to avoid split-brain problems (two containers simultaneously believe that they are active, leading to input messages being consumed twice and output messages being duplicated). Perhaps a container must stop processing if it has not been able to successfully check in with a central controller node (e.g. YARN AM) for some amount of time, and the controller must wait at least that amount of time before promoting a standby to active. Therefore this feature will probably require some direct RPC between containers and YARN AM (or equivalent).
> This feature probably doesn't require any new user-facing APIs (from application code's point of view, a standby container looks like a container that is being restored), and just one boolean configuration flag to enable hot standby.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)