You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@accumulo.apache.org by David Medinets <da...@gmail.com> on 2012/09/26 14:00:16 UTC

Accumulo Between Two Centers (DR - disaster recovery)

I recall a conversation in which people were pointed to Cassandra for
its ability to replicate between data centers. I have forgotten what
Accumulo offers on this topic. And does latency matter? If latency
matters, what is the highest acceptable latency?

Re: Accumulo Between Two Centers (DR - disaster recovery)

Posted by Adam Fuchs <af...@apache.org>.
Another way to say this is that cross-data center replication for Accumulo
is left to a layer on top of Accumulo (or the application space). Cassandra
supports a mode in which you can have a bigger write replication than write
quorum, allowing writes to eventually propagate and reads to happen on
stale versions of the data. This increases availability at the cost of
consistency, which is important when dealing with links that are less
reliable or higher latency (but does nothing special for lower bandwidth
links). Cassandra, running in this mode, leaves dealing with eventual
consistency to the application space, which might be only slightly less
challenging than implementing a cross-data center replication scheme.

Adam


On Wed, Sep 26, 2012 at 9:46 AM, Eric Newton <er...@gmail.com> wrote:

> I think you're talking about 2 different things.
>
> Accumulo is architected to run on fast connections.  If you add one
> slowly connected computer, generally speaking, it will make everything
> run slowly.
>
> Replication is typically used to send copies from one data center to
> another, so that each has a local copy.  Typically, the trick uses
> extra latency in updates to the copies to compensate for the
> relatively slow connections between data centers.
>
> Accumulo does not presently support replication.  See ACCUMULO-378.
>
> -Eric
>
> On Wed, Sep 26, 2012 at 8:08 AM, Christopher Tubbs <ct...@gmail.com>
> wrote:
> > I believe Accumulo can work across data centers, if the underlying DFS
> > span data centers. I also believe the latency tolerance is
> > configurable, and matters for servers holding locks in Zookeeper and
> > heartbeat messages to the Master. I'm not sure what the defaults for
> > these are, though.
> >
> > On Wed, Sep 26, 2012 at 8:00 AM, David Medinets
> > <da...@gmail.com> wrote:
> >> I recall a conversation in which people were pointed to Cassandra for
> >> its ability to replicate between data centers. I have forgotten what
> >> Accumulo offers on this topic. And does latency matter? If latency
> >> matters, what is the highest acceptable latency?
>

Re: Accumulo Between Two Centers (DR - disaster recovery)

Posted by Eric Newton <er...@gmail.com>.
I think you're talking about 2 different things.

Accumulo is architected to run on fast connections.  If you add one
slowly connected computer, generally speaking, it will make everything
run slowly.

Replication is typically used to send copies from one data center to
another, so that each has a local copy.  Typically, the trick uses
extra latency in updates to the copies to compensate for the
relatively slow connections between data centers.

Accumulo does not presently support replication.  See ACCUMULO-378.

-Eric

On Wed, Sep 26, 2012 at 8:08 AM, Christopher Tubbs <ct...@gmail.com> wrote:
> I believe Accumulo can work across data centers, if the underlying DFS
> span data centers. I also believe the latency tolerance is
> configurable, and matters for servers holding locks in Zookeeper and
> heartbeat messages to the Master. I'm not sure what the defaults for
> these are, though.
>
> On Wed, Sep 26, 2012 at 8:00 AM, David Medinets
> <da...@gmail.com> wrote:
>> I recall a conversation in which people were pointed to Cassandra for
>> its ability to replicate between data centers. I have forgotten what
>> Accumulo offers on this topic. And does latency matter? If latency
>> matters, what is the highest acceptable latency?

Re: Accumulo Between Two Centers (DR - disaster recovery)

Posted by Christopher Tubbs <ct...@gmail.com>.
I believe Accumulo can work across data centers, if the underlying DFS
span data centers. I also believe the latency tolerance is
configurable, and matters for servers holding locks in Zookeeper and
heartbeat messages to the Master. I'm not sure what the defaults for
these are, though.

On Wed, Sep 26, 2012 at 8:00 AM, David Medinets
<da...@gmail.com> wrote:
> I recall a conversation in which people were pointed to Cassandra for
> its ability to replicate between data centers. I have forgotten what
> Accumulo offers on this topic. And does latency matter? If latency
> matters, what is the highest acceptable latency?