You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@cloudstack.apache.org by John Kinsella <jl...@stratosec.co> on 2013/03/13 02:32:59 UTC

what are folks using for replication?

Thought I'd throw this question out on the list - we're working on geographically (1500 miles) replicated databases for customers who really don't want their stuff to go down.

In this particular case, how they've architected their DB schemas means they're really not friendly to the standard transactional replication (this is MSSQL server) so we're looking at replicating the whole block device the db is stored on.

We've tried using…
 * Ceph -  know it's not meant for geo-rep…loving it locally, though!
 * Gluster - When clustering across the WAN, performance was weak. Don't have enough disks/nodes to create a gluster cluster to replicate with gluster-geo-replicate
 * drbd in active/standby - Fairly decent performance. I hear it'd be better with the drbd-proxy, but don't feel like spending the $$ yet. I've been previously shot in the foot enough times with drbd active/active that I won't try that again.

Currently using drbd (I'm pondering writing management of the primary/secondary stuff into ACS) but curious if others have found ways of doing this with ACS that they like?

John

Stratosec - Secure Infrastructure as a Service
o: 415.315.9385
@johnlkinsella

Re: what are folks using for replication?

Posted by John Kinsella <jl...@stratosec.co>.

On Mar 12, 2013, at 6:42 PM, Outback Dingo <ou...@gmail.com>
 wrote:

> On Tue, Mar 12, 2013 at 9:32 PM, John Kinsella <jl...@stratosec.co> wrote:
> 
>> Thought I'd throw this question out on the list - we're working on
>> geographically (1500 miles) replicated databases for customers who really
>> don't want their stuff to go down.
>> 
>> In this particular case, how they've architected their DB schemas means
>> they're really not friendly to the standard transactional replication (this
>> is MSSQL server) so we're looking at replicating the whole block device the
>> db is stored on.
>> 
>> We've tried using…
>> * Ceph -  know it's not meant for geo-rep…loving it locally, though!
>> * Gluster - When clustering across the WAN, performance was weak. Don't
>> have enough disks/nodes to create a gluster cluster to replicate with
>> gluster-geo-replicate
>> * drbd in active/standby - Fairly decent performance. I hear it'd be
>> better with the drbd-proxy, but don't feel like spending the $$ yet. I've
>> been previously shot in the foot enough times with drbd active/active that
>> I won't try that again.
>> 
>> Currently using drbd (I'm pondering writing management of the
>> primary/secondary stuff into ACS) but curious if others have found ways of
>> doing this with ACS that they like?
>> 
>> John
>> 
>> Stratosec - Secure Infrastructure as a Service
>> o: 415.315.9385
>> @johnlkinsella
>> 
>> 
> one word, one filesystem, ZFS


Ah yes, that's on our list, as well. Might have to set up a testbed and see…

Re: what are folks using for replication?

Posted by Jason Davis <sc...@gmail.com>.

Maybe MSSQL AlwaysOn?

http://msdn.microsoft.com/en-us/sqlserver/gg490638


On Wed, Mar 13, 2013 at 2:21 PM, Nux! <nu...@li.nux.ro> wrote:

> On 13.03.2013 19:09, Outback Dingo wrote:
>
>> On Wed, Mar 13, 2013 at 2:29 PM, Nux! <nu...@li.nux.ro> wrote:
>>
>>  On 13.03.2013 01:42, Outback Dingo wrote:
>>>
>>>  one word, one filesystem, ZFS
>>>>
>>>>
>>> If you want to scale you need Ceph/GlusterFS/XtreemeFS/etc. ZFS is for
>>> when you only have one NFS server and you don't want to grow.
>>>
>>>
>> Well thats not exactly true, ZFS can be used in various environments of
>> which scalability is required also. Many people dont realize stacking
>> swift, on top of ZFS provides one such  environment if scalability is
>> required, the original question was about replication, if your going to
>> throw scalability into play, it can
>> also be accomplished with ZFS included in the mix with a clustering
>> solution
>>
>
> Agreed to the swift idea, but in this particular case he needs to
> replicate MS SQL. He needs a distributed filesystem of some sort or
> something like Gallera, but for MS SQL. I'm sure Microsoft developed
> something like this considering the amazing prices they charge for it. ;-)
>
> Anyway, 1500 miles will add a lot of latency ... I'm curious how this will
> end up. :)
>
>
> --
> Sent from the Delta quadrant using Borg technology!
>
> Nux!
> www.nux.ro
>

Re: what are folks using for replication?

Posted by Nux! <nu...@li.nux.ro>.

On 13.03.2013 19:09, Outback Dingo wrote:
> On Wed, Mar 13, 2013 at 2:29 PM, Nux! <nu...@li.nux.ro> wrote:
> 
>> On 13.03.2013 01:42, Outback Dingo wrote:
>> 
>>> one word, one filesystem, ZFS
>>> 
>> 
>> If you want to scale you need Ceph/GlusterFS/XtreemeFS/etc. ZFS is 
>> for
>> when you only have one NFS server and you don't want to grow.
>> 
> 
> Well thats not exactly true, ZFS can be used in various environments 
> of
> which scalability is required also. Many people dont realize stacking
> swift, on top of ZFS provides one such  environment if scalability is
> required, the original question was about replication, if your going 
> to
> throw scalability into play, it can
> also be accomplished with ZFS included in the mix with a clustering 
> solution

Agreed to the swift idea, but in this particular case he needs to 
replicate MS SQL. He needs a distributed filesystem of some sort or 
something like Gallera, but for MS SQL. I'm sure Microsoft developed 
something like this considering the amazing prices they charge for it. 
;-)

Anyway, 1500 miles will add a lot of latency ... I'm curious how this 
will end up. :)

-- 
Sent from the Delta quadrant using Borg technology!

Nux!
www.nux.ro

Re: what are folks using for replication?

Posted by Outback Dingo <ou...@gmail.com>.

On Wed, Mar 13, 2013 at 2:29 PM, Nux! <nu...@li.nux.ro> wrote:

> On 13.03.2013 01:42, Outback Dingo wrote:
>
>> one word, one filesystem, ZFS
>>
>
> If you want to scale you need Ceph/GlusterFS/XtreemeFS/etc. ZFS is for
> when you only have one NFS server and you don't want to grow.
>

Well thats not exactly true, ZFS can be used in various environments of
which scalability is required also. Many people dont realize stacking
swift, on top of ZFS provides one such  environment if scalability is
required, the original question was about replication, if your going to
throw scalability into play, it can
also be accomplished with ZFS included in the mix with a clustering solution

> Lucian
>
> --
> Sent from the Delta quadrant using Borg technology!
>
> Nux!
> www.nux.ro
>

Re: what are folks using for replication?

Posted by Nux! <nu...@li.nux.ro>.

On 13.03.2013 01:42, Outback Dingo wrote:
> one word, one filesystem, ZFS

If you want to scale you need Ceph/GlusterFS/XtreemeFS/etc. ZFS is for 
when you only have one NFS server and you don't want to grow.

Lucian

-- 
Sent from the Delta quadrant using Borg technology!

Nux!
www.nux.ro

Re: what are folks using for replication?

Posted by Outback Dingo <ou...@gmail.com>.

On Tue, Mar 12, 2013 at 9:32 PM, John Kinsella <jl...@stratosec.co> wrote:

> Thought I'd throw this question out on the list - we're working on
> geographically (1500 miles) replicated databases for customers who really
> don't want their stuff to go down.
>
> In this particular case, how they've architected their DB schemas means
> they're really not friendly to the standard transactional replication (this
> is MSSQL server) so we're looking at replicating the whole block device the
> db is stored on.
>
> We've tried using…
>  * Ceph -  know it's not meant for geo-rep…loving it locally, though!
>  * Gluster - When clustering across the WAN, performance was weak. Don't
> have enough disks/nodes to create a gluster cluster to replicate with
> gluster-geo-replicate
>  * drbd in active/standby - Fairly decent performance. I hear it'd be
> better with the drbd-proxy, but don't feel like spending the $$ yet. I've
> been previously shot in the foot enough times with drbd active/active that
> I won't try that again.
>
> Currently using drbd (I'm pondering writing management of the
> primary/secondary stuff into ACS) but curious if others have found ways of
> doing this with ACS that they like?
>
> John
>
> Stratosec - Secure Infrastructure as a Service
> o: 415.315.9385
> @johnlkinsella
>
>
one word, one filesystem, ZFS

Re: what are folks using for replication?

Posted by John Kinsella <jl...@stratosec.co>.

On Mar 13, 2013, at 9:14 AM, David Nalley <da...@gnsa.us> wrote:

> So not to come off all BOFH-ish, but it seems like there needs to be
> give and take here.

Ah, so you're Simon! ;)

> From your post:
> $entity wants no downtime.
> $DB provides several replication strategies that work to varying
> degrees to get one closer to no downtime.
> $entity doesn't want to design a schema that works well the proven $DB
> replication strategies.
> $entity seeks magic silver bullet to provide the above.
> 
> While things like DRBD might be reasonable, if it's something they
> care enough about to really want no downtime, why are they not willing
> to modify their schema to use $DB-approved solutions for maintaining
> availability?

That'd be how I'd do it as well, and in the future we'll be guiding them into that type of thought process. For now we're keeping them happy, and pondering how we'll respond to similar requests in the future.

This is part of the "migrating to the cloud" thing - enterprises have a "traditional" workload that they want to cloudify (sorry). They can either re-architect and then move the new pieces to a cloud environment, or move and then re-architect. In this case it's the latter, partially due to internal IT telling the division they were taking up too much compute space. And I suspect at the IaaS level, you'll see more of the "move and then re-architect" model, which is partially why I brought this up on the users list.

-j

Re: what are folks using for replication?

Posted by David Nalley <da...@gnsa.us>.

On Wed, Mar 13, 2013 at 11:55 AM, John Kinsella <jl...@stratosec.co> wrote:
> Looks good, alas this particular use case needs MSSQL not MY :)
>
> On Mar 12, 2013, at 7:21 PM, Jason Davis <sc...@gmail.com> wrote:
>
>> Galera MySQL might be an option, depending on how fast the round trip is
>> between the two sites.
>>
>> I will confess that I am NOT a fan of Linux HA clustering. I would
>> recommend doing replication at the MySQL layer. Linux HA clustering is
>> complex and fragile... Our big LMS at the university runs on such a cluster
>> (RHEL cluster + DRBD) and its more trouble than its worth. I am working to
>> hopefully move this to Galera. In our testing it has been way easier and
>> more resistant to fault.
>> On Mar 12, 2013 9:00 PM, "Mathias Mullins" <ma...@citrix.com>
>> wrote:
>>
>>> Hi John,
>>>
>>> So in a lot of the installations that we have worked with, we are seeing a
>>> lot of work with DRBD (Active/Standby, Active/Active w/Pacemaker). So far
>>> it's been one of the better solutions that we have seen installations
>>> stable with. I think a lot of people would like to see this functionality
>>> written into ACS to support something in addition to MySQL Master/Slave
>>> replication.
>>>
>>> Personally I like the DRBD solution because it deals with the replication
>>> at a block level and gives us the flexibility of active/passive and
>>> active/active ability versus having to do a lot of manual work to revert
>>> the master/slave relationship.
>>>
>>> Thanks,
>>> Matt
>>>
>>>
>>> On 3/12/13 9:32 PM, "John Kinsella" <jl...@stratosec.co> wrote:
>>>
>>>> Thought I'd throw this question out on the list - we're working on
>>>> geographically (1500 miles) replicated databases for customers who really
>>>> don't want their stuff to go down.
>>>>
>>>> In this particular case, how they've architected their DB schemas means
>>>> they're really not friendly to the standard transactional replication
>>>> (this is MSSQL server) so we're looking at replicating the whole block
>>>> device the db is stored on.
>>>>
>>>> We've tried usingŠ
>>>> * Ceph -  know it's not meant for geo-repŠloving it locally, though!
>>>> * Gluster - When clustering across the WAN, performance was weak. Don't
>>>> have enough disks/nodes to create a gluster cluster to replicate with
>>>> gluster-geo-replicate
>>>> * drbd in active/standby - Fairly decent performance. I hear it'd be
>>>> better with the drbd-proxy, but don't feel like spending the $$ yet. I've
>>>> been previously shot in the foot enough times with drbd active/active
>>>> that I won't try that again.
>>>>
>>>> Currently using drbd (I'm pondering writing management of the
>>>> primary/secondary stuff into ACS) but curious if others have found ways
>>>> of doing this with ACS that they like?
>>>>
>>>> John
>>>>
>>>> Stratosec - Secure Infrastructure as a Service
>>>> o: 415.315.9385
>>>> @johnlkinsella
>>>>
>>>
>>>
>
> Stratosec - Secure Infrastructure as a Service
> o: 415.315.9385
> @johnlkinsella
>


So not to come off all BOFH-ish, but it seems like there needs to be
give and take here.

>From your post:
$entity wants no downtime.
$DB provides several replication strategies that work to varying
degrees to get one closer to no downtime.
$entity doesn't want to design a schema that works well the proven $DB
replication strategies.
$entity seeks magic silver bullet to provide the above.

While things like DRBD might be reasonable, if it's something they
care enough about to really want no downtime, why are they not willing
to modify their schema to use $DB-approved solutions for maintaining
availability?

--David

Re: what are folks using for replication?

Posted by John Kinsella <jl...@stratosec.co>.

Looks good, alas this particular use case needs MSSQL not MY :)

On Mar 12, 2013, at 7:21 PM, Jason Davis <sc...@gmail.com> wrote:

> Galera MySQL might be an option, depending on how fast the round trip is
> between the two sites.
> 
> I will confess that I am NOT a fan of Linux HA clustering. I would
> recommend doing replication at the MySQL layer. Linux HA clustering is
> complex and fragile... Our big LMS at the university runs on such a cluster
> (RHEL cluster + DRBD) and its more trouble than its worth. I am working to
> hopefully move this to Galera. In our testing it has been way easier and
> more resistant to fault.
> On Mar 12, 2013 9:00 PM, "Mathias Mullins" <ma...@citrix.com>
> wrote:
> 
>> Hi John,
>> 
>> So in a lot of the installations that we have worked with, we are seeing a
>> lot of work with DRBD (Active/Standby, Active/Active w/Pacemaker). So far
>> it's been one of the better solutions that we have seen installations
>> stable with. I think a lot of people would like to see this functionality
>> written into ACS to support something in addition to MySQL Master/Slave
>> replication.
>> 
>> Personally I like the DRBD solution because it deals with the replication
>> at a block level and gives us the flexibility of active/passive and
>> active/active ability versus having to do a lot of manual work to revert
>> the master/slave relationship.
>> 
>> Thanks,
>> Matt
>> 
>> 
>> On 3/12/13 9:32 PM, "John Kinsella" <jl...@stratosec.co> wrote:
>> 
>>> Thought I'd throw this question out on the list - we're working on
>>> geographically (1500 miles) replicated databases for customers who really
>>> don't want their stuff to go down.
>>> 
>>> In this particular case, how they've architected their DB schemas means
>>> they're really not friendly to the standard transactional replication
>>> (this is MSSQL server) so we're looking at replicating the whole block
>>> device the db is stored on.
>>> 
>>> We've tried usingŠ
>>> * Ceph -  know it's not meant for geo-repŠloving it locally, though!
>>> * Gluster - When clustering across the WAN, performance was weak. Don't
>>> have enough disks/nodes to create a gluster cluster to replicate with
>>> gluster-geo-replicate
>>> * drbd in active/standby - Fairly decent performance. I hear it'd be
>>> better with the drbd-proxy, but don't feel like spending the $$ yet. I've
>>> been previously shot in the foot enough times with drbd active/active
>>> that I won't try that again.
>>> 
>>> Currently using drbd (I'm pondering writing management of the
>>> primary/secondary stuff into ACS) but curious if others have found ways
>>> of doing this with ACS that they like?
>>> 
>>> John
>>> 
>>> Stratosec - Secure Infrastructure as a Service
>>> o: 415.315.9385
>>> @johnlkinsella
>>> 
>> 
>> 

Stratosec - Secure Infrastructure as a Service
o: 415.315.9385
@johnlkinsella

Re: what are folks using for replication?

Posted by Jason Davis <sc...@gmail.com>.

Good article explaining my argument:
http://openlife.cc/blogs/2012/september/failover-evil

TL;DR

Cluster failover sucks.
On Mar 12, 2013 9:21 PM, "Jason Davis" <sc...@gmail.com> wrote:

> Galera MySQL might be an option, depending on how fast the round trip is
> between the two sites.
>
> I will confess that I am NOT a fan of Linux HA clustering. I would
> recommend doing replication at the MySQL layer. Linux HA clustering is
> complex and fragile... Our big LMS at the university runs on such a cluster
> (RHEL cluster + DRBD) and its more trouble than its worth. I am working to
> hopefully move this to Galera. In our testing it has been way easier and
> more resistant to fault.
> On Mar 12, 2013 9:00 PM, "Mathias Mullins" <ma...@citrix.com>
> wrote:
>
>> Hi John,
>>
>> So in a lot of the installations that we have worked with, we are seeing a
>> lot of work with DRBD (Active/Standby, Active/Active w/Pacemaker). So far
>> it's been one of the better solutions that we have seen installations
>> stable with. I think a lot of people would like to see this functionality
>> written into ACS to support something in addition to MySQL Master/Slave
>> replication.
>>
>> Personally I like the DRBD solution because it deals with the replication
>> at a block level and gives us the flexibility of active/passive and
>> active/active ability versus having to do a lot of manual work to revert
>> the master/slave relationship.
>>
>> Thanks,
>> Matt
>>
>>
>> On 3/12/13 9:32 PM, "John Kinsella" <jl...@stratosec.co> wrote:
>>
>> >Thought I'd throw this question out on the list - we're working on
>> >geographically (1500 miles) replicated databases for customers who really
>> >don't want their stuff to go down.
>> >
>> >In this particular case, how they've architected their DB schemas means
>> >they're really not friendly to the standard transactional replication
>> >(this is MSSQL server) so we're looking at replicating the whole block
>> >device the db is stored on.
>> >
>> >We've tried usingŠ
>> > * Ceph -  know it's not meant for geo-repŠloving it locally, though!
>> > * Gluster - When clustering across the WAN, performance was weak. Don't
>> >have enough disks/nodes to create a gluster cluster to replicate with
>> >gluster-geo-replicate
>> > * drbd in active/standby - Fairly decent performance. I hear it'd be
>> >better with the drbd-proxy, but don't feel like spending the $$ yet. I've
>> >been previously shot in the foot enough times with drbd active/active
>> >that I won't try that again.
>> >
>> >Currently using drbd (I'm pondering writing management of the
>> >primary/secondary stuff into ACS) but curious if others have found ways
>> >of doing this with ACS that they like?
>> >
>> >John
>> >
>> >Stratosec - Secure Infrastructure as a Service
>> >o: 415.315.9385
>> >@johnlkinsella
>> >
>>
>>

Re: what are folks using for replication?

Posted by Jason Davis <sc...@gmail.com>.

Galera MySQL might be an option, depending on how fast the round trip is
between the two sites.

I will confess that I am NOT a fan of Linux HA clustering. I would
recommend doing replication at the MySQL layer. Linux HA clustering is
complex and fragile... Our big LMS at the university runs on such a cluster
(RHEL cluster + DRBD) and its more trouble than its worth. I am working to
hopefully move this to Galera. In our testing it has been way easier and
more resistant to fault.
On Mar 12, 2013 9:00 PM, "Mathias Mullins" <ma...@citrix.com>
wrote:

> Hi John,
>
> So in a lot of the installations that we have worked with, we are seeing a
> lot of work with DRBD (Active/Standby, Active/Active w/Pacemaker). So far
> it's been one of the better solutions that we have seen installations
> stable with. I think a lot of people would like to see this functionality
> written into ACS to support something in addition to MySQL Master/Slave
> replication.
>
> Personally I like the DRBD solution because it deals with the replication
> at a block level and gives us the flexibility of active/passive and
> active/active ability versus having to do a lot of manual work to revert
> the master/slave relationship.
>
> Thanks,
> Matt
>
>
> On 3/12/13 9:32 PM, "John Kinsella" <jl...@stratosec.co> wrote:
>
> >Thought I'd throw this question out on the list - we're working on
> >geographically (1500 miles) replicated databases for customers who really
> >don't want their stuff to go down.
> >
> >In this particular case, how they've architected their DB schemas means
> >they're really not friendly to the standard transactional replication
> >(this is MSSQL server) so we're looking at replicating the whole block
> >device the db is stored on.
> >
> >We've tried usingŠ
> > * Ceph -  know it's not meant for geo-repŠloving it locally, though!
> > * Gluster - When clustering across the WAN, performance was weak. Don't
> >have enough disks/nodes to create a gluster cluster to replicate with
> >gluster-geo-replicate
> > * drbd in active/standby - Fairly decent performance. I hear it'd be
> >better with the drbd-proxy, but don't feel like spending the $$ yet. I've
> >been previously shot in the foot enough times with drbd active/active
> >that I won't try that again.
> >
> >Currently using drbd (I'm pondering writing management of the
> >primary/secondary stuff into ACS) but curious if others have found ways
> >of doing this with ACS that they like?
> >
> >John
> >
> >Stratosec - Secure Infrastructure as a Service
> >o: 415.315.9385
> >@johnlkinsella
> >
>
>

Re: what are folks using for replication?

Posted by Mathias Mullins <ma...@citrix.com>.

Hi John, 

So in a lot of the installations that we have worked with, we are seeing a
lot of work with DRBD (Active/Standby, Active/Active w/Pacemaker). So far
it's been one of the better solutions that we have seen installations
stable with. I think a lot of people would like to see this functionality
written into ACS to support something in addition to MySQL Master/Slave
replication.

Personally I like the DRBD solution because it deals with the replication
at a block level and gives us the flexibility of active/passive and
active/active ability versus having to do a lot of manual work to revert
the master/slave relationship.

Thanks,
Matt 

On 3/12/13 9:32 PM, "John Kinsella" <jl...@stratosec.co> wrote:

>Thought I'd throw this question out on the list - we're working on
>geographically (1500 miles) replicated databases for customers who really
>don't want their stuff to go down.
>
>In this particular case, how they've architected their DB schemas means
>they're really not friendly to the standard transactional replication
>(this is MSSQL server) so we're looking at replicating the whole block
>device the db is stored on.
>
>We've tried usingŠ
> * Ceph -  know it's not meant for geo-repŠloving it locally, though!
> * Gluster - When clustering across the WAN, performance was weak. Don't
>have enough disks/nodes to create a gluster cluster to replicate with
>gluster-geo-replicate
> * drbd in active/standby - Fairly decent performance. I hear it'd be
>better with the drbd-proxy, but don't feel like spending the $$ yet. I've
>been previously shot in the foot enough times with drbd active/active
>that I won't try that again.
>
>Currently using drbd (I'm pondering writing management of the
>primary/secondary stuff into ACS) but curious if others have found ways
>of doing this with ACS that they like?
>
>John
>
>Stratosec - Secure Infrastructure as a Service
>o: 415.315.9385
>@johnlkinsella
>