You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cloudstack.apache.org by Alex Ough <al...@sungard.com> on 2014/02/06 15:29:28 UTC

Re: [DISCUSS] Domain/Account/User Sync Up Among Multiple Regions

All,

I just sent a review request, so please take a look at it and let me know
if you have any comments/suggests.

https://reviews.apache.org/r/17790/

Thanks
Alex Ough


On Mon, Jan 13, 2014 at 11:17 AM, Alex Ough <al...@sungard.com> wrote:

> All,
>
> I'd like to have some suggestion about 2 things related with this.
>
> 1. The 'Full Scan' management
> Now, I set it running every time a user logs in to the UI, but I think it
> will be necessary to make it run with some interval also.
> But I'm not familiar with the config file, so can anyone give some
> directions how to manage the time interval in the config file and the best
> way to run it with the time interval?
>
> 2. Repository of regions with their login information.
> To send/receive requests to/from other regions using API interfaces, we
> need the region information including login info of each region.
> I was planning to use a table as a repository, but I think it is better to
> store it in the config file to make the access a little lighter.
> Any recommendation on this?
>
> Your reply with directions & comments will be very appreciated.
> Thanks
> Alex Ough
>
>
> On Wed, Jan 8, 2014 at 2:17 PM, Alex Ough <al...@sungard.com> wrote:
>
>> All,
>>
>> A little bit of updates after a long vacation,
>> I'm currently creating automated test scripts that randomly
>> create/delete/update domain/account/user objects in random regions to
>> trigger the sync-up and full scans regularly.
>> Once they are completed, I'll post it in the github also and submit the
>> review requests for this implementation.
>>
>> Let me know if you have any comments.
>> Thanks
>> Alex Ough
>>
>>
>> On Wed, Dec 18, 2013 at 3:39 PM, Alex Ough <al...@sungard.com> wrote:
>>
>>> All,
>>>
>>> I updated the wiki after some logic changes, so please review them,
>>> especially "Full Scan", which is newly introduced.
>>>
>>> https://cwiki.apache.org/confluence/display/CLOUDSTACK/Domain-Account-User+Sync+Up+Among+Multiple+Regions
>>>
>>> And I implemented this functionality in Java and you can get the pull
>>> request of it here.
>>> (This does not include the 'full scan' yet and I'm currently working
>>> on this to finalize.)
>>> https://github.com/alexoughsg/Albatross/pull/1
>>>
>>> Especially, I really want to have your review on the "Full Scan" logic
>>> to confirm if it does not miss any cases.
>>> Thanks for your interest and your feedback will be very helpful.
>>> Alex Ough
>>>
>>> On Tue, Nov 12, 2013 at 6:00 PM, Alex Ough <al...@sungard.com>
>>> wrote:
>>> > Good point, Chiradeep,
>>> >
>>> > I'm not sure if you reviewed my design doc in the wiki, but my design
>>> is to
>>> > just skip any actions for target resources that already took place by
>>> any
>>> > means.
>>> > But the issue is when conflict actions in the same resources (like
>>> create &
>>> > delete the same users) are enqueued in reversed orders, which is
>>> hopefully
>>> > rare.
>>> >
>>> > And to support consistency in the AP system, I'd like to provide a
>>> full sync
>>> > up, which will sync up all data in all region servers by selecting a
>>> region
>>> > as a master and force its data to other regions.
>>> >
>>> > Let me know what you think.
>>> > Thanks
>>> > Alex Ough
>>> >
>>> >
>>> > On Tue, Nov 12, 2013 at 1:22 PM, Chiradeep Vittal
>>> > <Ch...@citrix.com> wrote:
>>> >>
>>> >> Missed this one. In a single region, the CloudStack DB is the master
>>> for
>>> >> most operations. If the infra is not in the state the DB says it
>>> should
>>> >> be, generally the approach is to whack it and make it conform. For
>>> some
>>> >> exceptions (live migration/related use cases are exceptions) the DB
>>> is the
>>> >> slave -- the point is that the inconsistency that inevitably arise in
>>> an
>>> >> AP system need a conflict resolution system. In a single region, the
>>> >> default is to assume that the MySQL DB is correct and handle other
>>> cases
>>> >> carefully.
>>> >>
>>> >> In a multi-region case, there is no master: disable an account in one
>>> >> region, and it may not propagate to the other regions for many
>>> hours/days.
>>> >> You could add a user in one region and then add the same user in a
>>> >> different region and conflict before the sync happens.
>>> >>
>>> >> This is of course not a problem unique to CloudStack -- people pay
>>> mucho
>>> >> dinero for Global AD/LDAP sync. I don't think this is a problem for
>>> >> CloudStack core to solve: I support the event-based mechanism for
>>> those
>>> >> who want this facility.
>>> >>
>>> >> Distributed systems are hard and research continues to try and make
>>> >> building these systems easier, but there are very few solutions for
>>> global
>>> >> state synchronization (Google Spanner comes to mind)
>>> >>
>>> >> --
>>> >> Chiradeep
>>> >>
>>> >>
>>> >> On 11/8/13 4:53 PM, "Chip Childers" <ch...@gmail.com> wrote:
>>> >>
>>> >> >We are already (generally) AP for most infra changes really. I'd use
>>> that
>>> >> >model. Eventual consistency is better in this scenario.
>>> >> >
>>> >> >> On Nov 8, 2013, at 6:49 PM, Chiradeep Vittal
>>> >> >><Ch...@citrix.com> wrote:
>>> >> >>
>>> >> >> I'd also like to highlight that it isn't a trivial problem.
>>> >> >> Let's say there's 3 regions: this means there are 3 copies of the
>>> user
>>> >> >> database that are geographically separated by network links that
>>> fail
>>> >> >> quite often (orders of magnitude more than intra-DC networks).
>>> >> >>
>>> >> >> Here we run into the consequences of the CAP theorem [1].
>>> >> >> We can either have a CP or AP system: either approach makes some
>>> >> >>tradeoffs:
>>> >> >> 1. If we run a AP system, then the challenge is to resolve
>>> conflicting
>>> >> >> updates
>>> >> >> 2. If we run a CP system, then the challenge is to detect
>>> partitions
>>> >> >> reliably and disallow updates during partitions.
>>> >> >>
>>> >> >> [1] http://en.wikipedia.org/wiki/CAP_theorem
>>> >> >>
>>> >> >>> On 11/7/13 11:58 AM, "Chip Childers" <ch...@apache.org>
>>> wrote:
>>> >> >>>
>>> >> >>> On Thu, Nov 7, 2013 at 2:37 PM, Chiradeep Vittal
>>> >> >>> <Ch...@citrix.com> wrote:
>>> >> >>>> It may be an admin burden, but it has to be optional. There are
>>> other
>>> >> >>>> ways
>>> >> >>>> to achieve global sync (e.g., LDAP/AD/Oauth).
>>> >> >>>> A lot of service providers who run cloudstack have their own user
>>> >> >>>> database
>>> >> >>>> / portal. In their implementations the CloudStack database is
>>> not the
>>> >> >>>> master source of user records, but a slave.
>>> >> >>>
>>> >> >>> +1 to it being optional.
>>> >> >>
>>> >>
>>> >>
>>> >
>>>
>>
>>
>

Re: [DISCUSS] Domain/Account/User Sync Up Among Multiple Regions

Posted by David Grizzanti <da...@sungard.com>.
Alex,

That sounds reasonable. Can we add the change to the API create calls for domain/account/user to the scope of this work?

Thanks

-- 
David Grizzanti
Software Engineer
Sungard Availability Services
e: david.grizzanti@sungard.com
w: 215.446.1431
c: 570.575.0315

On February 19, 2014 at 6:32:07 PM, Alex Ough (alex.ough@sungard.com) wrote:

Hi Dave,

"resources have different uuids in different regions even if they are identical" is not a restriction but a normal case because maintaining a same uuid can be difficult in some cases like when 2 different users create the same resource in different regions at the same time.

But it will not be an issue if you still want to maintain the same UUIDs because the resource map table will have the same uuid for each resource across all regions. The only thing to be changed is to set the resource UUID in the API create calls to send the create requests to remote regions during the real time synchronization.

Let me know if this is not clear.
Thanks
Alex Ough


On Wed, Feb 19, 2014 at 1:47 PM, David Grizzanti <da...@sungard.com> wrote:
Hi Alex,

One thing I wanted to ask about/mention on this was regarding the restriction you have mentioned in the wiki on "resources have different uuids in different regions even if they are identical".  We discussed this a bit offline, but I think it would be beneficial to allow for the the UUIDs to be carried over to the other regions when you're replicating resources. I'm not sure how others feel about this feature, but I know that in our discussions we will need this feature if we are to rely on the sync to create domains/accounts/user across all the regions as the UUID is in the identifying factor of uniqueness.

Let me know your thoughts on this and whether or not this can be added, at least as an optional item if needed.

Thanks

-- 
David Grizzanti
Software Engineer
Sungard Availability Services
e: david.grizzanti@sungard.com
w: 215.446.1431
c: 570.575.0315

On February 6, 2014 at 3:19:52 PM, Alex Ough (alex.ough@sungard.com) wrote:

Hi Chiradeep,
Thanks for your reply.

The change is just to add timestamps when record has been changed to decide
the time order when a same resource has been changed independently in
different regions.
The changes are minimum and additions, so I don't think they will cause any
side effects.

Thanks
Alex Ough


On Thu, Feb 6, 2014 at 1:35 PM, Chiradeep Vittal <
Chiradeep.Vittal@citrix.com> wrote:

> I am uncomfortable with changes to GenericDaoBase. Was this really
> necessary? This feature was supposed to be "outside" CloudStack as much as
> possible and optional. Yet it touches the most sensitive code in CloudStack.
>
> From: Alex Ough <al...@sungard.com>
> Date: Thursday, February 6, 2014 6:29 AM
> To: "dev@cloudstack.apache.org" <de...@cloudstack.apache.org>
> Cc: Chip Childers <ch...@gmail.org>, Daan Hoogland <
> daan.hoogland@gmail.com>, Chiradeep Vittal <ch...@citrix.com>,
> Kishan Kavala <Ki...@citrix.com>
> Subject: Re: [DISCUSS] Domain/Account/User Sync Up Among Multiple Regions
>
> All,
>
> I just sent a review request, so please take a look at it and let me
> know if you have any comments/suggests.
>
> https://reviews.apache.org/r/17790/
>
> Thanks
> Alex Ough
>
>
> On Mon, Jan 13, 2014 at 11:17 AM, Alex Ough <al...@sungard.com> wrote:
>
>> All,
>>
>> I'd like to have some suggestion about 2 things related with this.
>>
>> 1. The 'Full Scan' management
>> Now, I set it running every time a user logs in to the UI, but I think it
>> will be necessary to make it run with some interval also.
>> But I'm not familiar with the config file, so can anyone give some
>> directions how to manage the time interval in the config file and the best
>> way to run it with the time interval?
>>
>> 2. Repository of regions with their login information.
>> To send/receive requests to/from other regions using API interfaces, we
>> need the region information including login info of each region.
>> I was planning to use a table as a repository, but I think it is better
>> to store it in the config file to make the access a little lighter.
>> Any recommendation on this?
>>
>> Your reply with directions & comments will be very appreciated.
>> Thanks
>> Alex Ough
>>
>>
>> On Wed, Jan 8, 2014 at 2:17 PM, Alex Ough <al...@sungard.com> wrote:
>>
>>> All,
>>>
>>> A little bit of updates after a long vacation,
>>> I'm currently creating automated test scripts that randomly
>>> create/delete/update domain/account/user objects in random regions to
>>> trigger the sync-up and full scans regularly.
>>> Once they are completed, I'll post it in the github also and submit the
>>> review requests for this implementation.
>>>
>>> Let me know if you have any comments.
>>> Thanks
>>> Alex Ough
>>>
>>>
>>> On Wed, Dec 18, 2013 at 3:39 PM, Alex Ough <al...@sungard.com>wrote:
>>>
>>>> All,
>>>>
>>>> I updated the wiki after some logic changes, so please review them,
>>>> especially "Full Scan", which is newly introduced.
>>>>
>>>> https://cwiki.apache.org/confluence/display/CLOUDSTACK/Domain-Account-User+Sync+Up+Among+Multiple+Regions
>>>>
>>>> And I implemented this functionality in Java and you can get the pull
>>>> request of it here.
>>>> (This does not include the 'full scan' yet and I'm currently working
>>>> on this to finalize.)
>>>> https://github.com/alexoughsg/Albatross/pull/1
>>>>
>>>> Especially, I really want to have your review on the "Full Scan" logic
>>>> to confirm if it does not miss any cases.
>>>> Thanks for your interest and your feedback will be very helpful.
>>>> Alex Ough
>>>>
>>>> On Tue, Nov 12, 2013 at 6:00 PM, Alex Ough <al...@sungard.com>
>>>> wrote:
>>>> > Good point, Chiradeep,
>>>> >
>>>> > I'm not sure if you reviewed my design doc in the wiki, but my design
>>>> is to
>>>> > just skip any actions for target resources that already took place by
>>>> any
>>>> > means.
>>>> > But the issue is when conflict actions in the same resources (like
>>>> create &
>>>> > delete the same users) are enqueued in reversed orders, which is
>>>> hopefully
>>>> > rare.
>>>> >
>>>> > And to support consistency in the AP system, I'd like to provide a
>>>> full sync
>>>> > up, which will sync up all data in all region servers by selecting a
>>>> region
>>>> > as a master and force its data to other regions.
>>>> >
>>>> > Let me know what you think.
>>>> > Thanks
>>>> > Alex Ough
>>>> >
>>>> >
>>>> > On Tue, Nov 12, 2013 at 1:22 PM, Chiradeep Vittal
>>>> > <Ch...@citrix.com> wrote:
>>>> >>
>>>> >> Missed this one. In a single region, the CloudStack DB is the master
>>>> for
>>>> >> most operations. If the infra is not in the state the DB says it
>>>> should
>>>> >> be, generally the approach is to whack it and make it conform. For
>>>> some
>>>> >> exceptions (live migration/related use cases are exceptions) the DB
>>>> is the
>>>> >> slave -- the point is that the inconsistency that inevitably arise
>>>> in an
>>>> >> AP system need a conflict resolution system. In a single region, the
>>>> >> default is to assume that the MySQL DB is correct and handle other
>>>> cases
>>>> >> carefully.
>>>> >>
>>>> >> In a multi-region case, there is no master: disable an account in one
>>>> >> region, and it may not propagate to the other regions for many
>>>> hours/days.
>>>> >> You could add a user in one region and then add the same user in a
>>>> >> different region and conflict before the sync happens.
>>>> >>
>>>> >> This is of course not a problem unique to CloudStack -- people pay
>>>> mucho
>>>> >> dinero for Global AD/LDAP sync. I don't think this is a problem for
>>>> >> CloudStack core to solve: I support the event-based mechanism for
>>>> those
>>>> >> who want this facility.
>>>> >>
>>>> >> Distributed systems are hard and research continues to try and make
>>>> >> building these systems easier, but there are very few solutions for
>>>> global
>>>> >> state synchronization (Google Spanner comes to mind)
>>>> >>
>>>> >> --
>>>> >> Chiradeep
>>>> >>
>>>> >>
>>>> >> On 11/8/13 4:53 PM, "Chip Childers" <ch...@gmail.com> wrote:
>>>> >>
>>>> >> >We are already (generally) AP for most infra changes really. I'd
>>>> use that
>>>> >> >model. Eventual consistency is better in this scenario.
>>>> >> >
>>>> >> >> On Nov 8, 2013, at 6:49 PM, Chiradeep Vittal
>>>> >> >><Ch...@citrix.com> wrote:
>>>> >> >>
>>>> >> >> I'd also like to highlight that it isn't a trivial problem.
>>>> >> >> Let's say there's 3 regions: this means there are 3 copies of the
>>>> user
>>>> >> >> database that are geographically separated by network links that
>>>> fail
>>>> >> >> quite often (orders of magnitude more than intra-DC networks).
>>>> >> >>
>>>> >> >> Here we run into the consequences of the CAP theorem [1].
>>>> >> >> We can either have a CP or AP system: either approach makes some
>>>> >> >>tradeoffs:
>>>> >> >> 1. If we run a AP system, then the challenge is to resolve
>>>> conflicting
>>>> >> >> updates
>>>> >> >> 2. If we run a CP system, then the challenge is to detect
>>>> partitions
>>>> >> >> reliably and disallow updates during partitions.
>>>> >> >>
>>>> >> >> [1] http://en.wikipedia.org/wiki/CAP_theorem
>>>> >> >>
>>>> >> >>> On 11/7/13 11:58 AM, "Chip Childers" <ch...@apache.org>
>>>> wrote:
>>>> >> >>>
>>>> >> >>> On Thu, Nov 7, 2013 at 2:37 PM, Chiradeep Vittal
>>>> >> >>> <Ch...@citrix.com> wrote:
>>>> >> >>>> It may be an admin burden, but it has to be optional. There are
>>>> other
>>>> >> >>>> ways
>>>> >> >>>> to achieve global sync (e.g., LDAP/AD/Oauth).
>>>> >> >>>> A lot of service providers who run cloudstack have their own
>>>> user
>>>> >> >>>> database
>>>> >> >>>> / portal. In their implementations the CloudStack database is
>>>> not the
>>>> >> >>>> master source of user records, but a slave.
>>>> >> >>>
>>>> >> >>> +1 to it being optional.
>>>> >> >>
>>>> >>
>>>> >>
>>>> >
>>>>
>>>
>>>
>>
>


Re: [DISCUSS] Domain/Account/User Sync Up Among Multiple Regions

Posted by Alex Ough <al...@sungard.com>.
Hi Dave,

"resources have different uuids in different regions even if they are
identical" is not a restriction but a normal case because maintaining a
same uuid can be difficult in some cases like when 2 different users create
the same resource in different regions at the same time.

But it will not be an issue if you still want to maintain the same UUIDs
because the resource map table will have the same uuid for each resource
across all regions. The only thing to be changed is to set the resource
UUID in the API create calls to send the create requests to remote regions
during the real time synchronization.

Let me know if this is not clear.
Thanks
Alex Ough


On Wed, Feb 19, 2014 at 1:47 PM, David Grizzanti <
david.grizzanti@sungard.com> wrote:

> Hi Alex,
>
> One thing I wanted to ask about/mention on this was regarding the
> restriction you have mentioned in the wiki on "resources have different
> uuids in different regions even if they are identical".  We discussed this
> a bit offline, but I think it would be beneficial to allow for the the
> UUIDs to be carried over to the other regions when you're replicating
> resources. I'm not sure how others feel about this feature, but I know that
> in our discussions we will need this feature if we are to rely on the sync
> to create domains/accounts/user across all the regions as the UUID is in
> the identifying factor of uniqueness.
>
> Let me know your thoughts on this and whether or not this can be added, at
> least as an optional item if needed.
>
> Thanks
>
> --
> David Grizzanti
> Software Engineer
> Sungard Availability Services
> e: david.grizzanti@sungard.com
> w: 215.446.1431
> c: 570.575.0315
>
> On February 6, 2014 at 3:19:52 PM, Alex Ough (alex.ough@sungard.com<//...@sungard.com>)
> wrote:
>
> Hi Chiradeep,
> Thanks for your reply.
>
> The change is just to add timestamps when record has been changed to
> decide
> the time order when a same resource has been changed independently in
> different regions.
> The changes are minimum and additions, so I don't think they will cause
> any
> side effects.
>
> Thanks
> Alex Ough
>
>
> On Thu, Feb 6, 2014 at 1:35 PM, Chiradeep Vittal <
> Chiradeep.Vittal@citrix.com> wrote:
>
> > I am uncomfortable with changes to GenericDaoBase. Was this really
> > necessary? This feature was supposed to be "outside" CloudStack as much
> as
> > possible and optional. Yet it touches the most sensitive code in
> CloudStack.
> >
> > From: Alex Ough <al...@sungard.com>
> > Date: Thursday, February 6, 2014 6:29 AM
> > To: "dev@cloudstack.apache.org" <de...@cloudstack.apache.org>
> > Cc: Chip Childers <ch...@gmail.org>, Daan Hoogland <
> > daan.hoogland@gmail.com>, Chiradeep Vittal <ch...@citrix.com>,
>
> > Kishan Kavala <Ki...@citrix.com>
> > Subject: Re: [DISCUSS] Domain/Account/User Sync Up Among Multiple
> Regions
> >
> > All,
> >
> > I just sent a review request, so please take a look at it and let me
> > know if you have any comments/suggests.
> >
> > https://reviews.apache.org/r/17790/
> >
> > Thanks
> > Alex Ough
> >
> >
> > On Mon, Jan 13, 2014 at 11:17 AM, Alex Ough <al...@sungard.com>
> wrote:
> >
> >> All,
> >>
> >> I'd like to have some suggestion about 2 things related with this.
> >>
> >> 1. The 'Full Scan' management
> >> Now, I set it running every time a user logs in to the UI, but I think
> it
> >> will be necessary to make it run with some interval also.
> >> But I'm not familiar with the config file, so can anyone give some
> >> directions how to manage the time interval in the config file and the
> best
> >> way to run it with the time interval?
> >>
> >> 2. Repository of regions with their login information.
> >> To send/receive requests to/from other regions using API interfaces, we
> >> need the region information including login info of each region.
> >> I was planning to use a table as a repository, but I think it is better
> >> to store it in the config file to make the access a little lighter.
> >> Any recommendation on this?
> >>
> >> Your reply with directions & comments will be very appreciated.
> >> Thanks
> >> Alex Ough
> >>
> >>
> >> On Wed, Jan 8, 2014 at 2:17 PM, Alex Ough <al...@sungard.com>
> wrote:
> >>
> >>> All,
> >>>
> >>> A little bit of updates after a long vacation,
> >>> I'm currently creating automated test scripts that randomly
> >>> create/delete/update domain/account/user objects in random regions to
> >>> trigger the sync-up and full scans regularly.
> >>> Once they are completed, I'll post it in the github also and submit
> the
> >>> review requests for this implementation.
> >>>
> >>> Let me know if you have any comments.
> >>> Thanks
> >>> Alex Ough
> >>>
> >>>
> >>> On Wed, Dec 18, 2013 at 3:39 PM, Alex Ough <al...@sungard.com>wrote:
>
> >>>
> >>>> All,
> >>>>
> >>>> I updated the wiki after some logic changes, so please review them,
> >>>> especially "Full Scan", which is newly introduced.
> >>>>
> >>>>
> https://cwiki.apache.org/confluence/display/CLOUDSTACK/Domain-Account-User+Sync+Up+Among+Multiple+Regions
> >>>>
> >>>> And I implemented this functionality in Java and you can get the pull
> >>>> request of it here.
> >>>> (This does not include the 'full scan' yet and I'm currently working
> >>>> on this to finalize.)
> >>>> https://github.com/alexoughsg/Albatross/pull/1
> >>>>
> >>>> Especially, I really want to have your review on the "Full Scan"
> logic
> >>>> to confirm if it does not miss any cases.
> >>>> Thanks for your interest and your feedback will be very helpful.
> >>>> Alex Ough
> >>>>
> >>>> On Tue, Nov 12, 2013 at 6:00 PM, Alex Ough <al...@sungard.com>
> >>>> wrote:
> >>>> > Good point, Chiradeep,
> >>>> >
> >>>> > I'm not sure if you reviewed my design doc in the wiki, but my
> design
> >>>> is to
> >>>> > just skip any actions for target resources that already took place
> by
> >>>> any
> >>>> > means.
> >>>> > But the issue is when conflict actions in the same resources (like
> >>>> create &
> >>>> > delete the same users) are enqueued in reversed orders, which is
> >>>> hopefully
> >>>> > rare.
> >>>> >
> >>>> > And to support consistency in the AP system, I'd like to provide a
> >>>> full sync
> >>>> > up, which will sync up all data in all region servers by selecting
> a
> >>>> region
> >>>> > as a master and force its data to other regions.
> >>>> >
> >>>> > Let me know what you think.
> >>>> > Thanks
> >>>> > Alex Ough
> >>>> >
> >>>> >
> >>>> > On Tue, Nov 12, 2013 at 1:22 PM, Chiradeep Vittal
> >>>> > <Ch...@citrix.com> wrote:
> >>>> >>
> >>>> >> Missed this one. In a single region, the CloudStack DB is the
> master
> >>>> for
> >>>> >> most operations. If the infra is not in the state the DB says it
> >>>> should
> >>>> >> be, generally the approach is to whack it and make it conform. For
> >>>> some
> >>>> >> exceptions (live migration/related use cases are exceptions) the
> DB
> >>>> is the
> >>>> >> slave -- the point is that the inconsistency that inevitably arise
> >>>> in an
> >>>> >> AP system need a conflict resolution system. In a single region,
> the
> >>>> >> default is to assume that the MySQL DB is correct and handle other
> >>>> cases
> >>>> >> carefully.
> >>>> >>
> >>>> >> In a multi-region case, there is no master: disable an account in
> one
> >>>> >> region, and it may not propagate to the other regions for many
> >>>> hours/days.
> >>>> >> You could add a user in one region and then add the same user in a
> >>>> >> different region and conflict before the sync happens.
> >>>> >>
> >>>> >> This is of course not a problem unique to CloudStack -- people pay
> >>>> mucho
> >>>> >> dinero for Global AD/LDAP sync. I don't think this is a problem
> for
> >>>> >> CloudStack core to solve: I support the event-based mechanism for
> >>>> those
> >>>> >> who want this facility.
> >>>> >>
> >>>> >> Distributed systems are hard and research continues to try and
> make
> >>>> >> building these systems easier, but there are very few solutions
> for
> >>>> global
> >>>> >> state synchronization (Google Spanner comes to mind)
> >>>> >>
> >>>> >> --
> >>>> >> Chiradeep
> >>>> >>
> >>>> >>
> >>>> >> On 11/8/13 4:53 PM, "Chip Childers" <ch...@gmail.com>
> wrote:
> >>>> >>
> >>>> >> >We are already (generally) AP for most infra changes really. I'd
> >>>> use that
> >>>> >> >model. Eventual consistency is better in this scenario.
> >>>> >> >
> >>>> >> >> On Nov 8, 2013, at 6:49 PM, Chiradeep Vittal
> >>>> >> >><Ch...@citrix.com> wrote:
> >>>> >> >>
> >>>> >> >> I'd also like to highlight that it isn't a trivial problem.
> >>>> >> >> Let's say there's 3 regions: this means there are 3 copies of
> the
> >>>> user
> >>>> >> >> database that are geographically separated by network links
> that
> >>>> fail
> >>>> >> >> quite often (orders of magnitude more than intra-DC networks).
> >>>> >> >>
> >>>> >> >> Here we run into the consequences of the CAP theorem [1].
> >>>> >> >> We can either have a CP or AP system: either approach makes
> some
> >>>> >> >>tradeoffs:
> >>>> >> >> 1. If we run a AP system, then the challenge is to resolve
> >>>> conflicting
> >>>> >> >> updates
> >>>> >> >> 2. If we run a CP system, then the challenge is to detect
> >>>> partitions
> >>>> >> >> reliably and disallow updates during partitions.
> >>>> >> >>
> >>>> >> >> [1] http://en.wikipedia.org/wiki/CAP_theorem
> >>>> >> >>
> >>>> >> >>> On 11/7/13 11:58 AM, "Chip Childers" <ch...@apache.org>
>
> >>>> wrote:
> >>>> >> >>>
> >>>> >> >>> On Thu, Nov 7, 2013 at 2:37 PM, Chiradeep Vittal
> >>>> >> >>> <Ch...@citrix.com> wrote:
> >>>> >> >>>> It may be an admin burden, but it has to be optional. There
> are
> >>>> other
> >>>> >> >>>> ways
> >>>> >> >>>> to achieve global sync (e.g., LDAP/AD/Oauth).
> >>>> >> >>>> A lot of service providers who run cloudstack have their own
> >>>> user
> >>>> >> >>>> database
> >>>> >> >>>> / portal. In their implementations the CloudStack database is
> >>>> not the
> >>>> >> >>>> master source of user records, but a slave.
> >>>> >> >>>
> >>>> >> >>> +1 to it being optional.
> >>>> >> >>
> >>>> >>
> >>>> >>
> >>>> >
> >>>>
> >>>
> >>>
> >>
> >
>
>

Re: [DISCUSS] Domain/Account/User Sync Up Among Multiple Regions

Posted by David Grizzanti <da...@sungard.com>.
Hi Alex,

One thing I wanted to ask about/mention on this was regarding the restriction you have mentioned in the wiki on "resources have different uuids in different regions even if they are identical".  We discussed this a bit offline, but I think it would be beneficial to allow for the the UUIDs to be carried over to the other regions when you're replicating resources. I'm not sure how others feel about this feature, but I know that in our discussions we will need this feature if we are to rely on the sync to create domains/accounts/user across all the regions as the UUID is in the identifying factor of uniqueness.

Let me know your thoughts on this and whether or not this can be added, at least as an optional item if needed.

Thanks

-- 
David Grizzanti
Software Engineer
Sungard Availability Services
e: david.grizzanti@sungard.com
w: 215.446.1431
c: 570.575.0315

On February 6, 2014 at 3:19:52 PM, Alex Ough (alex.ough@sungard.com) wrote:

Hi Chiradeep,  
Thanks for your reply.  

The change is just to add timestamps when record has been changed to decide  
the time order when a same resource has been changed independently in  
different regions.  
The changes are minimum and additions, so I don't think they will cause any  
side effects.  

Thanks  
Alex Ough  


On Thu, Feb 6, 2014 at 1:35 PM, Chiradeep Vittal <  
Chiradeep.Vittal@citrix.com> wrote:  

> I am uncomfortable with changes to GenericDaoBase. Was this really  
> necessary? This feature was supposed to be "outside" CloudStack as much as  
> possible and optional. Yet it touches the most sensitive code in CloudStack.  
>  
> From: Alex Ough <al...@sungard.com>  
> Date: Thursday, February 6, 2014 6:29 AM  
> To: "dev@cloudstack.apache.org" <de...@cloudstack.apache.org>  
> Cc: Chip Childers <ch...@gmail.org>, Daan Hoogland <  
> daan.hoogland@gmail.com>, Chiradeep Vittal <ch...@citrix.com>,  
> Kishan Kavala <Ki...@citrix.com>  
> Subject: Re: [DISCUSS] Domain/Account/User Sync Up Among Multiple Regions  
>  
> All,  
>  
> I just sent a review request, so please take a look at it and let me  
> know if you have any comments/suggests.  
>  
> https://reviews.apache.org/r/17790/  
>  
> Thanks  
> Alex Ough  
>  
>  
> On Mon, Jan 13, 2014 at 11:17 AM, Alex Ough <al...@sungard.com> wrote:  
>  
>> All,  
>>  
>> I'd like to have some suggestion about 2 things related with this.  
>>  
>> 1. The 'Full Scan' management  
>> Now, I set it running every time a user logs in to the UI, but I think it  
>> will be necessary to make it run with some interval also.  
>> But I'm not familiar with the config file, so can anyone give some  
>> directions how to manage the time interval in the config file and the best  
>> way to run it with the time interval?  
>>  
>> 2. Repository of regions with their login information.  
>> To send/receive requests to/from other regions using API interfaces, we  
>> need the region information including login info of each region.  
>> I was planning to use a table as a repository, but I think it is better  
>> to store it in the config file to make the access a little lighter.  
>> Any recommendation on this?  
>>  
>> Your reply with directions & comments will be very appreciated.  
>> Thanks  
>> Alex Ough  
>>  
>>  
>> On Wed, Jan 8, 2014 at 2:17 PM, Alex Ough <al...@sungard.com> wrote:  
>>  
>>> All,  
>>>  
>>> A little bit of updates after a long vacation,  
>>> I'm currently creating automated test scripts that randomly  
>>> create/delete/update domain/account/user objects in random regions to  
>>> trigger the sync-up and full scans regularly.  
>>> Once they are completed, I'll post it in the github also and submit the  
>>> review requests for this implementation.  
>>>  
>>> Let me know if you have any comments.  
>>> Thanks  
>>> Alex Ough  
>>>  
>>>  
>>> On Wed, Dec 18, 2013 at 3:39 PM, Alex Ough <al...@sungard.com>wrote:  
>>>  
>>>> All,  
>>>>  
>>>> I updated the wiki after some logic changes, so please review them,  
>>>> especially "Full Scan", which is newly introduced.  
>>>>  
>>>> https://cwiki.apache.org/confluence/display/CLOUDSTACK/Domain-Account-User+Sync+Up+Among+Multiple+Regions  
>>>>  
>>>> And I implemented this functionality in Java and you can get the pull  
>>>> request of it here.  
>>>> (This does not include the 'full scan' yet and I'm currently working  
>>>> on this to finalize.)  
>>>> https://github.com/alexoughsg/Albatross/pull/1  
>>>>  
>>>> Especially, I really want to have your review on the "Full Scan" logic  
>>>> to confirm if it does not miss any cases.  
>>>> Thanks for your interest and your feedback will be very helpful.  
>>>> Alex Ough  
>>>>  
>>>> On Tue, Nov 12, 2013 at 6:00 PM, Alex Ough <al...@sungard.com>  
>>>> wrote:  
>>>> > Good point, Chiradeep,  
>>>> >  
>>>> > I'm not sure if you reviewed my design doc in the wiki, but my design  
>>>> is to  
>>>> > just skip any actions for target resources that already took place by  
>>>> any  
>>>> > means.  
>>>> > But the issue is when conflict actions in the same resources (like  
>>>> create &  
>>>> > delete the same users) are enqueued in reversed orders, which is  
>>>> hopefully  
>>>> > rare.  
>>>> >  
>>>> > And to support consistency in the AP system, I'd like to provide a  
>>>> full sync  
>>>> > up, which will sync up all data in all region servers by selecting a  
>>>> region  
>>>> > as a master and force its data to other regions.  
>>>> >  
>>>> > Let me know what you think.  
>>>> > Thanks  
>>>> > Alex Ough  
>>>> >  
>>>> >  
>>>> > On Tue, Nov 12, 2013 at 1:22 PM, Chiradeep Vittal  
>>>> > <Ch...@citrix.com> wrote:  
>>>> >>  
>>>> >> Missed this one. In a single region, the CloudStack DB is the master  
>>>> for  
>>>> >> most operations. If the infra is not in the state the DB says it  
>>>> should  
>>>> >> be, generally the approach is to whack it and make it conform. For  
>>>> some  
>>>> >> exceptions (live migration/related use cases are exceptions) the DB  
>>>> is the  
>>>> >> slave -- the point is that the inconsistency that inevitably arise  
>>>> in an  
>>>> >> AP system need a conflict resolution system. In a single region, the  
>>>> >> default is to assume that the MySQL DB is correct and handle other  
>>>> cases  
>>>> >> carefully.  
>>>> >>  
>>>> >> In a multi-region case, there is no master: disable an account in one  
>>>> >> region, and it may not propagate to the other regions for many  
>>>> hours/days.  
>>>> >> You could add a user in one region and then add the same user in a  
>>>> >> different region and conflict before the sync happens.  
>>>> >>  
>>>> >> This is of course not a problem unique to CloudStack -- people pay  
>>>> mucho  
>>>> >> dinero for Global AD/LDAP sync. I don't think this is a problem for  
>>>> >> CloudStack core to solve: I support the event-based mechanism for  
>>>> those  
>>>> >> who want this facility.  
>>>> >>  
>>>> >> Distributed systems are hard and research continues to try and make  
>>>> >> building these systems easier, but there are very few solutions for  
>>>> global  
>>>> >> state synchronization (Google Spanner comes to mind)  
>>>> >>  
>>>> >> --  
>>>> >> Chiradeep  
>>>> >>  
>>>> >>  
>>>> >> On 11/8/13 4:53 PM, "Chip Childers" <ch...@gmail.com> wrote:  
>>>> >>  
>>>> >> >We are already (generally) AP for most infra changes really. I'd  
>>>> use that  
>>>> >> >model. Eventual consistency is better in this scenario.  
>>>> >> >  
>>>> >> >> On Nov 8, 2013, at 6:49 PM, Chiradeep Vittal  
>>>> >> >><Ch...@citrix.com> wrote:  
>>>> >> >>  
>>>> >> >> I'd also like to highlight that it isn't a trivial problem.  
>>>> >> >> Let's say there's 3 regions: this means there are 3 copies of the  
>>>> user  
>>>> >> >> database that are geographically separated by network links that  
>>>> fail  
>>>> >> >> quite often (orders of magnitude more than intra-DC networks).  
>>>> >> >>  
>>>> >> >> Here we run into the consequences of the CAP theorem [1].  
>>>> >> >> We can either have a CP or AP system: either approach makes some  
>>>> >> >>tradeoffs:  
>>>> >> >> 1. If we run a AP system, then the challenge is to resolve  
>>>> conflicting  
>>>> >> >> updates  
>>>> >> >> 2. If we run a CP system, then the challenge is to detect  
>>>> partitions  
>>>> >> >> reliably and disallow updates during partitions.  
>>>> >> >>  
>>>> >> >> [1] http://en.wikipedia.org/wiki/CAP_theorem  
>>>> >> >>  
>>>> >> >>> On 11/7/13 11:58 AM, "Chip Childers" <ch...@apache.org>  
>>>> wrote:  
>>>> >> >>>  
>>>> >> >>> On Thu, Nov 7, 2013 at 2:37 PM, Chiradeep Vittal  
>>>> >> >>> <Ch...@citrix.com> wrote:  
>>>> >> >>>> It may be an admin burden, but it has to be optional. There are  
>>>> other  
>>>> >> >>>> ways  
>>>> >> >>>> to achieve global sync (e.g., LDAP/AD/Oauth).  
>>>> >> >>>> A lot of service providers who run cloudstack have their own  
>>>> user  
>>>> >> >>>> database  
>>>> >> >>>> / portal. In their implementations the CloudStack database is  
>>>> not the  
>>>> >> >>>> master source of user records, but a slave.  
>>>> >> >>>  
>>>> >> >>> +1 to it being optional.  
>>>> >> >>  
>>>> >>  
>>>> >>  
>>>> >  
>>>>  
>>>  
>>>  
>>  
>  

Re: [DISCUSS] Domain/Account/User Sync Up Among Multiple Regions

Posted by Alex Ough <al...@sungard.com>.
Hi Chiradeep,
Thanks for your reply.

The change is just to add timestamps when record has been changed to decide
the time order when a same resource has been changed independently in
different regions.
The changes are minimum and additions, so I don't think they will cause any
side effects.

Thanks
Alex Ough


On Thu, Feb 6, 2014 at 1:35 PM, Chiradeep Vittal <
Chiradeep.Vittal@citrix.com> wrote:

>  I am uncomfortable with changes to GenericDaoBase. Was this really
> necessary? This feature was supposed to be "outside" CloudStack as much as
> possible and optional. Yet it touches the most sensitive code in CloudStack.
>
>   From: Alex Ough <al...@sungard.com>
> Date: Thursday, February 6, 2014 6:29 AM
> To: "dev@cloudstack.apache.org" <de...@cloudstack.apache.org>
> Cc: Chip Childers <ch...@gmail.org>, Daan Hoogland <
> daan.hoogland@gmail.com>, Chiradeep Vittal <ch...@citrix.com>,
> Kishan Kavala <Ki...@citrix.com>
> Subject: Re: [DISCUSS] Domain/Account/User Sync Up Among Multiple Regions
>
>   All,
>
>  I just sent a review request, so please take a look at it and let me
> know if you have any comments/suggests.
>
>  https://reviews.apache.org/r/17790/
>
>  Thanks
> Alex Ough
>
>
> On Mon, Jan 13, 2014 at 11:17 AM, Alex Ough <al...@sungard.com> wrote:
>
>> All,
>>
>>  I'd like to have some suggestion about 2 things related with this.
>>
>>  1. The 'Full Scan' management
>> Now, I set it running every time a user logs in to the UI, but I think it
>> will be necessary to make it run with some interval also.
>> But I'm not familiar with the config file, so can anyone give some
>> directions how to manage the time interval in the config file and the best
>> way to run it with the time interval?
>>
>>  2. Repository of regions with their login information.
>> To send/receive requests to/from other regions using API interfaces, we
>> need the region information including login info of each region.
>> I was planning to use a table as a repository, but I think it is better
>> to store it in the config file to make the access a little lighter.
>> Any recommendation on this?
>>
>>  Your reply with directions & comments will be very appreciated.
>> Thanks
>>  Alex Ough
>>
>>
>> On Wed, Jan 8, 2014 at 2:17 PM, Alex Ough <al...@sungard.com> wrote:
>>
>>> All,
>>>
>>>  A little bit of updates after a long vacation,
>>> I'm currently creating automated test scripts that randomly
>>> create/delete/update domain/account/user objects in random regions to
>>> trigger the sync-up and full scans regularly.
>>> Once they are completed, I'll post it in the github also and submit the
>>> review requests for this implementation.
>>>
>>>  Let me know if you have any comments.
>>> Thanks
>>>  Alex Ough
>>>
>>>
>>> On Wed, Dec 18, 2013 at 3:39 PM, Alex Ough <al...@sungard.com>wrote:
>>>
>>>> All,
>>>>
>>>> I updated the wiki after some logic changes, so please review them,
>>>> especially "Full Scan", which is newly introduced.
>>>>
>>>> https://cwiki.apache.org/confluence/display/CLOUDSTACK/Domain-Account-User+Sync+Up+Among+Multiple+Regions
>>>>
>>>> And I implemented this functionality in Java and you can get the pull
>>>> request of it here.
>>>> (This does not include the 'full scan' yet and I'm currently working
>>>> on this to finalize.)
>>>> https://github.com/alexoughsg/Albatross/pull/1
>>>>
>>>> Especially, I really want to have your review on the "Full Scan" logic
>>>> to confirm if it does not miss any cases.
>>>> Thanks for your interest and your feedback will be very helpful.
>>>> Alex Ough
>>>>
>>>> On Tue, Nov 12, 2013 at 6:00 PM, Alex Ough <al...@sungard.com>
>>>> wrote:
>>>> > Good point, Chiradeep,
>>>> >
>>>> > I'm not sure if you reviewed my design doc in the wiki, but my design
>>>> is to
>>>> > just skip any actions for target resources that already took place by
>>>> any
>>>> > means.
>>>> > But the issue is when conflict actions in the same resources (like
>>>> create &
>>>> > delete the same users) are enqueued in reversed orders, which is
>>>> hopefully
>>>> > rare.
>>>> >
>>>> > And to support consistency in the AP system, I'd like to provide a
>>>> full sync
>>>> > up, which will sync up all data in all region servers by selecting a
>>>> region
>>>> > as a master and force its data to other regions.
>>>> >
>>>> > Let me know what you think.
>>>> > Thanks
>>>> > Alex Ough
>>>> >
>>>> >
>>>> > On Tue, Nov 12, 2013 at 1:22 PM, Chiradeep Vittal
>>>> > <Ch...@citrix.com> wrote:
>>>> >>
>>>> >> Missed this one. In a single region, the CloudStack DB is the master
>>>> for
>>>> >> most operations. If the infra is not in the state the DB says it
>>>> should
>>>> >> be, generally the approach is to whack it and make it conform. For
>>>> some
>>>> >> exceptions (live migration/related use cases are exceptions) the DB
>>>> is the
>>>> >> slave -- the point is that the inconsistency that inevitably arise
>>>> in an
>>>> >> AP system need a conflict resolution system. In a single region, the
>>>> >> default is to assume that the MySQL DB is correct and handle other
>>>> cases
>>>> >> carefully.
>>>> >>
>>>> >> In a multi-region case, there is no master: disable an account in one
>>>> >> region, and it may not propagate to the other regions for many
>>>> hours/days.
>>>> >> You could add a user in one region and then add the same user in a
>>>> >> different region and conflict before the sync happens.
>>>> >>
>>>> >> This is of course not a problem unique to CloudStack -- people pay
>>>> mucho
>>>> >> dinero for Global AD/LDAP sync. I don't think this is a problem for
>>>> >> CloudStack core to solve: I support the event-based mechanism for
>>>> those
>>>> >> who want this facility.
>>>> >>
>>>> >> Distributed systems are hard and research continues to try and make
>>>> >> building these systems easier, but there are very few solutions for
>>>> global
>>>> >> state synchronization (Google Spanner comes to mind)
>>>> >>
>>>> >> --
>>>> >> Chiradeep
>>>> >>
>>>> >>
>>>> >> On 11/8/13 4:53 PM, "Chip Childers" <ch...@gmail.com> wrote:
>>>> >>
>>>> >> >We are already (generally) AP for most infra changes really. I'd
>>>> use that
>>>> >> >model. Eventual consistency is better in this scenario.
>>>> >> >
>>>> >> >> On Nov 8, 2013, at 6:49 PM, Chiradeep Vittal
>>>> >> >><Ch...@citrix.com> wrote:
>>>> >> >>
>>>> >> >> I'd also like to highlight that it isn't a trivial problem.
>>>> >> >> Let's say there's 3 regions: this means there are 3 copies of the
>>>> user
>>>> >> >> database that are geographically separated by network links that
>>>> fail
>>>> >> >> quite often (orders of magnitude more than intra-DC networks).
>>>> >> >>
>>>> >> >> Here we run into the consequences of the CAP theorem [1].
>>>> >> >> We can either have a CP or AP system: either approach makes some
>>>> >> >>tradeoffs:
>>>> >> >> 1. If we run a AP system, then the challenge is to resolve
>>>> conflicting
>>>> >> >> updates
>>>> >> >> 2. If we run a CP system, then the challenge is to detect
>>>> partitions
>>>> >> >> reliably and disallow updates during partitions.
>>>> >> >>
>>>> >> >> [1] http://en.wikipedia.org/wiki/CAP_theorem
>>>> >> >>
>>>> >> >>> On 11/7/13 11:58 AM, "Chip Childers" <ch...@apache.org>
>>>> wrote:
>>>> >> >>>
>>>> >> >>> On Thu, Nov 7, 2013 at 2:37 PM, Chiradeep Vittal
>>>> >> >>> <Ch...@citrix.com> wrote:
>>>> >> >>>> It may be an admin burden, but it has to be optional. There are
>>>> other
>>>> >> >>>> ways
>>>> >> >>>> to achieve global sync (e.g., LDAP/AD/Oauth).
>>>> >> >>>> A lot of service providers who run cloudstack have their own
>>>> user
>>>> >> >>>> database
>>>> >> >>>> / portal. In their implementations the CloudStack database is
>>>> not the
>>>> >> >>>> master source of user records, but a slave.
>>>> >> >>>
>>>> >> >>> +1 to it being optional.
>>>> >> >>
>>>> >>
>>>> >>
>>>> >
>>>>
>>>
>>>
>>
>

Re: [DISCUSS] Domain/Account/User Sync Up Among Multiple Regions

Posted by Chiradeep Vittal <Ch...@citrix.com>.
I am uncomfortable with changes to GenericDaoBase. Was this really necessary? This feature was supposed to be "outside" CloudStack as much as possible and optional. Yet it touches the most sensitive code in CloudStack.

From: Alex Ough <al...@sungard.com>>
Date: Thursday, February 6, 2014 6:29 AM
To: "dev@cloudstack.apache.org<ma...@cloudstack.apache.org>" <de...@cloudstack.apache.org>>
Cc: Chip Childers <ch...@gmail.org>>, Daan Hoogland <da...@gmail.com>>, Chiradeep Vittal <ch...@citrix.com>>, Kishan Kavala <Ki...@citrix.com>>
Subject: Re: [DISCUSS] Domain/Account/User Sync Up Among Multiple Regions

All,

I just sent a review request, so please take a look at it and let me know if you have any comments/suggests.

https://reviews.apache.org/r/17790/

Thanks
Alex Ough


On Mon, Jan 13, 2014 at 11:17 AM, Alex Ough <al...@sungard.com>> wrote:
All,

I'd like to have some suggestion about 2 things related with this.

1. The 'Full Scan' management
Now, I set it running every time a user logs in to the UI, but I think it will be necessary to make it run with some interval also.
But I'm not familiar with the config file, so can anyone give some directions how to manage the time interval in the config file and the best way to run it with the time interval?

2. Repository of regions with their login information.
To send/receive requests to/from other regions using API interfaces, we need the region information including login info of each region.
I was planning to use a table as a repository, but I think it is better to store it in the config file to make the access a little lighter.
Any recommendation on this?

Your reply with directions & comments will be very appreciated.
Thanks
Alex Ough


On Wed, Jan 8, 2014 at 2:17 PM, Alex Ough <al...@sungard.com>> wrote:
All,

A little bit of updates after a long vacation,
I'm currently creating automated test scripts that randomly create/delete/update domain/account/user objects in random regions to trigger the sync-up and full scans regularly.
Once they are completed, I'll post it in the github also and submit the review requests for this implementation.

Let me know if you have any comments.
Thanks
Alex Ough


On Wed, Dec 18, 2013 at 3:39 PM, Alex Ough <al...@sungard.com>> wrote:
All,

I updated the wiki after some logic changes, so please review them,
especially "Full Scan", which is newly introduced.
https://cwiki.apache.org/confluence/display/CLOUDSTACK/Domain-Account-User+Sync+Up+Among+Multiple+Regions

And I implemented this functionality in Java and you can get the pull
request of it here.
(This does not include the 'full scan' yet and I'm currently working
on this to finalize.)
https://github.com/alexoughsg/Albatross/pull/1

Especially, I really want to have your review on the "Full Scan" logic
to confirm if it does not miss any cases.
Thanks for your interest and your feedback will be very helpful.
Alex Ough

On Tue, Nov 12, 2013 at 6:00 PM, Alex Ough <al...@sungard.com>> wrote:
> Good point, Chiradeep,
>
> I'm not sure if you reviewed my design doc in the wiki, but my design is to
> just skip any actions for target resources that already took place by any
> means.
> But the issue is when conflict actions in the same resources (like create &
> delete the same users) are enqueued in reversed orders, which is hopefully
> rare.
>
> And to support consistency in the AP system, I'd like to provide a full sync
> up, which will sync up all data in all region servers by selecting a region
> as a master and force its data to other regions.
>
> Let me know what you think.
> Thanks
> Alex Ough
>
>
> On Tue, Nov 12, 2013 at 1:22 PM, Chiradeep Vittal
> <Ch...@citrix.com>> wrote:
>>
>> Missed this one. In a single region, the CloudStack DB is the master for
>> most operations. If the infra is not in the state the DB says it should
>> be, generally the approach is to whack it and make it conform. For some
>> exceptions (live migration/related use cases are exceptions) the DB is the
>> slave -- the point is that the inconsistency that inevitably arise in an
>> AP system need a conflict resolution system. In a single region, the
>> default is to assume that the MySQL DB is correct and handle other cases
>> carefully.
>>
>> In a multi-region case, there is no master: disable an account in one
>> region, and it may not propagate to the other regions for many hours/days.
>> You could add a user in one region and then add the same user in a
>> different region and conflict before the sync happens.
>>
>> This is of course not a problem unique to CloudStack -- people pay mucho
>> dinero for Global AD/LDAP sync. I don't think this is a problem for
>> CloudStack core to solve: I support the event-based mechanism for those
>> who want this facility.
>>
>> Distributed systems are hard and research continues to try and make
>> building these systems easier, but there are very few solutions for global
>> state synchronization (Google Spanner comes to mind)
>>
>> --
>> Chiradeep
>>
>>
>> On 11/8/13 4:53 PM, "Chip Childers" <ch...@gmail.com>> wrote:
>>
>> >We are already (generally) AP for most infra changes really. I'd use that
>> >model. Eventual consistency is better in this scenario.
>> >
>> >> On Nov 8, 2013, at 6:49 PM, Chiradeep Vittal
>> >><Ch...@citrix.com>> wrote:
>> >>
>> >> I'd also like to highlight that it isn't a trivial problem.
>> >> Let's say there's 3 regions: this means there are 3 copies of the user
>> >> database that are geographically separated by network links that fail
>> >> quite often (orders of magnitude more than intra-DC networks).
>> >>
>> >> Here we run into the consequences of the CAP theorem [1].
>> >> We can either have a CP or AP system: either approach makes some
>> >>tradeoffs:
>> >> 1. If we run a AP system, then the challenge is to resolve conflicting
>> >> updates
>> >> 2. If we run a CP system, then the challenge is to detect partitions
>> >> reliably and disallow updates during partitions.
>> >>
>> >> [1] http://en.wikipedia.org/wiki/CAP_theorem
>> >>
>> >>> On 11/7/13 11:58 AM, "Chip Childers" <ch...@apache.org>> wrote:
>> >>>
>> >>> On Thu, Nov 7, 2013 at 2:37 PM, Chiradeep Vittal
>> >>> <Ch...@citrix.com>> wrote:
>> >>>> It may be an admin burden, but it has to be optional. There are other
>> >>>> ways
>> >>>> to achieve global sync (e.g., LDAP/AD/Oauth).
>> >>>> A lot of service providers who run cloudstack have their own user
>> >>>> database
>> >>>> / portal. In their implementations the CloudStack database is not the
>> >>>> master source of user records, but a slave.
>> >>>
>> >>> +1 to it being optional.
>> >>
>>
>>
>