You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cloudstack.apache.org by Rafael Weingärtner <ra...@gmail.com> on 2016/01/03 13:19:53 UTC

Re: Let’s discuss database upgrades

Sorry the delay on answering your inquiries, during this period of New
Year’s Eve I was AFK.

Thanks for the contributions of all.
I will comment your questions and suggestions as follows:

Ron, I understand your point that there are some projects that do not allow
database change in minor version releases (schema changes). We could define
that as a standard, I do not see a problem on that, as long as we have
consensus. What we have to keep in mind is that we could still have scripts
that do not change DB’s schema, but add some table into a table in a minor
version.

Having said that, we are looking for a way to make the upgrade process
smoother,  looking for a way to avoid creating upgrade path manually with
scripts such as <currentVersion>to<newerVersion>, because that way we have
to cover every single upgrade path manually. We can work that out using a
tool to “build and execute” the upgrade path, using a standard to create
and name upgrade routines we have been discussing earlier in this thread.

Erik, there is a tool to do that. As I mentioned in my previous emails
there is a tool called Flywaydb that does exactly what you mentioned.
However, that tool will require an improvement in the way we create and
name upgrade routines; those changes have been cited and discussed earlier.

Paul, about your inquiries:
When you say rollback, do you mean downgrade after an upgrade? If so, we
have discussed that earlier in this thread and we agreed that we would not
cover downgrades, at least for now. The Admin during the upgrade should
properly make a copy of his/her database to be restored if a problem
happens.

About the downtime you mentioned, do you mean the need to stop all of the
MS while executing the upgrade?
As a cloud administrator that is built on top of ACS, I find quite the
opposite of you. If I do not look at the source code, I find the upgrade
procedure pretty easy to follow and execute, giving that we just need to
stop all MS and update it with apt-get.
Even if we build a tool as Rohit suggested, the downtime would exist, while
upgrading the database old release of MS would have to be stopped,
otherwise we could receive errors with DB’s schemas change. As I said in
some email earlier, I do not find the need to create a new tool that is
just a wrapper. I prefer to define a standard to create and name upgrade
routines and then use a tool such as Flywaydb directly, which would allow
us to manage solely configurations, instead of wrapper code. IMO the less
code the better.

Paul and Remi, now with Remi’s explanation I understand what you meant with
“downtime”. As Remi’s said the others stack are far worse to upgrade.
OpenStack has a tool such as the suggested “Chimp” that seems to cover
rollbacks. However, I found their upgrade process worse than ours.

We are discussing DB upgrade routines here, I understand the problem of
upgrade as a whole that needs to cover aspect such as SystemVMs upgrade.
However, I think that point should and can be discussed in a separated
thread; as a consequence of that it is a different part of ACS source code.

About reverting an upgrade, I do not find it hard at all; it is basically
restoring the DBs “cloud” and “cloud_usage” to their state prior the
upgrade (giving that in ACS upgrade page, it is stated that you should
backup your databases). Maybe because I am a developer, I do not see much
problem with that.

Bottom line:

There is a tool that can help us with upgrade routines for DB, what we need
is a consensus on how to create and name upgrade routines and the tool that
we can use to build and execute the upgrade path. I think we all agreed
with the standards we had discussed earlier.

Can I create a page in the ACS wiki formalizing the points we discussed
here in regards to ACS DB’s upgrade routines?
I tried to create a child page in
https://cwiki.apache.org/confluence/display/CLOUDSTACK/Developers, but it
seems that I do not have permission. After that, I can start working in a
PR to change add flywaydb to ACs.


On Wed, Dec 30, 2015 at 2:41 PM, Ron Wheeler <rwheeler@artifact-software.com
> wrote:

> On 30/12/2015 4:58 AM, Remi Bergsma wrote:
>
>> Hoi Paul,
>>
>> Agree that the user perspective is important, thanks for bringing that up.
>>
> It is also worth pointing out that once you get into the SMB space, the
> system admin may wear a few hats and is not dedicated full time to
> maintaining Cloudstack.
> If it works most of the time the way it is supposed to, the admin is not
> spending any time working with the guts of Cloudstack.
> Once it is up and running, the skills and knowledge will decay pretty
> quickly.
> There is a need for an upgrade that works reliably and has good tests that
> can be quickly tried to see that the upgrade has worked or needs to be
> reverted.
>
>
>> Remember that the other “Stack” is far worse in upgrades, so it’s all
>> about perspective.
>>
>
> I guess being the second worst stack is comforting in some way. :-)
>
>>   Having said that, I also want it to be smooth and we absolutely need it
>> to be outside of the main repo and able to rollback if stuff goes wrong (so
>> users can retry).
>>
>> The biggest other issue I see in upgrading is the systemvm replacement
>> and having to reboot (100s or 1000s of routers). That’s where your real
>> downtime is most of the time.
>>
> If you have done all that and have to revert, it is not very comforting to
> know that most of the time you wasted was spent in a fairly stable process
> and that the downtime can be chalked up to the size of the server
> population. The users will be happy with that, I suppose.
>
>> Although upgrading from 4.6 to 4.7 takes under 5 minutes (stop ACS,
>> replace RPM and start it again) and no systemvm template needed to be
>> replaced. That’s more like it already ;-)
>>
>
> That sounds more like what I need!
>
>
>
>
>> Regards,
>> Remi
>>
>>
>> From: Paul Angus <paul.angus@shapeblue.com<mailto:
>> paul.angus@shapeblue.com>>
>> Reply-To: "dev@cloudstack.apache.org<ma...@cloudstack.apache.org>" <
>> dev@cloudstack.apache.org<ma...@cloudstack.apache.org>>
>> Date: Wednesday 30 December 2015 10:10
>> To: "dev@cloudstack.apache.org<ma...@cloudstack.apache.org>" <
>> dev@cloudstack.apache.org<ma...@cloudstack.apache.org>>
>> Subject: RE: Let’s discuss database upgrades
>>
>> Hi Guys, from the user's perspective, there are two points which come up
>> again and again -
>>
>> 1. lack a prescribed roll back if an upgrade goes badly
>> 2. The downtime involved in doing upgrades.
>>
>> - Upgrades are seen as CloudStack's biggest 'issue'.
>>
>> I've had to rescue enough upgrades to understand how complicated it is;
>> however with the increased release velocity, the admin's experience of
>> doing these upgrades needs to be taken into account or we will lose users
>> because of the increased admin overhead and downtime.
>>
>> The purpose of Rohit's CloudChimp was to find a suitable tool/method to
>> carry out schema changes *without downtime*. You guys are far better placed
>> to argue the merits of any one solution than me.
>>
>> I would just ask that you keep in mind what the users are looking for -
>> relatively clean and recoverable upgrade process.
>>
>>
>>
>>
>> [ShapeBlue]<http://www.shapeblue.com>
>> Paul Angus
>> VP Technology   ,       ShapeBlue
>>
>>
>> d:      +44 203 617 0528 | s: +44 203 603 0540<tel:+44%20203%20617%200528%20|%20s:%20+44%20203%20603%200540>
>>    |      m:      +44 7711 418784<tel:+44%207711%20418784>
>>
>> e:      paul.angus@shapeblue.com | t: @cloudyangus<mailto:
>> paul.angus@shapeblue.com%20|%20t:%20@cloudyangus>      |      w:
>> www.shapeblue.com<http://www.shapeblue.com>
>>
>> a:      53 Chandos Place, Covent Garden London WC2N 4HS UK
>>
>>
>> [cid:image182380.png@8ca21c21.40847519]
>>
>>
>> Shape Blue Ltd is a company incorporated in England & Wales. ShapeBlue
>> Services India LLP is a company incorporated in India and is operated under
>> license from Shape Blue Ltd. Shape Blue Brasil Consultoria Ltda is a
>> company incorporated in Brasil and is operated under license from Shape
>> Blue Ltd. ShapeBlue SA Pty Ltd is a company registered by The Republic of
>> South Africa and is traded under license from Shape Blue Ltd. ShapeBlue is
>> a registered trademark.
>> This email and any attachments to it may be confidential and are intended
>> solely for the use of the individual to whom it is addressed. Any views or
>> opinions expressed are solely those of the author and do not necessarily
>> represent those of Shape Blue Ltd or related companies. If you are not the
>> intended recipient of this email, you must neither take any action based
>> upon its contents, nor copy or show it to anyone. Please contact the sender
>> if you believe you have received this email in error.
>>
>>
>>
>>
>> -----Original Message-----
>> From: Erik Weber [mailto:terbolous@gmail.com]
>> Sent: 29 December 2015 21:45
>> To: dev <de...@cloudstack.apache.org>>
>> Subject: Re: Let’s discuss database upgrades
>>
>> On Mon, Dec 28, 2015 at 2:16 PM, Rafael Weingärtner <
>> rafaelweingartner@gmail.com<ma...@gmail.com>> wrote:
>>
>> Hi all devs,
>>> First of all, sorry the long text, but I hope we can start a
>>> discussion here and improve that part of ACS.
>>>
>>> A while ago I have faced the code that Apache CloudStack (ACS) uses to
>>> upgrade from a version to newer one and that did not seem to be a good
>>> way to execute our upgrades. Therefore, I decided to use some time to
>>> search for alternatives.
>>>
>>> I have read some material about versioning of scripts used to upgrade
>>> a database (DB) of a system and went through some frameworks that
>>> could help us.
>>>
>>> In the literature of software engineering, it is firmly stated that we
>>> have to version DB scripts as we do with the source code of the
>>> application, using the baseline approach. Gladly, we were not that bad
>>> at this point, we already versioned our routines for DB upgrade (.sql
>>> and .java). Therefore, it seemed that we just did not have used a
>>> practical approach to help us during DB upgrades.
>>>
>>>  From my readings and looking at the ACS source code I raised the
>>> following
>>> requirement:
>>> • We should be able to write more than one routine to upgrade to a
>>> version; those routines can be written in Java and SQL. We might have
>>> more than a routine to be executed for each version and we should be
>>> able to define an order of execution. Additionally, to go to an upper
>>> version, we have to run all of the routines from smaller versions
>>> first, until we achieve the desired version.
>>>
>>> We could also add another requirement that is the downgrade from a
>>> version, which we currently do not support. With that comes my first
>>> question for
>>> discussion:
>>> • Do we want/need a method to downgrade from a version to a previous
>>> one?
>>>
>>> I found an explanation for not supporting downgrades, and I liked it:
>>> http://flywaydb.org/documentation/faq.html#downgrade
>>>
>>> So, what I devised for us:
>>> First the bureaucracy part - our migrations occur basically in three
>>> (3) steps, first we have a "prepare script", then a cleanup script and
>>> finally the migration per se that is written in Java, at least, that
>>> is what we can expect when reading the interface
>>> “com.cloud.upgrade.dao.DbUpgrade”.
>>>
>>> Additionally, our scripts have the following naming convention:
>>> schema-<currentVersion>to<desiredVersion>, which in IMHO may cause
>>> some confusion because at first sight we may think that from the same
>>> version we could have different paths to an upper version, which in
>>> practice is not happening. Instead of a <currentVersion>to<version> we
>>> could simply use V_<numberOfVersion>_<sequencial>.<fileExtension>,
>>> giving that, we have to execute all of the V_<version> scripts that
>>> are smaller than the version we want to upgrade.
>>>
>>> To clarify what I am saying, I will use an example. Let’s say we have
>>> just installed ACS and ran the cloudstack-setup-database. That command
>>> will create a database schema in version 4.0.0. To upgrade that schema
>>> to version 4.3.0 (it is just an example, it could be any other
>>> version), ACS will use the following mapping:
>>>
>>> _upgradeMap.put("4.0.0", new DbUpgrade[] {new Upgrade40to41(), new
>>> Upgrade410to420(), new Upgrade420to421(), new Upgrade421to430())
>>>
>>> After loading the mapping, ACS will execute the scripts defined in
>>> each one of the Upgrade path classes and the migration code per se.
>>>
>>> Now, let’s say we change the “.sql” scripts name to the pattern I
>>> mentioned, we would have the following scripts; those are the scripts
>>> found that aim to upgrade to versions between the interval 4.0.0 –
>>> 4.3.0 (considering 4.3.0, since that is the goal version):
>>>
>>>
>>> - schema-40to410, can be named to: V_410_A.sql
>>> - schema-40to410-cleanup, can be named to: V_410_B.sql
>>> - schema-410to420, can be named to: V_420_A.sql
>>> - schema-410to420-cleanup , can be named to: V_420_b.sql
>>> - schema-420to421, can be named to: V_421_A.sql
>>> - schema-421to430, can be named to: V_430_A.sql
>>> - schema-421to430-cleanup, can be named to: V_430_B.sql
>>>
>>>
>>> Additionally, all of the java code would have to follow the same
>>> convention. For instance, we have
>>> “com.cloud.upgrade.dao.Upgrade40to41”,
>>> which has some java code to migrate from 4.0.0 to 4.1.0. The idea is
>>> to extract that migration code to a Java class named: V_410_C.java,
>>> giving that it has to execute the SQL scripts before the java code.
>>>
>>> In order to go from a smaller version (4.0.0) to an upper one (4.3.0),
>>> we have to run all of the migration routines from intermediate
>>> versions. That is what we are already doing, but we do all of that
>>> manually.
>>>
>>> Bottom line, I think we could simple use the convention
>>> V_<numberOfVersion>_<sequencial>.<fileExtension> to name upgrade
>>> routines.
>>> That would facilitate us to use a framework to help us with that process.
>>> Additionally, I believe that we should always assume that to go from a
>>> smaller version to a higher one, we should run all of the scripts that
>>> exist between them. What do you guys think of that?
>>>
>>> After the bureaucracy, we can discuss tools. If we use that convention
>>> to name migration (upgrade) routines, we can start thinking on tools
>>> to support our migration process. I found two (2) promising ones:
>>> Liquibase and Flywaydb (both seem to be under Apache license, but the
>>> first one has an enterprise version?!). After reading the
>>> documentation and some usage examples I found the flywaydb easier and
>>> simpler to use.
>>>
>>> What are the options of tools that we can use to help us manage the
>>> database upgrade, without needing to code the upgrade path that you know?
>>>
>>> After that, I think we should decide if we should create another
>>> project/component to take care of migrations, or we can just add the
>>> dependency of the tool to a project such as “cloud-framework-db” and
>>> start using it.
>>>
>>> The “cloud-framework-db” project seems to have a focus on other things
>>> such as managing transactions and generating SQLs from annotations
>>> (?!? That should be a topic for another discussion). Therefore, I
>>> would rather create a new project that has the specific goal of
>>> managing ACS DB upgrades. I would also move all of the routines (SQL and
>>> Java) to this new project.
>>> This project would be a module of the CloudStack project and it would
>>> execute the upgrade routines at the startup of ACS.
>>>
>>> I believe that going from a homemade solution to one that is more
>>> consolidated and used by other communities would be the way to go.
>>>
>>> I can volunteer myself to create a PR with the aforementioned changes
>>> and using flywaydb to manage our upgrades. However, I prefer to have a
>>> good discussion with other devs first, before starting coding.
>>>
>>> Do you have suggestions or points that should be raised before we
>>> start working on that?
>>>
>>>
>> This isn't my field of work, so forgive me if this is self explanatory or
>> something, but is there no tool like terraform/puppet or similar for
>> database work?
>> I mean, where you state you desired state and the tool handles it.
>>
>> To me it sounds like a good way would be if you could specify what you
>> want to exist (or not), and how it should look like.
>>
>> "I want table XYZ to exist with THESE columns having THIS type(s) and
>> THIS default value bla bla bla"
>>
>> Rather than handling a bunch of sql scripts that has to handle different
>> mysql versions (come to think about an issue with a mariadb version
>> crashing recently), a variety of cloudstack versions and a whole lot more.
>>
>> Disclaimer: i have no idea if this is what flywaydb does, if it is, then
>> just ignore this.
>>
>> --
>> Erik
>> Find out more about ShapeBlue and our range of CloudStack related
>> services:
>> IaaS Cloud Design & Build<
>> http://shapeblue.com/iaas-cloud-design-and-build//> | CSForge – rapid
>> IaaS deployment framework<http://shapeblue.com/csforge/>
>> CloudStack Consulting<http://shapeblue.com/cloudstack-consultancy/> |
>> CloudStack Software Engineering<
>> http://shapeblue.com/cloudstack-software-engineering/>
>> CloudStack Infrastructure Support<
>> http://shapeblue.com/cloudstack-infrastructure-support/> | CloudStack
>> Bootcamp Training Courses<http://shapeblue.com/cloudstack-training/>
>>
>
>
> --
> Ron Wheeler
> President
> Artifact Software Inc
> email: rwheeler@artifact-software.com
> skype: ronaldmwheeler
> phone: 866-970-2435, ext 102
>
>


-- 
Rafael Weingärtner

Re: Let’s discuss database upgrades

Posted by Rafael Weingärtner <ra...@gmail.com>.
Hi John,

Thanks for your contributions, sorry the late reply.

I completely agree with Wido that the notion of the ACS version (e.g.
4.6.0, 4.6.1, 4.7.0, etc) should be a purely logical concept.  It points to
particular git hash, a version of the database schema, etc.  I also agree
that supporting downgrade is a fools errand as many database schema changes
are destructive.

I totally agree with you here.

My largest issue with our upgrade process is that it requires management
server downtime (leaving out the system VM efor the moment).  On a 4-6
month release cycle, a few maintenance windows a year is livable.  However,
monthly maintenance windows becomes more and more onerous.

I understand your point regarding system VMs. However, that is not what we
are trying to reach a consensus here. Additionally, I think that more
frequent releases do not necessarily mean more releases of system VMs.

Consider the number of maintenance windows AWS opens where all or part of
its services are completely unavailable.  Therefore, our tooling needs to
favor additive, non-destructive schema changes that allow database upgrades
to be applied while an older version of the management is still running.
Combined with a clustered management server configuration, a CloudStack
user to could upgrade their database schema then perform rolling upgrade of
their management servers without taking any downtime.  By policy, we should
only perform changes that require downtime (e.g. changes to the clustering
protocol, destructive/offline schema changes, etc) in major versions (and
potentially critical security/stability bug fixes).

That is a pretty good point. We could define a procedure that only in X.0.0
version we would destroy database tables and columns that are no longer
required. While doing that we could have upgrades without requiring MS
server downtime (using redundancy nodes and a load balancer to receive
users’ request)

While it would be extra effort, developing a DSL for describing migrations
would allow the tool to relieve developers from worrying
destructive/additive changes, as well as, understanding if/when database
changes must be made offline.  Finally, in order to execute upgrades
separately from the management server upgrade, the upgrade tooling and
execution must be pulled out of the management server and into a standalone
utility.

I believe that is the point we diverge. I do not think that we should move
the upgrade “code” out of the MS. I believe we should go from a homemade
code to better solution (not a silver bullet, but better). I like the way
ACS works during upgrades. Why use another tool just to run upgrade
routines? If we change the version of ACS application, why doesn’t this
system already execute everything like the execution of upgrade routines? I
would also say the same to system VMs. If they are to a specific version,
why do we have to install them manually? The ACS could do that during
upgrades.

Another issue is that we don’t use our database tooling in development in
the same way as users in production.  Anecdotally, developers vary in the
way they upgrade their development databases based on their personal
workflow.  We are committing one of the biggest operational sins — having a
special process that is rarely executed.  Therefore, I agree with Daan that
which ever tool we select, it should support a workflow that is efficient
for both development and operations.  We should be eating our own dog food
everyday.

That happens because we use a homemade solution. If we use some tool like
FlywayDB we can integrate it with maven as well for developers.

I have two issues with FlywayDB.  First, it assumes that wall time will
synchronized and monotonically increasing.

I do not see a problem with that. Every release we have a tag, and for
every branch of a tag, we have a different branch; in that branch, upgrade
routines are synchronized and monotonically increasing. Those upgrade
routines are also forwarded to master, that also assumes this trend.

When things get out of order, they tend to fail quietly — causing subtle
corruption.  Second, and most importantly, it assumes a linearly increasing
set of releases.  This model works wonderfully for the deployment of
internally developed web applications, but it does not work for software
such as CloudStack.  As has been pointed out, minor revision are released
after minor releases (e.g. 4.5.3 after 4.6.0).  As Daan has also pointed
out, in a perfect world, database changes would only occur in major or
minor releases.  However, in reality, some critical bug fixes require
database schema changes.  It it not acceptable to deny users critical
defect fixes because we either cannot or will not make database changes in
minor revisions.  In its current form, FlywayDB is unable to handle this
situation because it does not know how interleave the 4.5.3 schema changes
into the 4.6.0 stream (or 4.7.0 or 4.8.0) since it branched off.

I understand your point here. You are talking about an upgrade routine that
is executed in a minor version let’s say (4.5.3), that is released after a
more recent version such as 4.6.0 is closed, and may break some things. I
believe we can avoid such cases with policies on how to create upgrade
routines that aim to enable a simpler upgrade path.

While I think the Chimp proposal could use some refinement, the basic idea
of using directed, acyclic graph (DAG) to establish a chain of database
mutations addresses the non-linearizable nature of our release process.
Essentially, it is borrowing the model established by git to create a log
schema transformations.  Coupling a DAG with content hashing to identify
each change (e.g. the SHA1 of the author, change date, and migration
content),  the set of changes required to transform a schema to another
version can be determined at runtime.   Most importantly, in the same way
the git can determine that two revisions cannot be automatically merged,
such a tool can deterministically fail if/when upgrades from one version to
the another is not possible.  To me, a database schema management tool that
leveraged the git tree to manage history and calculate the set of changes
required to upgrade from one version to another would represent the gold
standard.  I find this approach so powerful because it would leverage the
standard git revision tracking semantics to identify database changes, the
rebase/merge workflow to identify and resolve schema upgrade conflicts, and
release tagging.

I find the tool you just described amazingly. Such tool does not need ACS
to live. I only see some problems here, git is managing the source code per
se and how changes are added into the same files. When we talk about
upgrade routines, they will change the structure or data of  a database.
The only way I see to detect conflicts would be to run them against a DB.

In summary, I do not believe that an off-the-shelf tool supports
combination of non-linear upgrade paths and online database migration we
require.  Therefore, we will need to develop tooling of own.  To me, the
question is whether building a new tool from scratch or contributing to
existing project represent the shortest path to meeting these requirements.

I disagree with you here; I believe that we can work out with an
off-the-shelf solution (off course we may need to tun it a little bit to
attend our needs). Instead of starting with a complicated solution, we
should try the simpler way first. With versions being released more
frequent, we will not have lots of feature and database changes, and with a
formal protocol to create upgrade routines we can avoid problems to migrate
between minor to major releases.




On Mon, Jan 4, 2016 at 4:04 AM, John Burwell <jo...@shapeblue.com>
wrote:

> All,
>
> I completely agree with Wido that the notion of the ACS version (e.g.
> 4.6.0, 4.6.1, 4.7.0, etc) should be a purely logical concept.  It points to
> particular git hash, a version of the database schema, etc.  I also agree
> that supporting downgrade is a fools errand as many database schema changes
> are destructive.
>
> My largest issue with our upgrade process is that it requires management
> server downtime (leaving out the system VM efor the moment).  On a 4-6
> month release cycle, a few maintenance windows a year is livable.  However,
> monthly maintenance windows becomes more and more onerous.  Consider the
> number of maintenance windows AWS opens where all or part of its services
> are completely unavailable.  Therefore, our tooling needs to favor
> additive, non-destructive schema changes that allow database upgrades to be
> applied while an older version of the management is still running.
> Combined with a clustered management server configuration, a CloudStack
> user to could upgrade their database schema then perform rolling upgrade of
> their management servers without taking any downtime.  By policy, we should
> only perform changes that require downtime (e.g. changes to the clustering
> protocol, destructive/offline schema changes, etc) in major versions (and
> potentially critical security/stability bug fixes).  While it would be
> extra effort, developing a DSL for describing migrations would allow the
> tool to relieve developers from worrying destructive/additive changes, as
> well as, understanding if/when database changes must be made offline.
> Finally, in order to execute upgrades separately from the management server
> upgrade, the upgrade tooling and execution must be pulled out of the
> management server and into a standalone utility.
>
> Another issue is that we don’t use our database tooling in development in
> the same way as users in production.  Anecdotally, developers vary in the
> way they upgrade their development databases based on their personal
> workflow.  We are committing one of the biggest operational sins — having a
> special process that is rarely executed.  Therefore, I agree with Daan that
> which ever tool we select, it should support a workflow that is efficient
> for both development and operations.  We should be eating our own dog food
> everyday.
>
> I have two issues with FlywayDB.  First, it assumes that wall time will
> synchronized and monotonically increasing.  When things get out of order,
> they tend to fail quietly — causing subtle corruption.  Second, and most
> importantly, it assumes a linearly increasing set of releases.  This model
> works wonderfully for the deployment of internally developed web
> applications, but it does not work for software such as CloudStack.  As has
> been pointed out, minor revision are released after minor releases (e.g.
> 4.5.3 after 4.6.0).  As Daan has also pointed out, in a perfect world,
> database changes would only occur in major or minor releases.  However, in
> reality, some critical bug fixes require database schema changes.  It it
> not acceptable to deny users critical defect fixes because we either cannot
> or will not make database changes in minor revisions.  In its current form,
> FlywayDB is unable to handle this situation because it does not know how
> interleave the 4.5.3 schema changes into the 4.6.0 stream (or 4.7.0 or
> 4.8.0) since it branched off.
>
> While I think the Chimp proposal could use some refinement, the basic idea
> of using directed, acyclic graph (DAG) to establish a chain of database
> mutations addresses the non-linearizable nature of our release process.
> Essentially, it is borrowing the model established by git to create a log
> schema transformations.  Coupling a DAG with content hashing to identify
> each change (e.g. the SHA1 of the author, change date, and migration
> content),  the set of changes required to transform a schema to another
> version can be determined at runtime.   Most importantly, in the same way
> the git can determine that two revisions cannot be automatically merged,
> such a tool can deterministically fail if/when upgrades from one version to
> the another is not possible.  To me, a database schema management tool that
> leveraged the git tree to manage history and calculate the set of changes
> required to upgrade from one version to another would represent the gold
> standard.  I find this approach so powerful because it would leverage the
> standard git revision tracking semantics to identify database changes, the
> rebase/merge workflow to identify and resolve schema upgrade conflicts, and
> release tagging.
>
> In summary, I do not believe that an off-the-shelf tool supports
> combination of non-linear upgrade paths and online database migration we
> require.  Therefore, we will need to develop tooling of own.  To me, the
> question is whether building a new tool from scratch or contributing to
> existing project represent the shortest path to meeting these requirements.
>
> Thanks,
> -John
>
>
> > On Jan 3, 2016, at 6:07 PM, Rafael Weingärtner <
> rafaelweingartner@gmail.com> wrote:
> >
> > That is it Ron ;)
> >
> > Initially, my intentions were only to change the technology, from a
> > homemade approach to an improved one to manage/run upgrade routines to
> the
> > DB. However, after giving some thought to the point you brought up, I
> think
> > that we can use this thread to discuss it too.
> >
> > To use Flywaydb as we have been discussing so far, we have to use some
> > naming standard as “YYYYMMDDHHmm” and the rules we have stated before. We
> > would have to link an ACS version to a marker (timestamp) of the release;
> > that could be used to control the upgrade with Flywaydb, since to go
> from a
> > version to another we have to run all of the script in between them; that
> > is controlled by the timestamp that would work as an incremental version
> > for upgrade routines.
> >
> > Additionally, we can have a maven profile to use Flywaydb for devs and a
> > Spring bean to manage upgrades in production environments.
> > If we have consensus, I am good on adding restrictions regarding the use
> of
> > upgrade routines only on X and X.Y; and not in X.Y.Z to a document that
> can
> > be used to guide devs and committers.
> >
> > On Sun, Jan 3, 2016 at 8:16 PM, Ron Wheeler <
> rwheeler@artifact-software.com>
> > wrote:
> >
> >> On 03/01/2016 7:19 AM, Rafael Weingärtner wrote:
> >>
> >>> Sorry the delay on answering your inquiries, during this period of New
> >>> Year’s Eve I was AFK.
> >>>
> >>> Thanks for the contributions of all.
> >>> I will comment your questions and suggestions as follows:
> >>>
> >>> Ron, I understand your point that there are some projects that do not
> >>> allow
> >>> database change in minor version releases (schema changes). We could
> >>> define
> >>> that as a standard, I do not see a problem on that, as long as we have
> >>> consensus. What we have to keep in mind is that we could still have
> >>> scripts
> >>> that do not change DB’s schema, but add some table into a table in a
> minor
> >>> version.
> >>>
> >>
> >> The main point for me is to make sure that there is a discussion before
> >> this happens and that a clear understanding of the technology debt that
> >> this creates is taken into account before it happens.
> >>
> >>
> >>
> >>> Having said that, we are looking for a way to make the upgrade process
> >>> smoother,  looking for a way to avoid creating upgrade path manually
> with
> >>> scripts such as <currentVersion>to<newerVersion>, because that way we
> have
> >>> to cover every single upgrade path manually. We can work that out
> using a
> >>> tool to “build and execute” the upgrade path, using a standard to
> create
> >>> and name upgrade routines we have been discussing earlier in this
> thread.
> >>>
> >>> Erik, there is a tool to do that. As I mentioned in my previous emails
> >>> there is a tool called Flywaydb that does exactly what you mentioned.
> >>> However, that tool will require an improvement in the way we create and
> >>> name upgrade routines; those changes have been cited and discussed
> >>> earlier.
> >>>
> >>> Paul, about your inquiries:
> >>> When you say rollback, do you mean downgrade after an upgrade? If so,
> we
> >>> have discussed that earlier in this thread and we agreed that we would
> not
> >>> cover downgrades, at least for now. The Admin during the upgrade should
> >>> properly make a copy of his/her database to be restored if a problem
> >>> happens.
> >>>
> >>> About the downtime you mentioned, do you mean the need to stop all of
> the
> >>> MS while executing the upgrade?
> >>> As a cloud administrator that is built on top of ACS, I find quite the
> >>> opposite of you. If I do not look at the source code, I find the
> upgrade
> >>> procedure pretty easy to follow and execute, giving that we just need
> to
> >>> stop all MS and update it with apt-get.
> >>> Even if we build a tool as Rohit suggested, the downtime would exist,
> >>> while
> >>> upgrading the database old release of MS would have to be stopped,
> >>> otherwise we could receive errors with DB’s schemas change. As I said
> in
> >>> some email earlier, I do not find the need to create a new tool that is
> >>> just a wrapper. I prefer to define a standard to create and name
> upgrade
> >>> routines and then use a tool such as Flywaydb directly, which would
> allow
> >>> us to manage solely configurations, instead of wrapper code. IMO the
> less
> >>> code the better.
> >>>
> >>> Paul and Remi, now with Remi’s explanation I understand what you meant
> >>> with
> >>> “downtime”. As Remi’s said the others stack are far worse to upgrade.
> >>> OpenStack has a tool such as the suggested “Chimp” that seems to cover
> >>> rollbacks. However, I found their upgrade process worse than ours.
> >>>
> >>> We are discussing DB upgrade routines here, I understand the problem of
> >>> upgrade as a whole that needs to cover aspect such as SystemVMs
> upgrade.
> >>> However, I think that point should and can be discussed in a separated
> >>> thread; as a consequence of that it is a different part of ACS source
> >>> code.
> >>>
> >>> About reverting an upgrade, I do not find it hard at all; it is
> basically
> >>> restoring the DBs “cloud” and “cloud_usage” to their state prior the
> >>> upgrade (giving that in ACS upgrade page, it is stated that you should
> >>> backup your databases). Maybe because I am a developer, I do not see
> much
> >>> problem with that.
> >>>
> >>> Bottom line:
> >>>
> >>> There is a tool that can help us with upgrade routines for DB, what we
> >>> need
> >>> is a consensus on how to create and name upgrade routines and the tool
> >>> that
> >>> we can use to build and execute the upgrade path. I think we all agreed
> >>> with the standards we had discussed earlier.
> >>>
> >>> Can I create a page in the ACS wiki formalizing the points we discussed
> >>> here in regards to ACS DB’s upgrade routines?
> >>> I tried to create a child page in
> >>> https://cwiki.apache.org/confluence/display/CLOUDSTACK/Developers,
> but it
> >>> seems that I do not have permission. After that, I can start working
> in a
> >>> PR to change add flywaydb to ACs.
> >>>
> >>>
> >>> On Wed, Dec 30, 2015 at 2:41 PM, Ron Wheeler <
> >>> rwheeler@artifact-software.com
> >>>
> >>>> wrote:
> >>>> On 30/12/2015 4:58 AM, Remi Bergsma wrote:
> >>>>
> >>>> Hoi Paul,
> >>>>>
> >>>>> Agree that the user perspective is important, thanks for bringing
> that
> >>>>> up.
> >>>>>
> >>>>> It is also worth pointing out that once you get into the SMB space,
> the
> >>>> system admin may wear a few hats and is not dedicated full time to
> >>>> maintaining Cloudstack.
> >>>> If it works most of the time the way it is supposed to, the admin is
> not
> >>>> spending any time working with the guts of Cloudstack.
> >>>> Once it is up and running, the skills and knowledge will decay pretty
> >>>> quickly.
> >>>> There is a need for an upgrade that works reliably and has good tests
> >>>> that
> >>>> can be quickly tried to see that the upgrade has worked or needs to be
> >>>> reverted.
> >>>>
> >>>>
> >>>> Remember that the other “Stack” is far worse in upgrades, so it’s all
> >>>>> about perspective.
> >>>>>
> >>>>> I guess being the second worst stack is comforting in some way. :-)
> >>>>
> >>>>   Having said that, I also want it to be smooth and we absolutely need
> >>>>> it
> >>>>> to be outside of the main repo and able to rollback if stuff goes
> wrong
> >>>>> (so
> >>>>> users can retry).
> >>>>>
> >>>>> The biggest other issue I see in upgrading is the systemvm
> replacement
> >>>>> and having to reboot (100s or 1000s of routers). That’s where your
> real
> >>>>> downtime is most of the time.
> >>>>>
> >>>>> If you have done all that and have to revert, it is not very
> comforting
> >>>> to
> >>>> know that most of the time you wasted was spent in a fairly stable
> >>>> process
> >>>> and that the downtime can be chalked up to the size of the server
> >>>> population. The users will be happy with that, I suppose.
> >>>>
> >>>> Although upgrading from 4.6 to 4.7 takes under 5 minutes (stop ACS,
> >>>>> replace RPM and start it again) and no systemvm template needed to be
> >>>>> replaced. That’s more like it already ;-)
> >>>>>
> >>>>> That sounds more like what I need!
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> Regards,
> >>>>> Remi
> >>>>>
> >>>>>
> >>>>> From: Paul Angus <paul.angus@shapeblue.com<mailto:
> >>>>> paul.angus@shapeblue.com>>
> >>>>> Reply-To: "dev@cloudstack.apache.org<mailto:
> dev@cloudstack.apache.org>"
> >>>>> <
> >>>>> dev@cloudstack.apache.org<ma...@cloudstack.apache.org>>
> >>>>> Date: Wednesday 30 December 2015 10:10
> >>>>> To: "dev@cloudstack.apache.org<ma...@cloudstack.apache.org>" <
> >>>>> dev@cloudstack.apache.org<ma...@cloudstack.apache.org>>
> >>>>> Subject: RE: Let’s discuss database upgrades
> >>>>>
> >>>>> Hi Guys, from the user's perspective, there are two points which
> come up
> >>>>> again and again -
> >>>>>
> >>>>> 1. lack a prescribed roll back if an upgrade goes badly
> >>>>> 2. The downtime involved in doing upgrades.
> >>>>>
> >>>>> - Upgrades are seen as CloudStack's biggest 'issue'.
> >>>>>
> >>>>> I've had to rescue enough upgrades to understand how complicated it
> is;
> >>>>> however with the increased release velocity, the admin's experience
> of
> >>>>> doing these upgrades needs to be taken into account or we will lose
> >>>>> users
> >>>>> because of the increased admin overhead and downtime.
> >>>>>
> >>>>> The purpose of Rohit's CloudChimp was to find a suitable tool/method
> to
> >>>>> carry out schema changes *without downtime*. You guys are far better
> >>>>> placed
> >>>>> to argue the merits of any one solution than me.
> >>>>>
> >>>>> I would just ask that you keep in mind what the users are looking
> for -
> >>>>> relatively clean and recoverable upgrade process.
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> [ShapeBlue]<http://www.shapeblue.com>
> >>>>> Paul Angus
> >>>>> VP Technology   ,       ShapeBlue
> >>>>>
> >>>>>
> >>>>> d:      +44 203 617 0528 | s: +44 203 603 0540
> >>>>> <tel:+44%20203%20617%200528%20|%20s:%20+44%20203%20603%200540>
> >>>>>    |      m:      +44 7711 418784<tel:+44%207711%20418784>
> >>>>>
> >>>>> e:      paul.angus@shapeblue.com | t: @cloudyangus<mailto:
> >>>>> paul.angus@shapeblue.com%20|%20t:%20@cloudyangus>      |      w:
> >>>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>>>
> >>>>> a:      53 Chandos Place, Covent Garden London WC2N 4HS UK
> >>>>>
> >>>>>
> >>>>> [cid:image182380.png@8ca21c21.40847519]
> >>>>>
> >>>>>
> >>>>> Shape Blue Ltd is a company incorporated in England & Wales.
> ShapeBlue
> >>>>> Services India LLP is a company incorporated in India and is operated
> >>>>> under
> >>>>> license from Shape Blue Ltd. Shape Blue Brasil Consultoria Ltda is a
> >>>>> company incorporated in Brasil and is operated under license from
> Shape
> >>>>> Blue Ltd. ShapeBlue SA Pty Ltd is a company registered by The
> Republic
> >>>>> of
> >>>>> South Africa and is traded under license from Shape Blue Ltd.
> ShapeBlue
> >>>>> is
> >>>>> a registered trademark.
> >>>>> This email and any attachments to it may be confidential and are
> >>>>> intended
> >>>>> solely for the use of the individual to whom it is addressed. Any
> views
> >>>>> or
> >>>>> opinions expressed are solely those of the author and do not
> necessarily
> >>>>> represent those of Shape Blue Ltd or related companies. If you are
> not
> >>>>> the
> >>>>> intended recipient of this email, you must neither take any action
> based
> >>>>> upon its contents, nor copy or show it to anyone. Please contact the
> >>>>> sender
> >>>>> if you believe you have received this email in error.
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> -----Original Message-----
> >>>>> From: Erik Weber [mailto:terbolous@gmail.com]
> >>>>> Sent: 29 December 2015 21:45
> >>>>> To: dev <dev@cloudstack.apache.org<mailto:dev@cloudstack.apache.org
> >>
> >>>>> Subject: Re: Let’s discuss database upgrades
> >>>>>
> >>>>> On Mon, Dec 28, 2015 at 2:16 PM, Rafael Weingärtner <
> >>>>> rafaelweingartner@gmail.com<ma...@gmail.com>>
> wrote:
> >>>>>
> >>>>> Hi all devs,
> >>>>>
> >>>>>> First of all, sorry the long text, but I hope we can start a
> >>>>>> discussion here and improve that part of ACS.
> >>>>>>
> >>>>>> A while ago I have faced the code that Apache CloudStack (ACS) uses
> to
> >>>>>> upgrade from a version to newer one and that did not seem to be a
> good
> >>>>>> way to execute our upgrades. Therefore, I decided to use some time
> to
> >>>>>> search for alternatives.
> >>>>>>
> >>>>>> I have read some material about versioning of scripts used to
> upgrade
> >>>>>> a database (DB) of a system and went through some frameworks that
> >>>>>> could help us.
> >>>>>>
> >>>>>> In the literature of software engineering, it is firmly stated that
> we
> >>>>>> have to version DB scripts as we do with the source code of the
> >>>>>> application, using the baseline approach. Gladly, we were not that
> bad
> >>>>>> at this point, we already versioned our routines for DB upgrade
> (.sql
> >>>>>> and .java). Therefore, it seemed that we just did not have used a
> >>>>>> practical approach to help us during DB upgrades.
> >>>>>>
> >>>>>>  From my readings and looking at the ACS source code I raised the
> >>>>>> following
> >>>>>> requirement:
> >>>>>> • We should be able to write more than one routine to upgrade to a
> >>>>>> version; those routines can be written in Java and SQL. We might
> have
> >>>>>> more than a routine to be executed for each version and we should be
> >>>>>> able to define an order of execution. Additionally, to go to an
> upper
> >>>>>> version, we have to run all of the routines from smaller versions
> >>>>>> first, until we achieve the desired version.
> >>>>>>
> >>>>>> We could also add another requirement that is the downgrade from a
> >>>>>> version, which we currently do not support. With that comes my first
> >>>>>> question for
> >>>>>> discussion:
> >>>>>> • Do we want/need a method to downgrade from a version to a previous
> >>>>>> one?
> >>>>>>
> >>>>>> I found an explanation for not supporting downgrades, and I liked
> it:
> >>>>>> http://flywaydb.org/documentation/faq.html#downgrade
> >>>>>>
> >>>>>> So, what I devised for us:
> >>>>>> First the bureaucracy part - our migrations occur basically in three
> >>>>>> (3) steps, first we have a "prepare script", then a cleanup script
> and
> >>>>>> finally the migration per se that is written in Java, at least, that
> >>>>>> is what we can expect when reading the interface
> >>>>>> “com.cloud.upgrade.dao.DbUpgrade”.
> >>>>>>
> >>>>>> Additionally, our scripts have the following naming convention:
> >>>>>> schema-<currentVersion>to<desiredVersion>, which in IMHO may cause
> >>>>>> some confusion because at first sight we may think that from the
> same
> >>>>>> version we could have different paths to an upper version, which in
> >>>>>> practice is not happening. Instead of a <currentVersion>to<version>
> we
> >>>>>> could simply use V_<numberOfVersion>_<sequencial>.<fileExtension>,
> >>>>>> giving that, we have to execute all of the V_<version> scripts that
> >>>>>> are smaller than the version we want to upgrade.
> >>>>>>
> >>>>>> To clarify what I am saying, I will use an example. Let’s say we
> have
> >>>>>> just installed ACS and ran the cloudstack-setup-database. That
> command
> >>>>>> will create a database schema in version 4.0.0. To upgrade that
> schema
> >>>>>> to version 4.3.0 (it is just an example, it could be any other
> >>>>>> version), ACS will use the following mapping:
> >>>>>>
> >>>>>> _upgradeMap.put("4.0.0", new DbUpgrade[] {new Upgrade40to41(), new
> >>>>>> Upgrade410to420(), new Upgrade420to421(), new Upgrade421to430())
> >>>>>>
> >>>>>> After loading the mapping, ACS will execute the scripts defined in
> >>>>>> each one of the Upgrade path classes and the migration code per se.
> >>>>>>
> >>>>>> Now, let’s say we change the “.sql” scripts name to the pattern I
> >>>>>> mentioned, we would have the following scripts; those are the
> scripts
> >>>>>> found that aim to upgrade to versions between the interval 4.0.0 –
> >>>>>> 4.3.0 (considering 4.3.0, since that is the goal version):
> >>>>>>
> >>>>>>
> >>>>>> - schema-40to410, can be named to: V_410_A.sql
> >>>>>> - schema-40to410-cleanup, can be named to: V_410_B.sql
> >>>>>> - schema-410to420, can be named to: V_420_A.sql
> >>>>>> - schema-410to420-cleanup , can be named to: V_420_b.sql
> >>>>>> - schema-420to421, can be named to: V_421_A.sql
> >>>>>> - schema-421to430, can be named to: V_430_A.sql
> >>>>>> - schema-421to430-cleanup, can be named to: V_430_B.sql
> >>>>>>
> >>>>>>
> >>>>>> Additionally, all of the java code would have to follow the same
> >>>>>> convention. For instance, we have
> >>>>>> “com.cloud.upgrade.dao.Upgrade40to41”,
> >>>>>> which has some java code to migrate from 4.0.0 to 4.1.0. The idea is
> >>>>>> to extract that migration code to a Java class named: V_410_C.java,
> >>>>>> giving that it has to execute the SQL scripts before the java code.
> >>>>>>
> >>>>>> In order to go from a smaller version (4.0.0) to an upper one
> (4.3.0),
> >>>>>> we have to run all of the migration routines from intermediate
> >>>>>> versions. That is what we are already doing, but we do all of that
> >>>>>> manually.
> >>>>>>
> >>>>>> Bottom line, I think we could simple use the convention
> >>>>>> V_<numberOfVersion>_<sequencial>.<fileExtension> to name upgrade
> >>>>>> routines.
> >>>>>> That would facilitate us to use a framework to help us with that
> >>>>>> process.
> >>>>>> Additionally, I believe that we should always assume that to go
> from a
> >>>>>> smaller version to a higher one, we should run all of the scripts
> that
> >>>>>> exist between them. What do you guys think of that?
> >>>>>>
> >>>>>> After the bureaucracy, we can discuss tools. If we use that
> convention
> >>>>>> to name migration (upgrade) routines, we can start thinking on tools
> >>>>>> to support our migration process. I found two (2) promising ones:
> >>>>>> Liquibase and Flywaydb (both seem to be under Apache license, but
> the
> >>>>>> first one has an enterprise version?!). After reading the
> >>>>>> documentation and some usage examples I found the flywaydb easier
> and
> >>>>>> simpler to use.
> >>>>>>
> >>>>>> What are the options of tools that we can use to help us manage the
> >>>>>> database upgrade, without needing to code the upgrade path that you
> >>>>>> know?
> >>>>>>
> >>>>>> After that, I think we should decide if we should create another
> >>>>>> project/component to take care of migrations, or we can just add the
> >>>>>> dependency of the tool to a project such as “cloud-framework-db” and
> >>>>>> start using it.
> >>>>>>
> >>>>>> The “cloud-framework-db” project seems to have a focus on other
> things
> >>>>>> such as managing transactions and generating SQLs from annotations
> >>>>>> (?!? That should be a topic for another discussion). Therefore, I
> >>>>>> would rather create a new project that has the specific goal of
> >>>>>> managing ACS DB upgrades. I would also move all of the routines (SQL
> >>>>>> and
> >>>>>> Java) to this new project.
> >>>>>> This project would be a module of the CloudStack project and it
> would
> >>>>>> execute the upgrade routines at the startup of ACS.
> >>>>>>
> >>>>>> I believe that going from a homemade solution to one that is more
> >>>>>> consolidated and used by other communities would be the way to go.
> >>>>>>
> >>>>>> I can volunteer myself to create a PR with the aforementioned
> changes
> >>>>>> and using flywaydb to manage our upgrades. However, I prefer to
> have a
> >>>>>> good discussion with other devs first, before starting coding.
> >>>>>>
> >>>>>> Do you have suggestions or points that should be raised before we
> >>>>>> start working on that?
> >>>>>>
> >>>>>>
> >>>>>> This isn't my field of work, so forgive me if this is self
> explanatory
> >>>>> or
> >>>>> something, but is there no tool like terraform/puppet or similar for
> >>>>> database work?
> >>>>> I mean, where you state you desired state and the tool handles it.
> >>>>>
> >>>>> To me it sounds like a good way would be if you could specify what
> you
> >>>>> want to exist (or not), and how it should look like.
> >>>>>
> >>>>> "I want table XYZ to exist with THESE columns having THIS type(s) and
> >>>>> THIS default value bla bla bla"
> >>>>>
> >>>>> Rather than handling a bunch of sql scripts that has to handle
> different
> >>>>> mysql versions (come to think about an issue with a mariadb version
> >>>>> crashing recently), a variety of cloudstack versions and a whole lot
> >>>>> more.
> >>>>>
> >>>>> Disclaimer: i have no idea if this is what flywaydb does, if it is,
> then
> >>>>> just ignore this.
> >>>>>
> >>>>> --
> >>>>> Erik
> >>>>> Find out more about ShapeBlue and our range of CloudStack related
> >>>>> services:
> >>>>> IaaS Cloud Design & Build<
> >>>>> http://shapeblue.com/iaas-cloud-design-and-build//> | CSForge –
> rapid
> >>>>> IaaS deployment framework<http://shapeblue.com/csforge/>
> >>>>> CloudStack Consulting<http://shapeblue.com/cloudstack-consultancy/>
> |
> >>>>> CloudStack Software Engineering<
> >>>>> http://shapeblue.com/cloudstack-software-engineering/>
> >>>>> CloudStack Infrastructure Support<
> >>>>> http://shapeblue.com/cloudstack-infrastructure-support/> |
> CloudStack
> >>>>> Bootcamp Training Courses<http://shapeblue.com/cloudstack-training/>
> >>>>>
> >>>>>
> >>>> --
> >>>> Ron Wheeler
> >>>> President
> >>>> Artifact Software Inc
> >>>> email: rwheeler@artifact-software.com
> >>>> skype: ronaldmwheeler
> >>>> phone: 866-970-2435, ext 102
> >>>>
> >>>>
> >>>>
> >>>
> >>
> >> --
> >> Ron Wheeler
> >> President
> >> Artifact Software Inc
> >> email: rwheeler@artifact-software.com
> >> skype: ronaldmwheeler
> >> phone: 866-970-2435, ext 102
> >>
> >>
> >
> >
> > --
> > Rafael Weingärtner
>
> Find out more about ShapeBlue and our range of CloudStack related services:
> IaaS Cloud Design & Build<
> http://shapeblue.com/iaas-cloud-design-and-build//> | CSForge – rapid
> IaaS deployment framework<http://shapeblue.com/csforge/>
> CloudStack Consulting<http://shapeblue.com/cloudstack-consultancy/> |
> CloudStack Software Engineering<
> http://shapeblue.com/cloudstack-software-engineering/>
> CloudStack Infrastructure Support<
> http://shapeblue.com/cloudstack-infrastructure-support/> | CloudStack
> Bootcamp Training Courses<http://shapeblue.com/cloudstack-training/>
>



-- 
Rafael Weingärtner

Re: Let’s discuss database upgrades

Posted by John Burwell <jo...@shapeblue.com>.
All,

I completely agree with Wido that the notion of the ACS version (e.g. 4.6.0, 4.6.1, 4.7.0, etc) should be a purely logical concept.  It points to particular git hash, a version of the database schema, etc.  I also agree that supporting downgrade is a fools errand as many database schema changes are destructive.

My largest issue with our upgrade process is that it requires management server downtime (leaving out the system VM efor the moment).  On a 4-6 month release cycle, a few maintenance windows a year is livable.  However, monthly maintenance windows becomes more and more onerous.  Consider the number of maintenance windows AWS opens where all or part of its services are completely unavailable.  Therefore, our tooling needs to favor additive, non-destructive schema changes that allow database upgrades to be applied while an older version of the management is still running.  Combined with a clustered management server configuration, a CloudStack user to could upgrade their database schema then perform rolling upgrade of their management servers without taking any downtime.  By policy, we should only perform changes that require downtime (e.g. changes to the clustering protocol, destructive/offline schema changes, etc) in major versions (and potentially critical security/stability bug fixes).  While it would be extra effort, developing a DSL for describing migrations would allow the tool to relieve developers from worrying destructive/additive changes, as well as, understanding if/when database changes must be made offline.  Finally, in order to execute upgrades separately from the management server upgrade, the upgrade tooling and execution must be pulled out of the management server and into a standalone utility.

Another issue is that we don’t use our database tooling in development in the same way as users in production.  Anecdotally, developers vary in the way they upgrade their development databases based on their personal workflow.  We are committing one of the biggest operational sins — having a special process that is rarely executed.  Therefore, I agree with Daan that which ever tool we select, it should support a workflow that is efficient for both development and operations.  We should be eating our own dog food everyday.

I have two issues with FlywayDB.  First, it assumes that wall time will synchronized and monotonically increasing.  When things get out of order, they tend to fail quietly — causing subtle corruption.  Second, and most importantly, it assumes a linearly increasing set of releases.  This model works wonderfully for the deployment of internally developed web applications, but it does not work for software such as CloudStack.  As has been pointed out, minor revision are released after minor releases (e.g. 4.5.3 after 4.6.0).  As Daan has also pointed out, in a perfect world, database changes would only occur in major or minor releases.  However, in reality, some critical bug fixes require database schema changes.  It it not acceptable to deny users critical defect fixes because we either cannot or will not make database changes in minor revisions.  In its current form, FlywayDB is unable to handle this situation because it does not know how interleave the 4.5.3 schema changes into the 4.6.0 stream (or 4.7.0 or 4.8.0) since it branched off.

While I think the Chimp proposal could use some refinement, the basic idea of using directed, acyclic graph (DAG) to establish a chain of database mutations addresses the non-linearizable nature of our release process.  Essentially, it is borrowing the model established by git to create a log schema transformations.  Coupling a DAG with content hashing to identify each change (e.g. the SHA1 of the author, change date, and migration content),  the set of changes required to transform a schema to another version can be determined at runtime.   Most importantly, in the same way the git can determine that two revisions cannot be automatically merged, such a tool can deterministically fail if/when upgrades from one version to the another is not possible.  To me, a database schema management tool that leveraged the git tree to manage history and calculate the set of changes required to upgrade from one version to another would represent the gold standard.  I find this approach so powerful because it would leverage the standard git revision tracking semantics to identify database changes, the rebase/merge workflow to identify and resolve schema upgrade conflicts, and release tagging.

In summary, I do not believe that an off-the-shelf tool supports combination of non-linear upgrade paths and online database migration we require.  Therefore, we will need to develop tooling of own.  To me, the question is whether building a new tool from scratch or contributing to existing project represent the shortest path to meeting these requirements.

Thanks,
-John


> On Jan 3, 2016, at 6:07 PM, Rafael Weingärtner <ra...@gmail.com> wrote:
>
> That is it Ron ;)
>
> Initially, my intentions were only to change the technology, from a
> homemade approach to an improved one to manage/run upgrade routines to the
> DB. However, after giving some thought to the point you brought up, I think
> that we can use this thread to discuss it too.
>
> To use Flywaydb as we have been discussing so far, we have to use some
> naming standard as “YYYYMMDDHHmm” and the rules we have stated before. We
> would have to link an ACS version to a marker (timestamp) of the release;
> that could be used to control the upgrade with Flywaydb, since to go from a
> version to another we have to run all of the script in between them; that
> is controlled by the timestamp that would work as an incremental version
> for upgrade routines.
>
> Additionally, we can have a maven profile to use Flywaydb for devs and a
> Spring bean to manage upgrades in production environments.
> If we have consensus, I am good on adding restrictions regarding the use of
> upgrade routines only on X and X.Y; and not in X.Y.Z to a document that can
> be used to guide devs and committers.
>
> On Sun, Jan 3, 2016 at 8:16 PM, Ron Wheeler <rw...@artifact-software.com>
> wrote:
>
>> On 03/01/2016 7:19 AM, Rafael Weingärtner wrote:
>>
>>> Sorry the delay on answering your inquiries, during this period of New
>>> Year’s Eve I was AFK.
>>>
>>> Thanks for the contributions of all.
>>> I will comment your questions and suggestions as follows:
>>>
>>> Ron, I understand your point that there are some projects that do not
>>> allow
>>> database change in minor version releases (schema changes). We could
>>> define
>>> that as a standard, I do not see a problem on that, as long as we have
>>> consensus. What we have to keep in mind is that we could still have
>>> scripts
>>> that do not change DB’s schema, but add some table into a table in a minor
>>> version.
>>>
>>
>> The main point for me is to make sure that there is a discussion before
>> this happens and that a clear understanding of the technology debt that
>> this creates is taken into account before it happens.
>>
>>
>>
>>> Having said that, we are looking for a way to make the upgrade process
>>> smoother,  looking for a way to avoid creating upgrade path manually with
>>> scripts such as <currentVersion>to<newerVersion>, because that way we have
>>> to cover every single upgrade path manually. We can work that out using a
>>> tool to “build and execute” the upgrade path, using a standard to create
>>> and name upgrade routines we have been discussing earlier in this thread.
>>>
>>> Erik, there is a tool to do that. As I mentioned in my previous emails
>>> there is a tool called Flywaydb that does exactly what you mentioned.
>>> However, that tool will require an improvement in the way we create and
>>> name upgrade routines; those changes have been cited and discussed
>>> earlier.
>>>
>>> Paul, about your inquiries:
>>> When you say rollback, do you mean downgrade after an upgrade? If so, we
>>> have discussed that earlier in this thread and we agreed that we would not
>>> cover downgrades, at least for now. The Admin during the upgrade should
>>> properly make a copy of his/her database to be restored if a problem
>>> happens.
>>>
>>> About the downtime you mentioned, do you mean the need to stop all of the
>>> MS while executing the upgrade?
>>> As a cloud administrator that is built on top of ACS, I find quite the
>>> opposite of you. If I do not look at the source code, I find the upgrade
>>> procedure pretty easy to follow and execute, giving that we just need to
>>> stop all MS and update it with apt-get.
>>> Even if we build a tool as Rohit suggested, the downtime would exist,
>>> while
>>> upgrading the database old release of MS would have to be stopped,
>>> otherwise we could receive errors with DB’s schemas change. As I said in
>>> some email earlier, I do not find the need to create a new tool that is
>>> just a wrapper. I prefer to define a standard to create and name upgrade
>>> routines and then use a tool such as Flywaydb directly, which would allow
>>> us to manage solely configurations, instead of wrapper code. IMO the less
>>> code the better.
>>>
>>> Paul and Remi, now with Remi’s explanation I understand what you meant
>>> with
>>> “downtime”. As Remi’s said the others stack are far worse to upgrade.
>>> OpenStack has a tool such as the suggested “Chimp” that seems to cover
>>> rollbacks. However, I found their upgrade process worse than ours.
>>>
>>> We are discussing DB upgrade routines here, I understand the problem of
>>> upgrade as a whole that needs to cover aspect such as SystemVMs upgrade.
>>> However, I think that point should and can be discussed in a separated
>>> thread; as a consequence of that it is a different part of ACS source
>>> code.
>>>
>>> About reverting an upgrade, I do not find it hard at all; it is basically
>>> restoring the DBs “cloud” and “cloud_usage” to their state prior the
>>> upgrade (giving that in ACS upgrade page, it is stated that you should
>>> backup your databases). Maybe because I am a developer, I do not see much
>>> problem with that.
>>>
>>> Bottom line:
>>>
>>> There is a tool that can help us with upgrade routines for DB, what we
>>> need
>>> is a consensus on how to create and name upgrade routines and the tool
>>> that
>>> we can use to build and execute the upgrade path. I think we all agreed
>>> with the standards we had discussed earlier.
>>>
>>> Can I create a page in the ACS wiki formalizing the points we discussed
>>> here in regards to ACS DB’s upgrade routines?
>>> I tried to create a child page in
>>> https://cwiki.apache.org/confluence/display/CLOUDSTACK/Developers, but it
>>> seems that I do not have permission. After that, I can start working in a
>>> PR to change add flywaydb to ACs.
>>>
>>>
>>> On Wed, Dec 30, 2015 at 2:41 PM, Ron Wheeler <
>>> rwheeler@artifact-software.com
>>>
>>>> wrote:
>>>> On 30/12/2015 4:58 AM, Remi Bergsma wrote:
>>>>
>>>> Hoi Paul,
>>>>>
>>>>> Agree that the user perspective is important, thanks for bringing that
>>>>> up.
>>>>>
>>>>> It is also worth pointing out that once you get into the SMB space, the
>>>> system admin may wear a few hats and is not dedicated full time to
>>>> maintaining Cloudstack.
>>>> If it works most of the time the way it is supposed to, the admin is not
>>>> spending any time working with the guts of Cloudstack.
>>>> Once it is up and running, the skills and knowledge will decay pretty
>>>> quickly.
>>>> There is a need for an upgrade that works reliably and has good tests
>>>> that
>>>> can be quickly tried to see that the upgrade has worked or needs to be
>>>> reverted.
>>>>
>>>>
>>>> Remember that the other “Stack” is far worse in upgrades, so it’s all
>>>>> about perspective.
>>>>>
>>>>> I guess being the second worst stack is comforting in some way. :-)
>>>>
>>>>   Having said that, I also want it to be smooth and we absolutely need
>>>>> it
>>>>> to be outside of the main repo and able to rollback if stuff goes wrong
>>>>> (so
>>>>> users can retry).
>>>>>
>>>>> The biggest other issue I see in upgrading is the systemvm replacement
>>>>> and having to reboot (100s or 1000s of routers). That’s where your real
>>>>> downtime is most of the time.
>>>>>
>>>>> If you have done all that and have to revert, it is not very comforting
>>>> to
>>>> know that most of the time you wasted was spent in a fairly stable
>>>> process
>>>> and that the downtime can be chalked up to the size of the server
>>>> population. The users will be happy with that, I suppose.
>>>>
>>>> Although upgrading from 4.6 to 4.7 takes under 5 minutes (stop ACS,
>>>>> replace RPM and start it again) and no systemvm template needed to be
>>>>> replaced. That’s more like it already ;-)
>>>>>
>>>>> That sounds more like what I need!
>>>>
>>>>
>>>>
>>>>
>>>> Regards,
>>>>> Remi
>>>>>
>>>>>
>>>>> From: Paul Angus <paul.angus@shapeblue.com<mailto:
>>>>> paul.angus@shapeblue.com>>
>>>>> Reply-To: "dev@cloudstack.apache.org<ma...@cloudstack.apache.org>"
>>>>> <
>>>>> dev@cloudstack.apache.org<ma...@cloudstack.apache.org>>
>>>>> Date: Wednesday 30 December 2015 10:10
>>>>> To: "dev@cloudstack.apache.org<ma...@cloudstack.apache.org>" <
>>>>> dev@cloudstack.apache.org<ma...@cloudstack.apache.org>>
>>>>> Subject: RE: Let’s discuss database upgrades
>>>>>
>>>>> Hi Guys, from the user's perspective, there are two points which come up
>>>>> again and again -
>>>>>
>>>>> 1. lack a prescribed roll back if an upgrade goes badly
>>>>> 2. The downtime involved in doing upgrades.
>>>>>
>>>>> - Upgrades are seen as CloudStack's biggest 'issue'.
>>>>>
>>>>> I've had to rescue enough upgrades to understand how complicated it is;
>>>>> however with the increased release velocity, the admin's experience of
>>>>> doing these upgrades needs to be taken into account or we will lose
>>>>> users
>>>>> because of the increased admin overhead and downtime.
>>>>>
>>>>> The purpose of Rohit's CloudChimp was to find a suitable tool/method to
>>>>> carry out schema changes *without downtime*. You guys are far better
>>>>> placed
>>>>> to argue the merits of any one solution than me.
>>>>>
>>>>> I would just ask that you keep in mind what the users are looking for -
>>>>> relatively clean and recoverable upgrade process.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> [ShapeBlue]<http://www.shapeblue.com>
>>>>> Paul Angus
>>>>> VP Technology   ,       ShapeBlue
>>>>>
>>>>>
>>>>> d:      +44 203 617 0528 | s: +44 203 603 0540
>>>>> <tel:+44%20203%20617%200528%20|%20s:%20+44%20203%20603%200540>
>>>>>    |      m:      +44 7711 418784<tel:+44%207711%20418784>
>>>>>
>>>>> e:      paul.angus@shapeblue.com | t: @cloudyangus<mailto:
>>>>> paul.angus@shapeblue.com%20|%20t:%20@cloudyangus>      |      w:
>>>>> www.shapeblue.com<http://www.shapeblue.com>
>>>>>
>>>>> a:      53 Chandos Place, Covent Garden London WC2N 4HS UK
>>>>>
>>>>>
>>>>> [cid:image182380.png@8ca21c21.40847519]
>>>>>
>>>>>
>>>>> Shape Blue Ltd is a company incorporated in England & Wales. ShapeBlue
>>>>> Services India LLP is a company incorporated in India and is operated
>>>>> under
>>>>> license from Shape Blue Ltd. Shape Blue Brasil Consultoria Ltda is a
>>>>> company incorporated in Brasil and is operated under license from Shape
>>>>> Blue Ltd. ShapeBlue SA Pty Ltd is a company registered by The Republic
>>>>> of
>>>>> South Africa and is traded under license from Shape Blue Ltd. ShapeBlue
>>>>> is
>>>>> a registered trademark.
>>>>> This email and any attachments to it may be confidential and are
>>>>> intended
>>>>> solely for the use of the individual to whom it is addressed. Any views
>>>>> or
>>>>> opinions expressed are solely those of the author and do not necessarily
>>>>> represent those of Shape Blue Ltd or related companies. If you are not
>>>>> the
>>>>> intended recipient of this email, you must neither take any action based
>>>>> upon its contents, nor copy or show it to anyone. Please contact the
>>>>> sender
>>>>> if you believe you have received this email in error.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> -----Original Message-----
>>>>> From: Erik Weber [mailto:terbolous@gmail.com]
>>>>> Sent: 29 December 2015 21:45
>>>>> To: dev <de...@cloudstack.apache.org>>
>>>>> Subject: Re: Let’s discuss database upgrades
>>>>>
>>>>> On Mon, Dec 28, 2015 at 2:16 PM, Rafael Weingärtner <
>>>>> rafaelweingartner@gmail.com<ma...@gmail.com>> wrote:
>>>>>
>>>>> Hi all devs,
>>>>>
>>>>>> First of all, sorry the long text, but I hope we can start a
>>>>>> discussion here and improve that part of ACS.
>>>>>>
>>>>>> A while ago I have faced the code that Apache CloudStack (ACS) uses to
>>>>>> upgrade from a version to newer one and that did not seem to be a good
>>>>>> way to execute our upgrades. Therefore, I decided to use some time to
>>>>>> search for alternatives.
>>>>>>
>>>>>> I have read some material about versioning of scripts used to upgrade
>>>>>> a database (DB) of a system and went through some frameworks that
>>>>>> could help us.
>>>>>>
>>>>>> In the literature of software engineering, it is firmly stated that we
>>>>>> have to version DB scripts as we do with the source code of the
>>>>>> application, using the baseline approach. Gladly, we were not that bad
>>>>>> at this point, we already versioned our routines for DB upgrade (.sql
>>>>>> and .java). Therefore, it seemed that we just did not have used a
>>>>>> practical approach to help us during DB upgrades.
>>>>>>
>>>>>>  From my readings and looking at the ACS source code I raised the
>>>>>> following
>>>>>> requirement:
>>>>>> • We should be able to write more than one routine to upgrade to a
>>>>>> version; those routines can be written in Java and SQL. We might have
>>>>>> more than a routine to be executed for each version and we should be
>>>>>> able to define an order of execution. Additionally, to go to an upper
>>>>>> version, we have to run all of the routines from smaller versions
>>>>>> first, until we achieve the desired version.
>>>>>>
>>>>>> We could also add another requirement that is the downgrade from a
>>>>>> version, which we currently do not support. With that comes my first
>>>>>> question for
>>>>>> discussion:
>>>>>> • Do we want/need a method to downgrade from a version to a previous
>>>>>> one?
>>>>>>
>>>>>> I found an explanation for not supporting downgrades, and I liked it:
>>>>>> http://flywaydb.org/documentation/faq.html#downgrade
>>>>>>
>>>>>> So, what I devised for us:
>>>>>> First the bureaucracy part - our migrations occur basically in three
>>>>>> (3) steps, first we have a "prepare script", then a cleanup script and
>>>>>> finally the migration per se that is written in Java, at least, that
>>>>>> is what we can expect when reading the interface
>>>>>> “com.cloud.upgrade.dao.DbUpgrade”.
>>>>>>
>>>>>> Additionally, our scripts have the following naming convention:
>>>>>> schema-<currentVersion>to<desiredVersion>, which in IMHO may cause
>>>>>> some confusion because at first sight we may think that from the same
>>>>>> version we could have different paths to an upper version, which in
>>>>>> practice is not happening. Instead of a <currentVersion>to<version> we
>>>>>> could simply use V_<numberOfVersion>_<sequencial>.<fileExtension>,
>>>>>> giving that, we have to execute all of the V_<version> scripts that
>>>>>> are smaller than the version we want to upgrade.
>>>>>>
>>>>>> To clarify what I am saying, I will use an example. Let’s say we have
>>>>>> just installed ACS and ran the cloudstack-setup-database. That command
>>>>>> will create a database schema in version 4.0.0. To upgrade that schema
>>>>>> to version 4.3.0 (it is just an example, it could be any other
>>>>>> version), ACS will use the following mapping:
>>>>>>
>>>>>> _upgradeMap.put("4.0.0", new DbUpgrade[] {new Upgrade40to41(), new
>>>>>> Upgrade410to420(), new Upgrade420to421(), new Upgrade421to430())
>>>>>>
>>>>>> After loading the mapping, ACS will execute the scripts defined in
>>>>>> each one of the Upgrade path classes and the migration code per se.
>>>>>>
>>>>>> Now, let’s say we change the “.sql” scripts name to the pattern I
>>>>>> mentioned, we would have the following scripts; those are the scripts
>>>>>> found that aim to upgrade to versions between the interval 4.0.0 –
>>>>>> 4.3.0 (considering 4.3.0, since that is the goal version):
>>>>>>
>>>>>>
>>>>>> - schema-40to410, can be named to: V_410_A.sql
>>>>>> - schema-40to410-cleanup, can be named to: V_410_B.sql
>>>>>> - schema-410to420, can be named to: V_420_A.sql
>>>>>> - schema-410to420-cleanup , can be named to: V_420_b.sql
>>>>>> - schema-420to421, can be named to: V_421_A.sql
>>>>>> - schema-421to430, can be named to: V_430_A.sql
>>>>>> - schema-421to430-cleanup, can be named to: V_430_B.sql
>>>>>>
>>>>>>
>>>>>> Additionally, all of the java code would have to follow the same
>>>>>> convention. For instance, we have
>>>>>> “com.cloud.upgrade.dao.Upgrade40to41”,
>>>>>> which has some java code to migrate from 4.0.0 to 4.1.0. The idea is
>>>>>> to extract that migration code to a Java class named: V_410_C.java,
>>>>>> giving that it has to execute the SQL scripts before the java code.
>>>>>>
>>>>>> In order to go from a smaller version (4.0.0) to an upper one (4.3.0),
>>>>>> we have to run all of the migration routines from intermediate
>>>>>> versions. That is what we are already doing, but we do all of that
>>>>>> manually.
>>>>>>
>>>>>> Bottom line, I think we could simple use the convention
>>>>>> V_<numberOfVersion>_<sequencial>.<fileExtension> to name upgrade
>>>>>> routines.
>>>>>> That would facilitate us to use a framework to help us with that
>>>>>> process.
>>>>>> Additionally, I believe that we should always assume that to go from a
>>>>>> smaller version to a higher one, we should run all of the scripts that
>>>>>> exist between them. What do you guys think of that?
>>>>>>
>>>>>> After the bureaucracy, we can discuss tools. If we use that convention
>>>>>> to name migration (upgrade) routines, we can start thinking on tools
>>>>>> to support our migration process. I found two (2) promising ones:
>>>>>> Liquibase and Flywaydb (both seem to be under Apache license, but the
>>>>>> first one has an enterprise version?!). After reading the
>>>>>> documentation and some usage examples I found the flywaydb easier and
>>>>>> simpler to use.
>>>>>>
>>>>>> What are the options of tools that we can use to help us manage the
>>>>>> database upgrade, without needing to code the upgrade path that you
>>>>>> know?
>>>>>>
>>>>>> After that, I think we should decide if we should create another
>>>>>> project/component to take care of migrations, or we can just add the
>>>>>> dependency of the tool to a project such as “cloud-framework-db” and
>>>>>> start using it.
>>>>>>
>>>>>> The “cloud-framework-db” project seems to have a focus on other things
>>>>>> such as managing transactions and generating SQLs from annotations
>>>>>> (?!? That should be a topic for another discussion). Therefore, I
>>>>>> would rather create a new project that has the specific goal of
>>>>>> managing ACS DB upgrades. I would also move all of the routines (SQL
>>>>>> and
>>>>>> Java) to this new project.
>>>>>> This project would be a module of the CloudStack project and it would
>>>>>> execute the upgrade routines at the startup of ACS.
>>>>>>
>>>>>> I believe that going from a homemade solution to one that is more
>>>>>> consolidated and used by other communities would be the way to go.
>>>>>>
>>>>>> I can volunteer myself to create a PR with the aforementioned changes
>>>>>> and using flywaydb to manage our upgrades. However, I prefer to have a
>>>>>> good discussion with other devs first, before starting coding.
>>>>>>
>>>>>> Do you have suggestions or points that should be raised before we
>>>>>> start working on that?
>>>>>>
>>>>>>
>>>>>> This isn't my field of work, so forgive me if this is self explanatory
>>>>> or
>>>>> something, but is there no tool like terraform/puppet or similar for
>>>>> database work?
>>>>> I mean, where you state you desired state and the tool handles it.
>>>>>
>>>>> To me it sounds like a good way would be if you could specify what you
>>>>> want to exist (or not), and how it should look like.
>>>>>
>>>>> "I want table XYZ to exist with THESE columns having THIS type(s) and
>>>>> THIS default value bla bla bla"
>>>>>
>>>>> Rather than handling a bunch of sql scripts that has to handle different
>>>>> mysql versions (come to think about an issue with a mariadb version
>>>>> crashing recently), a variety of cloudstack versions and a whole lot
>>>>> more.
>>>>>
>>>>> Disclaimer: i have no idea if this is what flywaydb does, if it is, then
>>>>> just ignore this.
>>>>>
>>>>> --
>>>>> Erik
>>>>> Find out more about ShapeBlue and our range of CloudStack related
>>>>> services:
>>>>> IaaS Cloud Design & Build<
>>>>> http://shapeblue.com/iaas-cloud-design-and-build//> | CSForge – rapid
>>>>> IaaS deployment framework<http://shapeblue.com/csforge/>
>>>>> CloudStack Consulting<http://shapeblue.com/cloudstack-consultancy/> |
>>>>> CloudStack Software Engineering<
>>>>> http://shapeblue.com/cloudstack-software-engineering/>
>>>>> CloudStack Infrastructure Support<
>>>>> http://shapeblue.com/cloudstack-infrastructure-support/> | CloudStack
>>>>> Bootcamp Training Courses<http://shapeblue.com/cloudstack-training/>
>>>>>
>>>>>
>>>> --
>>>> Ron Wheeler
>>>> President
>>>> Artifact Software Inc
>>>> email: rwheeler@artifact-software.com
>>>> skype: ronaldmwheeler
>>>> phone: 866-970-2435, ext 102
>>>>
>>>>
>>>>
>>>
>>
>> --
>> Ron Wheeler
>> President
>> Artifact Software Inc
>> email: rwheeler@artifact-software.com
>> skype: ronaldmwheeler
>> phone: 866-970-2435, ext 102
>>
>>
>
>
> --
> Rafael Weingärtner

Find out more about ShapeBlue and our range of CloudStack related services:
IaaS Cloud Design & Build<http://shapeblue.com/iaas-cloud-design-and-build//> | CSForge – rapid IaaS deployment framework<http://shapeblue.com/csforge/>
CloudStack Consulting<http://shapeblue.com/cloudstack-consultancy/> | CloudStack Software Engineering<http://shapeblue.com/cloudstack-software-engineering/>
CloudStack Infrastructure Support<http://shapeblue.com/cloudstack-infrastructure-support/> | CloudStack Bootcamp Training Courses<http://shapeblue.com/cloudstack-training/>

Re: Let’s discuss database upgrades

Posted by Rafael Weingärtner <ra...@gmail.com>.
That is it Ron ;)

Initially, my intentions were only to change the technology, from a
homemade approach to an improved one to manage/run upgrade routines to the
DB. However, after giving some thought to the point you brought up, I think
that we can use this thread to discuss it too.

To use Flywaydb as we have been discussing so far, we have to use some
naming standard as “YYYYMMDDHHmm” and the rules we have stated before. We
would have to link an ACS version to a marker (timestamp) of the release;
that could be used to control the upgrade with Flywaydb, since to go from a
version to another we have to run all of the script in between them; that
is controlled by the timestamp that would work as an incremental version
for upgrade routines.

Additionally, we can have a maven profile to use Flywaydb for devs and a
Spring bean to manage upgrades in production environments.
If we have consensus, I am good on adding restrictions regarding the use of
upgrade routines only on X and X.Y; and not in X.Y.Z to a document that can
be used to guide devs and committers.

On Sun, Jan 3, 2016 at 8:16 PM, Ron Wheeler <rw...@artifact-software.com>
wrote:

> On 03/01/2016 7:19 AM, Rafael Weingärtner wrote:
>
>> Sorry the delay on answering your inquiries, during this period of New
>> Year’s Eve I was AFK.
>>
>> Thanks for the contributions of all.
>> I will comment your questions and suggestions as follows:
>>
>> Ron, I understand your point that there are some projects that do not
>> allow
>> database change in minor version releases (schema changes). We could
>> define
>> that as a standard, I do not see a problem on that, as long as we have
>> consensus. What we have to keep in mind is that we could still have
>> scripts
>> that do not change DB’s schema, but add some table into a table in a minor
>> version.
>>
>
> The main point for me is to make sure that there is a discussion before
> this happens and that a clear understanding of the technology debt that
> this creates is taken into account before it happens.
>
>
>
>> Having said that, we are looking for a way to make the upgrade process
>> smoother,  looking for a way to avoid creating upgrade path manually with
>> scripts such as <currentVersion>to<newerVersion>, because that way we have
>> to cover every single upgrade path manually. We can work that out using a
>> tool to “build and execute” the upgrade path, using a standard to create
>> and name upgrade routines we have been discussing earlier in this thread.
>>
>> Erik, there is a tool to do that. As I mentioned in my previous emails
>> there is a tool called Flywaydb that does exactly what you mentioned.
>> However, that tool will require an improvement in the way we create and
>> name upgrade routines; those changes have been cited and discussed
>> earlier.
>>
>> Paul, about your inquiries:
>> When you say rollback, do you mean downgrade after an upgrade? If so, we
>> have discussed that earlier in this thread and we agreed that we would not
>> cover downgrades, at least for now. The Admin during the upgrade should
>> properly make a copy of his/her database to be restored if a problem
>> happens.
>>
>> About the downtime you mentioned, do you mean the need to stop all of the
>> MS while executing the upgrade?
>> As a cloud administrator that is built on top of ACS, I find quite the
>> opposite of you. If I do not look at the source code, I find the upgrade
>> procedure pretty easy to follow and execute, giving that we just need to
>> stop all MS and update it with apt-get.
>> Even if we build a tool as Rohit suggested, the downtime would exist,
>> while
>> upgrading the database old release of MS would have to be stopped,
>> otherwise we could receive errors with DB’s schemas change. As I said in
>> some email earlier, I do not find the need to create a new tool that is
>> just a wrapper. I prefer to define a standard to create and name upgrade
>> routines and then use a tool such as Flywaydb directly, which would allow
>> us to manage solely configurations, instead of wrapper code. IMO the less
>> code the better.
>>
>> Paul and Remi, now with Remi’s explanation I understand what you meant
>> with
>> “downtime”. As Remi’s said the others stack are far worse to upgrade.
>> OpenStack has a tool such as the suggested “Chimp” that seems to cover
>> rollbacks. However, I found their upgrade process worse than ours.
>>
>> We are discussing DB upgrade routines here, I understand the problem of
>> upgrade as a whole that needs to cover aspect such as SystemVMs upgrade.
>> However, I think that point should and can be discussed in a separated
>> thread; as a consequence of that it is a different part of ACS source
>> code.
>>
>> About reverting an upgrade, I do not find it hard at all; it is basically
>> restoring the DBs “cloud” and “cloud_usage” to their state prior the
>> upgrade (giving that in ACS upgrade page, it is stated that you should
>> backup your databases). Maybe because I am a developer, I do not see much
>> problem with that.
>>
>> Bottom line:
>>
>> There is a tool that can help us with upgrade routines for DB, what we
>> need
>> is a consensus on how to create and name upgrade routines and the tool
>> that
>> we can use to build and execute the upgrade path. I think we all agreed
>> with the standards we had discussed earlier.
>>
>> Can I create a page in the ACS wiki formalizing the points we discussed
>> here in regards to ACS DB’s upgrade routines?
>> I tried to create a child page in
>> https://cwiki.apache.org/confluence/display/CLOUDSTACK/Developers, but it
>> seems that I do not have permission. After that, I can start working in a
>> PR to change add flywaydb to ACs.
>>
>>
>> On Wed, Dec 30, 2015 at 2:41 PM, Ron Wheeler <
>> rwheeler@artifact-software.com
>>
>>> wrote:
>>> On 30/12/2015 4:58 AM, Remi Bergsma wrote:
>>>
>>> Hoi Paul,
>>>>
>>>> Agree that the user perspective is important, thanks for bringing that
>>>> up.
>>>>
>>>> It is also worth pointing out that once you get into the SMB space, the
>>> system admin may wear a few hats and is not dedicated full time to
>>> maintaining Cloudstack.
>>> If it works most of the time the way it is supposed to, the admin is not
>>> spending any time working with the guts of Cloudstack.
>>> Once it is up and running, the skills and knowledge will decay pretty
>>> quickly.
>>> There is a need for an upgrade that works reliably and has good tests
>>> that
>>> can be quickly tried to see that the upgrade has worked or needs to be
>>> reverted.
>>>
>>>
>>> Remember that the other “Stack” is far worse in upgrades, so it’s all
>>>> about perspective.
>>>>
>>>> I guess being the second worst stack is comforting in some way. :-)
>>>
>>>    Having said that, I also want it to be smooth and we absolutely need
>>>> it
>>>> to be outside of the main repo and able to rollback if stuff goes wrong
>>>> (so
>>>> users can retry).
>>>>
>>>> The biggest other issue I see in upgrading is the systemvm replacement
>>>> and having to reboot (100s or 1000s of routers). That’s where your real
>>>> downtime is most of the time.
>>>>
>>>> If you have done all that and have to revert, it is not very comforting
>>> to
>>> know that most of the time you wasted was spent in a fairly stable
>>> process
>>> and that the downtime can be chalked up to the size of the server
>>> population. The users will be happy with that, I suppose.
>>>
>>> Although upgrading from 4.6 to 4.7 takes under 5 minutes (stop ACS,
>>>> replace RPM and start it again) and no systemvm template needed to be
>>>> replaced. That’s more like it already ;-)
>>>>
>>>> That sounds more like what I need!
>>>
>>>
>>>
>>>
>>> Regards,
>>>> Remi
>>>>
>>>>
>>>> From: Paul Angus <paul.angus@shapeblue.com<mailto:
>>>> paul.angus@shapeblue.com>>
>>>> Reply-To: "dev@cloudstack.apache.org<ma...@cloudstack.apache.org>"
>>>> <
>>>> dev@cloudstack.apache.org<ma...@cloudstack.apache.org>>
>>>> Date: Wednesday 30 December 2015 10:10
>>>> To: "dev@cloudstack.apache.org<ma...@cloudstack.apache.org>" <
>>>> dev@cloudstack.apache.org<ma...@cloudstack.apache.org>>
>>>> Subject: RE: Let’s discuss database upgrades
>>>>
>>>> Hi Guys, from the user's perspective, there are two points which come up
>>>> again and again -
>>>>
>>>> 1. lack a prescribed roll back if an upgrade goes badly
>>>> 2. The downtime involved in doing upgrades.
>>>>
>>>> - Upgrades are seen as CloudStack's biggest 'issue'.
>>>>
>>>> I've had to rescue enough upgrades to understand how complicated it is;
>>>> however with the increased release velocity, the admin's experience of
>>>> doing these upgrades needs to be taken into account or we will lose
>>>> users
>>>> because of the increased admin overhead and downtime.
>>>>
>>>> The purpose of Rohit's CloudChimp was to find a suitable tool/method to
>>>> carry out schema changes *without downtime*. You guys are far better
>>>> placed
>>>> to argue the merits of any one solution than me.
>>>>
>>>> I would just ask that you keep in mind what the users are looking for -
>>>> relatively clean and recoverable upgrade process.
>>>>
>>>>
>>>>
>>>>
>>>> [ShapeBlue]<http://www.shapeblue.com>
>>>> Paul Angus
>>>> VP Technology   ,       ShapeBlue
>>>>
>>>>
>>>> d:      +44 203 617 0528 | s: +44 203 603 0540
>>>> <tel:+44%20203%20617%200528%20|%20s:%20+44%20203%20603%200540>
>>>>     |      m:      +44 7711 418784<tel:+44%207711%20418784>
>>>>
>>>> e:      paul.angus@shapeblue.com | t: @cloudyangus<mailto:
>>>> paul.angus@shapeblue.com%20|%20t:%20@cloudyangus>      |      w:
>>>> www.shapeblue.com<http://www.shapeblue.com>
>>>>
>>>> a:      53 Chandos Place, Covent Garden London WC2N 4HS UK
>>>>
>>>>
>>>> [cid:image182380.png@8ca21c21.40847519]
>>>>
>>>>
>>>> Shape Blue Ltd is a company incorporated in England & Wales. ShapeBlue
>>>> Services India LLP is a company incorporated in India and is operated
>>>> under
>>>> license from Shape Blue Ltd. Shape Blue Brasil Consultoria Ltda is a
>>>> company incorporated in Brasil and is operated under license from Shape
>>>> Blue Ltd. ShapeBlue SA Pty Ltd is a company registered by The Republic
>>>> of
>>>> South Africa and is traded under license from Shape Blue Ltd. ShapeBlue
>>>> is
>>>> a registered trademark.
>>>> This email and any attachments to it may be confidential and are
>>>> intended
>>>> solely for the use of the individual to whom it is addressed. Any views
>>>> or
>>>> opinions expressed are solely those of the author and do not necessarily
>>>> represent those of Shape Blue Ltd or related companies. If you are not
>>>> the
>>>> intended recipient of this email, you must neither take any action based
>>>> upon its contents, nor copy or show it to anyone. Please contact the
>>>> sender
>>>> if you believe you have received this email in error.
>>>>
>>>>
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: Erik Weber [mailto:terbolous@gmail.com]
>>>> Sent: 29 December 2015 21:45
>>>> To: dev <de...@cloudstack.apache.org>>
>>>> Subject: Re: Let’s discuss database upgrades
>>>>
>>>> On Mon, Dec 28, 2015 at 2:16 PM, Rafael Weingärtner <
>>>> rafaelweingartner@gmail.com<ma...@gmail.com>> wrote:
>>>>
>>>> Hi all devs,
>>>>
>>>>> First of all, sorry the long text, but I hope we can start a
>>>>> discussion here and improve that part of ACS.
>>>>>
>>>>> A while ago I have faced the code that Apache CloudStack (ACS) uses to
>>>>> upgrade from a version to newer one and that did not seem to be a good
>>>>> way to execute our upgrades. Therefore, I decided to use some time to
>>>>> search for alternatives.
>>>>>
>>>>> I have read some material about versioning of scripts used to upgrade
>>>>> a database (DB) of a system and went through some frameworks that
>>>>> could help us.
>>>>>
>>>>> In the literature of software engineering, it is firmly stated that we
>>>>> have to version DB scripts as we do with the source code of the
>>>>> application, using the baseline approach. Gladly, we were not that bad
>>>>> at this point, we already versioned our routines for DB upgrade (.sql
>>>>> and .java). Therefore, it seemed that we just did not have used a
>>>>> practical approach to help us during DB upgrades.
>>>>>
>>>>>   From my readings and looking at the ACS source code I raised the
>>>>> following
>>>>> requirement:
>>>>> • We should be able to write more than one routine to upgrade to a
>>>>> version; those routines can be written in Java and SQL. We might have
>>>>> more than a routine to be executed for each version and we should be
>>>>> able to define an order of execution. Additionally, to go to an upper
>>>>> version, we have to run all of the routines from smaller versions
>>>>> first, until we achieve the desired version.
>>>>>
>>>>> We could also add another requirement that is the downgrade from a
>>>>> version, which we currently do not support. With that comes my first
>>>>> question for
>>>>> discussion:
>>>>> • Do we want/need a method to downgrade from a version to a previous
>>>>> one?
>>>>>
>>>>> I found an explanation for not supporting downgrades, and I liked it:
>>>>> http://flywaydb.org/documentation/faq.html#downgrade
>>>>>
>>>>> So, what I devised for us:
>>>>> First the bureaucracy part - our migrations occur basically in three
>>>>> (3) steps, first we have a "prepare script", then a cleanup script and
>>>>> finally the migration per se that is written in Java, at least, that
>>>>> is what we can expect when reading the interface
>>>>> “com.cloud.upgrade.dao.DbUpgrade”.
>>>>>
>>>>> Additionally, our scripts have the following naming convention:
>>>>> schema-<currentVersion>to<desiredVersion>, which in IMHO may cause
>>>>> some confusion because at first sight we may think that from the same
>>>>> version we could have different paths to an upper version, which in
>>>>> practice is not happening. Instead of a <currentVersion>to<version> we
>>>>> could simply use V_<numberOfVersion>_<sequencial>.<fileExtension>,
>>>>> giving that, we have to execute all of the V_<version> scripts that
>>>>> are smaller than the version we want to upgrade.
>>>>>
>>>>> To clarify what I am saying, I will use an example. Let’s say we have
>>>>> just installed ACS and ran the cloudstack-setup-database. That command
>>>>> will create a database schema in version 4.0.0. To upgrade that schema
>>>>> to version 4.3.0 (it is just an example, it could be any other
>>>>> version), ACS will use the following mapping:
>>>>>
>>>>> _upgradeMap.put("4.0.0", new DbUpgrade[] {new Upgrade40to41(), new
>>>>> Upgrade410to420(), new Upgrade420to421(), new Upgrade421to430())
>>>>>
>>>>> After loading the mapping, ACS will execute the scripts defined in
>>>>> each one of the Upgrade path classes and the migration code per se.
>>>>>
>>>>> Now, let’s say we change the “.sql” scripts name to the pattern I
>>>>> mentioned, we would have the following scripts; those are the scripts
>>>>> found that aim to upgrade to versions between the interval 4.0.0 –
>>>>> 4.3.0 (considering 4.3.0, since that is the goal version):
>>>>>
>>>>>
>>>>> - schema-40to410, can be named to: V_410_A.sql
>>>>> - schema-40to410-cleanup, can be named to: V_410_B.sql
>>>>> - schema-410to420, can be named to: V_420_A.sql
>>>>> - schema-410to420-cleanup , can be named to: V_420_b.sql
>>>>> - schema-420to421, can be named to: V_421_A.sql
>>>>> - schema-421to430, can be named to: V_430_A.sql
>>>>> - schema-421to430-cleanup, can be named to: V_430_B.sql
>>>>>
>>>>>
>>>>> Additionally, all of the java code would have to follow the same
>>>>> convention. For instance, we have
>>>>> “com.cloud.upgrade.dao.Upgrade40to41”,
>>>>> which has some java code to migrate from 4.0.0 to 4.1.0. The idea is
>>>>> to extract that migration code to a Java class named: V_410_C.java,
>>>>> giving that it has to execute the SQL scripts before the java code.
>>>>>
>>>>> In order to go from a smaller version (4.0.0) to an upper one (4.3.0),
>>>>> we have to run all of the migration routines from intermediate
>>>>> versions. That is what we are already doing, but we do all of that
>>>>> manually.
>>>>>
>>>>> Bottom line, I think we could simple use the convention
>>>>> V_<numberOfVersion>_<sequencial>.<fileExtension> to name upgrade
>>>>> routines.
>>>>> That would facilitate us to use a framework to help us with that
>>>>> process.
>>>>> Additionally, I believe that we should always assume that to go from a
>>>>> smaller version to a higher one, we should run all of the scripts that
>>>>> exist between them. What do you guys think of that?
>>>>>
>>>>> After the bureaucracy, we can discuss tools. If we use that convention
>>>>> to name migration (upgrade) routines, we can start thinking on tools
>>>>> to support our migration process. I found two (2) promising ones:
>>>>> Liquibase and Flywaydb (both seem to be under Apache license, but the
>>>>> first one has an enterprise version?!). After reading the
>>>>> documentation and some usage examples I found the flywaydb easier and
>>>>> simpler to use.
>>>>>
>>>>> What are the options of tools that we can use to help us manage the
>>>>> database upgrade, without needing to code the upgrade path that you
>>>>> know?
>>>>>
>>>>> After that, I think we should decide if we should create another
>>>>> project/component to take care of migrations, or we can just add the
>>>>> dependency of the tool to a project such as “cloud-framework-db” and
>>>>> start using it.
>>>>>
>>>>> The “cloud-framework-db” project seems to have a focus on other things
>>>>> such as managing transactions and generating SQLs from annotations
>>>>> (?!? That should be a topic for another discussion). Therefore, I
>>>>> would rather create a new project that has the specific goal of
>>>>> managing ACS DB upgrades. I would also move all of the routines (SQL
>>>>> and
>>>>> Java) to this new project.
>>>>> This project would be a module of the CloudStack project and it would
>>>>> execute the upgrade routines at the startup of ACS.
>>>>>
>>>>> I believe that going from a homemade solution to one that is more
>>>>> consolidated and used by other communities would be the way to go.
>>>>>
>>>>> I can volunteer myself to create a PR with the aforementioned changes
>>>>> and using flywaydb to manage our upgrades. However, I prefer to have a
>>>>> good discussion with other devs first, before starting coding.
>>>>>
>>>>> Do you have suggestions or points that should be raised before we
>>>>> start working on that?
>>>>>
>>>>>
>>>>> This isn't my field of work, so forgive me if this is self explanatory
>>>> or
>>>> something, but is there no tool like terraform/puppet or similar for
>>>> database work?
>>>> I mean, where you state you desired state and the tool handles it.
>>>>
>>>> To me it sounds like a good way would be if you could specify what you
>>>> want to exist (or not), and how it should look like.
>>>>
>>>> "I want table XYZ to exist with THESE columns having THIS type(s) and
>>>> THIS default value bla bla bla"
>>>>
>>>> Rather than handling a bunch of sql scripts that has to handle different
>>>> mysql versions (come to think about an issue with a mariadb version
>>>> crashing recently), a variety of cloudstack versions and a whole lot
>>>> more.
>>>>
>>>> Disclaimer: i have no idea if this is what flywaydb does, if it is, then
>>>> just ignore this.
>>>>
>>>> --
>>>> Erik
>>>> Find out more about ShapeBlue and our range of CloudStack related
>>>> services:
>>>> IaaS Cloud Design & Build<
>>>> http://shapeblue.com/iaas-cloud-design-and-build//> | CSForge – rapid
>>>> IaaS deployment framework<http://shapeblue.com/csforge/>
>>>> CloudStack Consulting<http://shapeblue.com/cloudstack-consultancy/> |
>>>> CloudStack Software Engineering<
>>>> http://shapeblue.com/cloudstack-software-engineering/>
>>>> CloudStack Infrastructure Support<
>>>> http://shapeblue.com/cloudstack-infrastructure-support/> | CloudStack
>>>> Bootcamp Training Courses<http://shapeblue.com/cloudstack-training/>
>>>>
>>>>
>>> --
>>> Ron Wheeler
>>> President
>>> Artifact Software Inc
>>> email: rwheeler@artifact-software.com
>>> skype: ronaldmwheeler
>>> phone: 866-970-2435, ext 102
>>>
>>>
>>>
>>
>
> --
> Ron Wheeler
> President
> Artifact Software Inc
> email: rwheeler@artifact-software.com
> skype: ronaldmwheeler
> phone: 866-970-2435, ext 102
>
>


-- 
Rafael Weingärtner

Re: Let’s discuss database upgrades

Posted by Ron Wheeler <rw...@artifact-software.com>.
On 03/01/2016 7:19 AM, Rafael Weingärtner wrote:
> Sorry the delay on answering your inquiries, during this period of New
> Year’s Eve I was AFK.
>
> Thanks for the contributions of all.
> I will comment your questions and suggestions as follows:
>
> Ron, I understand your point that there are some projects that do not allow
> database change in minor version releases (schema changes). We could define
> that as a standard, I do not see a problem on that, as long as we have
> consensus. What we have to keep in mind is that we could still have scripts
> that do not change DB’s schema, but add some table into a table in a minor
> version.

The main point for me is to make sure that there is a discussion before 
this happens and that a clear understanding of the technology debt that 
this creates is taken into account before it happens.

>
> Having said that, we are looking for a way to make the upgrade process
> smoother,  looking for a way to avoid creating upgrade path manually with
> scripts such as <currentVersion>to<newerVersion>, because that way we have
> to cover every single upgrade path manually. We can work that out using a
> tool to “build and execute” the upgrade path, using a standard to create
> and name upgrade routines we have been discussing earlier in this thread.
>
> Erik, there is a tool to do that. As I mentioned in my previous emails
> there is a tool called Flywaydb that does exactly what you mentioned.
> However, that tool will require an improvement in the way we create and
> name upgrade routines; those changes have been cited and discussed earlier.
>
> Paul, about your inquiries:
> When you say rollback, do you mean downgrade after an upgrade? If so, we
> have discussed that earlier in this thread and we agreed that we would not
> cover downgrades, at least for now. The Admin during the upgrade should
> properly make a copy of his/her database to be restored if a problem
> happens.
>
> About the downtime you mentioned, do you mean the need to stop all of the
> MS while executing the upgrade?
> As a cloud administrator that is built on top of ACS, I find quite the
> opposite of you. If I do not look at the source code, I find the upgrade
> procedure pretty easy to follow and execute, giving that we just need to
> stop all MS and update it with apt-get.
> Even if we build a tool as Rohit suggested, the downtime would exist, while
> upgrading the database old release of MS would have to be stopped,
> otherwise we could receive errors with DB’s schemas change. As I said in
> some email earlier, I do not find the need to create a new tool that is
> just a wrapper. I prefer to define a standard to create and name upgrade
> routines and then use a tool such as Flywaydb directly, which would allow
> us to manage solely configurations, instead of wrapper code. IMO the less
> code the better.
>
> Paul and Remi, now with Remi’s explanation I understand what you meant with
> “downtime”. As Remi’s said the others stack are far worse to upgrade.
> OpenStack has a tool such as the suggested “Chimp” that seems to cover
> rollbacks. However, I found their upgrade process worse than ours.
>
> We are discussing DB upgrade routines here, I understand the problem of
> upgrade as a whole that needs to cover aspect such as SystemVMs upgrade.
> However, I think that point should and can be discussed in a separated
> thread; as a consequence of that it is a different part of ACS source code.
>
> About reverting an upgrade, I do not find it hard at all; it is basically
> restoring the DBs “cloud” and “cloud_usage” to their state prior the
> upgrade (giving that in ACS upgrade page, it is stated that you should
> backup your databases). Maybe because I am a developer, I do not see much
> problem with that.
>
> Bottom line:
>
> There is a tool that can help us with upgrade routines for DB, what we need
> is a consensus on how to create and name upgrade routines and the tool that
> we can use to build and execute the upgrade path. I think we all agreed
> with the standards we had discussed earlier.
>
> Can I create a page in the ACS wiki formalizing the points we discussed
> here in regards to ACS DB’s upgrade routines?
> I tried to create a child page in
> https://cwiki.apache.org/confluence/display/CLOUDSTACK/Developers, but it
> seems that I do not have permission. After that, I can start working in a
> PR to change add flywaydb to ACs.
>
>
> On Wed, Dec 30, 2015 at 2:41 PM, Ron Wheeler <rwheeler@artifact-software.com
>> wrote:
>> On 30/12/2015 4:58 AM, Remi Bergsma wrote:
>>
>>> Hoi Paul,
>>>
>>> Agree that the user perspective is important, thanks for bringing that up.
>>>
>> It is also worth pointing out that once you get into the SMB space, the
>> system admin may wear a few hats and is not dedicated full time to
>> maintaining Cloudstack.
>> If it works most of the time the way it is supposed to, the admin is not
>> spending any time working with the guts of Cloudstack.
>> Once it is up and running, the skills and knowledge will decay pretty
>> quickly.
>> There is a need for an upgrade that works reliably and has good tests that
>> can be quickly tried to see that the upgrade has worked or needs to be
>> reverted.
>>
>>
>>> Remember that the other “Stack” is far worse in upgrades, so it’s all
>>> about perspective.
>>>
>> I guess being the second worst stack is comforting in some way. :-)
>>
>>>    Having said that, I also want it to be smooth and we absolutely need it
>>> to be outside of the main repo and able to rollback if stuff goes wrong (so
>>> users can retry).
>>>
>>> The biggest other issue I see in upgrading is the systemvm replacement
>>> and having to reboot (100s or 1000s of routers). That’s where your real
>>> downtime is most of the time.
>>>
>> If you have done all that and have to revert, it is not very comforting to
>> know that most of the time you wasted was spent in a fairly stable process
>> and that the downtime can be chalked up to the size of the server
>> population. The users will be happy with that, I suppose.
>>
>>> Although upgrading from 4.6 to 4.7 takes under 5 minutes (stop ACS,
>>> replace RPM and start it again) and no systemvm template needed to be
>>> replaced. That’s more like it already ;-)
>>>
>> That sounds more like what I need!
>>
>>
>>
>>
>>> Regards,
>>> Remi
>>>
>>>
>>> From: Paul Angus <paul.angus@shapeblue.com<mailto:
>>> paul.angus@shapeblue.com>>
>>> Reply-To: "dev@cloudstack.apache.org<ma...@cloudstack.apache.org>" <
>>> dev@cloudstack.apache.org<ma...@cloudstack.apache.org>>
>>> Date: Wednesday 30 December 2015 10:10
>>> To: "dev@cloudstack.apache.org<ma...@cloudstack.apache.org>" <
>>> dev@cloudstack.apache.org<ma...@cloudstack.apache.org>>
>>> Subject: RE: Let’s discuss database upgrades
>>>
>>> Hi Guys, from the user's perspective, there are two points which come up
>>> again and again -
>>>
>>> 1. lack a prescribed roll back if an upgrade goes badly
>>> 2. The downtime involved in doing upgrades.
>>>
>>> - Upgrades are seen as CloudStack's biggest 'issue'.
>>>
>>> I've had to rescue enough upgrades to understand how complicated it is;
>>> however with the increased release velocity, the admin's experience of
>>> doing these upgrades needs to be taken into account or we will lose users
>>> because of the increased admin overhead and downtime.
>>>
>>> The purpose of Rohit's CloudChimp was to find a suitable tool/method to
>>> carry out schema changes *without downtime*. You guys are far better placed
>>> to argue the merits of any one solution than me.
>>>
>>> I would just ask that you keep in mind what the users are looking for -
>>> relatively clean and recoverable upgrade process.
>>>
>>>
>>>
>>>
>>> [ShapeBlue]<http://www.shapeblue.com>
>>> Paul Angus
>>> VP Technology   ,       ShapeBlue
>>>
>>>
>>> d:      +44 203 617 0528 | s: +44 203 603 0540<tel:+44%20203%20617%200528%20|%20s:%20+44%20203%20603%200540>
>>>     |      m:      +44 7711 418784<tel:+44%207711%20418784>
>>>
>>> e:      paul.angus@shapeblue.com | t: @cloudyangus<mailto:
>>> paul.angus@shapeblue.com%20|%20t:%20@cloudyangus>      |      w:
>>> www.shapeblue.com<http://www.shapeblue.com>
>>>
>>> a:      53 Chandos Place, Covent Garden London WC2N 4HS UK
>>>
>>>
>>> [cid:image182380.png@8ca21c21.40847519]
>>>
>>>
>>> Shape Blue Ltd is a company incorporated in England & Wales. ShapeBlue
>>> Services India LLP is a company incorporated in India and is operated under
>>> license from Shape Blue Ltd. Shape Blue Brasil Consultoria Ltda is a
>>> company incorporated in Brasil and is operated under license from Shape
>>> Blue Ltd. ShapeBlue SA Pty Ltd is a company registered by The Republic of
>>> South Africa and is traded under license from Shape Blue Ltd. ShapeBlue is
>>> a registered trademark.
>>> This email and any attachments to it may be confidential and are intended
>>> solely for the use of the individual to whom it is addressed. Any views or
>>> opinions expressed are solely those of the author and do not necessarily
>>> represent those of Shape Blue Ltd or related companies. If you are not the
>>> intended recipient of this email, you must neither take any action based
>>> upon its contents, nor copy or show it to anyone. Please contact the sender
>>> if you believe you have received this email in error.
>>>
>>>
>>>
>>>
>>> -----Original Message-----
>>> From: Erik Weber [mailto:terbolous@gmail.com]
>>> Sent: 29 December 2015 21:45
>>> To: dev <de...@cloudstack.apache.org>>
>>> Subject: Re: Let’s discuss database upgrades
>>>
>>> On Mon, Dec 28, 2015 at 2:16 PM, Rafael Weingärtner <
>>> rafaelweingartner@gmail.com<ma...@gmail.com>> wrote:
>>>
>>> Hi all devs,
>>>> First of all, sorry the long text, but I hope we can start a
>>>> discussion here and improve that part of ACS.
>>>>
>>>> A while ago I have faced the code that Apache CloudStack (ACS) uses to
>>>> upgrade from a version to newer one and that did not seem to be a good
>>>> way to execute our upgrades. Therefore, I decided to use some time to
>>>> search for alternatives.
>>>>
>>>> I have read some material about versioning of scripts used to upgrade
>>>> a database (DB) of a system and went through some frameworks that
>>>> could help us.
>>>>
>>>> In the literature of software engineering, it is firmly stated that we
>>>> have to version DB scripts as we do with the source code of the
>>>> application, using the baseline approach. Gladly, we were not that bad
>>>> at this point, we already versioned our routines for DB upgrade (.sql
>>>> and .java). Therefore, it seemed that we just did not have used a
>>>> practical approach to help us during DB upgrades.
>>>>
>>>>   From my readings and looking at the ACS source code I raised the
>>>> following
>>>> requirement:
>>>> • We should be able to write more than one routine to upgrade to a
>>>> version; those routines can be written in Java and SQL. We might have
>>>> more than a routine to be executed for each version and we should be
>>>> able to define an order of execution. Additionally, to go to an upper
>>>> version, we have to run all of the routines from smaller versions
>>>> first, until we achieve the desired version.
>>>>
>>>> We could also add another requirement that is the downgrade from a
>>>> version, which we currently do not support. With that comes my first
>>>> question for
>>>> discussion:
>>>> • Do we want/need a method to downgrade from a version to a previous
>>>> one?
>>>>
>>>> I found an explanation for not supporting downgrades, and I liked it:
>>>> http://flywaydb.org/documentation/faq.html#downgrade
>>>>
>>>> So, what I devised for us:
>>>> First the bureaucracy part - our migrations occur basically in three
>>>> (3) steps, first we have a "prepare script", then a cleanup script and
>>>> finally the migration per se that is written in Java, at least, that
>>>> is what we can expect when reading the interface
>>>> “com.cloud.upgrade.dao.DbUpgrade”.
>>>>
>>>> Additionally, our scripts have the following naming convention:
>>>> schema-<currentVersion>to<desiredVersion>, which in IMHO may cause
>>>> some confusion because at first sight we may think that from the same
>>>> version we could have different paths to an upper version, which in
>>>> practice is not happening. Instead of a <currentVersion>to<version> we
>>>> could simply use V_<numberOfVersion>_<sequencial>.<fileExtension>,
>>>> giving that, we have to execute all of the V_<version> scripts that
>>>> are smaller than the version we want to upgrade.
>>>>
>>>> To clarify what I am saying, I will use an example. Let’s say we have
>>>> just installed ACS and ran the cloudstack-setup-database. That command
>>>> will create a database schema in version 4.0.0. To upgrade that schema
>>>> to version 4.3.0 (it is just an example, it could be any other
>>>> version), ACS will use the following mapping:
>>>>
>>>> _upgradeMap.put("4.0.0", new DbUpgrade[] {new Upgrade40to41(), new
>>>> Upgrade410to420(), new Upgrade420to421(), new Upgrade421to430())
>>>>
>>>> After loading the mapping, ACS will execute the scripts defined in
>>>> each one of the Upgrade path classes and the migration code per se.
>>>>
>>>> Now, let’s say we change the “.sql” scripts name to the pattern I
>>>> mentioned, we would have the following scripts; those are the scripts
>>>> found that aim to upgrade to versions between the interval 4.0.0 –
>>>> 4.3.0 (considering 4.3.0, since that is the goal version):
>>>>
>>>>
>>>> - schema-40to410, can be named to: V_410_A.sql
>>>> - schema-40to410-cleanup, can be named to: V_410_B.sql
>>>> - schema-410to420, can be named to: V_420_A.sql
>>>> - schema-410to420-cleanup , can be named to: V_420_b.sql
>>>> - schema-420to421, can be named to: V_421_A.sql
>>>> - schema-421to430, can be named to: V_430_A.sql
>>>> - schema-421to430-cleanup, can be named to: V_430_B.sql
>>>>
>>>>
>>>> Additionally, all of the java code would have to follow the same
>>>> convention. For instance, we have
>>>> “com.cloud.upgrade.dao.Upgrade40to41”,
>>>> which has some java code to migrate from 4.0.0 to 4.1.0. The idea is
>>>> to extract that migration code to a Java class named: V_410_C.java,
>>>> giving that it has to execute the SQL scripts before the java code.
>>>>
>>>> In order to go from a smaller version (4.0.0) to an upper one (4.3.0),
>>>> we have to run all of the migration routines from intermediate
>>>> versions. That is what we are already doing, but we do all of that
>>>> manually.
>>>>
>>>> Bottom line, I think we could simple use the convention
>>>> V_<numberOfVersion>_<sequencial>.<fileExtension> to name upgrade
>>>> routines.
>>>> That would facilitate us to use a framework to help us with that process.
>>>> Additionally, I believe that we should always assume that to go from a
>>>> smaller version to a higher one, we should run all of the scripts that
>>>> exist between them. What do you guys think of that?
>>>>
>>>> After the bureaucracy, we can discuss tools. If we use that convention
>>>> to name migration (upgrade) routines, we can start thinking on tools
>>>> to support our migration process. I found two (2) promising ones:
>>>> Liquibase and Flywaydb (both seem to be under Apache license, but the
>>>> first one has an enterprise version?!). After reading the
>>>> documentation and some usage examples I found the flywaydb easier and
>>>> simpler to use.
>>>>
>>>> What are the options of tools that we can use to help us manage the
>>>> database upgrade, without needing to code the upgrade path that you know?
>>>>
>>>> After that, I think we should decide if we should create another
>>>> project/component to take care of migrations, or we can just add the
>>>> dependency of the tool to a project such as “cloud-framework-db” and
>>>> start using it.
>>>>
>>>> The “cloud-framework-db” project seems to have a focus on other things
>>>> such as managing transactions and generating SQLs from annotations
>>>> (?!? That should be a topic for another discussion). Therefore, I
>>>> would rather create a new project that has the specific goal of
>>>> managing ACS DB upgrades. I would also move all of the routines (SQL and
>>>> Java) to this new project.
>>>> This project would be a module of the CloudStack project and it would
>>>> execute the upgrade routines at the startup of ACS.
>>>>
>>>> I believe that going from a homemade solution to one that is more
>>>> consolidated and used by other communities would be the way to go.
>>>>
>>>> I can volunteer myself to create a PR with the aforementioned changes
>>>> and using flywaydb to manage our upgrades. However, I prefer to have a
>>>> good discussion with other devs first, before starting coding.
>>>>
>>>> Do you have suggestions or points that should be raised before we
>>>> start working on that?
>>>>
>>>>
>>> This isn't my field of work, so forgive me if this is self explanatory or
>>> something, but is there no tool like terraform/puppet or similar for
>>> database work?
>>> I mean, where you state you desired state and the tool handles it.
>>>
>>> To me it sounds like a good way would be if you could specify what you
>>> want to exist (or not), and how it should look like.
>>>
>>> "I want table XYZ to exist with THESE columns having THIS type(s) and
>>> THIS default value bla bla bla"
>>>
>>> Rather than handling a bunch of sql scripts that has to handle different
>>> mysql versions (come to think about an issue with a mariadb version
>>> crashing recently), a variety of cloudstack versions and a whole lot more.
>>>
>>> Disclaimer: i have no idea if this is what flywaydb does, if it is, then
>>> just ignore this.
>>>
>>> --
>>> Erik
>>> Find out more about ShapeBlue and our range of CloudStack related
>>> services:
>>> IaaS Cloud Design & Build<
>>> http://shapeblue.com/iaas-cloud-design-and-build//> | CSForge – rapid
>>> IaaS deployment framework<http://shapeblue.com/csforge/>
>>> CloudStack Consulting<http://shapeblue.com/cloudstack-consultancy/> |
>>> CloudStack Software Engineering<
>>> http://shapeblue.com/cloudstack-software-engineering/>
>>> CloudStack Infrastructure Support<
>>> http://shapeblue.com/cloudstack-infrastructure-support/> | CloudStack
>>> Bootcamp Training Courses<http://shapeblue.com/cloudstack-training/>
>>>
>>
>> --
>> Ron Wheeler
>> President
>> Artifact Software Inc
>> email: rwheeler@artifact-software.com
>> skype: ronaldmwheeler
>> phone: 866-970-2435, ext 102
>>
>>
>


-- 
Ron Wheeler
President
Artifact Software Inc
email: rwheeler@artifact-software.com
skype: ronaldmwheeler
phone: 866-970-2435, ext 102