You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@openoffice.apache.org by Terry Ellison <te...@apache.org> on 2011/08/25 17:58:40 UTC
[migration] Making the forums and wiki cut-over
Sorry in advance. I've tried to put this type of content on the cwiki
and it has largely been ignored by the DL, hence this email. It is a
long onem so have I tried to structure it and keep to simple plain text
(as the DL forwarder seems to screw up text/html markup).
Since this is so long, any reply threads could kill us. So can I ask
respondents to open up a new [migration] thread to discuss specific
points in depth rather than replying directly to this.
*BACKGROUND*
As you all know, I've been doing work on the forums and wiki, though the
wiki has been my main focus recently. I am acting in two roles here (i)
the lead SysAdmin for the ooo-wiki and ooo-forum VMS; (ii) the lead (and
currently only) application maintainer on both systems. I also have
ssh access to the current prod systems running in Oracle infrastructure
and do equivalent roles there. So in practice, I am doing all of this
related work, including liaising with the project, the infrastructure
team and with Andrew Rist who represents Oracle here.
I understandably have to work within the practices of the Apache
infrastructure team to retain my permit to act as SysAdmin on these two
VMs. I am also trying to meet the expectations of the project, the
infrastructure teams and the needs of our user population across all of
this work.
We seem to have a Catch-22 here, and this email is about how we break
this and move these aspects of the project forward. My interpretation
of this Catch-22 is that whilst our current interactions on the DL are a
good basis for individuals articulating views on a particular thread
(and some seem to generate hundreds of viewpoints) we have no
functioning mechanism to move to, and adopt some form of, a consensus
policy or decision. The exact project requirements for this wiki, these
forums and the xxxx@openoffice.org mail forwarders are cases in point.
However, the infrastructure team believe that we, the project, have an
urgency about making this cut over from Oracle to Apache infrastructure,
and are pressing me to make progress.
I can't execute any plan without a baseline requirement and set of
assumptions, so what this note attempts is to lay down such a set, and
the decisions that need to be made to go forward. So PLEASE, I don't
want any flames about my use of DECISION below. What I simply mean is
the if the PPMC as a body accepts these, then I will try my best to move
this work forward. Of course you are free to challenge / change any of
this if that is a PPMC voted decision, but in this case I need to move
into a different mode; to suspend work and stop the clock until we have
an PPMC-endorsed baseline to replan on. I am NOT going to press on
without broad endorsement and then be criticised in retrospect for doing
so.
*INFRASTRUCTURE DRIVERS*
The infrastructure team has a policy of bringing in new services at
current S/W versions whenever possible -- simply because it makes it
easier to support then, and doing this before the service is on-line
involves less work and risk that when it is in production. I understand
and agree with this goal even though it can front-load work.
* The infrastructure stack is base on a standard Ubuntu server LAMP
stack as at current LTS (Ubuntu 10.04-3 LTS) which included PHP 5.3.2
* The forums are stable, but at an N-1 release level. (phpBB 3.0.8
vs. 3.0.9).
* *DECISION*: Upgrade the ooo-forums phpBB app + customisations to 3.0.9
before go-live. (Based on my last 5 upgrades, this 1-2 days work, the
main part being the regression of a 1K line customisation patch when we
rebaseline the package from 3.0.8 -> 3.0.9)
* The prod wiki is v1.15.1 that at an N-3 major release level (that's
30 months old: two major and 10 minor revisions behind the current
supported). This also runs on PHP 5.2.0.
* We need an reverse-proxy HTTP cache for performance reasons on the
wiki. One of the four market leaders in this niche is another Apache
project: Apache Traffic Server (ATS). It makes sense to stay "in-house"
here for both support and referenceability reasons
* *DECISION*: Adopt ATS v3.0.1 as the HTTP cache for the wiki. (BTW,
this work has been done and the product is excellent).
The PHP 5.3 introduced extra checking to remove an area of tolerance the
PHP 5.x<3 allowed. This was to do with when and how parameters can be
passed by reference under curtain circumstances. So moving a code base
from 5.2 to 5.3 involved a lot of work identifying and eliminating this
mis-codings. This was done by the MediaWiki team in MW v1.16. I had
planned to move to MW v1.15.5 (the last stable 1.15.x) as our baseline
and I've done this work integrating it with Apache Traffic Server (ATS)
and our LAMP stack. This is stable and performant enough to show that
we are good. However, I have only identified and bug-fixed the main
path 5.2->5.3 coding issues. During my testing I have subsequently
discovered others and there are undoubtedly more to find. I've also
discussed this with the MW devs on the MW IRC channel. Given this, the
consensus in the @infra team (me included) is that we should bite the
bullet now and move to current MW 1.17.0 even given the extra work.
There are some performance risks associated with MW 1.17.0 which we need
to mitigate. However, given that we've already got a complete
LAMP+ATS+MV in an ESX hosted VM performing like a dingbat, we really
only face the 1.15.5 -> 1.17.0 issues in this step.
* *DECISION*: Upgrade the ooo-wiki MediaWiki(MV) + all extensions to MW
v1.17.0.
* *DECISION*: I have agreed with infrastructure that we will keep 1 core
on "standby" so we can up the VM to a 2-core VM if we are seeing
unacceptable performance problems with one-core.
*BRANDING AND OTHER CONTENT/ACCESS CHANGES*
I've asked for feedback and "doer" support on the content aspect of the
wiki and the forum. There has been hundreds of associate emails and
unstructured discussion but no hard decisions taken. Drew and Dave F
have offered to get involved here, but we need to set up accounts etc.,
and move into execution.
* *DECISION*: We will cut over the wiki and the forums with the content
as-is and implement branding and access control changes within the a.o
infrastructure when volunteers come on-stream to resource this. This is
the standard "transfer then clean-up" approach adopted when a migration
is time critical.
*PRIORITISATION*
One the one hand the forums involve a lot less work and technical risk.
On the other they are arguably also used more than the wiki at the
moment, and the post rates are a LOT higher. Because I am a single
resource constraint, I can't do both at the same time. This one is
toss-a-coin, but my instinct is to get the one that we can do quickly
done. But if there is a strong consensus to the swap then I can do it.
* *DECISION*: The priority is to work the forums over the wiki. We cut
the forums over first.
*CUT-OVER*
There are two facets to cut-over: content move and DNS-based IP
reassignment. Clearly we need to freeze update access to the services
prior to start of content move and continue update-freeze on the legacy
service. Bringing the content across involves a backup, copy restore
which can be rehearsed and scripted, but in the case of the wiki, this
will be a few hours even if fully automated.
The main issues are:
* Migration coordination is more of a Programme Management /
Coordination challenge rather than a purely technical one. In the old
(paid/corporate) days, I would have a Programme Manager (PM)-type
working along side me covering this aspect.
* *REQUEST* Would anyone who has previous experience of doing this like
to volunteer to take this role, so I can focus on the technical stuff?
* We have to transfer DNS control for oo.o to Apache even if the A
and MX records point to the @Oracle IP addresses.
* *ACTION*: Our PM needs to identify who the authoritative controller
for the DNS entry in A.o is and how we interface with him or her during
this change process.
* The DNS IP reassignment can take 24 hrs or more to ripple around
the worldwide hierarchy of DNS servers. During this period who goes to
which service is undefined.
* There are many way "to skin the cat" of the migration process. All
will involve some service loss, but the complexity of the rehearsal and
planning come explode as we reduce this outage to a zero. Complex plans
can also go wrong so my instinct is to keep it simple: halt the service
at a pre-notified time, transfer and start new service at a
pre-notified time.
* *DECISION*: Halt the forum service for a notified (24hr) window during
cutover. The migration uses fixed IPs, so DNP IP reassignment is
co-incident with service stop.
* *GOAL*: Cut over forums within 7 days from today. Date TBD by PM. I
can do the content move.
* *DECISION*: Halt the wiki service for a notified (24hr) window during
cutover. The migration uses fixed IPs, so DNP IP reassignment is
co-incident with service stop.
* *GOAL*: Cut over forums within 14 days from today. Date TBD by PM. I
can do the content move.
* We have some further caching tweaks on the interaction of the
MediaWiki applicaiton with the ATS HTTP reverse proxy cache, but these
are probably nice-to-have than essential. More to the point these need
to be done on a system will production load patterns.
* *DECISION*. We will defer such tuning until post go-live.
*OTHER ISSUES*
* I am pretty much maxed out on "high-priority" tasks at the moment,
so I can't accept any more tasks until I've made material progress on my
current committed list
* A number of us have serious concerns about the continuity issues
around the XXX@openoffice.org forwarding service. I feel that there is
a project consensus that this service needs to be relocated Apache.org
infrastructure until we can sentence its content. The fact that we
don't have an owner doing as I am with the wiki and forum services is a
GRAVE CONCERN. I was hoping to step up to do this, but with the
decisions to upgrade phpBB and MW, the previous point now applies.
If you have got to the bottom of this, then thanks for your patience and
time.
Regards
Terry
PS. Please remember to break off specific discussion points onto
separate threads.
RE: [migration] Making the forums and wiki cut-over
Posted by "Dennis E. Hamilton" <de...@acm.org>.
Just one tiny clean-up.
Based on what you say, I believe the goal is to cut over forums in 7 days and the wiki in 14 days.
(I think there is a small typo in the second CUT-OVER GOAL.)
- Dennis
Thanks for all of this Terry. You've managed to juggle a complex number of considerations and you have my endless admiration for it.
-----Original Message-----
From: Terry Ellison [mailto:terrye@apache.org]
Sent: Thursday, August 25, 2011 08:59
To: ooo-dev@incubator.apache.org
Subject: [migration] Making the forums and wiki cut-over
Sorry in advance. I've tried to put this type of content on the cwiki
and it has largely been ignored by the DL, hence this email. It is a
long onem so have I tried to structure it and keep to simple plain text
(as the DL forwarder seems to screw up text/html markup).
Since this is so long, any reply threads could kill us. So can I ask
respondents to open up a new [migration] thread to discuss specific
points in depth rather than replying directly to this.
[ ... ]
*CUT-OVER*
[ ... ]
* *GOAL*: Cut over forums within 7 days from today. Date TBD by PM. I
can do the content move.
* *DECISION*: Halt the wiki service for a notified (24hr) window during
cutover. The migration uses fixed IPs, so DNP IP reassignment is
co-incident with service stop.
* *GOAL*: Cut over forums within 14 days from today. Date TBD by PM. I
can do the content move.
[ ... ]