You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@druid.apache.org by Jonathan Wei <jo...@apache.org> on 2019/03/05 22:04:32 UTC

Proposed website migration plan

We still need to complete the website migration to Apache infrastructure.

I'll propose the following plan:

Proposed Apache Druid website migration plan
========================================

These links have some previous discussion on the website migration:

https://lists.apache.org/thread.html/7cae100b684e0b33e0adda993efea3d6088978700988a0ae632fdd80@%3Cdev.druid.apache.org%3E
https://issues.apache.org/jira/browse/INFRA-17340

From the discussions above, the recommendation is to have 2 separate repos
for the website: one for source and another for built content that will be
served.

Generating site files
=======================

The Apache site update process will be similar to our current process.

Current process:
1. Push changes to https://github.com/druid-io/druid-io.github.io/tree/src
2. metamx bot picks up changes, builds, and commits to
https://github.com/druid-io/druid-io.github.io/tree/master
3. https://github.com/druid-io/druid-io.github.io/tree/master is served by
github pages

Apache process:
1. Push changes to https://github.com/apache/incubator-druid-website-src
2. Jenkins bot from Apache will build the website from source repo, commit
to https://github.com/apache/incubator-druid-website
3. Apache Druid website will be served from the content in
https://github.com/apache/incubator-druid-website (asf-site branch)


Hosting and SEO
================

The Apache site will be hosted at druid.apache.org on Apache
infrastructure: http://www.apache.org/dev/project-site.html

To preserve our search rankings, we can setup 301 redirects from the old
druid.io site to the corresponding pages on the druid.apache.org site. (
https://moz.com/learn/seo/redirection)

However, Github pages (which currently hosts the druid.io site) does not
support 301 redirects, so we propose the following:
- Setup a new Nginx server that will perform 301 redirects to
druid.apache.org for the druid.io. Imply can host this if needed.
- Update the druid.io DNS entry to point to this new Nginx server
- Shut down Github pages hosting for druid.io

In addition, we can also set canonical tags on our pages:
https://moz.com/learn/seo/canonicalization


Action items
===============
- Setup a Jenkins bot that builds the Apache website content from source
- Get the Apache website up
- Setup Nginx redirect server for druid.io
- Shutdown github pages and redirect DNS for druid.io to Nginx redirect
server
- Add canonical tags to pages

Re: Proposed website migration plan

Posted by Gian Merlino <gi...@apache.org>.
FYI -- I've also submitted an address change request from druid.io ->
druid.apache.org through Google's automated system.

On Wed, Jun 12, 2019 at 1:56 PM Gian Merlino <gi...@apache.org> wrote:

> Sorry, I mean references to https://github.com/druid-io/druid-io.github.io
> should be changed to https://github.com/apache/incubator-druid-website-src.
> That is the change that actually happened, and the one that makes sense.
>
> On Wed, Jun 12, 2019 at 4:25 PM Gian Merlino <gi...@apache.org> wrote:
>
>> Yep, any references to https://github.com/druid-io/druid-io.github.io
>> should be changed to https://github.com/apache/incubator-druid. Those
>> have all been updated now. I didn't see any references to
>> https://github.com/druid-io/druid -- I think we got them all in a
>> previous pass.
>>
>> There are still some lingering references to separate, but affiliated
>> projects like https://github.com/druid-io/pydruid. IMO, it makes sense
>> to leave them there for now, and incorporate them as subprojects of Druid
>> once Druid is a top level project.
>>
>> On Wed, Jun 12, 2019 at 12:18 PM Julian Hyde <jh...@gmail.com>
>> wrote:
>>
>>> Looks marvelous! Thanks for making it happen.
>>>
>>> I noticed at least one reference to https://github.com/druid-io on the
>>> site. Should be changed to https://github.com/apache/incubator-druid?
>>>
>>> > On Jun 11, 2019, at 9:44 PM, Gian Merlino <gi...@apache.org> wrote:
>>> >
>>> > This is now done: druid.io is redirecting to druid.apache.org!!
>>> >
>>> > Next, we'll add the stuff required by
>>> > https://whimsy.apache.org/pods/project/druid. Then, we should be good
>>> to go
>>> > on the website migration. (Behind the scenes, Vadim Ogievetsky has been
>>> > helping tons with this -- thanks a lot!)
>>> >
>>> >> On Mon, Jun 10, 2019 at 9:00 AM David Lim <da...@apache.org>
>>> wrote:
>>> >>
>>> >> No objections from me - thank you for testing this out.
>>> >>
>>> >>>> On Mon, Jun 10, 2019 at 7:48 AM Gian Merlino <gi...@apache.org>
>>> wrote:
>>> >>>
>>> >>> It looks like Google has picked up the 301 and [druid use cases] #1
>>> >> result
>>> >>> is https://druid.apache.org/use-cases now. For [what is druid used
>>> for]
>>> >>> it's not #4 instead of #2. I think this is the best we are likely to
>>> >> get. I
>>> >>> am ready to flip the switch if there aren't any objections.
>>> >>>
>>> >>> On Fri, Jun 7, 2019 at 9:15 PM Gian Merlino <gi...@apache.org> wrote:
>>> >>>
>>> >>>> Another update: as of
>>> >>>> https://github.com/apache/incubator-druid-website-src/pull/1 and
>>> >>>> https://github.com/apache/incubator-druid-website/pull/7, the
>>> >>>> https://druid.apache.org/ site is now serving almost all pages from
>>> >>>> druid.io, except:
>>> >>>>
>>> >>>> - the index page (it still has a placeholder until we flip the
>>> switch)
>>> >>>> - the download page (it has a differently-designed download page:
>>> >> compare
>>> >>>> http://druid.io/downloads.html with
>>> >>> http://druid.apache.org/downloads.html
>>> >>>> - any docs older than 0.13.0 (they aren't Apache releases)
>>> >>>>
>>> >>>> If you navigate to https://druid.apache.org/ + any other path from
>>> >>>> druid.io, you should see the page.
>>> >>>>
>>> >>>> I'm hoping to confirm that search engines pick up the 301 for
>>> >>>> http://druid.io/use-cases before flipping the switch. Hopefully
>>> that
>>> >>>> doesn't take much longer. If it does we should talk about how we
>>> want
>>> >> to
>>> >>>> proceed.
>>> >>>>
>>> >>>>> On Tue, Jun 4, 2019 at 1:48 AM Gian Merlino <gi...@apache.org>
>>> wrote:
>>> >>>>>
>>> >>>>> An update: we do have a redirect server set up on druid.io now:
>>> note
>>> >>>>> that http://druid.io/community/ and http://druid.io/use-cases both
>>> >>>>> redirect to https://druid.apache.org. I just set up the latter
>>> >> redirect
>>> >>>>> (on /use-cases) as part of 'test this first on a single page'. All
>>> >> other
>>> >>>>> druid.io URLs are still being hosted using the content from GitHub
>>> >>> pages
>>> >>>>> at https://github.com/druid-io/druid-io.github.io.
>>> >>>>>
>>> >>>>> Search engine watch: currently, http://druid.io is the #1 link for
>>> >>>>> [druid use cases] on Google, Bing, and DuckDuckGo (and has a cool
>>> >>> looking
>>> >>>>> infobox on Google & Bing). For [what is druid used for], it's #2 on
>>> >>> Google,
>>> >>>>> and not ranked on the first page on Bing & DDG. Will monitor this
>>> over
>>> >>> the
>>> >>>>> next few days.
>>> >>>>>
>>> >>>>>> On Mon, May 6, 2019 at 5:43 PM Gian Merlino <gi...@apache.org>
>>> wrote:
>>> >>>>>>
>>> >>>>>> Hi all,
>>> >>>>>>
>>> >>>>>> It sounds like we will need a redirect server that issues 301s
>>> from
>>> >>> each
>>> >>>>>> druid.io page to the corresponding druid.apache.org page. Charles
>>> >> and
>>> >>> I
>>> >>>>>> spoke offline and thought that something like Jon's original
>>> proposal
>>> >>> is
>>> >>>>>> the best way to go. I am going to suggest we get started on this,
>>> as
>>> >>> it's
>>> >>>>>> the last major piece of infra to move to ASF.
>>> >>>>>>
>>> >>>>>> 1) Set up a redirect server to perform 301 redirects to
>>> >>> druid.apache.org
>>> >>>>>> 2) Post all druid.io content on druid.apache.org
>>> >>>>>> 3) Update druid.io DNS to point to the redirect server
>>> >>>>>> 4) Shut down GitHub pages hosting for druid.io
>>> >>>>>>
>>> >>>>>> Steps (2) and (3) should be done as close in time as possible so
>>> >> there
>>> >>>>>> is no confusion as to which version of the pages is canonical.
>>> >>>>>>
>>> >>>>>> For the redirect server, two viable options are an nginx server
>>> or an
>>> >>> S3
>>> >>>>>> webpage redirect (
>>> >>
>>> https://docs.aws.amazon.com/AmazonS3/latest/dev/how-to-page-redirect.html
>>> >>> ).
>>> >>>>>> Just like we did with the HTML-level redirect, I suggest we test
>>> this
>>> >>> first
>>> >>>>>> on a single page. We can do that by having the redirect server
>>> >>> initially
>>> >>>>>> start off by hosting all druid.io content (so it's
>>> indistinguishable
>>> >>>>>> from the GitHub-pages-based site) except for a single page, which
>>> it
>>> >>>>>> redirects using HTTP 301 to druid.apache.org.
>>> >>>>>>
>>> >>>>>> I'm planning to start looking into this, so anyone around please
>>> >> speak
>>> >>>>>> up if you have any advice or alternative approaches to suggest.
>>> >>>>>>
>>> >>>>>> On Mon, Apr 29, 2019 at 4:01 PM Jonathan Wei <jo...@apache.org>
>>> >>> wrote:
>>> >>>>>>
>>> >>>>>>> Thanks for checking the SEO state, that's somewhat disappointing.
>>> >>>>>>>
>>> >>>>>>> For Bing, it sounds like they really want you to use 301s (
>>> >>>>>>>
>>> https://www.bing.com/webmaster/help/webmaster-guidelines-30fba23a):
>>> >>>>>>>
>>> >>>>>>>> Bing prefers you use a 301 permanent redirect when moving
>>> content,
>>> >>>>>>> should
>>> >>>>>>> the move be permanent.  If the move is temporary, then a 302
>>> >> temporary
>>> >>>>>>> redirect will work fine.  Do not use the rel=canonical tag in
>>> place
>>> >>> of a
>>> >>>>>>> proper redirect.
>>> >>>>>>>
>>> >>>>>>> I wasn't able to find similar guidance re: this issue for
>>> >> DuckDuckGo.
>>> >>>>>>>
>>> >>>>>>> On Mon, Apr 29, 2019 at 10:42 AM Gian Merlino <gi...@apache.org>
>>> >>> wrote:
>>> >>>>>>>
>>> >>>>>>>> Another update: SEO is not looking great after another day
>>> passed.
>>> >>>>>>> For a
>>> >>>>>>>> search for "druid community", both http://druid.io/community
>>> and
>>> >>>>>>>> https://druid.apache.org/community/ have dropped off the front
>>> >> page
>>> >>>>>>> of
>>> >>>>>>>> Bing
>>> >>>>>>>> completely. On Google, the legacy version is gone (as expected)
>>> >> but
>>> >>>>>>> the
>>> >>>>>>>> Apache version has dropped to the #3 spot (down from #2
>>> yesterday;
>>> >>>>>>> and down
>>> >>>>>>>> from where the legacy page was pre-migration, which was #1).
>>> >>>>>>>>
>>> >>>>>>>> I think this means we do need to try to get 301s figured out.
>>> >>>>>>>>
>>> >>>>>>>> On Sun, Apr 28, 2019 at 3:06 PM Gian Merlino <gi...@apache.org>
>>> >>> wrote:
>>> >>>>>>>>
>>> >>>>>>>>> Google has picked up the new URL as of today but Bing hasn't.
>>> >>>>>>> Neither has
>>> >>>>>>>>> DuckDuckGo for that matter.
>>> >>>>>>>>>
>>> >>>>>>>>> Currently, Google is showing
>>> >> https://druid.apache.org/community/
>>> >>>>>>> in the
>>> >>>>>>>>> #2 spot and Bing/DDG are showing http://druid.io/community in
>>> >> the
>>> >>>>>>> top
>>> >>>>>>>>> spot. Ominously, the latter two _have_ picked up a page title
>>> >>>>>>> change to
>>> >>>>>>>>> "Redirecting..."
>>> >>>>>>>>>
>>> >>>>>>>>> On Fri, Apr 26, 2019 at 11:00 AM Gian Merlino <gian@apache.org
>>> >
>>> >>>>>>> wrote:
>>> >>>>>>>>>
>>> >>>>>>>>>> An update: this is done now since a couple of days ago, but
>>> >>> Google
>>> >>>>>>> and
>>> >>>>>>>>>> Bing are still showing http://druid.io/community for a search
>>> >>> for
>>> >>>>>>>> "druid
>>> >>>>>>>>>> community" or even "apache druid community":
>>> >>>>>>>>>>
>>> >>>>>>>>>> - https://www.google.com/search?q=druid+community
>>> >>>>>>>>>> - https://www.bing.com/search?q=druid+community
>>> >>>>>>>>>>
>>> >>>>>>>>>> I suggest we keep an eye on the search engines and make sure
>>> >> they
>>> >>>>>>> can
>>> >>>>>>>>>> figure out that the site has changed (I'm not sure how often
>>> >> they
>>> >>>>>>>> crawl).
>>> >>>>>>>>>> If they can then it would make sense to me to move forward
>>> with
>>> >>>>>>>> migrating
>>> >>>>>>>>>> the entire web site.
>>> >>>>>>>>>>
>>> >>>>>>>>>> On Mon, Apr 22, 2019 at 7:49 PM Jonathan Wei <
>>> >> jonwei@apache.org>
>>> >>>>>>> wrote:
>>> >>>>>>>>>>
>>> >>>>>>>>>>> Correction: Xavier was suggesting we use
>>> >>
>>> https://github.com/druid-io/druid-io.github.io/blob/src/_layouts/redirect_page.html
>>> >>>>>>>>>>> ,
>>> >>>>>>>>>>> the existing redirect system used by the Druid website.
>>> >>>>>>>>>>>
>>> >>>>>>>>>>> I've opened PRs to do the community page migration test:
>>> >>>>>>>>>>> https://github.com/apache/incubator-druid-website/pull/3
>>> >>>>>>>>>>> https://github.com/druid-io/druid-io.github.io/pull/591
>>> >>>>>>>>>>>
>>> >>>>>>>>>>> On Mon, Apr 22, 2019 at 3:04 PM Gian Merlino <
>>> gian@apache.org
>>> >>>
>>> >>>>>>> wrote:
>>> >>>>>>>>>>>
>>> >>>>>>>>>>>> That sounds good to me. I would also consider adding
>>> >> canonical
>>> >>>>>>> tags
>>> >>>>>>>> to
>>> >>>>>>>>>>> all
>>> >>>>>>>>>>>> druid.apache.org pages so we don't have
>>> >>>>>>> druid.incubator.apache.org
>>> >>>>>>>> and
>>> >>>>>>>>>>>> druid.apache.org both floating around (not to mention
>>> >>>>>>> http/https
>>> >>>>>>>>>>> version
>>> >>>>>>>>>>>> of
>>> >>>>>>>>>>>> both).
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> On Mon, Apr 22, 2019 at 2:59 PM Jonathan Wei <
>>> >>> jonwei@apache.org
>>> >>>>>>>>
>>> >>>>>>>>>>> wrote:
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>>> For redirects, Xavier has suggested using
>>> >>> https://help.github.com/en/articles/redirects-on-github-pages
>>> >>>>>>> to
>>> >>>>>>>>>>>> redirect
>>> >>>>>>>>>>>>> to druid.apache.org as a way to transition before the
>>> >>> domain
>>> >>>>>>>>>>> migration
>>> >>>>>>>>>>>>> occurs, and believes that it would have the same SEO
>>> >> effects
>>> >>>>>>> as a
>>> >>>>>>>> 301
>>> >>>>>>>>>>>>> redirect after the new pages are indexed.
>>> >>>>>>>>>>>>>
>>> >>>>>>>>>>>>> I think we could try migrating the current Community page
>>> >> to
>>> >>>>>>>>>>>>> druid.apache.org with Github redirects and canonical
>>> >> links
>>> >>>>>>>> pointing
>>> >>>>>>>>>>> to
>>> >>>>>>>>>>>> the
>>> >>>>>>>>>>>>> https://druid.apache.org version. If that goes well, we
>>> >>> could
>>> >>>>>>>>>>> continue
>>> >>>>>>>>>>>>> migrating more pages.
>>> >>>>>>>>>>>>>
>>> >>>>>>>>>>>>> What are the community's thoughts on that?
>>> >>>>>>>>>>>>>
>>> >>>>>>>>>>>>> Thanks,
>>> >>>>>>>>>>>>> Jon
>>> >>>>>>>>>>>>>
>>> >>>>>>>>>>>>> On Tue, Mar 12, 2019 at 7:19 PM Gian Merlino <
>>> >>> gian@apache.org
>>> >>>>>>>>
>>> >>>>>>>>>>> wrote:
>>> >>>>>>>>>>>>>
>>> >>>>>>>>>>>>>> OpenOffice and Groovy both chose to sort of "meld" their
>>> >>>>>>> classic
>>> >>>>>>>>>>> and
>>> >>>>>>>>>>>>> Apache
>>> >>>>>>>>>>>>>> sites together: https://www.openoffice.org/,
>>> >>>>>>>>>>> http://groovy-lang.org/.
>>> >>>>>>>>>>>>> Note
>>> >>>>>>>>>>>>>> how when you click around, you get shuttled between the
>>> >>>>>>> classic
>>> >>>>>>>>>>> domain
>>> >>>>>>>>>>>>> and
>>> >>>>>>>>>>>>>> the Apache domain. Some pages are available on both
>>> >> sites,
>>> >>>>>>> like
>>> >>>>>>>>>>>>>> http://groovy-lang.org/download.html and
>>> >>>>>>>>>>>>>> https://groovy.apache.org/download.html (which don't
>>> >> use
>>> >>>>>>>> canonical
>>> >>>>>>>>>>>> link
>>> >>>>>>>>>>>>>> tags -- does not seem like a good example to follow!).
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>> NetBeans (still incubating) also has a "melded" site at
>>> >>>>>>>>>>>>>> https://netbeans.org/ but doesn't seem to consider
>>> >> itself
>>> >>>>>>> done
>>> >>>>>>>>>>> yet.
>>> >>>>>>>>>>>> They
>>> >>>>>>>>>>>>>> are discussing plans on their lists & wiki to do
>>> >> redirects
>>> >>>>>>> from
>>> >>>>>>>>>>>>>> netbeans.org
>>> >>>>>>>>>>>>>> to netbeans.apache.org:
>>> >>
>>> https://cwiki.apache.org/confluence/display/NETBEANS/netbeans.org+Transition+Process
>>> >>>>>>>>>>>>>> ,
>>> >>
>>> https://lists.apache.org/thread.html/ad10fb9d4c8fee571a2f6232b268a3b835f7b823d3a0983b84aeb18a@%3Cdev.netbeans.apache.org%3E
>>> >>>>>>>>>>>>>> .
>>> >>>>>>>>>>>>>> As of today the domain has been donated to ASF, but the
>>> >>>>>>> server is
>>> >>>>>>>>>>> still
>>> >>>>>>>>>>>>> run
>>> >>>>>>>>>>>>>> by Oracle, so the plan doesn't seem to be finished yet.
>>> >>>>>>> (WHOIS
>>> >>>>>>>> for
>>> >>>>>>>>>>>>>> netbeans.org shows ASF as the registrant; netbeans.org
>>> >>>>>>> resolves
>>> >>>>>>>> to
>>> >>>>>>>>>>>>>> lb-netbeans-cms-adc.oracle.com.)
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>> The melded sites don't really seem better to me than
>>> >>>>>>> redirecting
>>> >>>>>>>>>>> all
>>> >>>>>>>>>>>> urls
>>> >>>>>>>>>>>>>> on the domain. I guess it depends on if we want to keep
>>> >>>>>>> druid.io
>>> >>>>>>>>>>> as
>>> >>>>>>>>>>>> the
>>> >>>>>>>>>>>>>> official domain forever, or if we think
>>> >> druid.apache.org
>>> >>> is
>>> >>>>>>>>>>> cooler. I
>>> >>>>>>>>>>>>>> definitely think druid.apache.org is cooler so my vote
>>> >> is
>>> >>>>>>> there
>>> >>>>>>>>>>> :).
>>> >>>>>>>>>>>> It's
>>> >>>>>>>>>>>>>> also nice that it supports https. (druid.io does not
>>> >>> today,
>>> >>>>>>>> since
>>> >>>>>>>>>>> it's
>>> >>>>>>>>>>>>> on
>>> >>>>>>>>>>>>>> GitHub pages, which doesn't support https for custom
>>> >>>>>>> domains.)
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>> On Tue, Mar 12, 2019 at 7:47 PM Charles Allen
>>> >>>>>>>>>>>>>> <ch...@snap.com.invalid> wrote:
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>> Are there other projects who have transitioned an
>>> >>>>>>> independently
>>> >>>>>>>>>>>>>> successful
>>> >>>>>>>>>>>>>>> domain name to an apache one?
>>> >>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>> On Tue, Mar 5, 2019 at 2:13 PM David Lim <
>>> >>>>>>> davidlim@apache.org>
>>> >>>>>>>>>>>> wrote:
>>> >>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>> Who has control over the druid.io domain? Charles
>>> >>>>>>> would that
>>> >>>>>>>>>>> be
>>> >>>>>>>>>>>> you?
>>> >>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>> We'd need support from them for the DNS redirect.
>>> >>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>> On Tue, Mar 5, 2019 at 2:04 PM Jonathan Wei <
>>> >>>>>>>> jonwei@apache.org
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>>> wrote:
>>> >>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>> We still need to complete the website migration to
>>> >>>>>>> Apache
>>> >>>>>>>>>>>>>>> infrastructure.
>>> >>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>> I'll propose the following plan:
>>> >>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>> Proposed Apache Druid website migration plan
>>> >>>>>>>>>>>>>>>>> ========================================
>>> >>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>> These links have some previous discussion on the
>>> >>>>>>> website
>>> >>>>>>>>>>>> migration:
>>> >>
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.apache.org_thread.html_7cae100b684e0b33e0adda993efea3d6088978700988a0ae632fdd80-40-253Cdev.druid.apache.org-253E&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=G1dTS7FlYGauxNOaQECZix2YwroWVCqJB-cT0nEeNwM&e=
>>> >>
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_INFRA-2D17340&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=pwg0jE385gqei6EEEbxugKHWll7oyKoCloFc8ByhlUc&e=
>>> >>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>> From the discussions above, the recommendation is
>>> >> to
>>> >>>>>>> have 2
>>> >>>>>>>>>>>>> separate
>>> >>>>>>>>>>>>>>>> repos
>>> >>>>>>>>>>>>>>>>> for the website: one for source and another for
>>> >>> built
>>> >>>>>>>> content
>>> >>>>>>>>>>>> that
>>> >>>>>>>>>>>>>> will
>>> >>>>>>>>>>>>>>>> be
>>> >>>>>>>>>>>>>>>>> served.
>>> >>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>> Generating site files
>>> >>>>>>>>>>>>>>>>> =======================
>>> >>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>> The Apache site update process will be similar to
>>> >>> our
>>> >>>>>>>> current
>>> >>>>>>>>>>>>>> process.
>>> >>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>> Current process:
>>> >>>>>>>>>>>>>>>>> 1. Push changes to
>>> >>> https://github.com/druid-io/druid-io.github.io/tree/src
>>> >>>>>>>>>>>>>>>>> 2. metamx bot picks up changes, builds, and
>>> >> commits
>>> >>> to
>>> >>>>>>> https://github.com/druid-io/druid-io.github.io/tree/master
>>> >>>>>>>>>>>>>>>>> 3.
>>> >>>>>>>>>>> https://github.com/druid-io/druid-io.github.io/tree/master
>>> is
>>> >>>>>>>>>>>>>>> served
>>> >>>>>>>>>>>>>>>> by
>>> >>>>>>>>>>>>>>>>> github pages
>>> >>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>> Apache process:
>>> >>>>>>>>>>>>>>>>> 1. Push changes to
>>> >>>>>>>>>>>>>>> https://github.com/apache/incubator-druid-website-src
>>> >>>>>>>>>>>>>>>>> 2. Jenkins bot from Apache will build the website
>>> >>> from
>>> >>>>>>>> source
>>> >>>>>>>>>>>> repo,
>>> >>>>>>>>>>>>>>>> commit
>>> >>>>>>>>>>>>>>>>> to
>>> >>> https://github.com/apache/incubator-druid-website
>>> >>>>>>>>>>>>>>>>> 3. Apache Druid website will be served from the
>>> >>>>>>> content in
>>> >>>>>>>>>>>>>>>>> https://github.com/apache/incubator-druid-website
>>> >>>>>>>> (asf-site
>>> >>>>>>>>>>>>> branch)
>>> >>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>> Hosting and SEO
>>> >>>>>>>>>>>>>>>>> ================
>>> >>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>> The Apache site will be hosted at
>>> >> druid.apache.org
>>> >>> on
>>> >>>>>>>> Apache
>>> >>>>>>>>>>>>>>>>> infrastructure:
>>> >>
>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.apache.org_dev_project-2Dsite.html&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=_rHEo_asMXKypaunuBTXFkB6Ni3F6KqbEfkck18L7Ag&e=
>>> >>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>> To preserve our search rankings, we can setup 301
>>> >>>>>>> redirects
>>> >>>>>>>>>>> from
>>> >>>>>>>>>>>>> the
>>> >>>>>>>>>>>>>>> old
>>> >>>>>>>>>>>>>>>>> druid.io site to the corresponding pages on the
>>> >>>>>>>>>>> druid.apache.org
>>> >>>>>>>>>>>>>>> site. (
>>> >>
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__moz.com_learn_seo_redirection&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=lUeWU0dT9thy8gp11RO-Vry7zkYl_W4BXz01fyXJO0A&e=
>>> >>>>>>>>>>>>>>>> )
>>> >>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>> However, Github pages (which currently hosts the
>>> >>>>>>> druid.io
>>> >>>>>>>>>>> site)
>>> >>>>>>>>>>>>> does
>>> >>>>>>>>>>>>>>> not
>>> >>>>>>>>>>>>>>>>> support 301 redirects, so we propose the
>>> >> following:
>>> >>>>>>>>>>>>>>>>> - Setup a new Nginx server that will perform 301
>>> >>>>>>> redirects
>>> >>>>>>>> to
>>> >>>>>>>>>>>>>>>>> druid.apache.org for the druid.io. Imply can host
>>> >>>>>>> this if
>>> >>>>>>>>>>>> needed.
>>> >>>>>>>>>>>>>>>>> - Update the druid.io DNS entry to point to this
>>> >>> new
>>> >>>>>>> Nginx
>>> >>>>>>>>>>>> server
>>> >>>>>>>>>>>>>>>>> - Shut down Github pages hosting for druid.io
>>> >>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>> In addition, we can also set canonical tags on our
>>> >>>>>>> pages:
>>> >>
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__moz.com_learn_seo_canonicalization&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=T8G2c6d4EbQ_YDLFQXVebcj0UN9FNrbpPY5Xq4LAR8w&e=
>>> >>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>> Action items
>>> >>>>>>>>>>>>>>>>> ===============
>>> >>>>>>>>>>>>>>>>> - Setup a Jenkins bot that builds the Apache
>>> >> website
>>> >>>>>>>> content
>>> >>>>>>>>>>> from
>>> >>>>>>>>>>>>>>> source
>>> >>>>>>>>>>>>>>>>> - Get the Apache website up
>>> >>>>>>>>>>>>>>>>> - Setup Nginx redirect server for druid.io
>>> >>>>>>>>>>>>>>>>> - Shutdown github pages and redirect DNS for
>>> >>> druid.io
>>> >>>>>>> to
>>> >>>>>>>>>>> Nginx
>>> >>>>>>>>>>>>>>> redirect
>>> >>>>>>>>>>>>>>>>> server
>>> >>>>>>>>>>>>>>>>> - Add canonical tags to pages
>>> >>
>>>
>>

Re: Proposed website migration plan

Posted by Gian Merlino <gi...@apache.org>.
Sorry, I mean references to https://github.com/druid-io/druid-io.github.io
should be changed to https://github.com/apache/incubator-druid-website-src.
That is the change that actually happened, and the one that makes sense.

On Wed, Jun 12, 2019 at 4:25 PM Gian Merlino <gi...@apache.org> wrote:

> Yep, any references to https://github.com/druid-io/druid-io.github.io
> should be changed to https://github.com/apache/incubator-druid. Those
> have all been updated now. I didn't see any references to
> https://github.com/druid-io/druid -- I think we got them all in a
> previous pass.
>
> There are still some lingering references to separate, but affiliated
> projects like https://github.com/druid-io/pydruid. IMO, it makes sense to
> leave them there for now, and incorporate them as subprojects of Druid once
> Druid is a top level project.
>
> On Wed, Jun 12, 2019 at 12:18 PM Julian Hyde <jh...@gmail.com>
> wrote:
>
>> Looks marvelous! Thanks for making it happen.
>>
>> I noticed at least one reference to https://github.com/druid-io on the
>> site. Should be changed to https://github.com/apache/incubator-druid?
>>
>> > On Jun 11, 2019, at 9:44 PM, Gian Merlino <gi...@apache.org> wrote:
>> >
>> > This is now done: druid.io is redirecting to druid.apache.org!!
>> >
>> > Next, we'll add the stuff required by
>> > https://whimsy.apache.org/pods/project/druid. Then, we should be good
>> to go
>> > on the website migration. (Behind the scenes, Vadim Ogievetsky has been
>> > helping tons with this -- thanks a lot!)
>> >
>> >> On Mon, Jun 10, 2019 at 9:00 AM David Lim <da...@apache.org> wrote:
>> >>
>> >> No objections from me - thank you for testing this out.
>> >>
>> >>>> On Mon, Jun 10, 2019 at 7:48 AM Gian Merlino <gi...@apache.org>
>> wrote:
>> >>>
>> >>> It looks like Google has picked up the 301 and [druid use cases] #1
>> >> result
>> >>> is https://druid.apache.org/use-cases now. For [what is druid used
>> for]
>> >>> it's not #4 instead of #2. I think this is the best we are likely to
>> >> get. I
>> >>> am ready to flip the switch if there aren't any objections.
>> >>>
>> >>> On Fri, Jun 7, 2019 at 9:15 PM Gian Merlino <gi...@apache.org> wrote:
>> >>>
>> >>>> Another update: as of
>> >>>> https://github.com/apache/incubator-druid-website-src/pull/1 and
>> >>>> https://github.com/apache/incubator-druid-website/pull/7, the
>> >>>> https://druid.apache.org/ site is now serving almost all pages from
>> >>>> druid.io, except:
>> >>>>
>> >>>> - the index page (it still has a placeholder until we flip the
>> switch)
>> >>>> - the download page (it has a differently-designed download page:
>> >> compare
>> >>>> http://druid.io/downloads.html with
>> >>> http://druid.apache.org/downloads.html
>> >>>> - any docs older than 0.13.0 (they aren't Apache releases)
>> >>>>
>> >>>> If you navigate to https://druid.apache.org/ + any other path from
>> >>>> druid.io, you should see the page.
>> >>>>
>> >>>> I'm hoping to confirm that search engines pick up the 301 for
>> >>>> http://druid.io/use-cases before flipping the switch. Hopefully that
>> >>>> doesn't take much longer. If it does we should talk about how we want
>> >> to
>> >>>> proceed.
>> >>>>
>> >>>>> On Tue, Jun 4, 2019 at 1:48 AM Gian Merlino <gi...@apache.org>
>> wrote:
>> >>>>>
>> >>>>> An update: we do have a redirect server set up on druid.io now:
>> note
>> >>>>> that http://druid.io/community/ and http://druid.io/use-cases both
>> >>>>> redirect to https://druid.apache.org. I just set up the latter
>> >> redirect
>> >>>>> (on /use-cases) as part of 'test this first on a single page'. All
>> >> other
>> >>>>> druid.io URLs are still being hosted using the content from GitHub
>> >>> pages
>> >>>>> at https://github.com/druid-io/druid-io.github.io.
>> >>>>>
>> >>>>> Search engine watch: currently, http://druid.io is the #1 link for
>> >>>>> [druid use cases] on Google, Bing, and DuckDuckGo (and has a cool
>> >>> looking
>> >>>>> infobox on Google & Bing). For [what is druid used for], it's #2 on
>> >>> Google,
>> >>>>> and not ranked on the first page on Bing & DDG. Will monitor this
>> over
>> >>> the
>> >>>>> next few days.
>> >>>>>
>> >>>>>> On Mon, May 6, 2019 at 5:43 PM Gian Merlino <gi...@apache.org>
>> wrote:
>> >>>>>>
>> >>>>>> Hi all,
>> >>>>>>
>> >>>>>> It sounds like we will need a redirect server that issues 301s from
>> >>> each
>> >>>>>> druid.io page to the corresponding druid.apache.org page. Charles
>> >> and
>> >>> I
>> >>>>>> spoke offline and thought that something like Jon's original
>> proposal
>> >>> is
>> >>>>>> the best way to go. I am going to suggest we get started on this,
>> as
>> >>> it's
>> >>>>>> the last major piece of infra to move to ASF.
>> >>>>>>
>> >>>>>> 1) Set up a redirect server to perform 301 redirects to
>> >>> druid.apache.org
>> >>>>>> 2) Post all druid.io content on druid.apache.org
>> >>>>>> 3) Update druid.io DNS to point to the redirect server
>> >>>>>> 4) Shut down GitHub pages hosting for druid.io
>> >>>>>>
>> >>>>>> Steps (2) and (3) should be done as close in time as possible so
>> >> there
>> >>>>>> is no confusion as to which version of the pages is canonical.
>> >>>>>>
>> >>>>>> For the redirect server, two viable options are an nginx server or
>> an
>> >>> S3
>> >>>>>> webpage redirect (
>> >>
>> https://docs.aws.amazon.com/AmazonS3/latest/dev/how-to-page-redirect.html
>> >>> ).
>> >>>>>> Just like we did with the HTML-level redirect, I suggest we test
>> this
>> >>> first
>> >>>>>> on a single page. We can do that by having the redirect server
>> >>> initially
>> >>>>>> start off by hosting all druid.io content (so it's
>> indistinguishable
>> >>>>>> from the GitHub-pages-based site) except for a single page, which
>> it
>> >>>>>> redirects using HTTP 301 to druid.apache.org.
>> >>>>>>
>> >>>>>> I'm planning to start looking into this, so anyone around please
>> >> speak
>> >>>>>> up if you have any advice or alternative approaches to suggest.
>> >>>>>>
>> >>>>>> On Mon, Apr 29, 2019 at 4:01 PM Jonathan Wei <jo...@apache.org>
>> >>> wrote:
>> >>>>>>
>> >>>>>>> Thanks for checking the SEO state, that's somewhat disappointing.
>> >>>>>>>
>> >>>>>>> For Bing, it sounds like they really want you to use 301s (
>> >>>>>>> https://www.bing.com/webmaster/help/webmaster-guidelines-30fba23a
>> ):
>> >>>>>>>
>> >>>>>>>> Bing prefers you use a 301 permanent redirect when moving
>> content,
>> >>>>>>> should
>> >>>>>>> the move be permanent.  If the move is temporary, then a 302
>> >> temporary
>> >>>>>>> redirect will work fine.  Do not use the rel=canonical tag in
>> place
>> >>> of a
>> >>>>>>> proper redirect.
>> >>>>>>>
>> >>>>>>> I wasn't able to find similar guidance re: this issue for
>> >> DuckDuckGo.
>> >>>>>>>
>> >>>>>>> On Mon, Apr 29, 2019 at 10:42 AM Gian Merlino <gi...@apache.org>
>> >>> wrote:
>> >>>>>>>
>> >>>>>>>> Another update: SEO is not looking great after another day
>> passed.
>> >>>>>>> For a
>> >>>>>>>> search for "druid community", both http://druid.io/community and
>> >>>>>>>> https://druid.apache.org/community/ have dropped off the front
>> >> page
>> >>>>>>> of
>> >>>>>>>> Bing
>> >>>>>>>> completely. On Google, the legacy version is gone (as expected)
>> >> but
>> >>>>>>> the
>> >>>>>>>> Apache version has dropped to the #3 spot (down from #2
>> yesterday;
>> >>>>>>> and down
>> >>>>>>>> from where the legacy page was pre-migration, which was #1).
>> >>>>>>>>
>> >>>>>>>> I think this means we do need to try to get 301s figured out.
>> >>>>>>>>
>> >>>>>>>> On Sun, Apr 28, 2019 at 3:06 PM Gian Merlino <gi...@apache.org>
>> >>> wrote:
>> >>>>>>>>
>> >>>>>>>>> Google has picked up the new URL as of today but Bing hasn't.
>> >>>>>>> Neither has
>> >>>>>>>>> DuckDuckGo for that matter.
>> >>>>>>>>>
>> >>>>>>>>> Currently, Google is showing
>> >> https://druid.apache.org/community/
>> >>>>>>> in the
>> >>>>>>>>> #2 spot and Bing/DDG are showing http://druid.io/community in
>> >> the
>> >>>>>>> top
>> >>>>>>>>> spot. Ominously, the latter two _have_ picked up a page title
>> >>>>>>> change to
>> >>>>>>>>> "Redirecting..."
>> >>>>>>>>>
>> >>>>>>>>> On Fri, Apr 26, 2019 at 11:00 AM Gian Merlino <gi...@apache.org>
>> >>>>>>> wrote:
>> >>>>>>>>>
>> >>>>>>>>>> An update: this is done now since a couple of days ago, but
>> >>> Google
>> >>>>>>> and
>> >>>>>>>>>> Bing are still showing http://druid.io/community for a search
>> >>> for
>> >>>>>>>> "druid
>> >>>>>>>>>> community" or even "apache druid community":
>> >>>>>>>>>>
>> >>>>>>>>>> - https://www.google.com/search?q=druid+community
>> >>>>>>>>>> - https://www.bing.com/search?q=druid+community
>> >>>>>>>>>>
>> >>>>>>>>>> I suggest we keep an eye on the search engines and make sure
>> >> they
>> >>>>>>> can
>> >>>>>>>>>> figure out that the site has changed (I'm not sure how often
>> >> they
>> >>>>>>>> crawl).
>> >>>>>>>>>> If they can then it would make sense to me to move forward with
>> >>>>>>>> migrating
>> >>>>>>>>>> the entire web site.
>> >>>>>>>>>>
>> >>>>>>>>>> On Mon, Apr 22, 2019 at 7:49 PM Jonathan Wei <
>> >> jonwei@apache.org>
>> >>>>>>> wrote:
>> >>>>>>>>>>
>> >>>>>>>>>>> Correction: Xavier was suggesting we use
>> >>
>> https://github.com/druid-io/druid-io.github.io/blob/src/_layouts/redirect_page.html
>> >>>>>>>>>>> ,
>> >>>>>>>>>>> the existing redirect system used by the Druid website.
>> >>>>>>>>>>>
>> >>>>>>>>>>> I've opened PRs to do the community page migration test:
>> >>>>>>>>>>> https://github.com/apache/incubator-druid-website/pull/3
>> >>>>>>>>>>> https://github.com/druid-io/druid-io.github.io/pull/591
>> >>>>>>>>>>>
>> >>>>>>>>>>> On Mon, Apr 22, 2019 at 3:04 PM Gian Merlino <gian@apache.org
>> >>>
>> >>>>>>> wrote:
>> >>>>>>>>>>>
>> >>>>>>>>>>>> That sounds good to me. I would also consider adding
>> >> canonical
>> >>>>>>> tags
>> >>>>>>>> to
>> >>>>>>>>>>> all
>> >>>>>>>>>>>> druid.apache.org pages so we don't have
>> >>>>>>> druid.incubator.apache.org
>> >>>>>>>> and
>> >>>>>>>>>>>> druid.apache.org both floating around (not to mention
>> >>>>>>> http/https
>> >>>>>>>>>>> version
>> >>>>>>>>>>>> of
>> >>>>>>>>>>>> both).
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> On Mon, Apr 22, 2019 at 2:59 PM Jonathan Wei <
>> >>> jonwei@apache.org
>> >>>>>>>>
>> >>>>>>>>>>> wrote:
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>> For redirects, Xavier has suggested using
>> >>> https://help.github.com/en/articles/redirects-on-github-pages
>> >>>>>>> to
>> >>>>>>>>>>>> redirect
>> >>>>>>>>>>>>> to druid.apache.org as a way to transition before the
>> >>> domain
>> >>>>>>>>>>> migration
>> >>>>>>>>>>>>> occurs, and believes that it would have the same SEO
>> >> effects
>> >>>>>>> as a
>> >>>>>>>> 301
>> >>>>>>>>>>>>> redirect after the new pages are indexed.
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> I think we could try migrating the current Community page
>> >> to
>> >>>>>>>>>>>>> druid.apache.org with Github redirects and canonical
>> >> links
>> >>>>>>>> pointing
>> >>>>>>>>>>> to
>> >>>>>>>>>>>> the
>> >>>>>>>>>>>>> https://druid.apache.org version. If that goes well, we
>> >>> could
>> >>>>>>>>>>> continue
>> >>>>>>>>>>>>> migrating more pages.
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> What are the community's thoughts on that?
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> Thanks,
>> >>>>>>>>>>>>> Jon
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> On Tue, Mar 12, 2019 at 7:19 PM Gian Merlino <
>> >>> gian@apache.org
>> >>>>>>>>
>> >>>>>>>>>>> wrote:
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>> OpenOffice and Groovy both chose to sort of "meld" their
>> >>>>>>> classic
>> >>>>>>>>>>> and
>> >>>>>>>>>>>>> Apache
>> >>>>>>>>>>>>>> sites together: https://www.openoffice.org/,
>> >>>>>>>>>>> http://groovy-lang.org/.
>> >>>>>>>>>>>>> Note
>> >>>>>>>>>>>>>> how when you click around, you get shuttled between the
>> >>>>>>> classic
>> >>>>>>>>>>> domain
>> >>>>>>>>>>>>> and
>> >>>>>>>>>>>>>> the Apache domain. Some pages are available on both
>> >> sites,
>> >>>>>>> like
>> >>>>>>>>>>>>>> http://groovy-lang.org/download.html and
>> >>>>>>>>>>>>>> https://groovy.apache.org/download.html (which don't
>> >> use
>> >>>>>>>> canonical
>> >>>>>>>>>>>> link
>> >>>>>>>>>>>>>> tags -- does not seem like a good example to follow!).
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> NetBeans (still incubating) also has a "melded" site at
>> >>>>>>>>>>>>>> https://netbeans.org/ but doesn't seem to consider
>> >> itself
>> >>>>>>> done
>> >>>>>>>>>>> yet.
>> >>>>>>>>>>>> They
>> >>>>>>>>>>>>>> are discussing plans on their lists & wiki to do
>> >> redirects
>> >>>>>>> from
>> >>>>>>>>>>>>>> netbeans.org
>> >>>>>>>>>>>>>> to netbeans.apache.org:
>> >>
>> https://cwiki.apache.org/confluence/display/NETBEANS/netbeans.org+Transition+Process
>> >>>>>>>>>>>>>> ,
>> >>
>> https://lists.apache.org/thread.html/ad10fb9d4c8fee571a2f6232b268a3b835f7b823d3a0983b84aeb18a@%3Cdev.netbeans.apache.org%3E
>> >>>>>>>>>>>>>> .
>> >>>>>>>>>>>>>> As of today the domain has been donated to ASF, but the
>> >>>>>>> server is
>> >>>>>>>>>>> still
>> >>>>>>>>>>>>> run
>> >>>>>>>>>>>>>> by Oracle, so the plan doesn't seem to be finished yet.
>> >>>>>>> (WHOIS
>> >>>>>>>> for
>> >>>>>>>>>>>>>> netbeans.org shows ASF as the registrant; netbeans.org
>> >>>>>>> resolves
>> >>>>>>>> to
>> >>>>>>>>>>>>>> lb-netbeans-cms-adc.oracle.com.)
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> The melded sites don't really seem better to me than
>> >>>>>>> redirecting
>> >>>>>>>>>>> all
>> >>>>>>>>>>>> urls
>> >>>>>>>>>>>>>> on the domain. I guess it depends on if we want to keep
>> >>>>>>> druid.io
>> >>>>>>>>>>> as
>> >>>>>>>>>>>> the
>> >>>>>>>>>>>>>> official domain forever, or if we think
>> >> druid.apache.org
>> >>> is
>> >>>>>>>>>>> cooler. I
>> >>>>>>>>>>>>>> definitely think druid.apache.org is cooler so my vote
>> >> is
>> >>>>>>> there
>> >>>>>>>>>>> :).
>> >>>>>>>>>>>> It's
>> >>>>>>>>>>>>>> also nice that it supports https. (druid.io does not
>> >>> today,
>> >>>>>>>> since
>> >>>>>>>>>>> it's
>> >>>>>>>>>>>>> on
>> >>>>>>>>>>>>>> GitHub pages, which doesn't support https for custom
>> >>>>>>> domains.)
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> On Tue, Mar 12, 2019 at 7:47 PM Charles Allen
>> >>>>>>>>>>>>>> <ch...@snap.com.invalid> wrote:
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>> Are there other projects who have transitioned an
>> >>>>>>> independently
>> >>>>>>>>>>>>>> successful
>> >>>>>>>>>>>>>>> domain name to an apache one?
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>> On Tue, Mar 5, 2019 at 2:13 PM David Lim <
>> >>>>>>> davidlim@apache.org>
>> >>>>>>>>>>>> wrote:
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>> Who has control over the druid.io domain? Charles
>> >>>>>>> would that
>> >>>>>>>>>>> be
>> >>>>>>>>>>>> you?
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>> We'd need support from them for the DNS redirect.
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>> On Tue, Mar 5, 2019 at 2:04 PM Jonathan Wei <
>> >>>>>>>> jonwei@apache.org
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>> wrote:
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>> We still need to complete the website migration to
>> >>>>>>> Apache
>> >>>>>>>>>>>>>>> infrastructure.
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>> I'll propose the following plan:
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>> Proposed Apache Druid website migration plan
>> >>>>>>>>>>>>>>>>> ========================================
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>> These links have some previous discussion on the
>> >>>>>>> website
>> >>>>>>>>>>>> migration:
>> >>
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.apache.org_thread.html_7cae100b684e0b33e0adda993efea3d6088978700988a0ae632fdd80-40-253Cdev.druid.apache.org-253E&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=G1dTS7FlYGauxNOaQECZix2YwroWVCqJB-cT0nEeNwM&e=
>> >>
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_INFRA-2D17340&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=pwg0jE385gqei6EEEbxugKHWll7oyKoCloFc8ByhlUc&e=
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>> From the discussions above, the recommendation is
>> >> to
>> >>>>>>> have 2
>> >>>>>>>>>>>>> separate
>> >>>>>>>>>>>>>>>> repos
>> >>>>>>>>>>>>>>>>> for the website: one for source and another for
>> >>> built
>> >>>>>>>> content
>> >>>>>>>>>>>> that
>> >>>>>>>>>>>>>> will
>> >>>>>>>>>>>>>>>> be
>> >>>>>>>>>>>>>>>>> served.
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>> Generating site files
>> >>>>>>>>>>>>>>>>> =======================
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>> The Apache site update process will be similar to
>> >>> our
>> >>>>>>>> current
>> >>>>>>>>>>>>>> process.
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>> Current process:
>> >>>>>>>>>>>>>>>>> 1. Push changes to
>> >>> https://github.com/druid-io/druid-io.github.io/tree/src
>> >>>>>>>>>>>>>>>>> 2. metamx bot picks up changes, builds, and
>> >> commits
>> >>> to
>> >>>>>>> https://github.com/druid-io/druid-io.github.io/tree/master
>> >>>>>>>>>>>>>>>>> 3.
>> >>>>>>>>>>> https://github.com/druid-io/druid-io.github.io/tree/master is
>> >>>>>>>>>>>>>>> served
>> >>>>>>>>>>>>>>>> by
>> >>>>>>>>>>>>>>>>> github pages
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>> Apache process:
>> >>>>>>>>>>>>>>>>> 1. Push changes to
>> >>>>>>>>>>>>>>> https://github.com/apache/incubator-druid-website-src
>> >>>>>>>>>>>>>>>>> 2. Jenkins bot from Apache will build the website
>> >>> from
>> >>>>>>>> source
>> >>>>>>>>>>>> repo,
>> >>>>>>>>>>>>>>>> commit
>> >>>>>>>>>>>>>>>>> to
>> >>> https://github.com/apache/incubator-druid-website
>> >>>>>>>>>>>>>>>>> 3. Apache Druid website will be served from the
>> >>>>>>> content in
>> >>>>>>>>>>>>>>>>> https://github.com/apache/incubator-druid-website
>> >>>>>>>> (asf-site
>> >>>>>>>>>>>>> branch)
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>> Hosting and SEO
>> >>>>>>>>>>>>>>>>> ================
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>> The Apache site will be hosted at
>> >> druid.apache.org
>> >>> on
>> >>>>>>>> Apache
>> >>>>>>>>>>>>>>>>> infrastructure:
>> >>
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.apache.org_dev_project-2Dsite.html&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=_rHEo_asMXKypaunuBTXFkB6Ni3F6KqbEfkck18L7Ag&e=
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>> To preserve our search rankings, we can setup 301
>> >>>>>>> redirects
>> >>>>>>>>>>> from
>> >>>>>>>>>>>>> the
>> >>>>>>>>>>>>>>> old
>> >>>>>>>>>>>>>>>>> druid.io site to the corresponding pages on the
>> >>>>>>>>>>> druid.apache.org
>> >>>>>>>>>>>>>>> site. (
>> >>
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__moz.com_learn_seo_redirection&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=lUeWU0dT9thy8gp11RO-Vry7zkYl_W4BXz01fyXJO0A&e=
>> >>>>>>>>>>>>>>>> )
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>> However, Github pages (which currently hosts the
>> >>>>>>> druid.io
>> >>>>>>>>>>> site)
>> >>>>>>>>>>>>> does
>> >>>>>>>>>>>>>>> not
>> >>>>>>>>>>>>>>>>> support 301 redirects, so we propose the
>> >> following:
>> >>>>>>>>>>>>>>>>> - Setup a new Nginx server that will perform 301
>> >>>>>>> redirects
>> >>>>>>>> to
>> >>>>>>>>>>>>>>>>> druid.apache.org for the druid.io. Imply can host
>> >>>>>>> this if
>> >>>>>>>>>>>> needed.
>> >>>>>>>>>>>>>>>>> - Update the druid.io DNS entry to point to this
>> >>> new
>> >>>>>>> Nginx
>> >>>>>>>>>>>> server
>> >>>>>>>>>>>>>>>>> - Shut down Github pages hosting for druid.io
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>> In addition, we can also set canonical tags on our
>> >>>>>>> pages:
>> >>
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__moz.com_learn_seo_canonicalization&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=T8G2c6d4EbQ_YDLFQXVebcj0UN9FNrbpPY5Xq4LAR8w&e=
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>> Action items
>> >>>>>>>>>>>>>>>>> ===============
>> >>>>>>>>>>>>>>>>> - Setup a Jenkins bot that builds the Apache
>> >> website
>> >>>>>>>> content
>> >>>>>>>>>>> from
>> >>>>>>>>>>>>>>> source
>> >>>>>>>>>>>>>>>>> - Get the Apache website up
>> >>>>>>>>>>>>>>>>> - Setup Nginx redirect server for druid.io
>> >>>>>>>>>>>>>>>>> - Shutdown github pages and redirect DNS for
>> >>> druid.io
>> >>>>>>> to
>> >>>>>>>>>>> Nginx
>> >>>>>>>>>>>>>>> redirect
>> >>>>>>>>>>>>>>>>> server
>> >>>>>>>>>>>>>>>>> - Add canonical tags to pages
>> >>
>>
>

Re: Proposed website migration plan

Posted by Gian Merlino <gi...@apache.org>.
Yep, any references to https://github.com/druid-io/druid-io.github.io
should be changed to https://github.com/apache/incubator-druid. Those have
all been updated now. I didn't see any references to
https://github.com/druid-io/druid -- I think we got them all in a previous
pass.

There are still some lingering references to separate, but affiliated
projects like https://github.com/druid-io/pydruid. IMO, it makes sense to
leave them there for now, and incorporate them as subprojects of Druid once
Druid is a top level project.

On Wed, Jun 12, 2019 at 12:18 PM Julian Hyde <jh...@gmail.com> wrote:

> Looks marvelous! Thanks for making it happen.
>
> I noticed at least one reference to https://github.com/druid-io on the
> site. Should be changed to https://github.com/apache/incubator-druid?
>
> > On Jun 11, 2019, at 9:44 PM, Gian Merlino <gi...@apache.org> wrote:
> >
> > This is now done: druid.io is redirecting to druid.apache.org!!
> >
> > Next, we'll add the stuff required by
> > https://whimsy.apache.org/pods/project/druid. Then, we should be good
> to go
> > on the website migration. (Behind the scenes, Vadim Ogievetsky has been
> > helping tons with this -- thanks a lot!)
> >
> >> On Mon, Jun 10, 2019 at 9:00 AM David Lim <da...@apache.org> wrote:
> >>
> >> No objections from me - thank you for testing this out.
> >>
> >>>> On Mon, Jun 10, 2019 at 7:48 AM Gian Merlino <gi...@apache.org> wrote:
> >>>
> >>> It looks like Google has picked up the 301 and [druid use cases] #1
> >> result
> >>> is https://druid.apache.org/use-cases now. For [what is druid used
> for]
> >>> it's not #4 instead of #2. I think this is the best we are likely to
> >> get. I
> >>> am ready to flip the switch if there aren't any objections.
> >>>
> >>> On Fri, Jun 7, 2019 at 9:15 PM Gian Merlino <gi...@apache.org> wrote:
> >>>
> >>>> Another update: as of
> >>>> https://github.com/apache/incubator-druid-website-src/pull/1 and
> >>>> https://github.com/apache/incubator-druid-website/pull/7, the
> >>>> https://druid.apache.org/ site is now serving almost all pages from
> >>>> druid.io, except:
> >>>>
> >>>> - the index page (it still has a placeholder until we flip the switch)
> >>>> - the download page (it has a differently-designed download page:
> >> compare
> >>>> http://druid.io/downloads.html with
> >>> http://druid.apache.org/downloads.html
> >>>> - any docs older than 0.13.0 (they aren't Apache releases)
> >>>>
> >>>> If you navigate to https://druid.apache.org/ + any other path from
> >>>> druid.io, you should see the page.
> >>>>
> >>>> I'm hoping to confirm that search engines pick up the 301 for
> >>>> http://druid.io/use-cases before flipping the switch. Hopefully that
> >>>> doesn't take much longer. If it does we should talk about how we want
> >> to
> >>>> proceed.
> >>>>
> >>>>> On Tue, Jun 4, 2019 at 1:48 AM Gian Merlino <gi...@apache.org> wrote:
> >>>>>
> >>>>> An update: we do have a redirect server set up on druid.io now: note
> >>>>> that http://druid.io/community/ and http://druid.io/use-cases both
> >>>>> redirect to https://druid.apache.org. I just set up the latter
> >> redirect
> >>>>> (on /use-cases) as part of 'test this first on a single page'. All
> >> other
> >>>>> druid.io URLs are still being hosted using the content from GitHub
> >>> pages
> >>>>> at https://github.com/druid-io/druid-io.github.io.
> >>>>>
> >>>>> Search engine watch: currently, http://druid.io is the #1 link for
> >>>>> [druid use cases] on Google, Bing, and DuckDuckGo (and has a cool
> >>> looking
> >>>>> infobox on Google & Bing). For [what is druid used for], it's #2 on
> >>> Google,
> >>>>> and not ranked on the first page on Bing & DDG. Will monitor this
> over
> >>> the
> >>>>> next few days.
> >>>>>
> >>>>>> On Mon, May 6, 2019 at 5:43 PM Gian Merlino <gi...@apache.org>
> wrote:
> >>>>>>
> >>>>>> Hi all,
> >>>>>>
> >>>>>> It sounds like we will need a redirect server that issues 301s from
> >>> each
> >>>>>> druid.io page to the corresponding druid.apache.org page. Charles
> >> and
> >>> I
> >>>>>> spoke offline and thought that something like Jon's original
> proposal
> >>> is
> >>>>>> the best way to go. I am going to suggest we get started on this, as
> >>> it's
> >>>>>> the last major piece of infra to move to ASF.
> >>>>>>
> >>>>>> 1) Set up a redirect server to perform 301 redirects to
> >>> druid.apache.org
> >>>>>> 2) Post all druid.io content on druid.apache.org
> >>>>>> 3) Update druid.io DNS to point to the redirect server
> >>>>>> 4) Shut down GitHub pages hosting for druid.io
> >>>>>>
> >>>>>> Steps (2) and (3) should be done as close in time as possible so
> >> there
> >>>>>> is no confusion as to which version of the pages is canonical.
> >>>>>>
> >>>>>> For the redirect server, two viable options are an nginx server or
> an
> >>> S3
> >>>>>> webpage redirect (
> >>
> https://docs.aws.amazon.com/AmazonS3/latest/dev/how-to-page-redirect.html
> >>> ).
> >>>>>> Just like we did with the HTML-level redirect, I suggest we test
> this
> >>> first
> >>>>>> on a single page. We can do that by having the redirect server
> >>> initially
> >>>>>> start off by hosting all druid.io content (so it's
> indistinguishable
> >>>>>> from the GitHub-pages-based site) except for a single page, which it
> >>>>>> redirects using HTTP 301 to druid.apache.org.
> >>>>>>
> >>>>>> I'm planning to start looking into this, so anyone around please
> >> speak
> >>>>>> up if you have any advice or alternative approaches to suggest.
> >>>>>>
> >>>>>> On Mon, Apr 29, 2019 at 4:01 PM Jonathan Wei <jo...@apache.org>
> >>> wrote:
> >>>>>>
> >>>>>>> Thanks for checking the SEO state, that's somewhat disappointing.
> >>>>>>>
> >>>>>>> For Bing, it sounds like they really want you to use 301s (
> >>>>>>> https://www.bing.com/webmaster/help/webmaster-guidelines-30fba23a
> ):
> >>>>>>>
> >>>>>>>> Bing prefers you use a 301 permanent redirect when moving content,
> >>>>>>> should
> >>>>>>> the move be permanent.  If the move is temporary, then a 302
> >> temporary
> >>>>>>> redirect will work fine.  Do not use the rel=canonical tag in place
> >>> of a
> >>>>>>> proper redirect.
> >>>>>>>
> >>>>>>> I wasn't able to find similar guidance re: this issue for
> >> DuckDuckGo.
> >>>>>>>
> >>>>>>> On Mon, Apr 29, 2019 at 10:42 AM Gian Merlino <gi...@apache.org>
> >>> wrote:
> >>>>>>>
> >>>>>>>> Another update: SEO is not looking great after another day passed.
> >>>>>>> For a
> >>>>>>>> search for "druid community", both http://druid.io/community and
> >>>>>>>> https://druid.apache.org/community/ have dropped off the front
> >> page
> >>>>>>> of
> >>>>>>>> Bing
> >>>>>>>> completely. On Google, the legacy version is gone (as expected)
> >> but
> >>>>>>> the
> >>>>>>>> Apache version has dropped to the #3 spot (down from #2 yesterday;
> >>>>>>> and down
> >>>>>>>> from where the legacy page was pre-migration, which was #1).
> >>>>>>>>
> >>>>>>>> I think this means we do need to try to get 301s figured out.
> >>>>>>>>
> >>>>>>>> On Sun, Apr 28, 2019 at 3:06 PM Gian Merlino <gi...@apache.org>
> >>> wrote:
> >>>>>>>>
> >>>>>>>>> Google has picked up the new URL as of today but Bing hasn't.
> >>>>>>> Neither has
> >>>>>>>>> DuckDuckGo for that matter.
> >>>>>>>>>
> >>>>>>>>> Currently, Google is showing
> >> https://druid.apache.org/community/
> >>>>>>> in the
> >>>>>>>>> #2 spot and Bing/DDG are showing http://druid.io/community in
> >> the
> >>>>>>> top
> >>>>>>>>> spot. Ominously, the latter two _have_ picked up a page title
> >>>>>>> change to
> >>>>>>>>> "Redirecting..."
> >>>>>>>>>
> >>>>>>>>> On Fri, Apr 26, 2019 at 11:00 AM Gian Merlino <gi...@apache.org>
> >>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> An update: this is done now since a couple of days ago, but
> >>> Google
> >>>>>>> and
> >>>>>>>>>> Bing are still showing http://druid.io/community for a search
> >>> for
> >>>>>>>> "druid
> >>>>>>>>>> community" or even "apache druid community":
> >>>>>>>>>>
> >>>>>>>>>> - https://www.google.com/search?q=druid+community
> >>>>>>>>>> - https://www.bing.com/search?q=druid+community
> >>>>>>>>>>
> >>>>>>>>>> I suggest we keep an eye on the search engines and make sure
> >> they
> >>>>>>> can
> >>>>>>>>>> figure out that the site has changed (I'm not sure how often
> >> they
> >>>>>>>> crawl).
> >>>>>>>>>> If they can then it would make sense to me to move forward with
> >>>>>>>> migrating
> >>>>>>>>>> the entire web site.
> >>>>>>>>>>
> >>>>>>>>>> On Mon, Apr 22, 2019 at 7:49 PM Jonathan Wei <
> >> jonwei@apache.org>
> >>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> Correction: Xavier was suggesting we use
> >>
> https://github.com/druid-io/druid-io.github.io/blob/src/_layouts/redirect_page.html
> >>>>>>>>>>> ,
> >>>>>>>>>>> the existing redirect system used by the Druid website.
> >>>>>>>>>>>
> >>>>>>>>>>> I've opened PRs to do the community page migration test:
> >>>>>>>>>>> https://github.com/apache/incubator-druid-website/pull/3
> >>>>>>>>>>> https://github.com/druid-io/druid-io.github.io/pull/591
> >>>>>>>>>>>
> >>>>>>>>>>> On Mon, Apr 22, 2019 at 3:04 PM Gian Merlino <gian@apache.org
> >>>
> >>>>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>> That sounds good to me. I would also consider adding
> >> canonical
> >>>>>>> tags
> >>>>>>>> to
> >>>>>>>>>>> all
> >>>>>>>>>>>> druid.apache.org pages so we don't have
> >>>>>>> druid.incubator.apache.org
> >>>>>>>> and
> >>>>>>>>>>>> druid.apache.org both floating around (not to mention
> >>>>>>> http/https
> >>>>>>>>>>> version
> >>>>>>>>>>>> of
> >>>>>>>>>>>> both).
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Mon, Apr 22, 2019 at 2:59 PM Jonathan Wei <
> >>> jonwei@apache.org
> >>>>>>>>
> >>>>>>>>>>> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>>> For redirects, Xavier has suggested using
> >>> https://help.github.com/en/articles/redirects-on-github-pages
> >>>>>>> to
> >>>>>>>>>>>> redirect
> >>>>>>>>>>>>> to druid.apache.org as a way to transition before the
> >>> domain
> >>>>>>>>>>> migration
> >>>>>>>>>>>>> occurs, and believes that it would have the same SEO
> >> effects
> >>>>>>> as a
> >>>>>>>> 301
> >>>>>>>>>>>>> redirect after the new pages are indexed.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I think we could try migrating the current Community page
> >> to
> >>>>>>>>>>>>> druid.apache.org with Github redirects and canonical
> >> links
> >>>>>>>> pointing
> >>>>>>>>>>> to
> >>>>>>>>>>>> the
> >>>>>>>>>>>>> https://druid.apache.org version. If that goes well, we
> >>> could
> >>>>>>>>>>> continue
> >>>>>>>>>>>>> migrating more pages.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> What are the community's thoughts on that?
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>> Jon
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Tue, Mar 12, 2019 at 7:19 PM Gian Merlino <
> >>> gian@apache.org
> >>>>>>>>
> >>>>>>>>>>> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> OpenOffice and Groovy both chose to sort of "meld" their
> >>>>>>> classic
> >>>>>>>>>>> and
> >>>>>>>>>>>>> Apache
> >>>>>>>>>>>>>> sites together: https://www.openoffice.org/,
> >>>>>>>>>>> http://groovy-lang.org/.
> >>>>>>>>>>>>> Note
> >>>>>>>>>>>>>> how when you click around, you get shuttled between the
> >>>>>>> classic
> >>>>>>>>>>> domain
> >>>>>>>>>>>>> and
> >>>>>>>>>>>>>> the Apache domain. Some pages are available on both
> >> sites,
> >>>>>>> like
> >>>>>>>>>>>>>> http://groovy-lang.org/download.html and
> >>>>>>>>>>>>>> https://groovy.apache.org/download.html (which don't
> >> use
> >>>>>>>> canonical
> >>>>>>>>>>>> link
> >>>>>>>>>>>>>> tags -- does not seem like a good example to follow!).
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> NetBeans (still incubating) also has a "melded" site at
> >>>>>>>>>>>>>> https://netbeans.org/ but doesn't seem to consider
> >> itself
> >>>>>>> done
> >>>>>>>>>>> yet.
> >>>>>>>>>>>> They
> >>>>>>>>>>>>>> are discussing plans on their lists & wiki to do
> >> redirects
> >>>>>>> from
> >>>>>>>>>>>>>> netbeans.org
> >>>>>>>>>>>>>> to netbeans.apache.org:
> >>
> https://cwiki.apache.org/confluence/display/NETBEANS/netbeans.org+Transition+Process
> >>>>>>>>>>>>>> ,
> >>
> https://lists.apache.org/thread.html/ad10fb9d4c8fee571a2f6232b268a3b835f7b823d3a0983b84aeb18a@%3Cdev.netbeans.apache.org%3E
> >>>>>>>>>>>>>> .
> >>>>>>>>>>>>>> As of today the domain has been donated to ASF, but the
> >>>>>>> server is
> >>>>>>>>>>> still
> >>>>>>>>>>>>> run
> >>>>>>>>>>>>>> by Oracle, so the plan doesn't seem to be finished yet.
> >>>>>>> (WHOIS
> >>>>>>>> for
> >>>>>>>>>>>>>> netbeans.org shows ASF as the registrant; netbeans.org
> >>>>>>> resolves
> >>>>>>>> to
> >>>>>>>>>>>>>> lb-netbeans-cms-adc.oracle.com.)
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> The melded sites don't really seem better to me than
> >>>>>>> redirecting
> >>>>>>>>>>> all
> >>>>>>>>>>>> urls
> >>>>>>>>>>>>>> on the domain. I guess it depends on if we want to keep
> >>>>>>> druid.io
> >>>>>>>>>>> as
> >>>>>>>>>>>> the
> >>>>>>>>>>>>>> official domain forever, or if we think
> >> druid.apache.org
> >>> is
> >>>>>>>>>>> cooler. I
> >>>>>>>>>>>>>> definitely think druid.apache.org is cooler so my vote
> >> is
> >>>>>>> there
> >>>>>>>>>>> :).
> >>>>>>>>>>>> It's
> >>>>>>>>>>>>>> also nice that it supports https. (druid.io does not
> >>> today,
> >>>>>>>> since
> >>>>>>>>>>> it's
> >>>>>>>>>>>>> on
> >>>>>>>>>>>>>> GitHub pages, which doesn't support https for custom
> >>>>>>> domains.)
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On Tue, Mar 12, 2019 at 7:47 PM Charles Allen
> >>>>>>>>>>>>>> <ch...@snap.com.invalid> wrote:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Are there other projects who have transitioned an
> >>>>>>> independently
> >>>>>>>>>>>>>> successful
> >>>>>>>>>>>>>>> domain name to an apache one?
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On Tue, Mar 5, 2019 at 2:13 PM David Lim <
> >>>>>>> davidlim@apache.org>
> >>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Who has control over the druid.io domain? Charles
> >>>>>>> would that
> >>>>>>>>>>> be
> >>>>>>>>>>>> you?
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> We'd need support from them for the DNS redirect.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> On Tue, Mar 5, 2019 at 2:04 PM Jonathan Wei <
> >>>>>>>> jonwei@apache.org
> >>>>>>>>>>>>
> >>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> We still need to complete the website migration to
> >>>>>>> Apache
> >>>>>>>>>>>>>>> infrastructure.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> I'll propose the following plan:
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Proposed Apache Druid website migration plan
> >>>>>>>>>>>>>>>>> ========================================
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> These links have some previous discussion on the
> >>>>>>> website
> >>>>>>>>>>>> migration:
> >>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.apache.org_thread.html_7cae100b684e0b33e0adda993efea3d6088978700988a0ae632fdd80-40-253Cdev.druid.apache.org-253E&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=G1dTS7FlYGauxNOaQECZix2YwroWVCqJB-cT0nEeNwM&e=
> >>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_INFRA-2D17340&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=pwg0jE385gqei6EEEbxugKHWll7oyKoCloFc8ByhlUc&e=
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> From the discussions above, the recommendation is
> >> to
> >>>>>>> have 2
> >>>>>>>>>>>>> separate
> >>>>>>>>>>>>>>>> repos
> >>>>>>>>>>>>>>>>> for the website: one for source and another for
> >>> built
> >>>>>>>> content
> >>>>>>>>>>>> that
> >>>>>>>>>>>>>> will
> >>>>>>>>>>>>>>>> be
> >>>>>>>>>>>>>>>>> served.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Generating site files
> >>>>>>>>>>>>>>>>> =======================
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> The Apache site update process will be similar to
> >>> our
> >>>>>>>> current
> >>>>>>>>>>>>>> process.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Current process:
> >>>>>>>>>>>>>>>>> 1. Push changes to
> >>> https://github.com/druid-io/druid-io.github.io/tree/src
> >>>>>>>>>>>>>>>>> 2. metamx bot picks up changes, builds, and
> >> commits
> >>> to
> >>>>>>> https://github.com/druid-io/druid-io.github.io/tree/master
> >>>>>>>>>>>>>>>>> 3.
> >>>>>>>>>>> https://github.com/druid-io/druid-io.github.io/tree/master is
> >>>>>>>>>>>>>>> served
> >>>>>>>>>>>>>>>> by
> >>>>>>>>>>>>>>>>> github pages
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Apache process:
> >>>>>>>>>>>>>>>>> 1. Push changes to
> >>>>>>>>>>>>>>> https://github.com/apache/incubator-druid-website-src
> >>>>>>>>>>>>>>>>> 2. Jenkins bot from Apache will build the website
> >>> from
> >>>>>>>> source
> >>>>>>>>>>>> repo,
> >>>>>>>>>>>>>>>> commit
> >>>>>>>>>>>>>>>>> to
> >>> https://github.com/apache/incubator-druid-website
> >>>>>>>>>>>>>>>>> 3. Apache Druid website will be served from the
> >>>>>>> content in
> >>>>>>>>>>>>>>>>> https://github.com/apache/incubator-druid-website
> >>>>>>>> (asf-site
> >>>>>>>>>>>>> branch)
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Hosting and SEO
> >>>>>>>>>>>>>>>>> ================
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> The Apache site will be hosted at
> >> druid.apache.org
> >>> on
> >>>>>>>> Apache
> >>>>>>>>>>>>>>>>> infrastructure:
> >>
> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.apache.org_dev_project-2Dsite.html&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=_rHEo_asMXKypaunuBTXFkB6Ni3F6KqbEfkck18L7Ag&e=
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> To preserve our search rankings, we can setup 301
> >>>>>>> redirects
> >>>>>>>>>>> from
> >>>>>>>>>>>>> the
> >>>>>>>>>>>>>>> old
> >>>>>>>>>>>>>>>>> druid.io site to the corresponding pages on the
> >>>>>>>>>>> druid.apache.org
> >>>>>>>>>>>>>>> site. (
> >>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__moz.com_learn_seo_redirection&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=lUeWU0dT9thy8gp11RO-Vry7zkYl_W4BXz01fyXJO0A&e=
> >>>>>>>>>>>>>>>> )
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> However, Github pages (which currently hosts the
> >>>>>>> druid.io
> >>>>>>>>>>> site)
> >>>>>>>>>>>>> does
> >>>>>>>>>>>>>>> not
> >>>>>>>>>>>>>>>>> support 301 redirects, so we propose the
> >> following:
> >>>>>>>>>>>>>>>>> - Setup a new Nginx server that will perform 301
> >>>>>>> redirects
> >>>>>>>> to
> >>>>>>>>>>>>>>>>> druid.apache.org for the druid.io. Imply can host
> >>>>>>> this if
> >>>>>>>>>>>> needed.
> >>>>>>>>>>>>>>>>> - Update the druid.io DNS entry to point to this
> >>> new
> >>>>>>> Nginx
> >>>>>>>>>>>> server
> >>>>>>>>>>>>>>>>> - Shut down Github pages hosting for druid.io
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> In addition, we can also set canonical tags on our
> >>>>>>> pages:
> >>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__moz.com_learn_seo_canonicalization&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=T8G2c6d4EbQ_YDLFQXVebcj0UN9FNrbpPY5Xq4LAR8w&e=
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Action items
> >>>>>>>>>>>>>>>>> ===============
> >>>>>>>>>>>>>>>>> - Setup a Jenkins bot that builds the Apache
> >> website
> >>>>>>>> content
> >>>>>>>>>>> from
> >>>>>>>>>>>>>>> source
> >>>>>>>>>>>>>>>>> - Get the Apache website up
> >>>>>>>>>>>>>>>>> - Setup Nginx redirect server for druid.io
> >>>>>>>>>>>>>>>>> - Shutdown github pages and redirect DNS for
> >>> druid.io
> >>>>>>> to
> >>>>>>>>>>> Nginx
> >>>>>>>>>>>>>>> redirect
> >>>>>>>>>>>>>>>>> server
> >>>>>>>>>>>>>>>>> - Add canonical tags to pages
> >>
>

Re: Proposed website migration plan

Posted by Julian Hyde <jh...@gmail.com>.
Looks marvelous! Thanks for making it happen. 

I noticed at least one reference to https://github.com/druid-io on the site. Should be changed to https://github.com/apache/incubator-druid? 

> On Jun 11, 2019, at 9:44 PM, Gian Merlino <gi...@apache.org> wrote:
> 
> This is now done: druid.io is redirecting to druid.apache.org!!
> 
> Next, we'll add the stuff required by
> https://whimsy.apache.org/pods/project/druid. Then, we should be good to go
> on the website migration. (Behind the scenes, Vadim Ogievetsky has been
> helping tons with this -- thanks a lot!)
> 
>> On Mon, Jun 10, 2019 at 9:00 AM David Lim <da...@apache.org> wrote:
>> 
>> No objections from me - thank you for testing this out.
>> 
>>>> On Mon, Jun 10, 2019 at 7:48 AM Gian Merlino <gi...@apache.org> wrote:
>>> 
>>> It looks like Google has picked up the 301 and [druid use cases] #1
>> result
>>> is https://druid.apache.org/use-cases now. For [what is druid used for]
>>> it's not #4 instead of #2. I think this is the best we are likely to
>> get. I
>>> am ready to flip the switch if there aren't any objections.
>>> 
>>> On Fri, Jun 7, 2019 at 9:15 PM Gian Merlino <gi...@apache.org> wrote:
>>> 
>>>> Another update: as of
>>>> https://github.com/apache/incubator-druid-website-src/pull/1 and
>>>> https://github.com/apache/incubator-druid-website/pull/7, the
>>>> https://druid.apache.org/ site is now serving almost all pages from
>>>> druid.io, except:
>>>> 
>>>> - the index page (it still has a placeholder until we flip the switch)
>>>> - the download page (it has a differently-designed download page:
>> compare
>>>> http://druid.io/downloads.html with
>>> http://druid.apache.org/downloads.html
>>>> - any docs older than 0.13.0 (they aren't Apache releases)
>>>> 
>>>> If you navigate to https://druid.apache.org/ + any other path from
>>>> druid.io, you should see the page.
>>>> 
>>>> I'm hoping to confirm that search engines pick up the 301 for
>>>> http://druid.io/use-cases before flipping the switch. Hopefully that
>>>> doesn't take much longer. If it does we should talk about how we want
>> to
>>>> proceed.
>>>> 
>>>>> On Tue, Jun 4, 2019 at 1:48 AM Gian Merlino <gi...@apache.org> wrote:
>>>>> 
>>>>> An update: we do have a redirect server set up on druid.io now: note
>>>>> that http://druid.io/community/ and http://druid.io/use-cases both
>>>>> redirect to https://druid.apache.org. I just set up the latter
>> redirect
>>>>> (on /use-cases) as part of 'test this first on a single page'. All
>> other
>>>>> druid.io URLs are still being hosted using the content from GitHub
>>> pages
>>>>> at https://github.com/druid-io/druid-io.github.io.
>>>>> 
>>>>> Search engine watch: currently, http://druid.io is the #1 link for
>>>>> [druid use cases] on Google, Bing, and DuckDuckGo (and has a cool
>>> looking
>>>>> infobox on Google & Bing). For [what is druid used for], it's #2 on
>>> Google,
>>>>> and not ranked on the first page on Bing & DDG. Will monitor this over
>>> the
>>>>> next few days.
>>>>> 
>>>>>> On Mon, May 6, 2019 at 5:43 PM Gian Merlino <gi...@apache.org> wrote:
>>>>>> 
>>>>>> Hi all,
>>>>>> 
>>>>>> It sounds like we will need a redirect server that issues 301s from
>>> each
>>>>>> druid.io page to the corresponding druid.apache.org page. Charles
>> and
>>> I
>>>>>> spoke offline and thought that something like Jon's original proposal
>>> is
>>>>>> the best way to go. I am going to suggest we get started on this, as
>>> it's
>>>>>> the last major piece of infra to move to ASF.
>>>>>> 
>>>>>> 1) Set up a redirect server to perform 301 redirects to
>>> druid.apache.org
>>>>>> 2) Post all druid.io content on druid.apache.org
>>>>>> 3) Update druid.io DNS to point to the redirect server
>>>>>> 4) Shut down GitHub pages hosting for druid.io
>>>>>> 
>>>>>> Steps (2) and (3) should be done as close in time as possible so
>> there
>>>>>> is no confusion as to which version of the pages is canonical.
>>>>>> 
>>>>>> For the redirect server, two viable options are an nginx server or an
>>> S3
>>>>>> webpage redirect (
>> https://docs.aws.amazon.com/AmazonS3/latest/dev/how-to-page-redirect.html
>>> ).
>>>>>> Just like we did with the HTML-level redirect, I suggest we test this
>>> first
>>>>>> on a single page. We can do that by having the redirect server
>>> initially
>>>>>> start off by hosting all druid.io content (so it's indistinguishable
>>>>>> from the GitHub-pages-based site) except for a single page, which it
>>>>>> redirects using HTTP 301 to druid.apache.org.
>>>>>> 
>>>>>> I'm planning to start looking into this, so anyone around please
>> speak
>>>>>> up if you have any advice or alternative approaches to suggest.
>>>>>> 
>>>>>> On Mon, Apr 29, 2019 at 4:01 PM Jonathan Wei <jo...@apache.org>
>>> wrote:
>>>>>> 
>>>>>>> Thanks for checking the SEO state, that's somewhat disappointing.
>>>>>>> 
>>>>>>> For Bing, it sounds like they really want you to use 301s (
>>>>>>> https://www.bing.com/webmaster/help/webmaster-guidelines-30fba23a):
>>>>>>> 
>>>>>>>> Bing prefers you use a 301 permanent redirect when moving content,
>>>>>>> should
>>>>>>> the move be permanent.  If the move is temporary, then a 302
>> temporary
>>>>>>> redirect will work fine.  Do not use the rel=canonical tag in place
>>> of a
>>>>>>> proper redirect.
>>>>>>> 
>>>>>>> I wasn't able to find similar guidance re: this issue for
>> DuckDuckGo.
>>>>>>> 
>>>>>>> On Mon, Apr 29, 2019 at 10:42 AM Gian Merlino <gi...@apache.org>
>>> wrote:
>>>>>>> 
>>>>>>>> Another update: SEO is not looking great after another day passed.
>>>>>>> For a
>>>>>>>> search for "druid community", both http://druid.io/community and
>>>>>>>> https://druid.apache.org/community/ have dropped off the front
>> page
>>>>>>> of
>>>>>>>> Bing
>>>>>>>> completely. On Google, the legacy version is gone (as expected)
>> but
>>>>>>> the
>>>>>>>> Apache version has dropped to the #3 spot (down from #2 yesterday;
>>>>>>> and down
>>>>>>>> from where the legacy page was pre-migration, which was #1).
>>>>>>>> 
>>>>>>>> I think this means we do need to try to get 301s figured out.
>>>>>>>> 
>>>>>>>> On Sun, Apr 28, 2019 at 3:06 PM Gian Merlino <gi...@apache.org>
>>> wrote:
>>>>>>>> 
>>>>>>>>> Google has picked up the new URL as of today but Bing hasn't.
>>>>>>> Neither has
>>>>>>>>> DuckDuckGo for that matter.
>>>>>>>>> 
>>>>>>>>> Currently, Google is showing
>> https://druid.apache.org/community/
>>>>>>> in the
>>>>>>>>> #2 spot and Bing/DDG are showing http://druid.io/community in
>> the
>>>>>>> top
>>>>>>>>> spot. Ominously, the latter two _have_ picked up a page title
>>>>>>> change to
>>>>>>>>> "Redirecting..."
>>>>>>>>> 
>>>>>>>>> On Fri, Apr 26, 2019 at 11:00 AM Gian Merlino <gi...@apache.org>
>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> An update: this is done now since a couple of days ago, but
>>> Google
>>>>>>> and
>>>>>>>>>> Bing are still showing http://druid.io/community for a search
>>> for
>>>>>>>> "druid
>>>>>>>>>> community" or even "apache druid community":
>>>>>>>>>> 
>>>>>>>>>> - https://www.google.com/search?q=druid+community
>>>>>>>>>> - https://www.bing.com/search?q=druid+community
>>>>>>>>>> 
>>>>>>>>>> I suggest we keep an eye on the search engines and make sure
>> they
>>>>>>> can
>>>>>>>>>> figure out that the site has changed (I'm not sure how often
>> they
>>>>>>>> crawl).
>>>>>>>>>> If they can then it would make sense to me to move forward with
>>>>>>>> migrating
>>>>>>>>>> the entire web site.
>>>>>>>>>> 
>>>>>>>>>> On Mon, Apr 22, 2019 at 7:49 PM Jonathan Wei <
>> jonwei@apache.org>
>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>>> Correction: Xavier was suggesting we use
>> https://github.com/druid-io/druid-io.github.io/blob/src/_layouts/redirect_page.html
>>>>>>>>>>> ,
>>>>>>>>>>> the existing redirect system used by the Druid website.
>>>>>>>>>>> 
>>>>>>>>>>> I've opened PRs to do the community page migration test:
>>>>>>>>>>> https://github.com/apache/incubator-druid-website/pull/3
>>>>>>>>>>> https://github.com/druid-io/druid-io.github.io/pull/591
>>>>>>>>>>> 
>>>>>>>>>>> On Mon, Apr 22, 2019 at 3:04 PM Gian Merlino <gian@apache.org
>>> 
>>>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>>> That sounds good to me. I would also consider adding
>> canonical
>>>>>>> tags
>>>>>>>> to
>>>>>>>>>>> all
>>>>>>>>>>>> druid.apache.org pages so we don't have
>>>>>>> druid.incubator.apache.org
>>>>>>>> and
>>>>>>>>>>>> druid.apache.org both floating around (not to mention
>>>>>>> http/https
>>>>>>>>>>> version
>>>>>>>>>>>> of
>>>>>>>>>>>> both).
>>>>>>>>>>>> 
>>>>>>>>>>>> On Mon, Apr 22, 2019 at 2:59 PM Jonathan Wei <
>>> jonwei@apache.org
>>>>>>>> 
>>>>>>>>>>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>>> For redirects, Xavier has suggested using
>>> https://help.github.com/en/articles/redirects-on-github-pages
>>>>>>> to
>>>>>>>>>>>> redirect
>>>>>>>>>>>>> to druid.apache.org as a way to transition before the
>>> domain
>>>>>>>>>>> migration
>>>>>>>>>>>>> occurs, and believes that it would have the same SEO
>> effects
>>>>>>> as a
>>>>>>>> 301
>>>>>>>>>>>>> redirect after the new pages are indexed.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I think we could try migrating the current Community page
>> to
>>>>>>>>>>>>> druid.apache.org with Github redirects and canonical
>> links
>>>>>>>> pointing
>>>>>>>>>>> to
>>>>>>>>>>>> the
>>>>>>>>>>>>> https://druid.apache.org version. If that goes well, we
>>> could
>>>>>>>>>>> continue
>>>>>>>>>>>>> migrating more pages.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> What are the community's thoughts on that?
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Jon
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On Tue, Mar 12, 2019 at 7:19 PM Gian Merlino <
>>> gian@apache.org
>>>>>>>> 
>>>>>>>>>>> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> OpenOffice and Groovy both chose to sort of "meld" their
>>>>>>> classic
>>>>>>>>>>> and
>>>>>>>>>>>>> Apache
>>>>>>>>>>>>>> sites together: https://www.openoffice.org/,
>>>>>>>>>>> http://groovy-lang.org/.
>>>>>>>>>>>>> Note
>>>>>>>>>>>>>> how when you click around, you get shuttled between the
>>>>>>> classic
>>>>>>>>>>> domain
>>>>>>>>>>>>> and
>>>>>>>>>>>>>> the Apache domain. Some pages are available on both
>> sites,
>>>>>>> like
>>>>>>>>>>>>>> http://groovy-lang.org/download.html and
>>>>>>>>>>>>>> https://groovy.apache.org/download.html (which don't
>> use
>>>>>>>> canonical
>>>>>>>>>>>> link
>>>>>>>>>>>>>> tags -- does not seem like a good example to follow!).
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> NetBeans (still incubating) also has a "melded" site at
>>>>>>>>>>>>>> https://netbeans.org/ but doesn't seem to consider
>> itself
>>>>>>> done
>>>>>>>>>>> yet.
>>>>>>>>>>>> They
>>>>>>>>>>>>>> are discussing plans on their lists & wiki to do
>> redirects
>>>>>>> from
>>>>>>>>>>>>>> netbeans.org
>>>>>>>>>>>>>> to netbeans.apache.org:
>> https://cwiki.apache.org/confluence/display/NETBEANS/netbeans.org+Transition+Process
>>>>>>>>>>>>>> ,
>> https://lists.apache.org/thread.html/ad10fb9d4c8fee571a2f6232b268a3b835f7b823d3a0983b84aeb18a@%3Cdev.netbeans.apache.org%3E
>>>>>>>>>>>>>> .
>>>>>>>>>>>>>> As of today the domain has been donated to ASF, but the
>>>>>>> server is
>>>>>>>>>>> still
>>>>>>>>>>>>> run
>>>>>>>>>>>>>> by Oracle, so the plan doesn't seem to be finished yet.
>>>>>>> (WHOIS
>>>>>>>> for
>>>>>>>>>>>>>> netbeans.org shows ASF as the registrant; netbeans.org
>>>>>>> resolves
>>>>>>>> to
>>>>>>>>>>>>>> lb-netbeans-cms-adc.oracle.com.)
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> The melded sites don't really seem better to me than
>>>>>>> redirecting
>>>>>>>>>>> all
>>>>>>>>>>>> urls
>>>>>>>>>>>>>> on the domain. I guess it depends on if we want to keep
>>>>>>> druid.io
>>>>>>>>>>> as
>>>>>>>>>>>> the
>>>>>>>>>>>>>> official domain forever, or if we think
>> druid.apache.org
>>> is
>>>>>>>>>>> cooler. I
>>>>>>>>>>>>>> definitely think druid.apache.org is cooler so my vote
>> is
>>>>>>> there
>>>>>>>>>>> :).
>>>>>>>>>>>> It's
>>>>>>>>>>>>>> also nice that it supports https. (druid.io does not
>>> today,
>>>>>>>> since
>>>>>>>>>>> it's
>>>>>>>>>>>>> on
>>>>>>>>>>>>>> GitHub pages, which doesn't support https for custom
>>>>>>> domains.)
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Tue, Mar 12, 2019 at 7:47 PM Charles Allen
>>>>>>>>>>>>>> <ch...@snap.com.invalid> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Are there other projects who have transitioned an
>>>>>>> independently
>>>>>>>>>>>>>> successful
>>>>>>>>>>>>>>> domain name to an apache one?
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> On Tue, Mar 5, 2019 at 2:13 PM David Lim <
>>>>>>> davidlim@apache.org>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Who has control over the druid.io domain? Charles
>>>>>>> would that
>>>>>>>>>>> be
>>>>>>>>>>>> you?
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> We'd need support from them for the DNS redirect.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> On Tue, Mar 5, 2019 at 2:04 PM Jonathan Wei <
>>>>>>>> jonwei@apache.org
>>>>>>>>>>>> 
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> We still need to complete the website migration to
>>>>>>> Apache
>>>>>>>>>>>>>>> infrastructure.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> I'll propose the following plan:
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Proposed Apache Druid website migration plan
>>>>>>>>>>>>>>>>> ========================================
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> These links have some previous discussion on the
>>>>>>> website
>>>>>>>>>>>> migration:
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.apache.org_thread.html_7cae100b684e0b33e0adda993efea3d6088978700988a0ae632fdd80-40-253Cdev.druid.apache.org-253E&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=G1dTS7FlYGauxNOaQECZix2YwroWVCqJB-cT0nEeNwM&e=
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_INFRA-2D17340&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=pwg0jE385gqei6EEEbxugKHWll7oyKoCloFc8ByhlUc&e=
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> From the discussions above, the recommendation is
>> to
>>>>>>> have 2
>>>>>>>>>>>>> separate
>>>>>>>>>>>>>>>> repos
>>>>>>>>>>>>>>>>> for the website: one for source and another for
>>> built
>>>>>>>> content
>>>>>>>>>>>> that
>>>>>>>>>>>>>> will
>>>>>>>>>>>>>>>> be
>>>>>>>>>>>>>>>>> served.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Generating site files
>>>>>>>>>>>>>>>>> =======================
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> The Apache site update process will be similar to
>>> our
>>>>>>>> current
>>>>>>>>>>>>>> process.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Current process:
>>>>>>>>>>>>>>>>> 1. Push changes to
>>> https://github.com/druid-io/druid-io.github.io/tree/src
>>>>>>>>>>>>>>>>> 2. metamx bot picks up changes, builds, and
>> commits
>>> to
>>>>>>> https://github.com/druid-io/druid-io.github.io/tree/master
>>>>>>>>>>>>>>>>> 3.
>>>>>>>>>>> https://github.com/druid-io/druid-io.github.io/tree/master is
>>>>>>>>>>>>>>> served
>>>>>>>>>>>>>>>> by
>>>>>>>>>>>>>>>>> github pages
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Apache process:
>>>>>>>>>>>>>>>>> 1. Push changes to
>>>>>>>>>>>>>>> https://github.com/apache/incubator-druid-website-src
>>>>>>>>>>>>>>>>> 2. Jenkins bot from Apache will build the website
>>> from
>>>>>>>> source
>>>>>>>>>>>> repo,
>>>>>>>>>>>>>>>> commit
>>>>>>>>>>>>>>>>> to
>>> https://github.com/apache/incubator-druid-website
>>>>>>>>>>>>>>>>> 3. Apache Druid website will be served from the
>>>>>>> content in
>>>>>>>>>>>>>>>>> https://github.com/apache/incubator-druid-website
>>>>>>>> (asf-site
>>>>>>>>>>>>> branch)
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Hosting and SEO
>>>>>>>>>>>>>>>>> ================
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> The Apache site will be hosted at
>> druid.apache.org
>>> on
>>>>>>>> Apache
>>>>>>>>>>>>>>>>> infrastructure:
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.apache.org_dev_project-2Dsite.html&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=_rHEo_asMXKypaunuBTXFkB6Ni3F6KqbEfkck18L7Ag&e=
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> To preserve our search rankings, we can setup 301
>>>>>>> redirects
>>>>>>>>>>> from
>>>>>>>>>>>>> the
>>>>>>>>>>>>>>> old
>>>>>>>>>>>>>>>>> druid.io site to the corresponding pages on the
>>>>>>>>>>> druid.apache.org
>>>>>>>>>>>>>>> site. (
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__moz.com_learn_seo_redirection&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=lUeWU0dT9thy8gp11RO-Vry7zkYl_W4BXz01fyXJO0A&e=
>>>>>>>>>>>>>>>> )
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> However, Github pages (which currently hosts the
>>>>>>> druid.io
>>>>>>>>>>> site)
>>>>>>>>>>>>> does
>>>>>>>>>>>>>>> not
>>>>>>>>>>>>>>>>> support 301 redirects, so we propose the
>> following:
>>>>>>>>>>>>>>>>> - Setup a new Nginx server that will perform 301
>>>>>>> redirects
>>>>>>>> to
>>>>>>>>>>>>>>>>> druid.apache.org for the druid.io. Imply can host
>>>>>>> this if
>>>>>>>>>>>> needed.
>>>>>>>>>>>>>>>>> - Update the druid.io DNS entry to point to this
>>> new
>>>>>>> Nginx
>>>>>>>>>>>> server
>>>>>>>>>>>>>>>>> - Shut down Github pages hosting for druid.io
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> In addition, we can also set canonical tags on our
>>>>>>> pages:
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__moz.com_learn_seo_canonicalization&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=T8G2c6d4EbQ_YDLFQXVebcj0UN9FNrbpPY5Xq4LAR8w&e=
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Action items
>>>>>>>>>>>>>>>>> ===============
>>>>>>>>>>>>>>>>> - Setup a Jenkins bot that builds the Apache
>> website
>>>>>>>> content
>>>>>>>>>>> from
>>>>>>>>>>>>>>> source
>>>>>>>>>>>>>>>>> - Get the Apache website up
>>>>>>>>>>>>>>>>> - Setup Nginx redirect server for druid.io
>>>>>>>>>>>>>>>>> - Shutdown github pages and redirect DNS for
>>> druid.io
>>>>>>> to
>>>>>>>>>>> Nginx
>>>>>>>>>>>>>>> redirect
>>>>>>>>>>>>>>>>> server
>>>>>>>>>>>>>>>>> - Add canonical tags to pages
>> 

Re: Proposed website migration plan

Posted by Gian Merlino <gi...@apache.org>.
This is now done: druid.io is redirecting to druid.apache.org!!

Next, we'll add the stuff required by
https://whimsy.apache.org/pods/project/druid. Then, we should be good to go
on the website migration. (Behind the scenes, Vadim Ogievetsky has been
helping tons with this -- thanks a lot!)

On Mon, Jun 10, 2019 at 9:00 AM David Lim <da...@apache.org> wrote:

> No objections from me - thank you for testing this out.
>
> On Mon, Jun 10, 2019 at 7:48 AM Gian Merlino <gi...@apache.org> wrote:
>
> > It looks like Google has picked up the 301 and [druid use cases] #1
> result
> > is https://druid.apache.org/use-cases now. For [what is druid used for]
> > it's not #4 instead of #2. I think this is the best we are likely to
> get. I
> > am ready to flip the switch if there aren't any objections.
> >
> > On Fri, Jun 7, 2019 at 9:15 PM Gian Merlino <gi...@apache.org> wrote:
> >
> > > Another update: as of
> > > https://github.com/apache/incubator-druid-website-src/pull/1 and
> > > https://github.com/apache/incubator-druid-website/pull/7, the
> > > https://druid.apache.org/ site is now serving almost all pages from
> > > druid.io, except:
> > >
> > > - the index page (it still has a placeholder until we flip the switch)
> > > - the download page (it has a differently-designed download page:
> compare
> > > http://druid.io/downloads.html with
> > http://druid.apache.org/downloads.html
> > > - any docs older than 0.13.0 (they aren't Apache releases)
> > >
> > > If you navigate to https://druid.apache.org/ + any other path from
> > > druid.io, you should see the page.
> > >
> > > I'm hoping to confirm that search engines pick up the 301 for
> > > http://druid.io/use-cases before flipping the switch. Hopefully that
> > > doesn't take much longer. If it does we should talk about how we want
> to
> > > proceed.
> > >
> > > On Tue, Jun 4, 2019 at 1:48 AM Gian Merlino <gi...@apache.org> wrote:
> > >
> > >> An update: we do have a redirect server set up on druid.io now: note
> > >> that http://druid.io/community/ and http://druid.io/use-cases both
> > >> redirect to https://druid.apache.org. I just set up the latter
> redirect
> > >> (on /use-cases) as part of 'test this first on a single page'. All
> other
> > >> druid.io URLs are still being hosted using the content from GitHub
> > pages
> > >> at https://github.com/druid-io/druid-io.github.io.
> > >>
> > >> Search engine watch: currently, http://druid.io is the #1 link for
> > >> [druid use cases] on Google, Bing, and DuckDuckGo (and has a cool
> > looking
> > >> infobox on Google & Bing). For [what is druid used for], it's #2 on
> > Google,
> > >> and not ranked on the first page on Bing & DDG. Will monitor this over
> > the
> > >> next few days.
> > >>
> > >> On Mon, May 6, 2019 at 5:43 PM Gian Merlino <gi...@apache.org> wrote:
> > >>
> > >>> Hi all,
> > >>>
> > >>> It sounds like we will need a redirect server that issues 301s from
> > each
> > >>> druid.io page to the corresponding druid.apache.org page. Charles
> and
> > I
> > >>> spoke offline and thought that something like Jon's original proposal
> > is
> > >>> the best way to go. I am going to suggest we get started on this, as
> > it's
> > >>> the last major piece of infra to move to ASF.
> > >>>
> > >>> 1) Set up a redirect server to perform 301 redirects to
> > druid.apache.org
> > >>> 2) Post all druid.io content on druid.apache.org
> > >>> 3) Update druid.io DNS to point to the redirect server
> > >>> 4) Shut down GitHub pages hosting for druid.io
> > >>>
> > >>> Steps (2) and (3) should be done as close in time as possible so
> there
> > >>> is no confusion as to which version of the pages is canonical.
> > >>>
> > >>> For the redirect server, two viable options are an nginx server or an
> > S3
> > >>> webpage redirect (
> > >>>
> >
> https://docs.aws.amazon.com/AmazonS3/latest/dev/how-to-page-redirect.html
> > ).
> > >>> Just like we did with the HTML-level redirect, I suggest we test this
> > first
> > >>> on a single page. We can do that by having the redirect server
> > initially
> > >>> start off by hosting all druid.io content (so it's indistinguishable
> > >>> from the GitHub-pages-based site) except for a single page, which it
> > >>> redirects using HTTP 301 to druid.apache.org.
> > >>>
> > >>> I'm planning to start looking into this, so anyone around please
> speak
> > >>> up if you have any advice or alternative approaches to suggest.
> > >>>
> > >>> On Mon, Apr 29, 2019 at 4:01 PM Jonathan Wei <jo...@apache.org>
> > wrote:
> > >>>
> > >>>> Thanks for checking the SEO state, that's somewhat disappointing.
> > >>>>
> > >>>> For Bing, it sounds like they really want you to use 301s (
> > >>>> https://www.bing.com/webmaster/help/webmaster-guidelines-30fba23a):
> > >>>>
> > >>>> > Bing prefers you use a 301 permanent redirect when moving content,
> > >>>> should
> > >>>> the move be permanent.  If the move is temporary, then a 302
> temporary
> > >>>> redirect will work fine.  Do not use the rel=canonical tag in place
> > of a
> > >>>> proper redirect.
> > >>>>
> > >>>> I wasn't able to find similar guidance re: this issue for
> DuckDuckGo.
> > >>>>
> > >>>> On Mon, Apr 29, 2019 at 10:42 AM Gian Merlino <gi...@apache.org>
> > wrote:
> > >>>>
> > >>>> > Another update: SEO is not looking great after another day passed.
> > >>>> For a
> > >>>> > search for "druid community", both http://druid.io/community and
> > >>>> > https://druid.apache.org/community/ have dropped off the front
> page
> > >>>> of
> > >>>> > Bing
> > >>>> > completely. On Google, the legacy version is gone (as expected)
> but
> > >>>> the
> > >>>> > Apache version has dropped to the #3 spot (down from #2 yesterday;
> > >>>> and down
> > >>>> > from where the legacy page was pre-migration, which was #1).
> > >>>> >
> > >>>> > I think this means we do need to try to get 301s figured out.
> > >>>> >
> > >>>> > On Sun, Apr 28, 2019 at 3:06 PM Gian Merlino <gi...@apache.org>
> > wrote:
> > >>>> >
> > >>>> > > Google has picked up the new URL as of today but Bing hasn't.
> > >>>> Neither has
> > >>>> > > DuckDuckGo for that matter.
> > >>>> > >
> > >>>> > > Currently, Google is showing
> https://druid.apache.org/community/
> > >>>> in the
> > >>>> > > #2 spot and Bing/DDG are showing http://druid.io/community in
> the
> > >>>> top
> > >>>> > > spot. Ominously, the latter two _have_ picked up a page title
> > >>>> change to
> > >>>> > > "Redirecting..."
> > >>>> > >
> > >>>> > > On Fri, Apr 26, 2019 at 11:00 AM Gian Merlino <gi...@apache.org>
> > >>>> wrote:
> > >>>> > >
> > >>>> > >> An update: this is done now since a couple of days ago, but
> > Google
> > >>>> and
> > >>>> > >> Bing are still showing http://druid.io/community for a search
> > for
> > >>>> > "druid
> > >>>> > >> community" or even "apache druid community":
> > >>>> > >>
> > >>>> > >> - https://www.google.com/search?q=druid+community
> > >>>> > >> - https://www.bing.com/search?q=druid+community
> > >>>> > >>
> > >>>> > >> I suggest we keep an eye on the search engines and make sure
> they
> > >>>> can
> > >>>> > >> figure out that the site has changed (I'm not sure how often
> they
> > >>>> > crawl).
> > >>>> > >> If they can then it would make sense to me to move forward with
> > >>>> > migrating
> > >>>> > >> the entire web site.
> > >>>> > >>
> > >>>> > >> On Mon, Apr 22, 2019 at 7:49 PM Jonathan Wei <
> jonwei@apache.org>
> > >>>> wrote:
> > >>>> > >>
> > >>>> > >>> Correction: Xavier was suggesting we use
> > >>>> > >>>
> > >>>> > >>>
> > >>>> >
> > >>>>
> >
> https://github.com/druid-io/druid-io.github.io/blob/src/_layouts/redirect_page.html
> > >>>> > >>> ,
> > >>>> > >>> the existing redirect system used by the Druid website.
> > >>>> > >>>
> > >>>> > >>> I've opened PRs to do the community page migration test:
> > >>>> > >>> https://github.com/apache/incubator-druid-website/pull/3
> > >>>> > >>> https://github.com/druid-io/druid-io.github.io/pull/591
> > >>>> > >>>
> > >>>> > >>> On Mon, Apr 22, 2019 at 3:04 PM Gian Merlino <gian@apache.org
> >
> > >>>> wrote:
> > >>>> > >>>
> > >>>> > >>> > That sounds good to me. I would also consider adding
> canonical
> > >>>> tags
> > >>>> > to
> > >>>> > >>> all
> > >>>> > >>> > druid.apache.org pages so we don't have
> > >>>> druid.incubator.apache.org
> > >>>> > and
> > >>>> > >>> > druid.apache.org both floating around (not to mention
> > >>>> http/https
> > >>>> > >>> version
> > >>>> > >>> > of
> > >>>> > >>> > both).
> > >>>> > >>> >
> > >>>> > >>> > On Mon, Apr 22, 2019 at 2:59 PM Jonathan Wei <
> > jonwei@apache.org
> > >>>> >
> > >>>> > >>> wrote:
> > >>>> > >>> >
> > >>>> > >>> > > For redirects, Xavier has suggested using
> > >>>> > >>> > >
> > https://help.github.com/en/articles/redirects-on-github-pages
> > >>>> to
> > >>>> > >>> > redirect
> > >>>> > >>> > > to druid.apache.org as a way to transition before the
> > domain
> > >>>> > >>> migration
> > >>>> > >>> > > occurs, and believes that it would have the same SEO
> effects
> > >>>> as a
> > >>>> > 301
> > >>>> > >>> > > redirect after the new pages are indexed.
> > >>>> > >>> > >
> > >>>> > >>> > > I think we could try migrating the current Community page
> to
> > >>>> > >>> > > druid.apache.org with Github redirects and canonical
> links
> > >>>> > pointing
> > >>>> > >>> to
> > >>>> > >>> > the
> > >>>> > >>> > > https://druid.apache.org version. If that goes well, we
> > could
> > >>>> > >>> continue
> > >>>> > >>> > > migrating more pages.
> > >>>> > >>> > >
> > >>>> > >>> > > What are the community's thoughts on that?
> > >>>> > >>> > >
> > >>>> > >>> > > Thanks,
> > >>>> > >>> > > Jon
> > >>>> > >>> > >
> > >>>> > >>> > > On Tue, Mar 12, 2019 at 7:19 PM Gian Merlino <
> > gian@apache.org
> > >>>> >
> > >>>> > >>> wrote:
> > >>>> > >>> > >
> > >>>> > >>> > > > OpenOffice and Groovy both chose to sort of "meld" their
> > >>>> classic
> > >>>> > >>> and
> > >>>> > >>> > > Apache
> > >>>> > >>> > > > sites together: https://www.openoffice.org/,
> > >>>> > >>> http://groovy-lang.org/.
> > >>>> > >>> > > Note
> > >>>> > >>> > > > how when you click around, you get shuttled between the
> > >>>> classic
> > >>>> > >>> domain
> > >>>> > >>> > > and
> > >>>> > >>> > > > the Apache domain. Some pages are available on both
> sites,
> > >>>> like
> > >>>> > >>> > > > http://groovy-lang.org/download.html and
> > >>>> > >>> > > > https://groovy.apache.org/download.html (which don't
> use
> > >>>> > canonical
> > >>>> > >>> > link
> > >>>> > >>> > > > tags -- does not seem like a good example to follow!).
> > >>>> > >>> > > >
> > >>>> > >>> > > > NetBeans (still incubating) also has a "melded" site at
> > >>>> > >>> > > > https://netbeans.org/ but doesn't seem to consider
> itself
> > >>>> done
> > >>>> > >>> yet.
> > >>>> > >>> > They
> > >>>> > >>> > > > are discussing plans on their lists & wiki to do
> redirects
> > >>>> from
> > >>>> > >>> > > > netbeans.org
> > >>>> > >>> > > > to netbeans.apache.org:
> > >>>> > >>> > > >
> > >>>> > >>> > > >
> > >>>> > >>> > >
> > >>>> > >>> >
> > >>>> > >>>
> > >>>> >
> > >>>>
> >
> https://cwiki.apache.org/confluence/display/NETBEANS/netbeans.org+Transition+Process
> > >>>> > >>> > > > ,
> > >>>> > >>> > > >
> > >>>> > >>> > > >
> > >>>> > >>> > >
> > >>>> > >>> >
> > >>>> > >>>
> > >>>> >
> > >>>>
> >
> https://lists.apache.org/thread.html/ad10fb9d4c8fee571a2f6232b268a3b835f7b823d3a0983b84aeb18a@%3Cdev.netbeans.apache.org%3E
> > >>>> > >>> > > > .
> > >>>> > >>> > > > As of today the domain has been donated to ASF, but the
> > >>>> server is
> > >>>> > >>> still
> > >>>> > >>> > > run
> > >>>> > >>> > > > by Oracle, so the plan doesn't seem to be finished yet.
> > >>>> (WHOIS
> > >>>> > for
> > >>>> > >>> > > > netbeans.org shows ASF as the registrant; netbeans.org
> > >>>> resolves
> > >>>> > to
> > >>>> > >>> > > > lb-netbeans-cms-adc.oracle.com.)
> > >>>> > >>> > > >
> > >>>> > >>> > > > The melded sites don't really seem better to me than
> > >>>> redirecting
> > >>>> > >>> all
> > >>>> > >>> > urls
> > >>>> > >>> > > > on the domain. I guess it depends on if we want to keep
> > >>>> druid.io
> > >>>> > >>> as
> > >>>> > >>> > the
> > >>>> > >>> > > > official domain forever, or if we think
> druid.apache.org
> > is
> > >>>> > >>> cooler. I
> > >>>> > >>> > > > definitely think druid.apache.org is cooler so my vote
> is
> > >>>> there
> > >>>> > >>> :).
> > >>>> > >>> > It's
> > >>>> > >>> > > > also nice that it supports https. (druid.io does not
> > today,
> > >>>> > since
> > >>>> > >>> it's
> > >>>> > >>> > > on
> > >>>> > >>> > > > GitHub pages, which doesn't support https for custom
> > >>>> domains.)
> > >>>> > >>> > > >
> > >>>> > >>> > > > On Tue, Mar 12, 2019 at 7:47 PM Charles Allen
> > >>>> > >>> > > > <ch...@snap.com.invalid> wrote:
> > >>>> > >>> > > >
> > >>>> > >>> > > > > Are there other projects who have transitioned an
> > >>>> independently
> > >>>> > >>> > > > successful
> > >>>> > >>> > > > > domain name to an apache one?
> > >>>> > >>> > > > >
> > >>>> > >>> > > > > On Tue, Mar 5, 2019 at 2:13 PM David Lim <
> > >>>> davidlim@apache.org>
> > >>>> > >>> > wrote:
> > >>>> > >>> > > > >
> > >>>> > >>> > > > > > Who has control over the druid.io domain? Charles
> > >>>> would that
> > >>>> > >>> be
> > >>>> > >>> > you?
> > >>>> > >>> > > > > >
> > >>>> > >>> > > > > > We'd need support from them for the DNS redirect.
> > >>>> > >>> > > > > >
> > >>>> > >>> > > > > > On Tue, Mar 5, 2019 at 2:04 PM Jonathan Wei <
> > >>>> > jonwei@apache.org
> > >>>> > >>> >
> > >>>> > >>> > > wrote:
> > >>>> > >>> > > > > >
> > >>>> > >>> > > > > > > We still need to complete the website migration to
> > >>>> Apache
> > >>>> > >>> > > > > infrastructure.
> > >>>> > >>> > > > > > >
> > >>>> > >>> > > > > > > I'll propose the following plan:
> > >>>> > >>> > > > > > >
> > >>>> > >>> > > > > > > Proposed Apache Druid website migration plan
> > >>>> > >>> > > > > > > ========================================
> > >>>> > >>> > > > > > >
> > >>>> > >>> > > > > > > These links have some previous discussion on the
> > >>>> website
> > >>>> > >>> > migration:
> > >>>> > >>> > > > > > >
> > >>>> > >>> > > > > > >
> > >>>> > >>> > > > > > >
> > >>>> > >>> > > > > >
> > >>>> > >>> > > > >
> > >>>> > >>> > > >
> > >>>> > >>> > >
> > >>>> > >>> >
> > >>>> > >>>
> > >>>> >
> > >>>>
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.apache.org_thread.html_7cae100b684e0b33e0adda993efea3d6088978700988a0ae632fdd80-40-253Cdev.druid.apache.org-253E&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=G1dTS7FlYGauxNOaQECZix2YwroWVCqJB-cT0nEeNwM&e=
> > >>>> > >>> > > > > > >
> > >>>> > >>> > > > > >
> > >>>> > >>> > > > >
> > >>>> > >>> > > >
> > >>>> > >>> > >
> > >>>> > >>> >
> > >>>> > >>>
> > >>>> >
> > >>>>
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_INFRA-2D17340&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=pwg0jE385gqei6EEEbxugKHWll7oyKoCloFc8ByhlUc&e=
> > >>>> > >>> > > > > > >
> > >>>> > >>> > > > > > > From the discussions above, the recommendation is
> to
> > >>>> have 2
> > >>>> > >>> > > separate
> > >>>> > >>> > > > > > repos
> > >>>> > >>> > > > > > > for the website: one for source and another for
> > built
> > >>>> > content
> > >>>> > >>> > that
> > >>>> > >>> > > > will
> > >>>> > >>> > > > > > be
> > >>>> > >>> > > > > > > served.
> > >>>> > >>> > > > > > >
> > >>>> > >>> > > > > > > Generating site files
> > >>>> > >>> > > > > > > =======================
> > >>>> > >>> > > > > > >
> > >>>> > >>> > > > > > > The Apache site update process will be similar to
> > our
> > >>>> > current
> > >>>> > >>> > > > process.
> > >>>> > >>> > > > > > >
> > >>>> > >>> > > > > > > Current process:
> > >>>> > >>> > > > > > > 1. Push changes to
> > >>>> > >>> > > > > >
> > https://github.com/druid-io/druid-io.github.io/tree/src
> > >>>> > >>> > > > > > > 2. metamx bot picks up changes, builds, and
> commits
> > to
> > >>>> > >>> > > > > > >
> > >>>> https://github.com/druid-io/druid-io.github.io/tree/master
> > >>>> > >>> > > > > > > 3.
> > >>>> > >>> https://github.com/druid-io/druid-io.github.io/tree/master is
> > >>>> > >>> > > > > served
> > >>>> > >>> > > > > > by
> > >>>> > >>> > > > > > > github pages
> > >>>> > >>> > > > > > >
> > >>>> > >>> > > > > > > Apache process:
> > >>>> > >>> > > > > > > 1. Push changes to
> > >>>> > >>> > > > > https://github.com/apache/incubator-druid-website-src
> > >>>> > >>> > > > > > > 2. Jenkins bot from Apache will build the website
> > from
> > >>>> > source
> > >>>> > >>> > repo,
> > >>>> > >>> > > > > > commit
> > >>>> > >>> > > > > > > to
> > https://github.com/apache/incubator-druid-website
> > >>>> > >>> > > > > > > 3. Apache Druid website will be served from the
> > >>>> content in
> > >>>> > >>> > > > > > > https://github.com/apache/incubator-druid-website
> > >>>> > (asf-site
> > >>>> > >>> > > branch)
> > >>>> > >>> > > > > > >
> > >>>> > >>> > > > > > >
> > >>>> > >>> > > > > > > Hosting and SEO
> > >>>> > >>> > > > > > > ================
> > >>>> > >>> > > > > > >
> > >>>> > >>> > > > > > > The Apache site will be hosted at
> druid.apache.org
> > on
> > >>>> > Apache
> > >>>> > >>> > > > > > > infrastructure:
> > >>>> > >>> > > > > >
> > >>>> > >>> > > > >
> > >>>> > >>> > > >
> > >>>> > >>> > >
> > >>>> > >>> >
> > >>>> > >>>
> > >>>> >
> > >>>>
> >
> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.apache.org_dev_project-2Dsite.html&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=_rHEo_asMXKypaunuBTXFkB6Ni3F6KqbEfkck18L7Ag&e=
> > >>>> > >>> > > > > > >
> > >>>> > >>> > > > > > > To preserve our search rankings, we can setup 301
> > >>>> redirects
> > >>>> > >>> from
> > >>>> > >>> > > the
> > >>>> > >>> > > > > old
> > >>>> > >>> > > > > > > druid.io site to the corresponding pages on the
> > >>>> > >>> druid.apache.org
> > >>>> > >>> > > > > site. (
> > >>>> > >>> > > > > > >
> > >>>> > >>> > > > > >
> > >>>> > >>> > > > >
> > >>>> > >>> > > >
> > >>>> > >>> > >
> > >>>> > >>> >
> > >>>> > >>>
> > >>>> >
> > >>>>
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__moz.com_learn_seo_redirection&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=lUeWU0dT9thy8gp11RO-Vry7zkYl_W4BXz01fyXJO0A&e=
> > >>>> > >>> > > > > > )
> > >>>> > >>> > > > > > >
> > >>>> > >>> > > > > > > However, Github pages (which currently hosts the
> > >>>> druid.io
> > >>>> > >>> site)
> > >>>> > >>> > > does
> > >>>> > >>> > > > > not
> > >>>> > >>> > > > > > > support 301 redirects, so we propose the
> following:
> > >>>> > >>> > > > > > > - Setup a new Nginx server that will perform 301
> > >>>> redirects
> > >>>> > to
> > >>>> > >>> > > > > > > druid.apache.org for the druid.io. Imply can host
> > >>>> this if
> > >>>> > >>> > needed.
> > >>>> > >>> > > > > > > - Update the druid.io DNS entry to point to this
> > new
> > >>>> Nginx
> > >>>> > >>> > server
> > >>>> > >>> > > > > > > - Shut down Github pages hosting for druid.io
> > >>>> > >>> > > > > > >
> > >>>> > >>> > > > > > > In addition, we can also set canonical tags on our
> > >>>> pages:
> > >>>> > >>> > > > > > >
> > >>>> > >>> > > > > >
> > >>>> > >>> > > > >
> > >>>> > >>> > > >
> > >>>> > >>> > >
> > >>>> > >>> >
> > >>>> > >>>
> > >>>> >
> > >>>>
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__moz.com_learn_seo_canonicalization&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=T8G2c6d4EbQ_YDLFQXVebcj0UN9FNrbpPY5Xq4LAR8w&e=
> > >>>> > >>> > > > > > >
> > >>>> > >>> > > > > > >
> > >>>> > >>> > > > > > > Action items
> > >>>> > >>> > > > > > > ===============
> > >>>> > >>> > > > > > > - Setup a Jenkins bot that builds the Apache
> website
> > >>>> > content
> > >>>> > >>> from
> > >>>> > >>> > > > > source
> > >>>> > >>> > > > > > > - Get the Apache website up
> > >>>> > >>> > > > > > > - Setup Nginx redirect server for druid.io
> > >>>> > >>> > > > > > > - Shutdown github pages and redirect DNS for
> > druid.io
> > >>>> to
> > >>>> > >>> Nginx
> > >>>> > >>> > > > > redirect
> > >>>> > >>> > > > > > > server
> > >>>> > >>> > > > > > > - Add canonical tags to pages
> > >>>> > >>> > > > > > >
> > >>>> > >>> > > > > >
> > >>>> > >>> > > > >
> > >>>> > >>> > > >
> > >>>> > >>> > >
> > >>>> > >>> >
> > >>>> > >>>
> > >>>> > >>
> > >>>> >
> > >>>>
> > >>>
> >
>

Re: Proposed website migration plan

Posted by David Lim <da...@apache.org>.
No objections from me - thank you for testing this out.

On Mon, Jun 10, 2019 at 7:48 AM Gian Merlino <gi...@apache.org> wrote:

> It looks like Google has picked up the 301 and [druid use cases] #1 result
> is https://druid.apache.org/use-cases now. For [what is druid used for]
> it's not #4 instead of #2. I think this is the best we are likely to get. I
> am ready to flip the switch if there aren't any objections.
>
> On Fri, Jun 7, 2019 at 9:15 PM Gian Merlino <gi...@apache.org> wrote:
>
> > Another update: as of
> > https://github.com/apache/incubator-druid-website-src/pull/1 and
> > https://github.com/apache/incubator-druid-website/pull/7, the
> > https://druid.apache.org/ site is now serving almost all pages from
> > druid.io, except:
> >
> > - the index page (it still has a placeholder until we flip the switch)
> > - the download page (it has a differently-designed download page: compare
> > http://druid.io/downloads.html with
> http://druid.apache.org/downloads.html
> > - any docs older than 0.13.0 (they aren't Apache releases)
> >
> > If you navigate to https://druid.apache.org/ + any other path from
> > druid.io, you should see the page.
> >
> > I'm hoping to confirm that search engines pick up the 301 for
> > http://druid.io/use-cases before flipping the switch. Hopefully that
> > doesn't take much longer. If it does we should talk about how we want to
> > proceed.
> >
> > On Tue, Jun 4, 2019 at 1:48 AM Gian Merlino <gi...@apache.org> wrote:
> >
> >> An update: we do have a redirect server set up on druid.io now: note
> >> that http://druid.io/community/ and http://druid.io/use-cases both
> >> redirect to https://druid.apache.org. I just set up the latter redirect
> >> (on /use-cases) as part of 'test this first on a single page'. All other
> >> druid.io URLs are still being hosted using the content from GitHub
> pages
> >> at https://github.com/druid-io/druid-io.github.io.
> >>
> >> Search engine watch: currently, http://druid.io is the #1 link for
> >> [druid use cases] on Google, Bing, and DuckDuckGo (and has a cool
> looking
> >> infobox on Google & Bing). For [what is druid used for], it's #2 on
> Google,
> >> and not ranked on the first page on Bing & DDG. Will monitor this over
> the
> >> next few days.
> >>
> >> On Mon, May 6, 2019 at 5:43 PM Gian Merlino <gi...@apache.org> wrote:
> >>
> >>> Hi all,
> >>>
> >>> It sounds like we will need a redirect server that issues 301s from
> each
> >>> druid.io page to the corresponding druid.apache.org page. Charles and
> I
> >>> spoke offline and thought that something like Jon's original proposal
> is
> >>> the best way to go. I am going to suggest we get started on this, as
> it's
> >>> the last major piece of infra to move to ASF.
> >>>
> >>> 1) Set up a redirect server to perform 301 redirects to
> druid.apache.org
> >>> 2) Post all druid.io content on druid.apache.org
> >>> 3) Update druid.io DNS to point to the redirect server
> >>> 4) Shut down GitHub pages hosting for druid.io
> >>>
> >>> Steps (2) and (3) should be done as close in time as possible so there
> >>> is no confusion as to which version of the pages is canonical.
> >>>
> >>> For the redirect server, two viable options are an nginx server or an
> S3
> >>> webpage redirect (
> >>>
> https://docs.aws.amazon.com/AmazonS3/latest/dev/how-to-page-redirect.html
> ).
> >>> Just like we did with the HTML-level redirect, I suggest we test this
> first
> >>> on a single page. We can do that by having the redirect server
> initially
> >>> start off by hosting all druid.io content (so it's indistinguishable
> >>> from the GitHub-pages-based site) except for a single page, which it
> >>> redirects using HTTP 301 to druid.apache.org.
> >>>
> >>> I'm planning to start looking into this, so anyone around please speak
> >>> up if you have any advice or alternative approaches to suggest.
> >>>
> >>> On Mon, Apr 29, 2019 at 4:01 PM Jonathan Wei <jo...@apache.org>
> wrote:
> >>>
> >>>> Thanks for checking the SEO state, that's somewhat disappointing.
> >>>>
> >>>> For Bing, it sounds like they really want you to use 301s (
> >>>> https://www.bing.com/webmaster/help/webmaster-guidelines-30fba23a):
> >>>>
> >>>> > Bing prefers you use a 301 permanent redirect when moving content,
> >>>> should
> >>>> the move be permanent.  If the move is temporary, then a 302 temporary
> >>>> redirect will work fine.  Do not use the rel=canonical tag in place
> of a
> >>>> proper redirect.
> >>>>
> >>>> I wasn't able to find similar guidance re: this issue for DuckDuckGo.
> >>>>
> >>>> On Mon, Apr 29, 2019 at 10:42 AM Gian Merlino <gi...@apache.org>
> wrote:
> >>>>
> >>>> > Another update: SEO is not looking great after another day passed.
> >>>> For a
> >>>> > search for "druid community", both http://druid.io/community and
> >>>> > https://druid.apache.org/community/ have dropped off the front page
> >>>> of
> >>>> > Bing
> >>>> > completely. On Google, the legacy version is gone (as expected) but
> >>>> the
> >>>> > Apache version has dropped to the #3 spot (down from #2 yesterday;
> >>>> and down
> >>>> > from where the legacy page was pre-migration, which was #1).
> >>>> >
> >>>> > I think this means we do need to try to get 301s figured out.
> >>>> >
> >>>> > On Sun, Apr 28, 2019 at 3:06 PM Gian Merlino <gi...@apache.org>
> wrote:
> >>>> >
> >>>> > > Google has picked up the new URL as of today but Bing hasn't.
> >>>> Neither has
> >>>> > > DuckDuckGo for that matter.
> >>>> > >
> >>>> > > Currently, Google is showing https://druid.apache.org/community/
> >>>> in the
> >>>> > > #2 spot and Bing/DDG are showing http://druid.io/community in the
> >>>> top
> >>>> > > spot. Ominously, the latter two _have_ picked up a page title
> >>>> change to
> >>>> > > "Redirecting..."
> >>>> > >
> >>>> > > On Fri, Apr 26, 2019 at 11:00 AM Gian Merlino <gi...@apache.org>
> >>>> wrote:
> >>>> > >
> >>>> > >> An update: this is done now since a couple of days ago, but
> Google
> >>>> and
> >>>> > >> Bing are still showing http://druid.io/community for a search
> for
> >>>> > "druid
> >>>> > >> community" or even "apache druid community":
> >>>> > >>
> >>>> > >> - https://www.google.com/search?q=druid+community
> >>>> > >> - https://www.bing.com/search?q=druid+community
> >>>> > >>
> >>>> > >> I suggest we keep an eye on the search engines and make sure they
> >>>> can
> >>>> > >> figure out that the site has changed (I'm not sure how often they
> >>>> > crawl).
> >>>> > >> If they can then it would make sense to me to move forward with
> >>>> > migrating
> >>>> > >> the entire web site.
> >>>> > >>
> >>>> > >> On Mon, Apr 22, 2019 at 7:49 PM Jonathan Wei <jo...@apache.org>
> >>>> wrote:
> >>>> > >>
> >>>> > >>> Correction: Xavier was suggesting we use
> >>>> > >>>
> >>>> > >>>
> >>>> >
> >>>>
> https://github.com/druid-io/druid-io.github.io/blob/src/_layouts/redirect_page.html
> >>>> > >>> ,
> >>>> > >>> the existing redirect system used by the Druid website.
> >>>> > >>>
> >>>> > >>> I've opened PRs to do the community page migration test:
> >>>> > >>> https://github.com/apache/incubator-druid-website/pull/3
> >>>> > >>> https://github.com/druid-io/druid-io.github.io/pull/591
> >>>> > >>>
> >>>> > >>> On Mon, Apr 22, 2019 at 3:04 PM Gian Merlino <gi...@apache.org>
> >>>> wrote:
> >>>> > >>>
> >>>> > >>> > That sounds good to me. I would also consider adding canonical
> >>>> tags
> >>>> > to
> >>>> > >>> all
> >>>> > >>> > druid.apache.org pages so we don't have
> >>>> druid.incubator.apache.org
> >>>> > and
> >>>> > >>> > druid.apache.org both floating around (not to mention
> >>>> http/https
> >>>> > >>> version
> >>>> > >>> > of
> >>>> > >>> > both).
> >>>> > >>> >
> >>>> > >>> > On Mon, Apr 22, 2019 at 2:59 PM Jonathan Wei <
> jonwei@apache.org
> >>>> >
> >>>> > >>> wrote:
> >>>> > >>> >
> >>>> > >>> > > For redirects, Xavier has suggested using
> >>>> > >>> > >
> https://help.github.com/en/articles/redirects-on-github-pages
> >>>> to
> >>>> > >>> > redirect
> >>>> > >>> > > to druid.apache.org as a way to transition before the
> domain
> >>>> > >>> migration
> >>>> > >>> > > occurs, and believes that it would have the same SEO effects
> >>>> as a
> >>>> > 301
> >>>> > >>> > > redirect after the new pages are indexed.
> >>>> > >>> > >
> >>>> > >>> > > I think we could try migrating the current Community page to
> >>>> > >>> > > druid.apache.org with Github redirects and canonical links
> >>>> > pointing
> >>>> > >>> to
> >>>> > >>> > the
> >>>> > >>> > > https://druid.apache.org version. If that goes well, we
> could
> >>>> > >>> continue
> >>>> > >>> > > migrating more pages.
> >>>> > >>> > >
> >>>> > >>> > > What are the community's thoughts on that?
> >>>> > >>> > >
> >>>> > >>> > > Thanks,
> >>>> > >>> > > Jon
> >>>> > >>> > >
> >>>> > >>> > > On Tue, Mar 12, 2019 at 7:19 PM Gian Merlino <
> gian@apache.org
> >>>> >
> >>>> > >>> wrote:
> >>>> > >>> > >
> >>>> > >>> > > > OpenOffice and Groovy both chose to sort of "meld" their
> >>>> classic
> >>>> > >>> and
> >>>> > >>> > > Apache
> >>>> > >>> > > > sites together: https://www.openoffice.org/,
> >>>> > >>> http://groovy-lang.org/.
> >>>> > >>> > > Note
> >>>> > >>> > > > how when you click around, you get shuttled between the
> >>>> classic
> >>>> > >>> domain
> >>>> > >>> > > and
> >>>> > >>> > > > the Apache domain. Some pages are available on both sites,
> >>>> like
> >>>> > >>> > > > http://groovy-lang.org/download.html and
> >>>> > >>> > > > https://groovy.apache.org/download.html (which don't use
> >>>> > canonical
> >>>> > >>> > link
> >>>> > >>> > > > tags -- does not seem like a good example to follow!).
> >>>> > >>> > > >
> >>>> > >>> > > > NetBeans (still incubating) also has a "melded" site at
> >>>> > >>> > > > https://netbeans.org/ but doesn't seem to consider itself
> >>>> done
> >>>> > >>> yet.
> >>>> > >>> > They
> >>>> > >>> > > > are discussing plans on their lists & wiki to do redirects
> >>>> from
> >>>> > >>> > > > netbeans.org
> >>>> > >>> > > > to netbeans.apache.org:
> >>>> > >>> > > >
> >>>> > >>> > > >
> >>>> > >>> > >
> >>>> > >>> >
> >>>> > >>>
> >>>> >
> >>>>
> https://cwiki.apache.org/confluence/display/NETBEANS/netbeans.org+Transition+Process
> >>>> > >>> > > > ,
> >>>> > >>> > > >
> >>>> > >>> > > >
> >>>> > >>> > >
> >>>> > >>> >
> >>>> > >>>
> >>>> >
> >>>>
> https://lists.apache.org/thread.html/ad10fb9d4c8fee571a2f6232b268a3b835f7b823d3a0983b84aeb18a@%3Cdev.netbeans.apache.org%3E
> >>>> > >>> > > > .
> >>>> > >>> > > > As of today the domain has been donated to ASF, but the
> >>>> server is
> >>>> > >>> still
> >>>> > >>> > > run
> >>>> > >>> > > > by Oracle, so the plan doesn't seem to be finished yet.
> >>>> (WHOIS
> >>>> > for
> >>>> > >>> > > > netbeans.org shows ASF as the registrant; netbeans.org
> >>>> resolves
> >>>> > to
> >>>> > >>> > > > lb-netbeans-cms-adc.oracle.com.)
> >>>> > >>> > > >
> >>>> > >>> > > > The melded sites don't really seem better to me than
> >>>> redirecting
> >>>> > >>> all
> >>>> > >>> > urls
> >>>> > >>> > > > on the domain. I guess it depends on if we want to keep
> >>>> druid.io
> >>>> > >>> as
> >>>> > >>> > the
> >>>> > >>> > > > official domain forever, or if we think druid.apache.org
> is
> >>>> > >>> cooler. I
> >>>> > >>> > > > definitely think druid.apache.org is cooler so my vote is
> >>>> there
> >>>> > >>> :).
> >>>> > >>> > It's
> >>>> > >>> > > > also nice that it supports https. (druid.io does not
> today,
> >>>> > since
> >>>> > >>> it's
> >>>> > >>> > > on
> >>>> > >>> > > > GitHub pages, which doesn't support https for custom
> >>>> domains.)
> >>>> > >>> > > >
> >>>> > >>> > > > On Tue, Mar 12, 2019 at 7:47 PM Charles Allen
> >>>> > >>> > > > <ch...@snap.com.invalid> wrote:
> >>>> > >>> > > >
> >>>> > >>> > > > > Are there other projects who have transitioned an
> >>>> independently
> >>>> > >>> > > > successful
> >>>> > >>> > > > > domain name to an apache one?
> >>>> > >>> > > > >
> >>>> > >>> > > > > On Tue, Mar 5, 2019 at 2:13 PM David Lim <
> >>>> davidlim@apache.org>
> >>>> > >>> > wrote:
> >>>> > >>> > > > >
> >>>> > >>> > > > > > Who has control over the druid.io domain? Charles
> >>>> would that
> >>>> > >>> be
> >>>> > >>> > you?
> >>>> > >>> > > > > >
> >>>> > >>> > > > > > We'd need support from them for the DNS redirect.
> >>>> > >>> > > > > >
> >>>> > >>> > > > > > On Tue, Mar 5, 2019 at 2:04 PM Jonathan Wei <
> >>>> > jonwei@apache.org
> >>>> > >>> >
> >>>> > >>> > > wrote:
> >>>> > >>> > > > > >
> >>>> > >>> > > > > > > We still need to complete the website migration to
> >>>> Apache
> >>>> > >>> > > > > infrastructure.
> >>>> > >>> > > > > > >
> >>>> > >>> > > > > > > I'll propose the following plan:
> >>>> > >>> > > > > > >
> >>>> > >>> > > > > > > Proposed Apache Druid website migration plan
> >>>> > >>> > > > > > > ========================================
> >>>> > >>> > > > > > >
> >>>> > >>> > > > > > > These links have some previous discussion on the
> >>>> website
> >>>> > >>> > migration:
> >>>> > >>> > > > > > >
> >>>> > >>> > > > > > >
> >>>> > >>> > > > > > >
> >>>> > >>> > > > > >
> >>>> > >>> > > > >
> >>>> > >>> > > >
> >>>> > >>> > >
> >>>> > >>> >
> >>>> > >>>
> >>>> >
> >>>>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.apache.org_thread.html_7cae100b684e0b33e0adda993efea3d6088978700988a0ae632fdd80-40-253Cdev.druid.apache.org-253E&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=G1dTS7FlYGauxNOaQECZix2YwroWVCqJB-cT0nEeNwM&e=
> >>>> > >>> > > > > > >
> >>>> > >>> > > > > >
> >>>> > >>> > > > >
> >>>> > >>> > > >
> >>>> > >>> > >
> >>>> > >>> >
> >>>> > >>>
> >>>> >
> >>>>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_INFRA-2D17340&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=pwg0jE385gqei6EEEbxugKHWll7oyKoCloFc8ByhlUc&e=
> >>>> > >>> > > > > > >
> >>>> > >>> > > > > > > From the discussions above, the recommendation is to
> >>>> have 2
> >>>> > >>> > > separate
> >>>> > >>> > > > > > repos
> >>>> > >>> > > > > > > for the website: one for source and another for
> built
> >>>> > content
> >>>> > >>> > that
> >>>> > >>> > > > will
> >>>> > >>> > > > > > be
> >>>> > >>> > > > > > > served.
> >>>> > >>> > > > > > >
> >>>> > >>> > > > > > > Generating site files
> >>>> > >>> > > > > > > =======================
> >>>> > >>> > > > > > >
> >>>> > >>> > > > > > > The Apache site update process will be similar to
> our
> >>>> > current
> >>>> > >>> > > > process.
> >>>> > >>> > > > > > >
> >>>> > >>> > > > > > > Current process:
> >>>> > >>> > > > > > > 1. Push changes to
> >>>> > >>> > > > > >
> https://github.com/druid-io/druid-io.github.io/tree/src
> >>>> > >>> > > > > > > 2. metamx bot picks up changes, builds, and commits
> to
> >>>> > >>> > > > > > >
> >>>> https://github.com/druid-io/druid-io.github.io/tree/master
> >>>> > >>> > > > > > > 3.
> >>>> > >>> https://github.com/druid-io/druid-io.github.io/tree/master is
> >>>> > >>> > > > > served
> >>>> > >>> > > > > > by
> >>>> > >>> > > > > > > github pages
> >>>> > >>> > > > > > >
> >>>> > >>> > > > > > > Apache process:
> >>>> > >>> > > > > > > 1. Push changes to
> >>>> > >>> > > > > https://github.com/apache/incubator-druid-website-src
> >>>> > >>> > > > > > > 2. Jenkins bot from Apache will build the website
> from
> >>>> > source
> >>>> > >>> > repo,
> >>>> > >>> > > > > > commit
> >>>> > >>> > > > > > > to
> https://github.com/apache/incubator-druid-website
> >>>> > >>> > > > > > > 3. Apache Druid website will be served from the
> >>>> content in
> >>>> > >>> > > > > > > https://github.com/apache/incubator-druid-website
> >>>> > (asf-site
> >>>> > >>> > > branch)
> >>>> > >>> > > > > > >
> >>>> > >>> > > > > > >
> >>>> > >>> > > > > > > Hosting and SEO
> >>>> > >>> > > > > > > ================
> >>>> > >>> > > > > > >
> >>>> > >>> > > > > > > The Apache site will be hosted at druid.apache.org
> on
> >>>> > Apache
> >>>> > >>> > > > > > > infrastructure:
> >>>> > >>> > > > > >
> >>>> > >>> > > > >
> >>>> > >>> > > >
> >>>> > >>> > >
> >>>> > >>> >
> >>>> > >>>
> >>>> >
> >>>>
> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.apache.org_dev_project-2Dsite.html&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=_rHEo_asMXKypaunuBTXFkB6Ni3F6KqbEfkck18L7Ag&e=
> >>>> > >>> > > > > > >
> >>>> > >>> > > > > > > To preserve our search rankings, we can setup 301
> >>>> redirects
> >>>> > >>> from
> >>>> > >>> > > the
> >>>> > >>> > > > > old
> >>>> > >>> > > > > > > druid.io site to the corresponding pages on the
> >>>> > >>> druid.apache.org
> >>>> > >>> > > > > site. (
> >>>> > >>> > > > > > >
> >>>> > >>> > > > > >
> >>>> > >>> > > > >
> >>>> > >>> > > >
> >>>> > >>> > >
> >>>> > >>> >
> >>>> > >>>
> >>>> >
> >>>>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__moz.com_learn_seo_redirection&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=lUeWU0dT9thy8gp11RO-Vry7zkYl_W4BXz01fyXJO0A&e=
> >>>> > >>> > > > > > )
> >>>> > >>> > > > > > >
> >>>> > >>> > > > > > > However, Github pages (which currently hosts the
> >>>> druid.io
> >>>> > >>> site)
> >>>> > >>> > > does
> >>>> > >>> > > > > not
> >>>> > >>> > > > > > > support 301 redirects, so we propose the following:
> >>>> > >>> > > > > > > - Setup a new Nginx server that will perform 301
> >>>> redirects
> >>>> > to
> >>>> > >>> > > > > > > druid.apache.org for the druid.io. Imply can host
> >>>> this if
> >>>> > >>> > needed.
> >>>> > >>> > > > > > > - Update the druid.io DNS entry to point to this
> new
> >>>> Nginx
> >>>> > >>> > server
> >>>> > >>> > > > > > > - Shut down Github pages hosting for druid.io
> >>>> > >>> > > > > > >
> >>>> > >>> > > > > > > In addition, we can also set canonical tags on our
> >>>> pages:
> >>>> > >>> > > > > > >
> >>>> > >>> > > > > >
> >>>> > >>> > > > >
> >>>> > >>> > > >
> >>>> > >>> > >
> >>>> > >>> >
> >>>> > >>>
> >>>> >
> >>>>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__moz.com_learn_seo_canonicalization&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=T8G2c6d4EbQ_YDLFQXVebcj0UN9FNrbpPY5Xq4LAR8w&e=
> >>>> > >>> > > > > > >
> >>>> > >>> > > > > > >
> >>>> > >>> > > > > > > Action items
> >>>> > >>> > > > > > > ===============
> >>>> > >>> > > > > > > - Setup a Jenkins bot that builds the Apache website
> >>>> > content
> >>>> > >>> from
> >>>> > >>> > > > > source
> >>>> > >>> > > > > > > - Get the Apache website up
> >>>> > >>> > > > > > > - Setup Nginx redirect server for druid.io
> >>>> > >>> > > > > > > - Shutdown github pages and redirect DNS for
> druid.io
> >>>> to
> >>>> > >>> Nginx
> >>>> > >>> > > > > redirect
> >>>> > >>> > > > > > > server
> >>>> > >>> > > > > > > - Add canonical tags to pages
> >>>> > >>> > > > > > >
> >>>> > >>> > > > > >
> >>>> > >>> > > > >
> >>>> > >>> > > >
> >>>> > >>> > >
> >>>> > >>> >
> >>>> > >>>
> >>>> > >>
> >>>> >
> >>>>
> >>>
>

Re: Proposed website migration plan

Posted by Gian Merlino <gi...@apache.org>.
It looks like Google has picked up the 301 and [druid use cases] #1 result
is https://druid.apache.org/use-cases now. For [what is druid used for]
it's not #4 instead of #2. I think this is the best we are likely to get. I
am ready to flip the switch if there aren't any objections.

On Fri, Jun 7, 2019 at 9:15 PM Gian Merlino <gi...@apache.org> wrote:

> Another update: as of
> https://github.com/apache/incubator-druid-website-src/pull/1 and
> https://github.com/apache/incubator-druid-website/pull/7, the
> https://druid.apache.org/ site is now serving almost all pages from
> druid.io, except:
>
> - the index page (it still has a placeholder until we flip the switch)
> - the download page (it has a differently-designed download page: compare
> http://druid.io/downloads.html with http://druid.apache.org/downloads.html
> - any docs older than 0.13.0 (they aren't Apache releases)
>
> If you navigate to https://druid.apache.org/ + any other path from
> druid.io, you should see the page.
>
> I'm hoping to confirm that search engines pick up the 301 for
> http://druid.io/use-cases before flipping the switch. Hopefully that
> doesn't take much longer. If it does we should talk about how we want to
> proceed.
>
> On Tue, Jun 4, 2019 at 1:48 AM Gian Merlino <gi...@apache.org> wrote:
>
>> An update: we do have a redirect server set up on druid.io now: note
>> that http://druid.io/community/ and http://druid.io/use-cases both
>> redirect to https://druid.apache.org. I just set up the latter redirect
>> (on /use-cases) as part of 'test this first on a single page'. All other
>> druid.io URLs are still being hosted using the content from GitHub pages
>> at https://github.com/druid-io/druid-io.github.io.
>>
>> Search engine watch: currently, http://druid.io is the #1 link for
>> [druid use cases] on Google, Bing, and DuckDuckGo (and has a cool looking
>> infobox on Google & Bing). For [what is druid used for], it's #2 on Google,
>> and not ranked on the first page on Bing & DDG. Will monitor this over the
>> next few days.
>>
>> On Mon, May 6, 2019 at 5:43 PM Gian Merlino <gi...@apache.org> wrote:
>>
>>> Hi all,
>>>
>>> It sounds like we will need a redirect server that issues 301s from each
>>> druid.io page to the corresponding druid.apache.org page. Charles and I
>>> spoke offline and thought that something like Jon's original proposal is
>>> the best way to go. I am going to suggest we get started on this, as it's
>>> the last major piece of infra to move to ASF.
>>>
>>> 1) Set up a redirect server to perform 301 redirects to druid.apache.org
>>> 2) Post all druid.io content on druid.apache.org
>>> 3) Update druid.io DNS to point to the redirect server
>>> 4) Shut down GitHub pages hosting for druid.io
>>>
>>> Steps (2) and (3) should be done as close in time as possible so there
>>> is no confusion as to which version of the pages is canonical.
>>>
>>> For the redirect server, two viable options are an nginx server or an S3
>>> webpage redirect (
>>> https://docs.aws.amazon.com/AmazonS3/latest/dev/how-to-page-redirect.html).
>>> Just like we did with the HTML-level redirect, I suggest we test this first
>>> on a single page. We can do that by having the redirect server initially
>>> start off by hosting all druid.io content (so it's indistinguishable
>>> from the GitHub-pages-based site) except for a single page, which it
>>> redirects using HTTP 301 to druid.apache.org.
>>>
>>> I'm planning to start looking into this, so anyone around please speak
>>> up if you have any advice or alternative approaches to suggest.
>>>
>>> On Mon, Apr 29, 2019 at 4:01 PM Jonathan Wei <jo...@apache.org> wrote:
>>>
>>>> Thanks for checking the SEO state, that's somewhat disappointing.
>>>>
>>>> For Bing, it sounds like they really want you to use 301s (
>>>> https://www.bing.com/webmaster/help/webmaster-guidelines-30fba23a):
>>>>
>>>> > Bing prefers you use a 301 permanent redirect when moving content,
>>>> should
>>>> the move be permanent.  If the move is temporary, then a 302 temporary
>>>> redirect will work fine.  Do not use the rel=canonical tag in place of a
>>>> proper redirect.
>>>>
>>>> I wasn't able to find similar guidance re: this issue for DuckDuckGo.
>>>>
>>>> On Mon, Apr 29, 2019 at 10:42 AM Gian Merlino <gi...@apache.org> wrote:
>>>>
>>>> > Another update: SEO is not looking great after another day passed.
>>>> For a
>>>> > search for "druid community", both http://druid.io/community and
>>>> > https://druid.apache.org/community/ have dropped off the front page
>>>> of
>>>> > Bing
>>>> > completely. On Google, the legacy version is gone (as expected) but
>>>> the
>>>> > Apache version has dropped to the #3 spot (down from #2 yesterday;
>>>> and down
>>>> > from where the legacy page was pre-migration, which was #1).
>>>> >
>>>> > I think this means we do need to try to get 301s figured out.
>>>> >
>>>> > On Sun, Apr 28, 2019 at 3:06 PM Gian Merlino <gi...@apache.org> wrote:
>>>> >
>>>> > > Google has picked up the new URL as of today but Bing hasn't.
>>>> Neither has
>>>> > > DuckDuckGo for that matter.
>>>> > >
>>>> > > Currently, Google is showing https://druid.apache.org/community/
>>>> in the
>>>> > > #2 spot and Bing/DDG are showing http://druid.io/community in the
>>>> top
>>>> > > spot. Ominously, the latter two _have_ picked up a page title
>>>> change to
>>>> > > "Redirecting..."
>>>> > >
>>>> > > On Fri, Apr 26, 2019 at 11:00 AM Gian Merlino <gi...@apache.org>
>>>> wrote:
>>>> > >
>>>> > >> An update: this is done now since a couple of days ago, but Google
>>>> and
>>>> > >> Bing are still showing http://druid.io/community for a search for
>>>> > "druid
>>>> > >> community" or even "apache druid community":
>>>> > >>
>>>> > >> - https://www.google.com/search?q=druid+community
>>>> > >> - https://www.bing.com/search?q=druid+community
>>>> > >>
>>>> > >> I suggest we keep an eye on the search engines and make sure they
>>>> can
>>>> > >> figure out that the site has changed (I'm not sure how often they
>>>> > crawl).
>>>> > >> If they can then it would make sense to me to move forward with
>>>> > migrating
>>>> > >> the entire web site.
>>>> > >>
>>>> > >> On Mon, Apr 22, 2019 at 7:49 PM Jonathan Wei <jo...@apache.org>
>>>> wrote:
>>>> > >>
>>>> > >>> Correction: Xavier was suggesting we use
>>>> > >>>
>>>> > >>>
>>>> >
>>>> https://github.com/druid-io/druid-io.github.io/blob/src/_layouts/redirect_page.html
>>>> > >>> ,
>>>> > >>> the existing redirect system used by the Druid website.
>>>> > >>>
>>>> > >>> I've opened PRs to do the community page migration test:
>>>> > >>> https://github.com/apache/incubator-druid-website/pull/3
>>>> > >>> https://github.com/druid-io/druid-io.github.io/pull/591
>>>> > >>>
>>>> > >>> On Mon, Apr 22, 2019 at 3:04 PM Gian Merlino <gi...@apache.org>
>>>> wrote:
>>>> > >>>
>>>> > >>> > That sounds good to me. I would also consider adding canonical
>>>> tags
>>>> > to
>>>> > >>> all
>>>> > >>> > druid.apache.org pages so we don't have
>>>> druid.incubator.apache.org
>>>> > and
>>>> > >>> > druid.apache.org both floating around (not to mention
>>>> http/https
>>>> > >>> version
>>>> > >>> > of
>>>> > >>> > both).
>>>> > >>> >
>>>> > >>> > On Mon, Apr 22, 2019 at 2:59 PM Jonathan Wei <jonwei@apache.org
>>>> >
>>>> > >>> wrote:
>>>> > >>> >
>>>> > >>> > > For redirects, Xavier has suggested using
>>>> > >>> > > https://help.github.com/en/articles/redirects-on-github-pages
>>>> to
>>>> > >>> > redirect
>>>> > >>> > > to druid.apache.org as a way to transition before the domain
>>>> > >>> migration
>>>> > >>> > > occurs, and believes that it would have the same SEO effects
>>>> as a
>>>> > 301
>>>> > >>> > > redirect after the new pages are indexed.
>>>> > >>> > >
>>>> > >>> > > I think we could try migrating the current Community page to
>>>> > >>> > > druid.apache.org with Github redirects and canonical links
>>>> > pointing
>>>> > >>> to
>>>> > >>> > the
>>>> > >>> > > https://druid.apache.org version. If that goes well, we could
>>>> > >>> continue
>>>> > >>> > > migrating more pages.
>>>> > >>> > >
>>>> > >>> > > What are the community's thoughts on that?
>>>> > >>> > >
>>>> > >>> > > Thanks,
>>>> > >>> > > Jon
>>>> > >>> > >
>>>> > >>> > > On Tue, Mar 12, 2019 at 7:19 PM Gian Merlino <gian@apache.org
>>>> >
>>>> > >>> wrote:
>>>> > >>> > >
>>>> > >>> > > > OpenOffice and Groovy both chose to sort of "meld" their
>>>> classic
>>>> > >>> and
>>>> > >>> > > Apache
>>>> > >>> > > > sites together: https://www.openoffice.org/,
>>>> > >>> http://groovy-lang.org/.
>>>> > >>> > > Note
>>>> > >>> > > > how when you click around, you get shuttled between the
>>>> classic
>>>> > >>> domain
>>>> > >>> > > and
>>>> > >>> > > > the Apache domain. Some pages are available on both sites,
>>>> like
>>>> > >>> > > > http://groovy-lang.org/download.html and
>>>> > >>> > > > https://groovy.apache.org/download.html (which don't use
>>>> > canonical
>>>> > >>> > link
>>>> > >>> > > > tags -- does not seem like a good example to follow!).
>>>> > >>> > > >
>>>> > >>> > > > NetBeans (still incubating) also has a "melded" site at
>>>> > >>> > > > https://netbeans.org/ but doesn't seem to consider itself
>>>> done
>>>> > >>> yet.
>>>> > >>> > They
>>>> > >>> > > > are discussing plans on their lists & wiki to do redirects
>>>> from
>>>> > >>> > > > netbeans.org
>>>> > >>> > > > to netbeans.apache.org:
>>>> > >>> > > >
>>>> > >>> > > >
>>>> > >>> > >
>>>> > >>> >
>>>> > >>>
>>>> >
>>>> https://cwiki.apache.org/confluence/display/NETBEANS/netbeans.org+Transition+Process
>>>> > >>> > > > ,
>>>> > >>> > > >
>>>> > >>> > > >
>>>> > >>> > >
>>>> > >>> >
>>>> > >>>
>>>> >
>>>> https://lists.apache.org/thread.html/ad10fb9d4c8fee571a2f6232b268a3b835f7b823d3a0983b84aeb18a@%3Cdev.netbeans.apache.org%3E
>>>> > >>> > > > .
>>>> > >>> > > > As of today the domain has been donated to ASF, but the
>>>> server is
>>>> > >>> still
>>>> > >>> > > run
>>>> > >>> > > > by Oracle, so the plan doesn't seem to be finished yet.
>>>> (WHOIS
>>>> > for
>>>> > >>> > > > netbeans.org shows ASF as the registrant; netbeans.org
>>>> resolves
>>>> > to
>>>> > >>> > > > lb-netbeans-cms-adc.oracle.com.)
>>>> > >>> > > >
>>>> > >>> > > > The melded sites don't really seem better to me than
>>>> redirecting
>>>> > >>> all
>>>> > >>> > urls
>>>> > >>> > > > on the domain. I guess it depends on if we want to keep
>>>> druid.io
>>>> > >>> as
>>>> > >>> > the
>>>> > >>> > > > official domain forever, or if we think druid.apache.org is
>>>> > >>> cooler. I
>>>> > >>> > > > definitely think druid.apache.org is cooler so my vote is
>>>> there
>>>> > >>> :).
>>>> > >>> > It's
>>>> > >>> > > > also nice that it supports https. (druid.io does not today,
>>>> > since
>>>> > >>> it's
>>>> > >>> > > on
>>>> > >>> > > > GitHub pages, which doesn't support https for custom
>>>> domains.)
>>>> > >>> > > >
>>>> > >>> > > > On Tue, Mar 12, 2019 at 7:47 PM Charles Allen
>>>> > >>> > > > <ch...@snap.com.invalid> wrote:
>>>> > >>> > > >
>>>> > >>> > > > > Are there other projects who have transitioned an
>>>> independently
>>>> > >>> > > > successful
>>>> > >>> > > > > domain name to an apache one?
>>>> > >>> > > > >
>>>> > >>> > > > > On Tue, Mar 5, 2019 at 2:13 PM David Lim <
>>>> davidlim@apache.org>
>>>> > >>> > wrote:
>>>> > >>> > > > >
>>>> > >>> > > > > > Who has control over the druid.io domain? Charles
>>>> would that
>>>> > >>> be
>>>> > >>> > you?
>>>> > >>> > > > > >
>>>> > >>> > > > > > We'd need support from them for the DNS redirect.
>>>> > >>> > > > > >
>>>> > >>> > > > > > On Tue, Mar 5, 2019 at 2:04 PM Jonathan Wei <
>>>> > jonwei@apache.org
>>>> > >>> >
>>>> > >>> > > wrote:
>>>> > >>> > > > > >
>>>> > >>> > > > > > > We still need to complete the website migration to
>>>> Apache
>>>> > >>> > > > > infrastructure.
>>>> > >>> > > > > > >
>>>> > >>> > > > > > > I'll propose the following plan:
>>>> > >>> > > > > > >
>>>> > >>> > > > > > > Proposed Apache Druid website migration plan
>>>> > >>> > > > > > > ========================================
>>>> > >>> > > > > > >
>>>> > >>> > > > > > > These links have some previous discussion on the
>>>> website
>>>> > >>> > migration:
>>>> > >>> > > > > > >
>>>> > >>> > > > > > >
>>>> > >>> > > > > > >
>>>> > >>> > > > > >
>>>> > >>> > > > >
>>>> > >>> > > >
>>>> > >>> > >
>>>> > >>> >
>>>> > >>>
>>>> >
>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.apache.org_thread.html_7cae100b684e0b33e0adda993efea3d6088978700988a0ae632fdd80-40-253Cdev.druid.apache.org-253E&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=G1dTS7FlYGauxNOaQECZix2YwroWVCqJB-cT0nEeNwM&e=
>>>> > >>> > > > > > >
>>>> > >>> > > > > >
>>>> > >>> > > > >
>>>> > >>> > > >
>>>> > >>> > >
>>>> > >>> >
>>>> > >>>
>>>> >
>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_INFRA-2D17340&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=pwg0jE385gqei6EEEbxugKHWll7oyKoCloFc8ByhlUc&e=
>>>> > >>> > > > > > >
>>>> > >>> > > > > > > From the discussions above, the recommendation is to
>>>> have 2
>>>> > >>> > > separate
>>>> > >>> > > > > > repos
>>>> > >>> > > > > > > for the website: one for source and another for built
>>>> > content
>>>> > >>> > that
>>>> > >>> > > > will
>>>> > >>> > > > > > be
>>>> > >>> > > > > > > served.
>>>> > >>> > > > > > >
>>>> > >>> > > > > > > Generating site files
>>>> > >>> > > > > > > =======================
>>>> > >>> > > > > > >
>>>> > >>> > > > > > > The Apache site update process will be similar to our
>>>> > current
>>>> > >>> > > > process.
>>>> > >>> > > > > > >
>>>> > >>> > > > > > > Current process:
>>>> > >>> > > > > > > 1. Push changes to
>>>> > >>> > > > > > https://github.com/druid-io/druid-io.github.io/tree/src
>>>> > >>> > > > > > > 2. metamx bot picks up changes, builds, and commits to
>>>> > >>> > > > > > >
>>>> https://github.com/druid-io/druid-io.github.io/tree/master
>>>> > >>> > > > > > > 3.
>>>> > >>> https://github.com/druid-io/druid-io.github.io/tree/master is
>>>> > >>> > > > > served
>>>> > >>> > > > > > by
>>>> > >>> > > > > > > github pages
>>>> > >>> > > > > > >
>>>> > >>> > > > > > > Apache process:
>>>> > >>> > > > > > > 1. Push changes to
>>>> > >>> > > > > https://github.com/apache/incubator-druid-website-src
>>>> > >>> > > > > > > 2. Jenkins bot from Apache will build the website from
>>>> > source
>>>> > >>> > repo,
>>>> > >>> > > > > > commit
>>>> > >>> > > > > > > to https://github.com/apache/incubator-druid-website
>>>> > >>> > > > > > > 3. Apache Druid website will be served from the
>>>> content in
>>>> > >>> > > > > > > https://github.com/apache/incubator-druid-website
>>>> > (asf-site
>>>> > >>> > > branch)
>>>> > >>> > > > > > >
>>>> > >>> > > > > > >
>>>> > >>> > > > > > > Hosting and SEO
>>>> > >>> > > > > > > ================
>>>> > >>> > > > > > >
>>>> > >>> > > > > > > The Apache site will be hosted at druid.apache.org on
>>>> > Apache
>>>> > >>> > > > > > > infrastructure:
>>>> > >>> > > > > >
>>>> > >>> > > > >
>>>> > >>> > > >
>>>> > >>> > >
>>>> > >>> >
>>>> > >>>
>>>> >
>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.apache.org_dev_project-2Dsite.html&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=_rHEo_asMXKypaunuBTXFkB6Ni3F6KqbEfkck18L7Ag&e=
>>>> > >>> > > > > > >
>>>> > >>> > > > > > > To preserve our search rankings, we can setup 301
>>>> redirects
>>>> > >>> from
>>>> > >>> > > the
>>>> > >>> > > > > old
>>>> > >>> > > > > > > druid.io site to the corresponding pages on the
>>>> > >>> druid.apache.org
>>>> > >>> > > > > site. (
>>>> > >>> > > > > > >
>>>> > >>> > > > > >
>>>> > >>> > > > >
>>>> > >>> > > >
>>>> > >>> > >
>>>> > >>> >
>>>> > >>>
>>>> >
>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__moz.com_learn_seo_redirection&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=lUeWU0dT9thy8gp11RO-Vry7zkYl_W4BXz01fyXJO0A&e=
>>>> > >>> > > > > > )
>>>> > >>> > > > > > >
>>>> > >>> > > > > > > However, Github pages (which currently hosts the
>>>> druid.io
>>>> > >>> site)
>>>> > >>> > > does
>>>> > >>> > > > > not
>>>> > >>> > > > > > > support 301 redirects, so we propose the following:
>>>> > >>> > > > > > > - Setup a new Nginx server that will perform 301
>>>> redirects
>>>> > to
>>>> > >>> > > > > > > druid.apache.org for the druid.io. Imply can host
>>>> this if
>>>> > >>> > needed.
>>>> > >>> > > > > > > - Update the druid.io DNS entry to point to this new
>>>> Nginx
>>>> > >>> > server
>>>> > >>> > > > > > > - Shut down Github pages hosting for druid.io
>>>> > >>> > > > > > >
>>>> > >>> > > > > > > In addition, we can also set canonical tags on our
>>>> pages:
>>>> > >>> > > > > > >
>>>> > >>> > > > > >
>>>> > >>> > > > >
>>>> > >>> > > >
>>>> > >>> > >
>>>> > >>> >
>>>> > >>>
>>>> >
>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__moz.com_learn_seo_canonicalization&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=T8G2c6d4EbQ_YDLFQXVebcj0UN9FNrbpPY5Xq4LAR8w&e=
>>>> > >>> > > > > > >
>>>> > >>> > > > > > >
>>>> > >>> > > > > > > Action items
>>>> > >>> > > > > > > ===============
>>>> > >>> > > > > > > - Setup a Jenkins bot that builds the Apache website
>>>> > content
>>>> > >>> from
>>>> > >>> > > > > source
>>>> > >>> > > > > > > - Get the Apache website up
>>>> > >>> > > > > > > - Setup Nginx redirect server for druid.io
>>>> > >>> > > > > > > - Shutdown github pages and redirect DNS for druid.io
>>>> to
>>>> > >>> Nginx
>>>> > >>> > > > > redirect
>>>> > >>> > > > > > > server
>>>> > >>> > > > > > > - Add canonical tags to pages
>>>> > >>> > > > > > >
>>>> > >>> > > > > >
>>>> > >>> > > > >
>>>> > >>> > > >
>>>> > >>> > >
>>>> > >>> >
>>>> > >>>
>>>> > >>
>>>> >
>>>>
>>>

Re: Proposed website migration plan

Posted by Gian Merlino <gi...@apache.org>.
Another update: as of
https://github.com/apache/incubator-druid-website-src/pull/1 and
https://github.com/apache/incubator-druid-website/pull/7, the
https://druid.apache.org/ site is now serving almost all pages from druid.io,
except:

- the index page (it still has a placeholder until we flip the switch)
- the download page (it has a differently-designed download page: compare
http://druid.io/downloads.html with http://druid.apache.org/downloads.html
- any docs older than 0.13.0 (they aren't Apache releases)

If you navigate to https://druid.apache.org/ + any other path from druid.io,
you should see the page.

I'm hoping to confirm that search engines pick up the 301 for
http://druid.io/use-cases before flipping the switch. Hopefully that
doesn't take much longer. If it does we should talk about how we want to
proceed.

On Tue, Jun 4, 2019 at 1:48 AM Gian Merlino <gi...@apache.org> wrote:

> An update: we do have a redirect server set up on druid.io now: note that
> http://druid.io/community/ and http://druid.io/use-cases both redirect to
> https://druid.apache.org. I just set up the latter redirect (on
> /use-cases) as part of 'test this first on a single page'. All other
> druid.io URLs are still being hosted using the content from GitHub pages
> at https://github.com/druid-io/druid-io.github.io.
>
> Search engine watch: currently, http://druid.io is the #1 link for [druid
> use cases] on Google, Bing, and DuckDuckGo (and has a cool looking infobox
> on Google & Bing). For [what is druid used for], it's #2 on Google, and not
> ranked on the first page on Bing & DDG. Will monitor this over the next few
> days.
>
> On Mon, May 6, 2019 at 5:43 PM Gian Merlino <gi...@apache.org> wrote:
>
>> Hi all,
>>
>> It sounds like we will need a redirect server that issues 301s from each
>> druid.io page to the corresponding druid.apache.org page. Charles and I
>> spoke offline and thought that something like Jon's original proposal is
>> the best way to go. I am going to suggest we get started on this, as it's
>> the last major piece of infra to move to ASF.
>>
>> 1) Set up a redirect server to perform 301 redirects to druid.apache.org
>> 2) Post all druid.io content on druid.apache.org
>> 3) Update druid.io DNS to point to the redirect server
>> 4) Shut down GitHub pages hosting for druid.io
>>
>> Steps (2) and (3) should be done as close in time as possible so there is
>> no confusion as to which version of the pages is canonical.
>>
>> For the redirect server, two viable options are an nginx server or an S3
>> webpage redirect (
>> https://docs.aws.amazon.com/AmazonS3/latest/dev/how-to-page-redirect.html).
>> Just like we did with the HTML-level redirect, I suggest we test this first
>> on a single page. We can do that by having the redirect server initially
>> start off by hosting all druid.io content (so it's indistinguishable
>> from the GitHub-pages-based site) except for a single page, which it
>> redirects using HTTP 301 to druid.apache.org.
>>
>> I'm planning to start looking into this, so anyone around please speak up
>> if you have any advice or alternative approaches to suggest.
>>
>> On Mon, Apr 29, 2019 at 4:01 PM Jonathan Wei <jo...@apache.org> wrote:
>>
>>> Thanks for checking the SEO state, that's somewhat disappointing.
>>>
>>> For Bing, it sounds like they really want you to use 301s (
>>> https://www.bing.com/webmaster/help/webmaster-guidelines-30fba23a):
>>>
>>> > Bing prefers you use a 301 permanent redirect when moving content,
>>> should
>>> the move be permanent.  If the move is temporary, then a 302 temporary
>>> redirect will work fine.  Do not use the rel=canonical tag in place of a
>>> proper redirect.
>>>
>>> I wasn't able to find similar guidance re: this issue for DuckDuckGo.
>>>
>>> On Mon, Apr 29, 2019 at 10:42 AM Gian Merlino <gi...@apache.org> wrote:
>>>
>>> > Another update: SEO is not looking great after another day passed. For
>>> a
>>> > search for "druid community", both http://druid.io/community and
>>> > https://druid.apache.org/community/ have dropped off the front page of
>>> > Bing
>>> > completely. On Google, the legacy version is gone (as expected) but the
>>> > Apache version has dropped to the #3 spot (down from #2 yesterday; and
>>> down
>>> > from where the legacy page was pre-migration, which was #1).
>>> >
>>> > I think this means we do need to try to get 301s figured out.
>>> >
>>> > On Sun, Apr 28, 2019 at 3:06 PM Gian Merlino <gi...@apache.org> wrote:
>>> >
>>> > > Google has picked up the new URL as of today but Bing hasn't.
>>> Neither has
>>> > > DuckDuckGo for that matter.
>>> > >
>>> > > Currently, Google is showing https://druid.apache.org/community/ in
>>> the
>>> > > #2 spot and Bing/DDG are showing http://druid.io/community in the
>>> top
>>> > > spot. Ominously, the latter two _have_ picked up a page title change
>>> to
>>> > > "Redirecting..."
>>> > >
>>> > > On Fri, Apr 26, 2019 at 11:00 AM Gian Merlino <gi...@apache.org>
>>> wrote:
>>> > >
>>> > >> An update: this is done now since a couple of days ago, but Google
>>> and
>>> > >> Bing are still showing http://druid.io/community for a search for
>>> > "druid
>>> > >> community" or even "apache druid community":
>>> > >>
>>> > >> - https://www.google.com/search?q=druid+community
>>> > >> - https://www.bing.com/search?q=druid+community
>>> > >>
>>> > >> I suggest we keep an eye on the search engines and make sure they
>>> can
>>> > >> figure out that the site has changed (I'm not sure how often they
>>> > crawl).
>>> > >> If they can then it would make sense to me to move forward with
>>> > migrating
>>> > >> the entire web site.
>>> > >>
>>> > >> On Mon, Apr 22, 2019 at 7:49 PM Jonathan Wei <jo...@apache.org>
>>> wrote:
>>> > >>
>>> > >>> Correction: Xavier was suggesting we use
>>> > >>>
>>> > >>>
>>> >
>>> https://github.com/druid-io/druid-io.github.io/blob/src/_layouts/redirect_page.html
>>> > >>> ,
>>> > >>> the existing redirect system used by the Druid website.
>>> > >>>
>>> > >>> I've opened PRs to do the community page migration test:
>>> > >>> https://github.com/apache/incubator-druid-website/pull/3
>>> > >>> https://github.com/druid-io/druid-io.github.io/pull/591
>>> > >>>
>>> > >>> On Mon, Apr 22, 2019 at 3:04 PM Gian Merlino <gi...@apache.org>
>>> wrote:
>>> > >>>
>>> > >>> > That sounds good to me. I would also consider adding canonical
>>> tags
>>> > to
>>> > >>> all
>>> > >>> > druid.apache.org pages so we don't have
>>> druid.incubator.apache.org
>>> > and
>>> > >>> > druid.apache.org both floating around (not to mention http/https
>>> > >>> version
>>> > >>> > of
>>> > >>> > both).
>>> > >>> >
>>> > >>> > On Mon, Apr 22, 2019 at 2:59 PM Jonathan Wei <jo...@apache.org>
>>> > >>> wrote:
>>> > >>> >
>>> > >>> > > For redirects, Xavier has suggested using
>>> > >>> > > https://help.github.com/en/articles/redirects-on-github-pages
>>> to
>>> > >>> > redirect
>>> > >>> > > to druid.apache.org as a way to transition before the domain
>>> > >>> migration
>>> > >>> > > occurs, and believes that it would have the same SEO effects
>>> as a
>>> > 301
>>> > >>> > > redirect after the new pages are indexed.
>>> > >>> > >
>>> > >>> > > I think we could try migrating the current Community page to
>>> > >>> > > druid.apache.org with Github redirects and canonical links
>>> > pointing
>>> > >>> to
>>> > >>> > the
>>> > >>> > > https://druid.apache.org version. If that goes well, we could
>>> > >>> continue
>>> > >>> > > migrating more pages.
>>> > >>> > >
>>> > >>> > > What are the community's thoughts on that?
>>> > >>> > >
>>> > >>> > > Thanks,
>>> > >>> > > Jon
>>> > >>> > >
>>> > >>> > > On Tue, Mar 12, 2019 at 7:19 PM Gian Merlino <gi...@apache.org>
>>> > >>> wrote:
>>> > >>> > >
>>> > >>> > > > OpenOffice and Groovy both chose to sort of "meld" their
>>> classic
>>> > >>> and
>>> > >>> > > Apache
>>> > >>> > > > sites together: https://www.openoffice.org/,
>>> > >>> http://groovy-lang.org/.
>>> > >>> > > Note
>>> > >>> > > > how when you click around, you get shuttled between the
>>> classic
>>> > >>> domain
>>> > >>> > > and
>>> > >>> > > > the Apache domain. Some pages are available on both sites,
>>> like
>>> > >>> > > > http://groovy-lang.org/download.html and
>>> > >>> > > > https://groovy.apache.org/download.html (which don't use
>>> > canonical
>>> > >>> > link
>>> > >>> > > > tags -- does not seem like a good example to follow!).
>>> > >>> > > >
>>> > >>> > > > NetBeans (still incubating) also has a "melded" site at
>>> > >>> > > > https://netbeans.org/ but doesn't seem to consider itself
>>> done
>>> > >>> yet.
>>> > >>> > They
>>> > >>> > > > are discussing plans on their lists & wiki to do redirects
>>> from
>>> > >>> > > > netbeans.org
>>> > >>> > > > to netbeans.apache.org:
>>> > >>> > > >
>>> > >>> > > >
>>> > >>> > >
>>> > >>> >
>>> > >>>
>>> >
>>> https://cwiki.apache.org/confluence/display/NETBEANS/netbeans.org+Transition+Process
>>> > >>> > > > ,
>>> > >>> > > >
>>> > >>> > > >
>>> > >>> > >
>>> > >>> >
>>> > >>>
>>> >
>>> https://lists.apache.org/thread.html/ad10fb9d4c8fee571a2f6232b268a3b835f7b823d3a0983b84aeb18a@%3Cdev.netbeans.apache.org%3E
>>> > >>> > > > .
>>> > >>> > > > As of today the domain has been donated to ASF, but the
>>> server is
>>> > >>> still
>>> > >>> > > run
>>> > >>> > > > by Oracle, so the plan doesn't seem to be finished yet.
>>> (WHOIS
>>> > for
>>> > >>> > > > netbeans.org shows ASF as the registrant; netbeans.org
>>> resolves
>>> > to
>>> > >>> > > > lb-netbeans-cms-adc.oracle.com.)
>>> > >>> > > >
>>> > >>> > > > The melded sites don't really seem better to me than
>>> redirecting
>>> > >>> all
>>> > >>> > urls
>>> > >>> > > > on the domain. I guess it depends on if we want to keep
>>> druid.io
>>> > >>> as
>>> > >>> > the
>>> > >>> > > > official domain forever, or if we think druid.apache.org is
>>> > >>> cooler. I
>>> > >>> > > > definitely think druid.apache.org is cooler so my vote is
>>> there
>>> > >>> :).
>>> > >>> > It's
>>> > >>> > > > also nice that it supports https. (druid.io does not today,
>>> > since
>>> > >>> it's
>>> > >>> > > on
>>> > >>> > > > GitHub pages, which doesn't support https for custom
>>> domains.)
>>> > >>> > > >
>>> > >>> > > > On Tue, Mar 12, 2019 at 7:47 PM Charles Allen
>>> > >>> > > > <ch...@snap.com.invalid> wrote:
>>> > >>> > > >
>>> > >>> > > > > Are there other projects who have transitioned an
>>> independently
>>> > >>> > > > successful
>>> > >>> > > > > domain name to an apache one?
>>> > >>> > > > >
>>> > >>> > > > > On Tue, Mar 5, 2019 at 2:13 PM David Lim <
>>> davidlim@apache.org>
>>> > >>> > wrote:
>>> > >>> > > > >
>>> > >>> > > > > > Who has control over the druid.io domain? Charles would
>>> that
>>> > >>> be
>>> > >>> > you?
>>> > >>> > > > > >
>>> > >>> > > > > > We'd need support from them for the DNS redirect.
>>> > >>> > > > > >
>>> > >>> > > > > > On Tue, Mar 5, 2019 at 2:04 PM Jonathan Wei <
>>> > jonwei@apache.org
>>> > >>> >
>>> > >>> > > wrote:
>>> > >>> > > > > >
>>> > >>> > > > > > > We still need to complete the website migration to
>>> Apache
>>> > >>> > > > > infrastructure.
>>> > >>> > > > > > >
>>> > >>> > > > > > > I'll propose the following plan:
>>> > >>> > > > > > >
>>> > >>> > > > > > > Proposed Apache Druid website migration plan
>>> > >>> > > > > > > ========================================
>>> > >>> > > > > > >
>>> > >>> > > > > > > These links have some previous discussion on the
>>> website
>>> > >>> > migration:
>>> > >>> > > > > > >
>>> > >>> > > > > > >
>>> > >>> > > > > > >
>>> > >>> > > > > >
>>> > >>> > > > >
>>> > >>> > > >
>>> > >>> > >
>>> > >>> >
>>> > >>>
>>> >
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.apache.org_thread.html_7cae100b684e0b33e0adda993efea3d6088978700988a0ae632fdd80-40-253Cdev.druid.apache.org-253E&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=G1dTS7FlYGauxNOaQECZix2YwroWVCqJB-cT0nEeNwM&e=
>>> > >>> > > > > > >
>>> > >>> > > > > >
>>> > >>> > > > >
>>> > >>> > > >
>>> > >>> > >
>>> > >>> >
>>> > >>>
>>> >
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_INFRA-2D17340&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=pwg0jE385gqei6EEEbxugKHWll7oyKoCloFc8ByhlUc&e=
>>> > >>> > > > > > >
>>> > >>> > > > > > > From the discussions above, the recommendation is to
>>> have 2
>>> > >>> > > separate
>>> > >>> > > > > > repos
>>> > >>> > > > > > > for the website: one for source and another for built
>>> > content
>>> > >>> > that
>>> > >>> > > > will
>>> > >>> > > > > > be
>>> > >>> > > > > > > served.
>>> > >>> > > > > > >
>>> > >>> > > > > > > Generating site files
>>> > >>> > > > > > > =======================
>>> > >>> > > > > > >
>>> > >>> > > > > > > The Apache site update process will be similar to our
>>> > current
>>> > >>> > > > process.
>>> > >>> > > > > > >
>>> > >>> > > > > > > Current process:
>>> > >>> > > > > > > 1. Push changes to
>>> > >>> > > > > > https://github.com/druid-io/druid-io.github.io/tree/src
>>> > >>> > > > > > > 2. metamx bot picks up changes, builds, and commits to
>>> > >>> > > > > > >
>>> https://github.com/druid-io/druid-io.github.io/tree/master
>>> > >>> > > > > > > 3.
>>> > >>> https://github.com/druid-io/druid-io.github.io/tree/master is
>>> > >>> > > > > served
>>> > >>> > > > > > by
>>> > >>> > > > > > > github pages
>>> > >>> > > > > > >
>>> > >>> > > > > > > Apache process:
>>> > >>> > > > > > > 1. Push changes to
>>> > >>> > > > > https://github.com/apache/incubator-druid-website-src
>>> > >>> > > > > > > 2. Jenkins bot from Apache will build the website from
>>> > source
>>> > >>> > repo,
>>> > >>> > > > > > commit
>>> > >>> > > > > > > to https://github.com/apache/incubator-druid-website
>>> > >>> > > > > > > 3. Apache Druid website will be served from the
>>> content in
>>> > >>> > > > > > > https://github.com/apache/incubator-druid-website
>>> > (asf-site
>>> > >>> > > branch)
>>> > >>> > > > > > >
>>> > >>> > > > > > >
>>> > >>> > > > > > > Hosting and SEO
>>> > >>> > > > > > > ================
>>> > >>> > > > > > >
>>> > >>> > > > > > > The Apache site will be hosted at druid.apache.org on
>>> > Apache
>>> > >>> > > > > > > infrastructure:
>>> > >>> > > > > >
>>> > >>> > > > >
>>> > >>> > > >
>>> > >>> > >
>>> > >>> >
>>> > >>>
>>> >
>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.apache.org_dev_project-2Dsite.html&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=_rHEo_asMXKypaunuBTXFkB6Ni3F6KqbEfkck18L7Ag&e=
>>> > >>> > > > > > >
>>> > >>> > > > > > > To preserve our search rankings, we can setup 301
>>> redirects
>>> > >>> from
>>> > >>> > > the
>>> > >>> > > > > old
>>> > >>> > > > > > > druid.io site to the corresponding pages on the
>>> > >>> druid.apache.org
>>> > >>> > > > > site. (
>>> > >>> > > > > > >
>>> > >>> > > > > >
>>> > >>> > > > >
>>> > >>> > > >
>>> > >>> > >
>>> > >>> >
>>> > >>>
>>> >
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__moz.com_learn_seo_redirection&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=lUeWU0dT9thy8gp11RO-Vry7zkYl_W4BXz01fyXJO0A&e=
>>> > >>> > > > > > )
>>> > >>> > > > > > >
>>> > >>> > > > > > > However, Github pages (which currently hosts the
>>> druid.io
>>> > >>> site)
>>> > >>> > > does
>>> > >>> > > > > not
>>> > >>> > > > > > > support 301 redirects, so we propose the following:
>>> > >>> > > > > > > - Setup a new Nginx server that will perform 301
>>> redirects
>>> > to
>>> > >>> > > > > > > druid.apache.org for the druid.io. Imply can host
>>> this if
>>> > >>> > needed.
>>> > >>> > > > > > > - Update the druid.io DNS entry to point to this new
>>> Nginx
>>> > >>> > server
>>> > >>> > > > > > > - Shut down Github pages hosting for druid.io
>>> > >>> > > > > > >
>>> > >>> > > > > > > In addition, we can also set canonical tags on our
>>> pages:
>>> > >>> > > > > > >
>>> > >>> > > > > >
>>> > >>> > > > >
>>> > >>> > > >
>>> > >>> > >
>>> > >>> >
>>> > >>>
>>> >
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__moz.com_learn_seo_canonicalization&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=T8G2c6d4EbQ_YDLFQXVebcj0UN9FNrbpPY5Xq4LAR8w&e=
>>> > >>> > > > > > >
>>> > >>> > > > > > >
>>> > >>> > > > > > > Action items
>>> > >>> > > > > > > ===============
>>> > >>> > > > > > > - Setup a Jenkins bot that builds the Apache website
>>> > content
>>> > >>> from
>>> > >>> > > > > source
>>> > >>> > > > > > > - Get the Apache website up
>>> > >>> > > > > > > - Setup Nginx redirect server for druid.io
>>> > >>> > > > > > > - Shutdown github pages and redirect DNS for druid.io
>>> to
>>> > >>> Nginx
>>> > >>> > > > > redirect
>>> > >>> > > > > > > server
>>> > >>> > > > > > > - Add canonical tags to pages
>>> > >>> > > > > > >
>>> > >>> > > > > >
>>> > >>> > > > >
>>> > >>> > > >
>>> > >>> > >
>>> > >>> >
>>> > >>>
>>> > >>
>>> >
>>>
>>

Re: Proposed website migration plan

Posted by Gian Merlino <gi...@apache.org>.
An update: we do have a redirect server set up on druid.io now: note that
http://druid.io/community/ and http://druid.io/use-cases both redirect to
https://druid.apache.org. I just set up the latter redirect (on /use-cases)
as part of 'test this first on a single page'. All other druid.io URLs are
still being hosted using the content from GitHub pages at
https://github.com/druid-io/druid-io.github.io.

Search engine watch: currently, http://druid.io is the #1 link for [druid
use cases] on Google, Bing, and DuckDuckGo (and has a cool looking infobox
on Google & Bing). For [what is druid used for], it's #2 on Google, and not
ranked on the first page on Bing & DDG. Will monitor this over the next few
days.

On Mon, May 6, 2019 at 5:43 PM Gian Merlino <gi...@apache.org> wrote:

> Hi all,
>
> It sounds like we will need a redirect server that issues 301s from each
> druid.io page to the corresponding druid.apache.org page. Charles and I
> spoke offline and thought that something like Jon's original proposal is
> the best way to go. I am going to suggest we get started on this, as it's
> the last major piece of infra to move to ASF.
>
> 1) Set up a redirect server to perform 301 redirects to druid.apache.org
> 2) Post all druid.io content on druid.apache.org
> 3) Update druid.io DNS to point to the redirect server
> 4) Shut down GitHub pages hosting for druid.io
>
> Steps (2) and (3) should be done as close in time as possible so there is
> no confusion as to which version of the pages is canonical.
>
> For the redirect server, two viable options are an nginx server or an S3
> webpage redirect (
> https://docs.aws.amazon.com/AmazonS3/latest/dev/how-to-page-redirect.html).
> Just like we did with the HTML-level redirect, I suggest we test this first
> on a single page. We can do that by having the redirect server initially
> start off by hosting all druid.io content (so it's indistinguishable from
> the GitHub-pages-based site) except for a single page, which it redirects
> using HTTP 301 to druid.apache.org.
>
> I'm planning to start looking into this, so anyone around please speak up
> if you have any advice or alternative approaches to suggest.
>
> On Mon, Apr 29, 2019 at 4:01 PM Jonathan Wei <jo...@apache.org> wrote:
>
>> Thanks for checking the SEO state, that's somewhat disappointing.
>>
>> For Bing, it sounds like they really want you to use 301s (
>> https://www.bing.com/webmaster/help/webmaster-guidelines-30fba23a):
>>
>> > Bing prefers you use a 301 permanent redirect when moving content,
>> should
>> the move be permanent.  If the move is temporary, then a 302 temporary
>> redirect will work fine.  Do not use the rel=canonical tag in place of a
>> proper redirect.
>>
>> I wasn't able to find similar guidance re: this issue for DuckDuckGo.
>>
>> On Mon, Apr 29, 2019 at 10:42 AM Gian Merlino <gi...@apache.org> wrote:
>>
>> > Another update: SEO is not looking great after another day passed. For a
>> > search for "druid community", both http://druid.io/community and
>> > https://druid.apache.org/community/ have dropped off the front page of
>> > Bing
>> > completely. On Google, the legacy version is gone (as expected) but the
>> > Apache version has dropped to the #3 spot (down from #2 yesterday; and
>> down
>> > from where the legacy page was pre-migration, which was #1).
>> >
>> > I think this means we do need to try to get 301s figured out.
>> >
>> > On Sun, Apr 28, 2019 at 3:06 PM Gian Merlino <gi...@apache.org> wrote:
>> >
>> > > Google has picked up the new URL as of today but Bing hasn't. Neither
>> has
>> > > DuckDuckGo for that matter.
>> > >
>> > > Currently, Google is showing https://druid.apache.org/community/ in
>> the
>> > > #2 spot and Bing/DDG are showing http://druid.io/community in the top
>> > > spot. Ominously, the latter two _have_ picked up a page title change
>> to
>> > > "Redirecting..."
>> > >
>> > > On Fri, Apr 26, 2019 at 11:00 AM Gian Merlino <gi...@apache.org>
>> wrote:
>> > >
>> > >> An update: this is done now since a couple of days ago, but Google
>> and
>> > >> Bing are still showing http://druid.io/community for a search for
>> > "druid
>> > >> community" or even "apache druid community":
>> > >>
>> > >> - https://www.google.com/search?q=druid+community
>> > >> - https://www.bing.com/search?q=druid+community
>> > >>
>> > >> I suggest we keep an eye on the search engines and make sure they can
>> > >> figure out that the site has changed (I'm not sure how often they
>> > crawl).
>> > >> If they can then it would make sense to me to move forward with
>> > migrating
>> > >> the entire web site.
>> > >>
>> > >> On Mon, Apr 22, 2019 at 7:49 PM Jonathan Wei <jo...@apache.org>
>> wrote:
>> > >>
>> > >>> Correction: Xavier was suggesting we use
>> > >>>
>> > >>>
>> >
>> https://github.com/druid-io/druid-io.github.io/blob/src/_layouts/redirect_page.html
>> > >>> ,
>> > >>> the existing redirect system used by the Druid website.
>> > >>>
>> > >>> I've opened PRs to do the community page migration test:
>> > >>> https://github.com/apache/incubator-druid-website/pull/3
>> > >>> https://github.com/druid-io/druid-io.github.io/pull/591
>> > >>>
>> > >>> On Mon, Apr 22, 2019 at 3:04 PM Gian Merlino <gi...@apache.org>
>> wrote:
>> > >>>
>> > >>> > That sounds good to me. I would also consider adding canonical
>> tags
>> > to
>> > >>> all
>> > >>> > druid.apache.org pages so we don't have
>> druid.incubator.apache.org
>> > and
>> > >>> > druid.apache.org both floating around (not to mention http/https
>> > >>> version
>> > >>> > of
>> > >>> > both).
>> > >>> >
>> > >>> > On Mon, Apr 22, 2019 at 2:59 PM Jonathan Wei <jo...@apache.org>
>> > >>> wrote:
>> > >>> >
>> > >>> > > For redirects, Xavier has suggested using
>> > >>> > > https://help.github.com/en/articles/redirects-on-github-pages
>> to
>> > >>> > redirect
>> > >>> > > to druid.apache.org as a way to transition before the domain
>> > >>> migration
>> > >>> > > occurs, and believes that it would have the same SEO effects as
>> a
>> > 301
>> > >>> > > redirect after the new pages are indexed.
>> > >>> > >
>> > >>> > > I think we could try migrating the current Community page to
>> > >>> > > druid.apache.org with Github redirects and canonical links
>> > pointing
>> > >>> to
>> > >>> > the
>> > >>> > > https://druid.apache.org version. If that goes well, we could
>> > >>> continue
>> > >>> > > migrating more pages.
>> > >>> > >
>> > >>> > > What are the community's thoughts on that?
>> > >>> > >
>> > >>> > > Thanks,
>> > >>> > > Jon
>> > >>> > >
>> > >>> > > On Tue, Mar 12, 2019 at 7:19 PM Gian Merlino <gi...@apache.org>
>> > >>> wrote:
>> > >>> > >
>> > >>> > > > OpenOffice and Groovy both chose to sort of "meld" their
>> classic
>> > >>> and
>> > >>> > > Apache
>> > >>> > > > sites together: https://www.openoffice.org/,
>> > >>> http://groovy-lang.org/.
>> > >>> > > Note
>> > >>> > > > how when you click around, you get shuttled between the
>> classic
>> > >>> domain
>> > >>> > > and
>> > >>> > > > the Apache domain. Some pages are available on both sites,
>> like
>> > >>> > > > http://groovy-lang.org/download.html and
>> > >>> > > > https://groovy.apache.org/download.html (which don't use
>> > canonical
>> > >>> > link
>> > >>> > > > tags -- does not seem like a good example to follow!).
>> > >>> > > >
>> > >>> > > > NetBeans (still incubating) also has a "melded" site at
>> > >>> > > > https://netbeans.org/ but doesn't seem to consider itself
>> done
>> > >>> yet.
>> > >>> > They
>> > >>> > > > are discussing plans on their lists & wiki to do redirects
>> from
>> > >>> > > > netbeans.org
>> > >>> > > > to netbeans.apache.org:
>> > >>> > > >
>> > >>> > > >
>> > >>> > >
>> > >>> >
>> > >>>
>> >
>> https://cwiki.apache.org/confluence/display/NETBEANS/netbeans.org+Transition+Process
>> > >>> > > > ,
>> > >>> > > >
>> > >>> > > >
>> > >>> > >
>> > >>> >
>> > >>>
>> >
>> https://lists.apache.org/thread.html/ad10fb9d4c8fee571a2f6232b268a3b835f7b823d3a0983b84aeb18a@%3Cdev.netbeans.apache.org%3E
>> > >>> > > > .
>> > >>> > > > As of today the domain has been donated to ASF, but the
>> server is
>> > >>> still
>> > >>> > > run
>> > >>> > > > by Oracle, so the plan doesn't seem to be finished yet. (WHOIS
>> > for
>> > >>> > > > netbeans.org shows ASF as the registrant; netbeans.org
>> resolves
>> > to
>> > >>> > > > lb-netbeans-cms-adc.oracle.com.)
>> > >>> > > >
>> > >>> > > > The melded sites don't really seem better to me than
>> redirecting
>> > >>> all
>> > >>> > urls
>> > >>> > > > on the domain. I guess it depends on if we want to keep
>> druid.io
>> > >>> as
>> > >>> > the
>> > >>> > > > official domain forever, or if we think druid.apache.org is
>> > >>> cooler. I
>> > >>> > > > definitely think druid.apache.org is cooler so my vote is
>> there
>> > >>> :).
>> > >>> > It's
>> > >>> > > > also nice that it supports https. (druid.io does not today,
>> > since
>> > >>> it's
>> > >>> > > on
>> > >>> > > > GitHub pages, which doesn't support https for custom domains.)
>> > >>> > > >
>> > >>> > > > On Tue, Mar 12, 2019 at 7:47 PM Charles Allen
>> > >>> > > > <ch...@snap.com.invalid> wrote:
>> > >>> > > >
>> > >>> > > > > Are there other projects who have transitioned an
>> independently
>> > >>> > > > successful
>> > >>> > > > > domain name to an apache one?
>> > >>> > > > >
>> > >>> > > > > On Tue, Mar 5, 2019 at 2:13 PM David Lim <
>> davidlim@apache.org>
>> > >>> > wrote:
>> > >>> > > > >
>> > >>> > > > > > Who has control over the druid.io domain? Charles would
>> that
>> > >>> be
>> > >>> > you?
>> > >>> > > > > >
>> > >>> > > > > > We'd need support from them for the DNS redirect.
>> > >>> > > > > >
>> > >>> > > > > > On Tue, Mar 5, 2019 at 2:04 PM Jonathan Wei <
>> > jonwei@apache.org
>> > >>> >
>> > >>> > > wrote:
>> > >>> > > > > >
>> > >>> > > > > > > We still need to complete the website migration to
>> Apache
>> > >>> > > > > infrastructure.
>> > >>> > > > > > >
>> > >>> > > > > > > I'll propose the following plan:
>> > >>> > > > > > >
>> > >>> > > > > > > Proposed Apache Druid website migration plan
>> > >>> > > > > > > ========================================
>> > >>> > > > > > >
>> > >>> > > > > > > These links have some previous discussion on the website
>> > >>> > migration:
>> > >>> > > > > > >
>> > >>> > > > > > >
>> > >>> > > > > > >
>> > >>> > > > > >
>> > >>> > > > >
>> > >>> > > >
>> > >>> > >
>> > >>> >
>> > >>>
>> >
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.apache.org_thread.html_7cae100b684e0b33e0adda993efea3d6088978700988a0ae632fdd80-40-253Cdev.druid.apache.org-253E&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=G1dTS7FlYGauxNOaQECZix2YwroWVCqJB-cT0nEeNwM&e=
>> > >>> > > > > > >
>> > >>> > > > > >
>> > >>> > > > >
>> > >>> > > >
>> > >>> > >
>> > >>> >
>> > >>>
>> >
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_INFRA-2D17340&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=pwg0jE385gqei6EEEbxugKHWll7oyKoCloFc8ByhlUc&e=
>> > >>> > > > > > >
>> > >>> > > > > > > From the discussions above, the recommendation is to
>> have 2
>> > >>> > > separate
>> > >>> > > > > > repos
>> > >>> > > > > > > for the website: one for source and another for built
>> > content
>> > >>> > that
>> > >>> > > > will
>> > >>> > > > > > be
>> > >>> > > > > > > served.
>> > >>> > > > > > >
>> > >>> > > > > > > Generating site files
>> > >>> > > > > > > =======================
>> > >>> > > > > > >
>> > >>> > > > > > > The Apache site update process will be similar to our
>> > current
>> > >>> > > > process.
>> > >>> > > > > > >
>> > >>> > > > > > > Current process:
>> > >>> > > > > > > 1. Push changes to
>> > >>> > > > > > https://github.com/druid-io/druid-io.github.io/tree/src
>> > >>> > > > > > > 2. metamx bot picks up changes, builds, and commits to
>> > >>> > > > > > >
>> https://github.com/druid-io/druid-io.github.io/tree/master
>> > >>> > > > > > > 3.
>> > >>> https://github.com/druid-io/druid-io.github.io/tree/master is
>> > >>> > > > > served
>> > >>> > > > > > by
>> > >>> > > > > > > github pages
>> > >>> > > > > > >
>> > >>> > > > > > > Apache process:
>> > >>> > > > > > > 1. Push changes to
>> > >>> > > > > https://github.com/apache/incubator-druid-website-src
>> > >>> > > > > > > 2. Jenkins bot from Apache will build the website from
>> > source
>> > >>> > repo,
>> > >>> > > > > > commit
>> > >>> > > > > > > to https://github.com/apache/incubator-druid-website
>> > >>> > > > > > > 3. Apache Druid website will be served from the content
>> in
>> > >>> > > > > > > https://github.com/apache/incubator-druid-website
>> > (asf-site
>> > >>> > > branch)
>> > >>> > > > > > >
>> > >>> > > > > > >
>> > >>> > > > > > > Hosting and SEO
>> > >>> > > > > > > ================
>> > >>> > > > > > >
>> > >>> > > > > > > The Apache site will be hosted at druid.apache.org on
>> > Apache
>> > >>> > > > > > > infrastructure:
>> > >>> > > > > >
>> > >>> > > > >
>> > >>> > > >
>> > >>> > >
>> > >>> >
>> > >>>
>> >
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.apache.org_dev_project-2Dsite.html&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=_rHEo_asMXKypaunuBTXFkB6Ni3F6KqbEfkck18L7Ag&e=
>> > >>> > > > > > >
>> > >>> > > > > > > To preserve our search rankings, we can setup 301
>> redirects
>> > >>> from
>> > >>> > > the
>> > >>> > > > > old
>> > >>> > > > > > > druid.io site to the corresponding pages on the
>> > >>> druid.apache.org
>> > >>> > > > > site. (
>> > >>> > > > > > >
>> > >>> > > > > >
>> > >>> > > > >
>> > >>> > > >
>> > >>> > >
>> > >>> >
>> > >>>
>> >
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__moz.com_learn_seo_redirection&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=lUeWU0dT9thy8gp11RO-Vry7zkYl_W4BXz01fyXJO0A&e=
>> > >>> > > > > > )
>> > >>> > > > > > >
>> > >>> > > > > > > However, Github pages (which currently hosts the
>> druid.io
>> > >>> site)
>> > >>> > > does
>> > >>> > > > > not
>> > >>> > > > > > > support 301 redirects, so we propose the following:
>> > >>> > > > > > > - Setup a new Nginx server that will perform 301
>> redirects
>> > to
>> > >>> > > > > > > druid.apache.org for the druid.io. Imply can host this
>> if
>> > >>> > needed.
>> > >>> > > > > > > - Update the druid.io DNS entry to point to this new
>> Nginx
>> > >>> > server
>> > >>> > > > > > > - Shut down Github pages hosting for druid.io
>> > >>> > > > > > >
>> > >>> > > > > > > In addition, we can also set canonical tags on our
>> pages:
>> > >>> > > > > > >
>> > >>> > > > > >
>> > >>> > > > >
>> > >>> > > >
>> > >>> > >
>> > >>> >
>> > >>>
>> >
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__moz.com_learn_seo_canonicalization&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=T8G2c6d4EbQ_YDLFQXVebcj0UN9FNrbpPY5Xq4LAR8w&e=
>> > >>> > > > > > >
>> > >>> > > > > > >
>> > >>> > > > > > > Action items
>> > >>> > > > > > > ===============
>> > >>> > > > > > > - Setup a Jenkins bot that builds the Apache website
>> > content
>> > >>> from
>> > >>> > > > > source
>> > >>> > > > > > > - Get the Apache website up
>> > >>> > > > > > > - Setup Nginx redirect server for druid.io
>> > >>> > > > > > > - Shutdown github pages and redirect DNS for druid.io
>> to
>> > >>> Nginx
>> > >>> > > > > redirect
>> > >>> > > > > > > server
>> > >>> > > > > > > - Add canonical tags to pages
>> > >>> > > > > > >
>> > >>> > > > > >
>> > >>> > > > >
>> > >>> > > >
>> > >>> > >
>> > >>> >
>> > >>>
>> > >>
>> >
>>
>

Re: Proposed website migration plan

Posted by Gian Merlino <gi...@apache.org>.
Hi all,

It sounds like we will need a redirect server that issues 301s from each
druid.io page to the corresponding druid.apache.org page. Charles and I
spoke offline and thought that something like Jon's original proposal is
the best way to go. I am going to suggest we get started on this, as it's
the last major piece of infra to move to ASF.

1) Set up a redirect server to perform 301 redirects to druid.apache.org
2) Post all druid.io content on druid.apache.org
3) Update druid.io DNS to point to the redirect server
4) Shut down GitHub pages hosting for druid.io

Steps (2) and (3) should be done as close in time as possible so there is
no confusion as to which version of the pages is canonical.

For the redirect server, two viable options are an nginx server or an S3
webpage redirect (
https://docs.aws.amazon.com/AmazonS3/latest/dev/how-to-page-redirect.html).
Just like we did with the HTML-level redirect, I suggest we test this first
on a single page. We can do that by having the redirect server initially
start off by hosting all druid.io content (so it's indistinguishable from
the GitHub-pages-based site) except for a single page, which it redirects
using HTTP 301 to druid.apache.org.

I'm planning to start looking into this, so anyone around please speak up
if you have any advice or alternative approaches to suggest.

On Mon, Apr 29, 2019 at 4:01 PM Jonathan Wei <jo...@apache.org> wrote:

> Thanks for checking the SEO state, that's somewhat disappointing.
>
> For Bing, it sounds like they really want you to use 301s (
> https://www.bing.com/webmaster/help/webmaster-guidelines-30fba23a):
>
> > Bing prefers you use a 301 permanent redirect when moving content, should
> the move be permanent.  If the move is temporary, then a 302 temporary
> redirect will work fine.  Do not use the rel=canonical tag in place of a
> proper redirect.
>
> I wasn't able to find similar guidance re: this issue for DuckDuckGo.
>
> On Mon, Apr 29, 2019 at 10:42 AM Gian Merlino <gi...@apache.org> wrote:
>
> > Another update: SEO is not looking great after another day passed. For a
> > search for "druid community", both http://druid.io/community and
> > https://druid.apache.org/community/ have dropped off the front page of
> > Bing
> > completely. On Google, the legacy version is gone (as expected) but the
> > Apache version has dropped to the #3 spot (down from #2 yesterday; and
> down
> > from where the legacy page was pre-migration, which was #1).
> >
> > I think this means we do need to try to get 301s figured out.
> >
> > On Sun, Apr 28, 2019 at 3:06 PM Gian Merlino <gi...@apache.org> wrote:
> >
> > > Google has picked up the new URL as of today but Bing hasn't. Neither
> has
> > > DuckDuckGo for that matter.
> > >
> > > Currently, Google is showing https://druid.apache.org/community/ in
> the
> > > #2 spot and Bing/DDG are showing http://druid.io/community in the top
> > > spot. Ominously, the latter two _have_ picked up a page title change to
> > > "Redirecting..."
> > >
> > > On Fri, Apr 26, 2019 at 11:00 AM Gian Merlino <gi...@apache.org> wrote:
> > >
> > >> An update: this is done now since a couple of days ago, but Google and
> > >> Bing are still showing http://druid.io/community for a search for
> > "druid
> > >> community" or even "apache druid community":
> > >>
> > >> - https://www.google.com/search?q=druid+community
> > >> - https://www.bing.com/search?q=druid+community
> > >>
> > >> I suggest we keep an eye on the search engines and make sure they can
> > >> figure out that the site has changed (I'm not sure how often they
> > crawl).
> > >> If they can then it would make sense to me to move forward with
> > migrating
> > >> the entire web site.
> > >>
> > >> On Mon, Apr 22, 2019 at 7:49 PM Jonathan Wei <jo...@apache.org>
> wrote:
> > >>
> > >>> Correction: Xavier was suggesting we use
> > >>>
> > >>>
> >
> https://github.com/druid-io/druid-io.github.io/blob/src/_layouts/redirect_page.html
> > >>> ,
> > >>> the existing redirect system used by the Druid website.
> > >>>
> > >>> I've opened PRs to do the community page migration test:
> > >>> https://github.com/apache/incubator-druid-website/pull/3
> > >>> https://github.com/druid-io/druid-io.github.io/pull/591
> > >>>
> > >>> On Mon, Apr 22, 2019 at 3:04 PM Gian Merlino <gi...@apache.org>
> wrote:
> > >>>
> > >>> > That sounds good to me. I would also consider adding canonical tags
> > to
> > >>> all
> > >>> > druid.apache.org pages so we don't have druid.incubator.apache.org
> > and
> > >>> > druid.apache.org both floating around (not to mention http/https
> > >>> version
> > >>> > of
> > >>> > both).
> > >>> >
> > >>> > On Mon, Apr 22, 2019 at 2:59 PM Jonathan Wei <jo...@apache.org>
> > >>> wrote:
> > >>> >
> > >>> > > For redirects, Xavier has suggested using
> > >>> > > https://help.github.com/en/articles/redirects-on-github-pages to
> > >>> > redirect
> > >>> > > to druid.apache.org as a way to transition before the domain
> > >>> migration
> > >>> > > occurs, and believes that it would have the same SEO effects as a
> > 301
> > >>> > > redirect after the new pages are indexed.
> > >>> > >
> > >>> > > I think we could try migrating the current Community page to
> > >>> > > druid.apache.org with Github redirects and canonical links
> > pointing
> > >>> to
> > >>> > the
> > >>> > > https://druid.apache.org version. If that goes well, we could
> > >>> continue
> > >>> > > migrating more pages.
> > >>> > >
> > >>> > > What are the community's thoughts on that?
> > >>> > >
> > >>> > > Thanks,
> > >>> > > Jon
> > >>> > >
> > >>> > > On Tue, Mar 12, 2019 at 7:19 PM Gian Merlino <gi...@apache.org>
> > >>> wrote:
> > >>> > >
> > >>> > > > OpenOffice and Groovy both chose to sort of "meld" their
> classic
> > >>> and
> > >>> > > Apache
> > >>> > > > sites together: https://www.openoffice.org/,
> > >>> http://groovy-lang.org/.
> > >>> > > Note
> > >>> > > > how when you click around, you get shuttled between the classic
> > >>> domain
> > >>> > > and
> > >>> > > > the Apache domain. Some pages are available on both sites, like
> > >>> > > > http://groovy-lang.org/download.html and
> > >>> > > > https://groovy.apache.org/download.html (which don't use
> > canonical
> > >>> > link
> > >>> > > > tags -- does not seem like a good example to follow!).
> > >>> > > >
> > >>> > > > NetBeans (still incubating) also has a "melded" site at
> > >>> > > > https://netbeans.org/ but doesn't seem to consider itself done
> > >>> yet.
> > >>> > They
> > >>> > > > are discussing plans on their lists & wiki to do redirects from
> > >>> > > > netbeans.org
> > >>> > > > to netbeans.apache.org:
> > >>> > > >
> > >>> > > >
> > >>> > >
> > >>> >
> > >>>
> >
> https://cwiki.apache.org/confluence/display/NETBEANS/netbeans.org+Transition+Process
> > >>> > > > ,
> > >>> > > >
> > >>> > > >
> > >>> > >
> > >>> >
> > >>>
> >
> https://lists.apache.org/thread.html/ad10fb9d4c8fee571a2f6232b268a3b835f7b823d3a0983b84aeb18a@%3Cdev.netbeans.apache.org%3E
> > >>> > > > .
> > >>> > > > As of today the domain has been donated to ASF, but the server
> is
> > >>> still
> > >>> > > run
> > >>> > > > by Oracle, so the plan doesn't seem to be finished yet. (WHOIS
> > for
> > >>> > > > netbeans.org shows ASF as the registrant; netbeans.org
> resolves
> > to
> > >>> > > > lb-netbeans-cms-adc.oracle.com.)
> > >>> > > >
> > >>> > > > The melded sites don't really seem better to me than
> redirecting
> > >>> all
> > >>> > urls
> > >>> > > > on the domain. I guess it depends on if we want to keep
> druid.io
> > >>> as
> > >>> > the
> > >>> > > > official domain forever, or if we think druid.apache.org is
> > >>> cooler. I
> > >>> > > > definitely think druid.apache.org is cooler so my vote is
> there
> > >>> :).
> > >>> > It's
> > >>> > > > also nice that it supports https. (druid.io does not today,
> > since
> > >>> it's
> > >>> > > on
> > >>> > > > GitHub pages, which doesn't support https for custom domains.)
> > >>> > > >
> > >>> > > > On Tue, Mar 12, 2019 at 7:47 PM Charles Allen
> > >>> > > > <ch...@snap.com.invalid> wrote:
> > >>> > > >
> > >>> > > > > Are there other projects who have transitioned an
> independently
> > >>> > > > successful
> > >>> > > > > domain name to an apache one?
> > >>> > > > >
> > >>> > > > > On Tue, Mar 5, 2019 at 2:13 PM David Lim <
> davidlim@apache.org>
> > >>> > wrote:
> > >>> > > > >
> > >>> > > > > > Who has control over the druid.io domain? Charles would
> that
> > >>> be
> > >>> > you?
> > >>> > > > > >
> > >>> > > > > > We'd need support from them for the DNS redirect.
> > >>> > > > > >
> > >>> > > > > > On Tue, Mar 5, 2019 at 2:04 PM Jonathan Wei <
> > jonwei@apache.org
> > >>> >
> > >>> > > wrote:
> > >>> > > > > >
> > >>> > > > > > > We still need to complete the website migration to Apache
> > >>> > > > > infrastructure.
> > >>> > > > > > >
> > >>> > > > > > > I'll propose the following plan:
> > >>> > > > > > >
> > >>> > > > > > > Proposed Apache Druid website migration plan
> > >>> > > > > > > ========================================
> > >>> > > > > > >
> > >>> > > > > > > These links have some previous discussion on the website
> > >>> > migration:
> > >>> > > > > > >
> > >>> > > > > > >
> > >>> > > > > > >
> > >>> > > > > >
> > >>> > > > >
> > >>> > > >
> > >>> > >
> > >>> >
> > >>>
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.apache.org_thread.html_7cae100b684e0b33e0adda993efea3d6088978700988a0ae632fdd80-40-253Cdev.druid.apache.org-253E&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=G1dTS7FlYGauxNOaQECZix2YwroWVCqJB-cT0nEeNwM&e=
> > >>> > > > > > >
> > >>> > > > > >
> > >>> > > > >
> > >>> > > >
> > >>> > >
> > >>> >
> > >>>
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_INFRA-2D17340&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=pwg0jE385gqei6EEEbxugKHWll7oyKoCloFc8ByhlUc&e=
> > >>> > > > > > >
> > >>> > > > > > > From the discussions above, the recommendation is to
> have 2
> > >>> > > separate
> > >>> > > > > > repos
> > >>> > > > > > > for the website: one for source and another for built
> > content
> > >>> > that
> > >>> > > > will
> > >>> > > > > > be
> > >>> > > > > > > served.
> > >>> > > > > > >
> > >>> > > > > > > Generating site files
> > >>> > > > > > > =======================
> > >>> > > > > > >
> > >>> > > > > > > The Apache site update process will be similar to our
> > current
> > >>> > > > process.
> > >>> > > > > > >
> > >>> > > > > > > Current process:
> > >>> > > > > > > 1. Push changes to
> > >>> > > > > > https://github.com/druid-io/druid-io.github.io/tree/src
> > >>> > > > > > > 2. metamx bot picks up changes, builds, and commits to
> > >>> > > > > > >
> https://github.com/druid-io/druid-io.github.io/tree/master
> > >>> > > > > > > 3.
> > >>> https://github.com/druid-io/druid-io.github.io/tree/master is
> > >>> > > > > served
> > >>> > > > > > by
> > >>> > > > > > > github pages
> > >>> > > > > > >
> > >>> > > > > > > Apache process:
> > >>> > > > > > > 1. Push changes to
> > >>> > > > > https://github.com/apache/incubator-druid-website-src
> > >>> > > > > > > 2. Jenkins bot from Apache will build the website from
> > source
> > >>> > repo,
> > >>> > > > > > commit
> > >>> > > > > > > to https://github.com/apache/incubator-druid-website
> > >>> > > > > > > 3. Apache Druid website will be served from the content
> in
> > >>> > > > > > > https://github.com/apache/incubator-druid-website
> > (asf-site
> > >>> > > branch)
> > >>> > > > > > >
> > >>> > > > > > >
> > >>> > > > > > > Hosting and SEO
> > >>> > > > > > > ================
> > >>> > > > > > >
> > >>> > > > > > > The Apache site will be hosted at druid.apache.org on
> > Apache
> > >>> > > > > > > infrastructure:
> > >>> > > > > >
> > >>> > > > >
> > >>> > > >
> > >>> > >
> > >>> >
> > >>>
> >
> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.apache.org_dev_project-2Dsite.html&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=_rHEo_asMXKypaunuBTXFkB6Ni3F6KqbEfkck18L7Ag&e=
> > >>> > > > > > >
> > >>> > > > > > > To preserve our search rankings, we can setup 301
> redirects
> > >>> from
> > >>> > > the
> > >>> > > > > old
> > >>> > > > > > > druid.io site to the corresponding pages on the
> > >>> druid.apache.org
> > >>> > > > > site. (
> > >>> > > > > > >
> > >>> > > > > >
> > >>> > > > >
> > >>> > > >
> > >>> > >
> > >>> >
> > >>>
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__moz.com_learn_seo_redirection&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=lUeWU0dT9thy8gp11RO-Vry7zkYl_W4BXz01fyXJO0A&e=
> > >>> > > > > > )
> > >>> > > > > > >
> > >>> > > > > > > However, Github pages (which currently hosts the
> druid.io
> > >>> site)
> > >>> > > does
> > >>> > > > > not
> > >>> > > > > > > support 301 redirects, so we propose the following:
> > >>> > > > > > > - Setup a new Nginx server that will perform 301
> redirects
> > to
> > >>> > > > > > > druid.apache.org for the druid.io. Imply can host this
> if
> > >>> > needed.
> > >>> > > > > > > - Update the druid.io DNS entry to point to this new
> Nginx
> > >>> > server
> > >>> > > > > > > - Shut down Github pages hosting for druid.io
> > >>> > > > > > >
> > >>> > > > > > > In addition, we can also set canonical tags on our pages:
> > >>> > > > > > >
> > >>> > > > > >
> > >>> > > > >
> > >>> > > >
> > >>> > >
> > >>> >
> > >>>
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__moz.com_learn_seo_canonicalization&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=T8G2c6d4EbQ_YDLFQXVebcj0UN9FNrbpPY5Xq4LAR8w&e=
> > >>> > > > > > >
> > >>> > > > > > >
> > >>> > > > > > > Action items
> > >>> > > > > > > ===============
> > >>> > > > > > > - Setup a Jenkins bot that builds the Apache website
> > content
> > >>> from
> > >>> > > > > source
> > >>> > > > > > > - Get the Apache website up
> > >>> > > > > > > - Setup Nginx redirect server for druid.io
> > >>> > > > > > > - Shutdown github pages and redirect DNS for druid.io to
> > >>> Nginx
> > >>> > > > > redirect
> > >>> > > > > > > server
> > >>> > > > > > > - Add canonical tags to pages
> > >>> > > > > > >
> > >>> > > > > >
> > >>> > > > >
> > >>> > > >
> > >>> > >
> > >>> >
> > >>>
> > >>
> >
>

Re: Proposed website migration plan

Posted by Jonathan Wei <jo...@apache.org>.
Thanks for checking the SEO state, that's somewhat disappointing.

For Bing, it sounds like they really want you to use 301s (
https://www.bing.com/webmaster/help/webmaster-guidelines-30fba23a):

> Bing prefers you use a 301 permanent redirect when moving content, should
the move be permanent.  If the move is temporary, then a 302 temporary
redirect will work fine.  Do not use the rel=canonical tag in place of a
proper redirect.

I wasn't able to find similar guidance re: this issue for DuckDuckGo.

On Mon, Apr 29, 2019 at 10:42 AM Gian Merlino <gi...@apache.org> wrote:

> Another update: SEO is not looking great after another day passed. For a
> search for "druid community", both http://druid.io/community and
> https://druid.apache.org/community/ have dropped off the front page of
> Bing
> completely. On Google, the legacy version is gone (as expected) but the
> Apache version has dropped to the #3 spot (down from #2 yesterday; and down
> from where the legacy page was pre-migration, which was #1).
>
> I think this means we do need to try to get 301s figured out.
>
> On Sun, Apr 28, 2019 at 3:06 PM Gian Merlino <gi...@apache.org> wrote:
>
> > Google has picked up the new URL as of today but Bing hasn't. Neither has
> > DuckDuckGo for that matter.
> >
> > Currently, Google is showing https://druid.apache.org/community/ in the
> > #2 spot and Bing/DDG are showing http://druid.io/community in the top
> > spot. Ominously, the latter two _have_ picked up a page title change to
> > "Redirecting..."
> >
> > On Fri, Apr 26, 2019 at 11:00 AM Gian Merlino <gi...@apache.org> wrote:
> >
> >> An update: this is done now since a couple of days ago, but Google and
> >> Bing are still showing http://druid.io/community for a search for
> "druid
> >> community" or even "apache druid community":
> >>
> >> - https://www.google.com/search?q=druid+community
> >> - https://www.bing.com/search?q=druid+community
> >>
> >> I suggest we keep an eye on the search engines and make sure they can
> >> figure out that the site has changed (I'm not sure how often they
> crawl).
> >> If they can then it would make sense to me to move forward with
> migrating
> >> the entire web site.
> >>
> >> On Mon, Apr 22, 2019 at 7:49 PM Jonathan Wei <jo...@apache.org> wrote:
> >>
> >>> Correction: Xavier was suggesting we use
> >>>
> >>>
> https://github.com/druid-io/druid-io.github.io/blob/src/_layouts/redirect_page.html
> >>> ,
> >>> the existing redirect system used by the Druid website.
> >>>
> >>> I've opened PRs to do the community page migration test:
> >>> https://github.com/apache/incubator-druid-website/pull/3
> >>> https://github.com/druid-io/druid-io.github.io/pull/591
> >>>
> >>> On Mon, Apr 22, 2019 at 3:04 PM Gian Merlino <gi...@apache.org> wrote:
> >>>
> >>> > That sounds good to me. I would also consider adding canonical tags
> to
> >>> all
> >>> > druid.apache.org pages so we don't have druid.incubator.apache.org
> and
> >>> > druid.apache.org both floating around (not to mention http/https
> >>> version
> >>> > of
> >>> > both).
> >>> >
> >>> > On Mon, Apr 22, 2019 at 2:59 PM Jonathan Wei <jo...@apache.org>
> >>> wrote:
> >>> >
> >>> > > For redirects, Xavier has suggested using
> >>> > > https://help.github.com/en/articles/redirects-on-github-pages to
> >>> > redirect
> >>> > > to druid.apache.org as a way to transition before the domain
> >>> migration
> >>> > > occurs, and believes that it would have the same SEO effects as a
> 301
> >>> > > redirect after the new pages are indexed.
> >>> > >
> >>> > > I think we could try migrating the current Community page to
> >>> > > druid.apache.org with Github redirects and canonical links
> pointing
> >>> to
> >>> > the
> >>> > > https://druid.apache.org version. If that goes well, we could
> >>> continue
> >>> > > migrating more pages.
> >>> > >
> >>> > > What are the community's thoughts on that?
> >>> > >
> >>> > > Thanks,
> >>> > > Jon
> >>> > >
> >>> > > On Tue, Mar 12, 2019 at 7:19 PM Gian Merlino <gi...@apache.org>
> >>> wrote:
> >>> > >
> >>> > > > OpenOffice and Groovy both chose to sort of "meld" their classic
> >>> and
> >>> > > Apache
> >>> > > > sites together: https://www.openoffice.org/,
> >>> http://groovy-lang.org/.
> >>> > > Note
> >>> > > > how when you click around, you get shuttled between the classic
> >>> domain
> >>> > > and
> >>> > > > the Apache domain. Some pages are available on both sites, like
> >>> > > > http://groovy-lang.org/download.html and
> >>> > > > https://groovy.apache.org/download.html (which don't use
> canonical
> >>> > link
> >>> > > > tags -- does not seem like a good example to follow!).
> >>> > > >
> >>> > > > NetBeans (still incubating) also has a "melded" site at
> >>> > > > https://netbeans.org/ but doesn't seem to consider itself done
> >>> yet.
> >>> > They
> >>> > > > are discussing plans on their lists & wiki to do redirects from
> >>> > > > netbeans.org
> >>> > > > to netbeans.apache.org:
> >>> > > >
> >>> > > >
> >>> > >
> >>> >
> >>>
> https://cwiki.apache.org/confluence/display/NETBEANS/netbeans.org+Transition+Process
> >>> > > > ,
> >>> > > >
> >>> > > >
> >>> > >
> >>> >
> >>>
> https://lists.apache.org/thread.html/ad10fb9d4c8fee571a2f6232b268a3b835f7b823d3a0983b84aeb18a@%3Cdev.netbeans.apache.org%3E
> >>> > > > .
> >>> > > > As of today the domain has been donated to ASF, but the server is
> >>> still
> >>> > > run
> >>> > > > by Oracle, so the plan doesn't seem to be finished yet. (WHOIS
> for
> >>> > > > netbeans.org shows ASF as the registrant; netbeans.org resolves
> to
> >>> > > > lb-netbeans-cms-adc.oracle.com.)
> >>> > > >
> >>> > > > The melded sites don't really seem better to me than redirecting
> >>> all
> >>> > urls
> >>> > > > on the domain. I guess it depends on if we want to keep druid.io
> >>> as
> >>> > the
> >>> > > > official domain forever, or if we think druid.apache.org is
> >>> cooler. I
> >>> > > > definitely think druid.apache.org is cooler so my vote is there
> >>> :).
> >>> > It's
> >>> > > > also nice that it supports https. (druid.io does not today,
> since
> >>> it's
> >>> > > on
> >>> > > > GitHub pages, which doesn't support https for custom domains.)
> >>> > > >
> >>> > > > On Tue, Mar 12, 2019 at 7:47 PM Charles Allen
> >>> > > > <ch...@snap.com.invalid> wrote:
> >>> > > >
> >>> > > > > Are there other projects who have transitioned an independently
> >>> > > > successful
> >>> > > > > domain name to an apache one?
> >>> > > > >
> >>> > > > > On Tue, Mar 5, 2019 at 2:13 PM David Lim <da...@apache.org>
> >>> > wrote:
> >>> > > > >
> >>> > > > > > Who has control over the druid.io domain? Charles would that
> >>> be
> >>> > you?
> >>> > > > > >
> >>> > > > > > We'd need support from them for the DNS redirect.
> >>> > > > > >
> >>> > > > > > On Tue, Mar 5, 2019 at 2:04 PM Jonathan Wei <
> jonwei@apache.org
> >>> >
> >>> > > wrote:
> >>> > > > > >
> >>> > > > > > > We still need to complete the website migration to Apache
> >>> > > > > infrastructure.
> >>> > > > > > >
> >>> > > > > > > I'll propose the following plan:
> >>> > > > > > >
> >>> > > > > > > Proposed Apache Druid website migration plan
> >>> > > > > > > ========================================
> >>> > > > > > >
> >>> > > > > > > These links have some previous discussion on the website
> >>> > migration:
> >>> > > > > > >
> >>> > > > > > >
> >>> > > > > > >
> >>> > > > > >
> >>> > > > >
> >>> > > >
> >>> > >
> >>> >
> >>>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.apache.org_thread.html_7cae100b684e0b33e0adda993efea3d6088978700988a0ae632fdd80-40-253Cdev.druid.apache.org-253E&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=G1dTS7FlYGauxNOaQECZix2YwroWVCqJB-cT0nEeNwM&e=
> >>> > > > > > >
> >>> > > > > >
> >>> > > > >
> >>> > > >
> >>> > >
> >>> >
> >>>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_INFRA-2D17340&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=pwg0jE385gqei6EEEbxugKHWll7oyKoCloFc8ByhlUc&e=
> >>> > > > > > >
> >>> > > > > > > From the discussions above, the recommendation is to have 2
> >>> > > separate
> >>> > > > > > repos
> >>> > > > > > > for the website: one for source and another for built
> content
> >>> > that
> >>> > > > will
> >>> > > > > > be
> >>> > > > > > > served.
> >>> > > > > > >
> >>> > > > > > > Generating site files
> >>> > > > > > > =======================
> >>> > > > > > >
> >>> > > > > > > The Apache site update process will be similar to our
> current
> >>> > > > process.
> >>> > > > > > >
> >>> > > > > > > Current process:
> >>> > > > > > > 1. Push changes to
> >>> > > > > > https://github.com/druid-io/druid-io.github.io/tree/src
> >>> > > > > > > 2. metamx bot picks up changes, builds, and commits to
> >>> > > > > > > https://github.com/druid-io/druid-io.github.io/tree/master
> >>> > > > > > > 3.
> >>> https://github.com/druid-io/druid-io.github.io/tree/master is
> >>> > > > > served
> >>> > > > > > by
> >>> > > > > > > github pages
> >>> > > > > > >
> >>> > > > > > > Apache process:
> >>> > > > > > > 1. Push changes to
> >>> > > > > https://github.com/apache/incubator-druid-website-src
> >>> > > > > > > 2. Jenkins bot from Apache will build the website from
> source
> >>> > repo,
> >>> > > > > > commit
> >>> > > > > > > to https://github.com/apache/incubator-druid-website
> >>> > > > > > > 3. Apache Druid website will be served from the content in
> >>> > > > > > > https://github.com/apache/incubator-druid-website
> (asf-site
> >>> > > branch)
> >>> > > > > > >
> >>> > > > > > >
> >>> > > > > > > Hosting and SEO
> >>> > > > > > > ================
> >>> > > > > > >
> >>> > > > > > > The Apache site will be hosted at druid.apache.org on
> Apache
> >>> > > > > > > infrastructure:
> >>> > > > > >
> >>> > > > >
> >>> > > >
> >>> > >
> >>> >
> >>>
> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.apache.org_dev_project-2Dsite.html&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=_rHEo_asMXKypaunuBTXFkB6Ni3F6KqbEfkck18L7Ag&e=
> >>> > > > > > >
> >>> > > > > > > To preserve our search rankings, we can setup 301 redirects
> >>> from
> >>> > > the
> >>> > > > > old
> >>> > > > > > > druid.io site to the corresponding pages on the
> >>> druid.apache.org
> >>> > > > > site. (
> >>> > > > > > >
> >>> > > > > >
> >>> > > > >
> >>> > > >
> >>> > >
> >>> >
> >>>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__moz.com_learn_seo_redirection&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=lUeWU0dT9thy8gp11RO-Vry7zkYl_W4BXz01fyXJO0A&e=
> >>> > > > > > )
> >>> > > > > > >
> >>> > > > > > > However, Github pages (which currently hosts the druid.io
> >>> site)
> >>> > > does
> >>> > > > > not
> >>> > > > > > > support 301 redirects, so we propose the following:
> >>> > > > > > > - Setup a new Nginx server that will perform 301 redirects
> to
> >>> > > > > > > druid.apache.org for the druid.io. Imply can host this if
> >>> > needed.
> >>> > > > > > > - Update the druid.io DNS entry to point to this new Nginx
> >>> > server
> >>> > > > > > > - Shut down Github pages hosting for druid.io
> >>> > > > > > >
> >>> > > > > > > In addition, we can also set canonical tags on our pages:
> >>> > > > > > >
> >>> > > > > >
> >>> > > > >
> >>> > > >
> >>> > >
> >>> >
> >>>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__moz.com_learn_seo_canonicalization&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=T8G2c6d4EbQ_YDLFQXVebcj0UN9FNrbpPY5Xq4LAR8w&e=
> >>> > > > > > >
> >>> > > > > > >
> >>> > > > > > > Action items
> >>> > > > > > > ===============
> >>> > > > > > > - Setup a Jenkins bot that builds the Apache website
> content
> >>> from
> >>> > > > > source
> >>> > > > > > > - Get the Apache website up
> >>> > > > > > > - Setup Nginx redirect server for druid.io
> >>> > > > > > > - Shutdown github pages and redirect DNS for druid.io to
> >>> Nginx
> >>> > > > > redirect
> >>> > > > > > > server
> >>> > > > > > > - Add canonical tags to pages
> >>> > > > > > >
> >>> > > > > >
> >>> > > > >
> >>> > > >
> >>> > >
> >>> >
> >>>
> >>
>

Re: Proposed website migration plan

Posted by Gian Merlino <gi...@apache.org>.
Another update: SEO is not looking great after another day passed. For a
search for "druid community", both http://druid.io/community and
https://druid.apache.org/community/ have dropped off the front page of Bing
completely. On Google, the legacy version is gone (as expected) but the
Apache version has dropped to the #3 spot (down from #2 yesterday; and down
from where the legacy page was pre-migration, which was #1).

I think this means we do need to try to get 301s figured out.

On Sun, Apr 28, 2019 at 3:06 PM Gian Merlino <gi...@apache.org> wrote:

> Google has picked up the new URL as of today but Bing hasn't. Neither has
> DuckDuckGo for that matter.
>
> Currently, Google is showing https://druid.apache.org/community/ in the
> #2 spot and Bing/DDG are showing http://druid.io/community in the top
> spot. Ominously, the latter two _have_ picked up a page title change to
> "Redirecting..."
>
> On Fri, Apr 26, 2019 at 11:00 AM Gian Merlino <gi...@apache.org> wrote:
>
>> An update: this is done now since a couple of days ago, but Google and
>> Bing are still showing http://druid.io/community for a search for "druid
>> community" or even "apache druid community":
>>
>> - https://www.google.com/search?q=druid+community
>> - https://www.bing.com/search?q=druid+community
>>
>> I suggest we keep an eye on the search engines and make sure they can
>> figure out that the site has changed (I'm not sure how often they crawl).
>> If they can then it would make sense to me to move forward with migrating
>> the entire web site.
>>
>> On Mon, Apr 22, 2019 at 7:49 PM Jonathan Wei <jo...@apache.org> wrote:
>>
>>> Correction: Xavier was suggesting we use
>>>
>>> https://github.com/druid-io/druid-io.github.io/blob/src/_layouts/redirect_page.html
>>> ,
>>> the existing redirect system used by the Druid website.
>>>
>>> I've opened PRs to do the community page migration test:
>>> https://github.com/apache/incubator-druid-website/pull/3
>>> https://github.com/druid-io/druid-io.github.io/pull/591
>>>
>>> On Mon, Apr 22, 2019 at 3:04 PM Gian Merlino <gi...@apache.org> wrote:
>>>
>>> > That sounds good to me. I would also consider adding canonical tags to
>>> all
>>> > druid.apache.org pages so we don't have druid.incubator.apache.org and
>>> > druid.apache.org both floating around (not to mention http/https
>>> version
>>> > of
>>> > both).
>>> >
>>> > On Mon, Apr 22, 2019 at 2:59 PM Jonathan Wei <jo...@apache.org>
>>> wrote:
>>> >
>>> > > For redirects, Xavier has suggested using
>>> > > https://help.github.com/en/articles/redirects-on-github-pages to
>>> > redirect
>>> > > to druid.apache.org as a way to transition before the domain
>>> migration
>>> > > occurs, and believes that it would have the same SEO effects as a 301
>>> > > redirect after the new pages are indexed.
>>> > >
>>> > > I think we could try migrating the current Community page to
>>> > > druid.apache.org with Github redirects and canonical links pointing
>>> to
>>> > the
>>> > > https://druid.apache.org version. If that goes well, we could
>>> continue
>>> > > migrating more pages.
>>> > >
>>> > > What are the community's thoughts on that?
>>> > >
>>> > > Thanks,
>>> > > Jon
>>> > >
>>> > > On Tue, Mar 12, 2019 at 7:19 PM Gian Merlino <gi...@apache.org>
>>> wrote:
>>> > >
>>> > > > OpenOffice and Groovy both chose to sort of "meld" their classic
>>> and
>>> > > Apache
>>> > > > sites together: https://www.openoffice.org/,
>>> http://groovy-lang.org/.
>>> > > Note
>>> > > > how when you click around, you get shuttled between the classic
>>> domain
>>> > > and
>>> > > > the Apache domain. Some pages are available on both sites, like
>>> > > > http://groovy-lang.org/download.html and
>>> > > > https://groovy.apache.org/download.html (which don't use canonical
>>> > link
>>> > > > tags -- does not seem like a good example to follow!).
>>> > > >
>>> > > > NetBeans (still incubating) also has a "melded" site at
>>> > > > https://netbeans.org/ but doesn't seem to consider itself done
>>> yet.
>>> > They
>>> > > > are discussing plans on their lists & wiki to do redirects from
>>> > > > netbeans.org
>>> > > > to netbeans.apache.org:
>>> > > >
>>> > > >
>>> > >
>>> >
>>> https://cwiki.apache.org/confluence/display/NETBEANS/netbeans.org+Transition+Process
>>> > > > ,
>>> > > >
>>> > > >
>>> > >
>>> >
>>> https://lists.apache.org/thread.html/ad10fb9d4c8fee571a2f6232b268a3b835f7b823d3a0983b84aeb18a@%3Cdev.netbeans.apache.org%3E
>>> > > > .
>>> > > > As of today the domain has been donated to ASF, but the server is
>>> still
>>> > > run
>>> > > > by Oracle, so the plan doesn't seem to be finished yet. (WHOIS for
>>> > > > netbeans.org shows ASF as the registrant; netbeans.org resolves to
>>> > > > lb-netbeans-cms-adc.oracle.com.)
>>> > > >
>>> > > > The melded sites don't really seem better to me than redirecting
>>> all
>>> > urls
>>> > > > on the domain. I guess it depends on if we want to keep druid.io
>>> as
>>> > the
>>> > > > official domain forever, or if we think druid.apache.org is
>>> cooler. I
>>> > > > definitely think druid.apache.org is cooler so my vote is there
>>> :).
>>> > It's
>>> > > > also nice that it supports https. (druid.io does not today, since
>>> it's
>>> > > on
>>> > > > GitHub pages, which doesn't support https for custom domains.)
>>> > > >
>>> > > > On Tue, Mar 12, 2019 at 7:47 PM Charles Allen
>>> > > > <ch...@snap.com.invalid> wrote:
>>> > > >
>>> > > > > Are there other projects who have transitioned an independently
>>> > > > successful
>>> > > > > domain name to an apache one?
>>> > > > >
>>> > > > > On Tue, Mar 5, 2019 at 2:13 PM David Lim <da...@apache.org>
>>> > wrote:
>>> > > > >
>>> > > > > > Who has control over the druid.io domain? Charles would that
>>> be
>>> > you?
>>> > > > > >
>>> > > > > > We'd need support from them for the DNS redirect.
>>> > > > > >
>>> > > > > > On Tue, Mar 5, 2019 at 2:04 PM Jonathan Wei <jonwei@apache.org
>>> >
>>> > > wrote:
>>> > > > > >
>>> > > > > > > We still need to complete the website migration to Apache
>>> > > > > infrastructure.
>>> > > > > > >
>>> > > > > > > I'll propose the following plan:
>>> > > > > > >
>>> > > > > > > Proposed Apache Druid website migration plan
>>> > > > > > > ========================================
>>> > > > > > >
>>> > > > > > > These links have some previous discussion on the website
>>> > migration:
>>> > > > > > >
>>> > > > > > >
>>> > > > > > >
>>> > > > > >
>>> > > > >
>>> > > >
>>> > >
>>> >
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.apache.org_thread.html_7cae100b684e0b33e0adda993efea3d6088978700988a0ae632fdd80-40-253Cdev.druid.apache.org-253E&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=G1dTS7FlYGauxNOaQECZix2YwroWVCqJB-cT0nEeNwM&e=
>>> > > > > > >
>>> > > > > >
>>> > > > >
>>> > > >
>>> > >
>>> >
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_INFRA-2D17340&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=pwg0jE385gqei6EEEbxugKHWll7oyKoCloFc8ByhlUc&e=
>>> > > > > > >
>>> > > > > > > From the discussions above, the recommendation is to have 2
>>> > > separate
>>> > > > > > repos
>>> > > > > > > for the website: one for source and another for built content
>>> > that
>>> > > > will
>>> > > > > > be
>>> > > > > > > served.
>>> > > > > > >
>>> > > > > > > Generating site files
>>> > > > > > > =======================
>>> > > > > > >
>>> > > > > > > The Apache site update process will be similar to our current
>>> > > > process.
>>> > > > > > >
>>> > > > > > > Current process:
>>> > > > > > > 1. Push changes to
>>> > > > > > https://github.com/druid-io/druid-io.github.io/tree/src
>>> > > > > > > 2. metamx bot picks up changes, builds, and commits to
>>> > > > > > > https://github.com/druid-io/druid-io.github.io/tree/master
>>> > > > > > > 3.
>>> https://github.com/druid-io/druid-io.github.io/tree/master is
>>> > > > > served
>>> > > > > > by
>>> > > > > > > github pages
>>> > > > > > >
>>> > > > > > > Apache process:
>>> > > > > > > 1. Push changes to
>>> > > > > https://github.com/apache/incubator-druid-website-src
>>> > > > > > > 2. Jenkins bot from Apache will build the website from source
>>> > repo,
>>> > > > > > commit
>>> > > > > > > to https://github.com/apache/incubator-druid-website
>>> > > > > > > 3. Apache Druid website will be served from the content in
>>> > > > > > > https://github.com/apache/incubator-druid-website (asf-site
>>> > > branch)
>>> > > > > > >
>>> > > > > > >
>>> > > > > > > Hosting and SEO
>>> > > > > > > ================
>>> > > > > > >
>>> > > > > > > The Apache site will be hosted at druid.apache.org on Apache
>>> > > > > > > infrastructure:
>>> > > > > >
>>> > > > >
>>> > > >
>>> > >
>>> >
>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.apache.org_dev_project-2Dsite.html&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=_rHEo_asMXKypaunuBTXFkB6Ni3F6KqbEfkck18L7Ag&e=
>>> > > > > > >
>>> > > > > > > To preserve our search rankings, we can setup 301 redirects
>>> from
>>> > > the
>>> > > > > old
>>> > > > > > > druid.io site to the corresponding pages on the
>>> druid.apache.org
>>> > > > > site. (
>>> > > > > > >
>>> > > > > >
>>> > > > >
>>> > > >
>>> > >
>>> >
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__moz.com_learn_seo_redirection&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=lUeWU0dT9thy8gp11RO-Vry7zkYl_W4BXz01fyXJO0A&e=
>>> > > > > > )
>>> > > > > > >
>>> > > > > > > However, Github pages (which currently hosts the druid.io
>>> site)
>>> > > does
>>> > > > > not
>>> > > > > > > support 301 redirects, so we propose the following:
>>> > > > > > > - Setup a new Nginx server that will perform 301 redirects to
>>> > > > > > > druid.apache.org for the druid.io. Imply can host this if
>>> > needed.
>>> > > > > > > - Update the druid.io DNS entry to point to this new Nginx
>>> > server
>>> > > > > > > - Shut down Github pages hosting for druid.io
>>> > > > > > >
>>> > > > > > > In addition, we can also set canonical tags on our pages:
>>> > > > > > >
>>> > > > > >
>>> > > > >
>>> > > >
>>> > >
>>> >
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__moz.com_learn_seo_canonicalization&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=T8G2c6d4EbQ_YDLFQXVebcj0UN9FNrbpPY5Xq4LAR8w&e=
>>> > > > > > >
>>> > > > > > >
>>> > > > > > > Action items
>>> > > > > > > ===============
>>> > > > > > > - Setup a Jenkins bot that builds the Apache website content
>>> from
>>> > > > > source
>>> > > > > > > - Get the Apache website up
>>> > > > > > > - Setup Nginx redirect server for druid.io
>>> > > > > > > - Shutdown github pages and redirect DNS for druid.io to
>>> Nginx
>>> > > > > redirect
>>> > > > > > > server
>>> > > > > > > - Add canonical tags to pages
>>> > > > > > >
>>> > > > > >
>>> > > > >
>>> > > >
>>> > >
>>> >
>>>
>>

Re: Proposed website migration plan

Posted by Gian Merlino <gi...@apache.org>.
Google has picked up the new URL as of today but Bing hasn't. Neither has
DuckDuckGo for that matter.

Currently, Google is showing https://druid.apache.org/community/ in the #2
spot and Bing/DDG are showing http://druid.io/community in the top spot.
Ominously, the latter two _have_ picked up a page title change to
"Redirecting..."

On Fri, Apr 26, 2019 at 11:00 AM Gian Merlino <gi...@apache.org> wrote:

> An update: this is done now since a couple of days ago, but Google and
> Bing are still showing http://druid.io/community for a search for "druid
> community" or even "apache druid community":
>
> - https://www.google.com/search?q=druid+community
> - https://www.bing.com/search?q=druid+community
>
> I suggest we keep an eye on the search engines and make sure they can
> figure out that the site has changed (I'm not sure how often they crawl).
> If they can then it would make sense to me to move forward with migrating
> the entire web site.
>
> On Mon, Apr 22, 2019 at 7:49 PM Jonathan Wei <jo...@apache.org> wrote:
>
>> Correction: Xavier was suggesting we use
>>
>> https://github.com/druid-io/druid-io.github.io/blob/src/_layouts/redirect_page.html
>> ,
>> the existing redirect system used by the Druid website.
>>
>> I've opened PRs to do the community page migration test:
>> https://github.com/apache/incubator-druid-website/pull/3
>> https://github.com/druid-io/druid-io.github.io/pull/591
>>
>> On Mon, Apr 22, 2019 at 3:04 PM Gian Merlino <gi...@apache.org> wrote:
>>
>> > That sounds good to me. I would also consider adding canonical tags to
>> all
>> > druid.apache.org pages so we don't have druid.incubator.apache.org and
>> > druid.apache.org both floating around (not to mention http/https
>> version
>> > of
>> > both).
>> >
>> > On Mon, Apr 22, 2019 at 2:59 PM Jonathan Wei <jo...@apache.org> wrote:
>> >
>> > > For redirects, Xavier has suggested using
>> > > https://help.github.com/en/articles/redirects-on-github-pages to
>> > redirect
>> > > to druid.apache.org as a way to transition before the domain
>> migration
>> > > occurs, and believes that it would have the same SEO effects as a 301
>> > > redirect after the new pages are indexed.
>> > >
>> > > I think we could try migrating the current Community page to
>> > > druid.apache.org with Github redirects and canonical links pointing
>> to
>> > the
>> > > https://druid.apache.org version. If that goes well, we could
>> continue
>> > > migrating more pages.
>> > >
>> > > What are the community's thoughts on that?
>> > >
>> > > Thanks,
>> > > Jon
>> > >
>> > > On Tue, Mar 12, 2019 at 7:19 PM Gian Merlino <gi...@apache.org> wrote:
>> > >
>> > > > OpenOffice and Groovy both chose to sort of "meld" their classic and
>> > > Apache
>> > > > sites together: https://www.openoffice.org/,
>> http://groovy-lang.org/.
>> > > Note
>> > > > how when you click around, you get shuttled between the classic
>> domain
>> > > and
>> > > > the Apache domain. Some pages are available on both sites, like
>> > > > http://groovy-lang.org/download.html and
>> > > > https://groovy.apache.org/download.html (which don't use canonical
>> > link
>> > > > tags -- does not seem like a good example to follow!).
>> > > >
>> > > > NetBeans (still incubating) also has a "melded" site at
>> > > > https://netbeans.org/ but doesn't seem to consider itself done yet.
>> > They
>> > > > are discussing plans on their lists & wiki to do redirects from
>> > > > netbeans.org
>> > > > to netbeans.apache.org:
>> > > >
>> > > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/NETBEANS/netbeans.org+Transition+Process
>> > > > ,
>> > > >
>> > > >
>> > >
>> >
>> https://lists.apache.org/thread.html/ad10fb9d4c8fee571a2f6232b268a3b835f7b823d3a0983b84aeb18a@%3Cdev.netbeans.apache.org%3E
>> > > > .
>> > > > As of today the domain has been donated to ASF, but the server is
>> still
>> > > run
>> > > > by Oracle, so the plan doesn't seem to be finished yet. (WHOIS for
>> > > > netbeans.org shows ASF as the registrant; netbeans.org resolves to
>> > > > lb-netbeans-cms-adc.oracle.com.)
>> > > >
>> > > > The melded sites don't really seem better to me than redirecting all
>> > urls
>> > > > on the domain. I guess it depends on if we want to keep druid.io as
>> > the
>> > > > official domain forever, or if we think druid.apache.org is
>> cooler. I
>> > > > definitely think druid.apache.org is cooler so my vote is there :).
>> > It's
>> > > > also nice that it supports https. (druid.io does not today, since
>> it's
>> > > on
>> > > > GitHub pages, which doesn't support https for custom domains.)
>> > > >
>> > > > On Tue, Mar 12, 2019 at 7:47 PM Charles Allen
>> > > > <ch...@snap.com.invalid> wrote:
>> > > >
>> > > > > Are there other projects who have transitioned an independently
>> > > > successful
>> > > > > domain name to an apache one?
>> > > > >
>> > > > > On Tue, Mar 5, 2019 at 2:13 PM David Lim <da...@apache.org>
>> > wrote:
>> > > > >
>> > > > > > Who has control over the druid.io domain? Charles would that be
>> > you?
>> > > > > >
>> > > > > > We'd need support from them for the DNS redirect.
>> > > > > >
>> > > > > > On Tue, Mar 5, 2019 at 2:04 PM Jonathan Wei <jo...@apache.org>
>> > > wrote:
>> > > > > >
>> > > > > > > We still need to complete the website migration to Apache
>> > > > > infrastructure.
>> > > > > > >
>> > > > > > > I'll propose the following plan:
>> > > > > > >
>> > > > > > > Proposed Apache Druid website migration plan
>> > > > > > > ========================================
>> > > > > > >
>> > > > > > > These links have some previous discussion on the website
>> > migration:
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.apache.org_thread.html_7cae100b684e0b33e0adda993efea3d6088978700988a0ae632fdd80-40-253Cdev.druid.apache.org-253E&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=G1dTS7FlYGauxNOaQECZix2YwroWVCqJB-cT0nEeNwM&e=
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_INFRA-2D17340&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=pwg0jE385gqei6EEEbxugKHWll7oyKoCloFc8ByhlUc&e=
>> > > > > > >
>> > > > > > > From the discussions above, the recommendation is to have 2
>> > > separate
>> > > > > > repos
>> > > > > > > for the website: one for source and another for built content
>> > that
>> > > > will
>> > > > > > be
>> > > > > > > served.
>> > > > > > >
>> > > > > > > Generating site files
>> > > > > > > =======================
>> > > > > > >
>> > > > > > > The Apache site update process will be similar to our current
>> > > > process.
>> > > > > > >
>> > > > > > > Current process:
>> > > > > > > 1. Push changes to
>> > > > > > https://github.com/druid-io/druid-io.github.io/tree/src
>> > > > > > > 2. metamx bot picks up changes, builds, and commits to
>> > > > > > > https://github.com/druid-io/druid-io.github.io/tree/master
>> > > > > > > 3. https://github.com/druid-io/druid-io.github.io/tree/master
>> is
>> > > > > served
>> > > > > > by
>> > > > > > > github pages
>> > > > > > >
>> > > > > > > Apache process:
>> > > > > > > 1. Push changes to
>> > > > > https://github.com/apache/incubator-druid-website-src
>> > > > > > > 2. Jenkins bot from Apache will build the website from source
>> > repo,
>> > > > > > commit
>> > > > > > > to https://github.com/apache/incubator-druid-website
>> > > > > > > 3. Apache Druid website will be served from the content in
>> > > > > > > https://github.com/apache/incubator-druid-website (asf-site
>> > > branch)
>> > > > > > >
>> > > > > > >
>> > > > > > > Hosting and SEO
>> > > > > > > ================
>> > > > > > >
>> > > > > > > The Apache site will be hosted at druid.apache.org on Apache
>> > > > > > > infrastructure:
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.apache.org_dev_project-2Dsite.html&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=_rHEo_asMXKypaunuBTXFkB6Ni3F6KqbEfkck18L7Ag&e=
>> > > > > > >
>> > > > > > > To preserve our search rankings, we can setup 301 redirects
>> from
>> > > the
>> > > > > old
>> > > > > > > druid.io site to the corresponding pages on the
>> druid.apache.org
>> > > > > site. (
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__moz.com_learn_seo_redirection&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=lUeWU0dT9thy8gp11RO-Vry7zkYl_W4BXz01fyXJO0A&e=
>> > > > > > )
>> > > > > > >
>> > > > > > > However, Github pages (which currently hosts the druid.io
>> site)
>> > > does
>> > > > > not
>> > > > > > > support 301 redirects, so we propose the following:
>> > > > > > > - Setup a new Nginx server that will perform 301 redirects to
>> > > > > > > druid.apache.org for the druid.io. Imply can host this if
>> > needed.
>> > > > > > > - Update the druid.io DNS entry to point to this new Nginx
>> > server
>> > > > > > > - Shut down Github pages hosting for druid.io
>> > > > > > >
>> > > > > > > In addition, we can also set canonical tags on our pages:
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__moz.com_learn_seo_canonicalization&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=T8G2c6d4EbQ_YDLFQXVebcj0UN9FNrbpPY5Xq4LAR8w&e=
>> > > > > > >
>> > > > > > >
>> > > > > > > Action items
>> > > > > > > ===============
>> > > > > > > - Setup a Jenkins bot that builds the Apache website content
>> from
>> > > > > source
>> > > > > > > - Get the Apache website up
>> > > > > > > - Setup Nginx redirect server for druid.io
>> > > > > > > - Shutdown github pages and redirect DNS for druid.io to
>> Nginx
>> > > > > redirect
>> > > > > > > server
>> > > > > > > - Add canonical tags to pages
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>

Re: Proposed website migration plan

Posted by Gian Merlino <gi...@apache.org>.
An update: this is done now since a couple of days ago, but Google and Bing
are still showing http://druid.io/community for a search for "druid
community" or even "apache druid community":

- https://www.google.com/search?q=druid+community
- https://www.bing.com/search?q=druid+community

I suggest we keep an eye on the search engines and make sure they can
figure out that the site has changed (I'm not sure how often they crawl).
If they can then it would make sense to me to move forward with migrating
the entire web site.

On Mon, Apr 22, 2019 at 7:49 PM Jonathan Wei <jo...@apache.org> wrote:

> Correction: Xavier was suggesting we use
>
> https://github.com/druid-io/druid-io.github.io/blob/src/_layouts/redirect_page.html
> ,
> the existing redirect system used by the Druid website.
>
> I've opened PRs to do the community page migration test:
> https://github.com/apache/incubator-druid-website/pull/3
> https://github.com/druid-io/druid-io.github.io/pull/591
>
> On Mon, Apr 22, 2019 at 3:04 PM Gian Merlino <gi...@apache.org> wrote:
>
> > That sounds good to me. I would also consider adding canonical tags to
> all
> > druid.apache.org pages so we don't have druid.incubator.apache.org and
> > druid.apache.org both floating around (not to mention http/https version
> > of
> > both).
> >
> > On Mon, Apr 22, 2019 at 2:59 PM Jonathan Wei <jo...@apache.org> wrote:
> >
> > > For redirects, Xavier has suggested using
> > > https://help.github.com/en/articles/redirects-on-github-pages to
> > redirect
> > > to druid.apache.org as a way to transition before the domain migration
> > > occurs, and believes that it would have the same SEO effects as a 301
> > > redirect after the new pages are indexed.
> > >
> > > I think we could try migrating the current Community page to
> > > druid.apache.org with Github redirects and canonical links pointing to
> > the
> > > https://druid.apache.org version. If that goes well, we could continue
> > > migrating more pages.
> > >
> > > What are the community's thoughts on that?
> > >
> > > Thanks,
> > > Jon
> > >
> > > On Tue, Mar 12, 2019 at 7:19 PM Gian Merlino <gi...@apache.org> wrote:
> > >
> > > > OpenOffice and Groovy both chose to sort of "meld" their classic and
> > > Apache
> > > > sites together: https://www.openoffice.org/, http://groovy-lang.org/
> .
> > > Note
> > > > how when you click around, you get shuttled between the classic
> domain
> > > and
> > > > the Apache domain. Some pages are available on both sites, like
> > > > http://groovy-lang.org/download.html and
> > > > https://groovy.apache.org/download.html (which don't use canonical
> > link
> > > > tags -- does not seem like a good example to follow!).
> > > >
> > > > NetBeans (still incubating) also has a "melded" site at
> > > > https://netbeans.org/ but doesn't seem to consider itself done yet.
> > They
> > > > are discussing plans on their lists & wiki to do redirects from
> > > > netbeans.org
> > > > to netbeans.apache.org:
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/NETBEANS/netbeans.org+Transition+Process
> > > > ,
> > > >
> > > >
> > >
> >
> https://lists.apache.org/thread.html/ad10fb9d4c8fee571a2f6232b268a3b835f7b823d3a0983b84aeb18a@%3Cdev.netbeans.apache.org%3E
> > > > .
> > > > As of today the domain has been donated to ASF, but the server is
> still
> > > run
> > > > by Oracle, so the plan doesn't seem to be finished yet. (WHOIS for
> > > > netbeans.org shows ASF as the registrant; netbeans.org resolves to
> > > > lb-netbeans-cms-adc.oracle.com.)
> > > >
> > > > The melded sites don't really seem better to me than redirecting all
> > urls
> > > > on the domain. I guess it depends on if we want to keep druid.io as
> > the
> > > > official domain forever, or if we think druid.apache.org is cooler.
> I
> > > > definitely think druid.apache.org is cooler so my vote is there :).
> > It's
> > > > also nice that it supports https. (druid.io does not today, since
> it's
> > > on
> > > > GitHub pages, which doesn't support https for custom domains.)
> > > >
> > > > On Tue, Mar 12, 2019 at 7:47 PM Charles Allen
> > > > <ch...@snap.com.invalid> wrote:
> > > >
> > > > > Are there other projects who have transitioned an independently
> > > > successful
> > > > > domain name to an apache one?
> > > > >
> > > > > On Tue, Mar 5, 2019 at 2:13 PM David Lim <da...@apache.org>
> > wrote:
> > > > >
> > > > > > Who has control over the druid.io domain? Charles would that be
> > you?
> > > > > >
> > > > > > We'd need support from them for the DNS redirect.
> > > > > >
> > > > > > On Tue, Mar 5, 2019 at 2:04 PM Jonathan Wei <jo...@apache.org>
> > > wrote:
> > > > > >
> > > > > > > We still need to complete the website migration to Apache
> > > > > infrastructure.
> > > > > > >
> > > > > > > I'll propose the following plan:
> > > > > > >
> > > > > > > Proposed Apache Druid website migration plan
> > > > > > > ========================================
> > > > > > >
> > > > > > > These links have some previous discussion on the website
> > migration:
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.apache.org_thread.html_7cae100b684e0b33e0adda993efea3d6088978700988a0ae632fdd80-40-253Cdev.druid.apache.org-253E&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=G1dTS7FlYGauxNOaQECZix2YwroWVCqJB-cT0nEeNwM&e=
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_INFRA-2D17340&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=pwg0jE385gqei6EEEbxugKHWll7oyKoCloFc8ByhlUc&e=
> > > > > > >
> > > > > > > From the discussions above, the recommendation is to have 2
> > > separate
> > > > > > repos
> > > > > > > for the website: one for source and another for built content
> > that
> > > > will
> > > > > > be
> > > > > > > served.
> > > > > > >
> > > > > > > Generating site files
> > > > > > > =======================
> > > > > > >
> > > > > > > The Apache site update process will be similar to our current
> > > > process.
> > > > > > >
> > > > > > > Current process:
> > > > > > > 1. Push changes to
> > > > > > https://github.com/druid-io/druid-io.github.io/tree/src
> > > > > > > 2. metamx bot picks up changes, builds, and commits to
> > > > > > > https://github.com/druid-io/druid-io.github.io/tree/master
> > > > > > > 3. https://github.com/druid-io/druid-io.github.io/tree/master
> is
> > > > > served
> > > > > > by
> > > > > > > github pages
> > > > > > >
> > > > > > > Apache process:
> > > > > > > 1. Push changes to
> > > > > https://github.com/apache/incubator-druid-website-src
> > > > > > > 2. Jenkins bot from Apache will build the website from source
> > repo,
> > > > > > commit
> > > > > > > to https://github.com/apache/incubator-druid-website
> > > > > > > 3. Apache Druid website will be served from the content in
> > > > > > > https://github.com/apache/incubator-druid-website (asf-site
> > > branch)
> > > > > > >
> > > > > > >
> > > > > > > Hosting and SEO
> > > > > > > ================
> > > > > > >
> > > > > > > The Apache site will be hosted at druid.apache.org on Apache
> > > > > > > infrastructure:
> > > > > >
> > > > >
> > > >
> > >
> >
> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.apache.org_dev_project-2Dsite.html&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=_rHEo_asMXKypaunuBTXFkB6Ni3F6KqbEfkck18L7Ag&e=
> > > > > > >
> > > > > > > To preserve our search rankings, we can setup 301 redirects
> from
> > > the
> > > > > old
> > > > > > > druid.io site to the corresponding pages on the
> druid.apache.org
> > > > > site. (
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__moz.com_learn_seo_redirection&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=lUeWU0dT9thy8gp11RO-Vry7zkYl_W4BXz01fyXJO0A&e=
> > > > > > )
> > > > > > >
> > > > > > > However, Github pages (which currently hosts the druid.io
> site)
> > > does
> > > > > not
> > > > > > > support 301 redirects, so we propose the following:
> > > > > > > - Setup a new Nginx server that will perform 301 redirects to
> > > > > > > druid.apache.org for the druid.io. Imply can host this if
> > needed.
> > > > > > > - Update the druid.io DNS entry to point to this new Nginx
> > server
> > > > > > > - Shut down Github pages hosting for druid.io
> > > > > > >
> > > > > > > In addition, we can also set canonical tags on our pages:
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__moz.com_learn_seo_canonicalization&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=T8G2c6d4EbQ_YDLFQXVebcj0UN9FNrbpPY5Xq4LAR8w&e=
> > > > > > >
> > > > > > >
> > > > > > > Action items
> > > > > > > ===============
> > > > > > > - Setup a Jenkins bot that builds the Apache website content
> from
> > > > > source
> > > > > > > - Get the Apache website up
> > > > > > > - Setup Nginx redirect server for druid.io
> > > > > > > - Shutdown github pages and redirect DNS for druid.io to Nginx
> > > > > redirect
> > > > > > > server
> > > > > > > - Add canonical tags to pages
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Proposed website migration plan

Posted by Jonathan Wei <jo...@apache.org>.
Correction: Xavier was suggesting we use
https://github.com/druid-io/druid-io.github.io/blob/src/_layouts/redirect_page.html,
the existing redirect system used by the Druid website.

I've opened PRs to do the community page migration test:
https://github.com/apache/incubator-druid-website/pull/3
https://github.com/druid-io/druid-io.github.io/pull/591

On Mon, Apr 22, 2019 at 3:04 PM Gian Merlino <gi...@apache.org> wrote:

> That sounds good to me. I would also consider adding canonical tags to all
> druid.apache.org pages so we don't have druid.incubator.apache.org and
> druid.apache.org both floating around (not to mention http/https version
> of
> both).
>
> On Mon, Apr 22, 2019 at 2:59 PM Jonathan Wei <jo...@apache.org> wrote:
>
> > For redirects, Xavier has suggested using
> > https://help.github.com/en/articles/redirects-on-github-pages to
> redirect
> > to druid.apache.org as a way to transition before the domain migration
> > occurs, and believes that it would have the same SEO effects as a 301
> > redirect after the new pages are indexed.
> >
> > I think we could try migrating the current Community page to
> > druid.apache.org with Github redirects and canonical links pointing to
> the
> > https://druid.apache.org version. If that goes well, we could continue
> > migrating more pages.
> >
> > What are the community's thoughts on that?
> >
> > Thanks,
> > Jon
> >
> > On Tue, Mar 12, 2019 at 7:19 PM Gian Merlino <gi...@apache.org> wrote:
> >
> > > OpenOffice and Groovy both chose to sort of "meld" their classic and
> > Apache
> > > sites together: https://www.openoffice.org/, http://groovy-lang.org/.
> > Note
> > > how when you click around, you get shuttled between the classic domain
> > and
> > > the Apache domain. Some pages are available on both sites, like
> > > http://groovy-lang.org/download.html and
> > > https://groovy.apache.org/download.html (which don't use canonical
> link
> > > tags -- does not seem like a good example to follow!).
> > >
> > > NetBeans (still incubating) also has a "melded" site at
> > > https://netbeans.org/ but doesn't seem to consider itself done yet.
> They
> > > are discussing plans on their lists & wiki to do redirects from
> > > netbeans.org
> > > to netbeans.apache.org:
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/NETBEANS/netbeans.org+Transition+Process
> > > ,
> > >
> > >
> >
> https://lists.apache.org/thread.html/ad10fb9d4c8fee571a2f6232b268a3b835f7b823d3a0983b84aeb18a@%3Cdev.netbeans.apache.org%3E
> > > .
> > > As of today the domain has been donated to ASF, but the server is still
> > run
> > > by Oracle, so the plan doesn't seem to be finished yet. (WHOIS for
> > > netbeans.org shows ASF as the registrant; netbeans.org resolves to
> > > lb-netbeans-cms-adc.oracle.com.)
> > >
> > > The melded sites don't really seem better to me than redirecting all
> urls
> > > on the domain. I guess it depends on if we want to keep druid.io as
> the
> > > official domain forever, or if we think druid.apache.org is cooler. I
> > > definitely think druid.apache.org is cooler so my vote is there :).
> It's
> > > also nice that it supports https. (druid.io does not today, since it's
> > on
> > > GitHub pages, which doesn't support https for custom domains.)
> > >
> > > On Tue, Mar 12, 2019 at 7:47 PM Charles Allen
> > > <ch...@snap.com.invalid> wrote:
> > >
> > > > Are there other projects who have transitioned an independently
> > > successful
> > > > domain name to an apache one?
> > > >
> > > > On Tue, Mar 5, 2019 at 2:13 PM David Lim <da...@apache.org>
> wrote:
> > > >
> > > > > Who has control over the druid.io domain? Charles would that be
> you?
> > > > >
> > > > > We'd need support from them for the DNS redirect.
> > > > >
> > > > > On Tue, Mar 5, 2019 at 2:04 PM Jonathan Wei <jo...@apache.org>
> > wrote:
> > > > >
> > > > > > We still need to complete the website migration to Apache
> > > > infrastructure.
> > > > > >
> > > > > > I'll propose the following plan:
> > > > > >
> > > > > > Proposed Apache Druid website migration plan
> > > > > > ========================================
> > > > > >
> > > > > > These links have some previous discussion on the website
> migration:
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.apache.org_thread.html_7cae100b684e0b33e0adda993efea3d6088978700988a0ae632fdd80-40-253Cdev.druid.apache.org-253E&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=G1dTS7FlYGauxNOaQECZix2YwroWVCqJB-cT0nEeNwM&e=
> > > > > >
> > > > >
> > > >
> > >
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_INFRA-2D17340&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=pwg0jE385gqei6EEEbxugKHWll7oyKoCloFc8ByhlUc&e=
> > > > > >
> > > > > > From the discussions above, the recommendation is to have 2
> > separate
> > > > > repos
> > > > > > for the website: one for source and another for built content
> that
> > > will
> > > > > be
> > > > > > served.
> > > > > >
> > > > > > Generating site files
> > > > > > =======================
> > > > > >
> > > > > > The Apache site update process will be similar to our current
> > > process.
> > > > > >
> > > > > > Current process:
> > > > > > 1. Push changes to
> > > > > https://github.com/druid-io/druid-io.github.io/tree/src
> > > > > > 2. metamx bot picks up changes, builds, and commits to
> > > > > > https://github.com/druid-io/druid-io.github.io/tree/master
> > > > > > 3. https://github.com/druid-io/druid-io.github.io/tree/master is
> > > > served
> > > > > by
> > > > > > github pages
> > > > > >
> > > > > > Apache process:
> > > > > > 1. Push changes to
> > > > https://github.com/apache/incubator-druid-website-src
> > > > > > 2. Jenkins bot from Apache will build the website from source
> repo,
> > > > > commit
> > > > > > to https://github.com/apache/incubator-druid-website
> > > > > > 3. Apache Druid website will be served from the content in
> > > > > > https://github.com/apache/incubator-druid-website (asf-site
> > branch)
> > > > > >
> > > > > >
> > > > > > Hosting and SEO
> > > > > > ================
> > > > > >
> > > > > > The Apache site will be hosted at druid.apache.org on Apache
> > > > > > infrastructure:
> > > > >
> > > >
> > >
> >
> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.apache.org_dev_project-2Dsite.html&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=_rHEo_asMXKypaunuBTXFkB6Ni3F6KqbEfkck18L7Ag&e=
> > > > > >
> > > > > > To preserve our search rankings, we can setup 301 redirects from
> > the
> > > > old
> > > > > > druid.io site to the corresponding pages on the druid.apache.org
> > > > site. (
> > > > > >
> > > > >
> > > >
> > >
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__moz.com_learn_seo_redirection&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=lUeWU0dT9thy8gp11RO-Vry7zkYl_W4BXz01fyXJO0A&e=
> > > > > )
> > > > > >
> > > > > > However, Github pages (which currently hosts the druid.io site)
> > does
> > > > not
> > > > > > support 301 redirects, so we propose the following:
> > > > > > - Setup a new Nginx server that will perform 301 redirects to
> > > > > > druid.apache.org for the druid.io. Imply can host this if
> needed.
> > > > > > - Update the druid.io DNS entry to point to this new Nginx
> server
> > > > > > - Shut down Github pages hosting for druid.io
> > > > > >
> > > > > > In addition, we can also set canonical tags on our pages:
> > > > > >
> > > > >
> > > >
> > >
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__moz.com_learn_seo_canonicalization&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=T8G2c6d4EbQ_YDLFQXVebcj0UN9FNrbpPY5Xq4LAR8w&e=
> > > > > >
> > > > > >
> > > > > > Action items
> > > > > > ===============
> > > > > > - Setup a Jenkins bot that builds the Apache website content from
> > > > source
> > > > > > - Get the Apache website up
> > > > > > - Setup Nginx redirect server for druid.io
> > > > > > - Shutdown github pages and redirect DNS for druid.io to Nginx
> > > > redirect
> > > > > > server
> > > > > > - Add canonical tags to pages
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Proposed website migration plan

Posted by Gian Merlino <gi...@apache.org>.
That sounds good to me. I would also consider adding canonical tags to all
druid.apache.org pages so we don't have druid.incubator.apache.org and
druid.apache.org both floating around (not to mention http/https version of
both).

On Mon, Apr 22, 2019 at 2:59 PM Jonathan Wei <jo...@apache.org> wrote:

> For redirects, Xavier has suggested using
> https://help.github.com/en/articles/redirects-on-github-pages to redirect
> to druid.apache.org as a way to transition before the domain migration
> occurs, and believes that it would have the same SEO effects as a 301
> redirect after the new pages are indexed.
>
> I think we could try migrating the current Community page to
> druid.apache.org with Github redirects and canonical links pointing to the
> https://druid.apache.org version. If that goes well, we could continue
> migrating more pages.
>
> What are the community's thoughts on that?
>
> Thanks,
> Jon
>
> On Tue, Mar 12, 2019 at 7:19 PM Gian Merlino <gi...@apache.org> wrote:
>
> > OpenOffice and Groovy both chose to sort of "meld" their classic and
> Apache
> > sites together: https://www.openoffice.org/, http://groovy-lang.org/.
> Note
> > how when you click around, you get shuttled between the classic domain
> and
> > the Apache domain. Some pages are available on both sites, like
> > http://groovy-lang.org/download.html and
> > https://groovy.apache.org/download.html (which don't use canonical link
> > tags -- does not seem like a good example to follow!).
> >
> > NetBeans (still incubating) also has a "melded" site at
> > https://netbeans.org/ but doesn't seem to consider itself done yet. They
> > are discussing plans on their lists & wiki to do redirects from
> > netbeans.org
> > to netbeans.apache.org:
> >
> >
> https://cwiki.apache.org/confluence/display/NETBEANS/netbeans.org+Transition+Process
> > ,
> >
> >
> https://lists.apache.org/thread.html/ad10fb9d4c8fee571a2f6232b268a3b835f7b823d3a0983b84aeb18a@%3Cdev.netbeans.apache.org%3E
> > .
> > As of today the domain has been donated to ASF, but the server is still
> run
> > by Oracle, so the plan doesn't seem to be finished yet. (WHOIS for
> > netbeans.org shows ASF as the registrant; netbeans.org resolves to
> > lb-netbeans-cms-adc.oracle.com.)
> >
> > The melded sites don't really seem better to me than redirecting all urls
> > on the domain. I guess it depends on if we want to keep druid.io as the
> > official domain forever, or if we think druid.apache.org is cooler. I
> > definitely think druid.apache.org is cooler so my vote is there :). It's
> > also nice that it supports https. (druid.io does not today, since it's
> on
> > GitHub pages, which doesn't support https for custom domains.)
> >
> > On Tue, Mar 12, 2019 at 7:47 PM Charles Allen
> > <ch...@snap.com.invalid> wrote:
> >
> > > Are there other projects who have transitioned an independently
> > successful
> > > domain name to an apache one?
> > >
> > > On Tue, Mar 5, 2019 at 2:13 PM David Lim <da...@apache.org> wrote:
> > >
> > > > Who has control over the druid.io domain? Charles would that be you?
> > > >
> > > > We'd need support from them for the DNS redirect.
> > > >
> > > > On Tue, Mar 5, 2019 at 2:04 PM Jonathan Wei <jo...@apache.org>
> wrote:
> > > >
> > > > > We still need to complete the website migration to Apache
> > > infrastructure.
> > > > >
> > > > > I'll propose the following plan:
> > > > >
> > > > > Proposed Apache Druid website migration plan
> > > > > ========================================
> > > > >
> > > > > These links have some previous discussion on the website migration:
> > > > >
> > > > >
> > > > >
> > > >
> > >
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.apache.org_thread.html_7cae100b684e0b33e0adda993efea3d6088978700988a0ae632fdd80-40-253Cdev.druid.apache.org-253E&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=G1dTS7FlYGauxNOaQECZix2YwroWVCqJB-cT0nEeNwM&e=
> > > > >
> > > >
> > >
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_INFRA-2D17340&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=pwg0jE385gqei6EEEbxugKHWll7oyKoCloFc8ByhlUc&e=
> > > > >
> > > > > From the discussions above, the recommendation is to have 2
> separate
> > > > repos
> > > > > for the website: one for source and another for built content that
> > will
> > > > be
> > > > > served.
> > > > >
> > > > > Generating site files
> > > > > =======================
> > > > >
> > > > > The Apache site update process will be similar to our current
> > process.
> > > > >
> > > > > Current process:
> > > > > 1. Push changes to
> > > > https://github.com/druid-io/druid-io.github.io/tree/src
> > > > > 2. metamx bot picks up changes, builds, and commits to
> > > > > https://github.com/druid-io/druid-io.github.io/tree/master
> > > > > 3. https://github.com/druid-io/druid-io.github.io/tree/master is
> > > served
> > > > by
> > > > > github pages
> > > > >
> > > > > Apache process:
> > > > > 1. Push changes to
> > > https://github.com/apache/incubator-druid-website-src
> > > > > 2. Jenkins bot from Apache will build the website from source repo,
> > > > commit
> > > > > to https://github.com/apache/incubator-druid-website
> > > > > 3. Apache Druid website will be served from the content in
> > > > > https://github.com/apache/incubator-druid-website (asf-site
> branch)
> > > > >
> > > > >
> > > > > Hosting and SEO
> > > > > ================
> > > > >
> > > > > The Apache site will be hosted at druid.apache.org on Apache
> > > > > infrastructure:
> > > >
> > >
> >
> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.apache.org_dev_project-2Dsite.html&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=_rHEo_asMXKypaunuBTXFkB6Ni3F6KqbEfkck18L7Ag&e=
> > > > >
> > > > > To preserve our search rankings, we can setup 301 redirects from
> the
> > > old
> > > > > druid.io site to the corresponding pages on the druid.apache.org
> > > site. (
> > > > >
> > > >
> > >
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__moz.com_learn_seo_redirection&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=lUeWU0dT9thy8gp11RO-Vry7zkYl_W4BXz01fyXJO0A&e=
> > > > )
> > > > >
> > > > > However, Github pages (which currently hosts the druid.io site)
> does
> > > not
> > > > > support 301 redirects, so we propose the following:
> > > > > - Setup a new Nginx server that will perform 301 redirects to
> > > > > druid.apache.org for the druid.io. Imply can host this if needed.
> > > > > - Update the druid.io DNS entry to point to this new Nginx server
> > > > > - Shut down Github pages hosting for druid.io
> > > > >
> > > > > In addition, we can also set canonical tags on our pages:
> > > > >
> > > >
> > >
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__moz.com_learn_seo_canonicalization&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=T8G2c6d4EbQ_YDLFQXVebcj0UN9FNrbpPY5Xq4LAR8w&e=
> > > > >
> > > > >
> > > > > Action items
> > > > > ===============
> > > > > - Setup a Jenkins bot that builds the Apache website content from
> > > source
> > > > > - Get the Apache website up
> > > > > - Setup Nginx redirect server for druid.io
> > > > > - Shutdown github pages and redirect DNS for druid.io to Nginx
> > > redirect
> > > > > server
> > > > > - Add canonical tags to pages
> > > > >
> > > >
> > >
> >
>

Re: Proposed website migration plan

Posted by Jonathan Wei <jo...@apache.org>.
For redirects, Xavier has suggested using
https://help.github.com/en/articles/redirects-on-github-pages to redirect
to druid.apache.org as a way to transition before the domain migration
occurs, and believes that it would have the same SEO effects as a 301
redirect after the new pages are indexed.

I think we could try migrating the current Community page to
druid.apache.org with Github redirects and canonical links pointing to the
https://druid.apache.org version. If that goes well, we could continue
migrating more pages.

What are the community's thoughts on that?

Thanks,
Jon

On Tue, Mar 12, 2019 at 7:19 PM Gian Merlino <gi...@apache.org> wrote:

> OpenOffice and Groovy both chose to sort of "meld" their classic and Apache
> sites together: https://www.openoffice.org/, http://groovy-lang.org/. Note
> how when you click around, you get shuttled between the classic domain and
> the Apache domain. Some pages are available on both sites, like
> http://groovy-lang.org/download.html and
> https://groovy.apache.org/download.html (which don't use canonical link
> tags -- does not seem like a good example to follow!).
>
> NetBeans (still incubating) also has a "melded" site at
> https://netbeans.org/ but doesn't seem to consider itself done yet. They
> are discussing plans on their lists & wiki to do redirects from
> netbeans.org
> to netbeans.apache.org:
>
> https://cwiki.apache.org/confluence/display/NETBEANS/netbeans.org+Transition+Process
> ,
>
> https://lists.apache.org/thread.html/ad10fb9d4c8fee571a2f6232b268a3b835f7b823d3a0983b84aeb18a@%3Cdev.netbeans.apache.org%3E
> .
> As of today the domain has been donated to ASF, but the server is still run
> by Oracle, so the plan doesn't seem to be finished yet. (WHOIS for
> netbeans.org shows ASF as the registrant; netbeans.org resolves to
> lb-netbeans-cms-adc.oracle.com.)
>
> The melded sites don't really seem better to me than redirecting all urls
> on the domain. I guess it depends on if we want to keep druid.io as the
> official domain forever, or if we think druid.apache.org is cooler. I
> definitely think druid.apache.org is cooler so my vote is there :). It's
> also nice that it supports https. (druid.io does not today, since it's on
> GitHub pages, which doesn't support https for custom domains.)
>
> On Tue, Mar 12, 2019 at 7:47 PM Charles Allen
> <ch...@snap.com.invalid> wrote:
>
> > Are there other projects who have transitioned an independently
> successful
> > domain name to an apache one?
> >
> > On Tue, Mar 5, 2019 at 2:13 PM David Lim <da...@apache.org> wrote:
> >
> > > Who has control over the druid.io domain? Charles would that be you?
> > >
> > > We'd need support from them for the DNS redirect.
> > >
> > > On Tue, Mar 5, 2019 at 2:04 PM Jonathan Wei <jo...@apache.org> wrote:
> > >
> > > > We still need to complete the website migration to Apache
> > infrastructure.
> > > >
> > > > I'll propose the following plan:
> > > >
> > > > Proposed Apache Druid website migration plan
> > > > ========================================
> > > >
> > > > These links have some previous discussion on the website migration:
> > > >
> > > >
> > > >
> > >
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.apache.org_thread.html_7cae100b684e0b33e0adda993efea3d6088978700988a0ae632fdd80-40-253Cdev.druid.apache.org-253E&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=G1dTS7FlYGauxNOaQECZix2YwroWVCqJB-cT0nEeNwM&e=
> > > >
> > >
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_INFRA-2D17340&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=pwg0jE385gqei6EEEbxugKHWll7oyKoCloFc8ByhlUc&e=
> > > >
> > > > From the discussions above, the recommendation is to have 2 separate
> > > repos
> > > > for the website: one for source and another for built content that
> will
> > > be
> > > > served.
> > > >
> > > > Generating site files
> > > > =======================
> > > >
> > > > The Apache site update process will be similar to our current
> process.
> > > >
> > > > Current process:
> > > > 1. Push changes to
> > > https://github.com/druid-io/druid-io.github.io/tree/src
> > > > 2. metamx bot picks up changes, builds, and commits to
> > > > https://github.com/druid-io/druid-io.github.io/tree/master
> > > > 3. https://github.com/druid-io/druid-io.github.io/tree/master is
> > served
> > > by
> > > > github pages
> > > >
> > > > Apache process:
> > > > 1. Push changes to
> > https://github.com/apache/incubator-druid-website-src
> > > > 2. Jenkins bot from Apache will build the website from source repo,
> > > commit
> > > > to https://github.com/apache/incubator-druid-website
> > > > 3. Apache Druid website will be served from the content in
> > > > https://github.com/apache/incubator-druid-website (asf-site branch)
> > > >
> > > >
> > > > Hosting and SEO
> > > > ================
> > > >
> > > > The Apache site will be hosted at druid.apache.org on Apache
> > > > infrastructure:
> > >
> >
> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.apache.org_dev_project-2Dsite.html&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=_rHEo_asMXKypaunuBTXFkB6Ni3F6KqbEfkck18L7Ag&e=
> > > >
> > > > To preserve our search rankings, we can setup 301 redirects from the
> > old
> > > > druid.io site to the corresponding pages on the druid.apache.org
> > site. (
> > > >
> > >
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__moz.com_learn_seo_redirection&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=lUeWU0dT9thy8gp11RO-Vry7zkYl_W4BXz01fyXJO0A&e=
> > > )
> > > >
> > > > However, Github pages (which currently hosts the druid.io site) does
> > not
> > > > support 301 redirects, so we propose the following:
> > > > - Setup a new Nginx server that will perform 301 redirects to
> > > > druid.apache.org for the druid.io. Imply can host this if needed.
> > > > - Update the druid.io DNS entry to point to this new Nginx server
> > > > - Shut down Github pages hosting for druid.io
> > > >
> > > > In addition, we can also set canonical tags on our pages:
> > > >
> > >
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__moz.com_learn_seo_canonicalization&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=T8G2c6d4EbQ_YDLFQXVebcj0UN9FNrbpPY5Xq4LAR8w&e=
> > > >
> > > >
> > > > Action items
> > > > ===============
> > > > - Setup a Jenkins bot that builds the Apache website content from
> > source
> > > > - Get the Apache website up
> > > > - Setup Nginx redirect server for druid.io
> > > > - Shutdown github pages and redirect DNS for druid.io to Nginx
> > redirect
> > > > server
> > > > - Add canonical tags to pages
> > > >
> > >
> >
>

Re: Proposed website migration plan

Posted by Gian Merlino <gi...@apache.org>.
OpenOffice and Groovy both chose to sort of "meld" their classic and Apache
sites together: https://www.openoffice.org/, http://groovy-lang.org/. Note
how when you click around, you get shuttled between the classic domain and
the Apache domain. Some pages are available on both sites, like
http://groovy-lang.org/download.html and
https://groovy.apache.org/download.html (which don't use canonical link
tags -- does not seem like a good example to follow!).

NetBeans (still incubating) also has a "melded" site at
https://netbeans.org/ but doesn't seem to consider itself done yet. They
are discussing plans on their lists & wiki to do redirects from netbeans.org
to netbeans.apache.org:
https://cwiki.apache.org/confluence/display/NETBEANS/netbeans.org+Transition+Process
,
https://lists.apache.org/thread.html/ad10fb9d4c8fee571a2f6232b268a3b835f7b823d3a0983b84aeb18a@%3Cdev.netbeans.apache.org%3E.
As of today the domain has been donated to ASF, but the server is still run
by Oracle, so the plan doesn't seem to be finished yet. (WHOIS for
netbeans.org shows ASF as the registrant; netbeans.org resolves to
lb-netbeans-cms-adc.oracle.com.)

The melded sites don't really seem better to me than redirecting all urls
on the domain. I guess it depends on if we want to keep druid.io as the
official domain forever, or if we think druid.apache.org is cooler. I
definitely think druid.apache.org is cooler so my vote is there :). It's
also nice that it supports https. (druid.io does not today, since it's on
GitHub pages, which doesn't support https for custom domains.)

On Tue, Mar 12, 2019 at 7:47 PM Charles Allen
<ch...@snap.com.invalid> wrote:

> Are there other projects who have transitioned an independently successful
> domain name to an apache one?
>
> On Tue, Mar 5, 2019 at 2:13 PM David Lim <da...@apache.org> wrote:
>
> > Who has control over the druid.io domain? Charles would that be you?
> >
> > We'd need support from them for the DNS redirect.
> >
> > On Tue, Mar 5, 2019 at 2:04 PM Jonathan Wei <jo...@apache.org> wrote:
> >
> > > We still need to complete the website migration to Apache
> infrastructure.
> > >
> > > I'll propose the following plan:
> > >
> > > Proposed Apache Druid website migration plan
> > > ========================================
> > >
> > > These links have some previous discussion on the website migration:
> > >
> > >
> > >
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.apache.org_thread.html_7cae100b684e0b33e0adda993efea3d6088978700988a0ae632fdd80-40-253Cdev.druid.apache.org-253E&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=G1dTS7FlYGauxNOaQECZix2YwroWVCqJB-cT0nEeNwM&e=
> > >
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_INFRA-2D17340&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=pwg0jE385gqei6EEEbxugKHWll7oyKoCloFc8ByhlUc&e=
> > >
> > > From the discussions above, the recommendation is to have 2 separate
> > repos
> > > for the website: one for source and another for built content that will
> > be
> > > served.
> > >
> > > Generating site files
> > > =======================
> > >
> > > The Apache site update process will be similar to our current process.
> > >
> > > Current process:
> > > 1. Push changes to
> > https://github.com/druid-io/druid-io.github.io/tree/src
> > > 2. metamx bot picks up changes, builds, and commits to
> > > https://github.com/druid-io/druid-io.github.io/tree/master
> > > 3. https://github.com/druid-io/druid-io.github.io/tree/master is
> served
> > by
> > > github pages
> > >
> > > Apache process:
> > > 1. Push changes to
> https://github.com/apache/incubator-druid-website-src
> > > 2. Jenkins bot from Apache will build the website from source repo,
> > commit
> > > to https://github.com/apache/incubator-druid-website
> > > 3. Apache Druid website will be served from the content in
> > > https://github.com/apache/incubator-druid-website (asf-site branch)
> > >
> > >
> > > Hosting and SEO
> > > ================
> > >
> > > The Apache site will be hosted at druid.apache.org on Apache
> > > infrastructure:
> >
> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.apache.org_dev_project-2Dsite.html&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=_rHEo_asMXKypaunuBTXFkB6Ni3F6KqbEfkck18L7Ag&e=
> > >
> > > To preserve our search rankings, we can setup 301 redirects from the
> old
> > > druid.io site to the corresponding pages on the druid.apache.org
> site. (
> > >
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__moz.com_learn_seo_redirection&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=lUeWU0dT9thy8gp11RO-Vry7zkYl_W4BXz01fyXJO0A&e=
> > )
> > >
> > > However, Github pages (which currently hosts the druid.io site) does
> not
> > > support 301 redirects, so we propose the following:
> > > - Setup a new Nginx server that will perform 301 redirects to
> > > druid.apache.org for the druid.io. Imply can host this if needed.
> > > - Update the druid.io DNS entry to point to this new Nginx server
> > > - Shut down Github pages hosting for druid.io
> > >
> > > In addition, we can also set canonical tags on our pages:
> > >
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__moz.com_learn_seo_canonicalization&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=T8G2c6d4EbQ_YDLFQXVebcj0UN9FNrbpPY5Xq4LAR8w&e=
> > >
> > >
> > > Action items
> > > ===============
> > > - Setup a Jenkins bot that builds the Apache website content from
> source
> > > - Get the Apache website up
> > > - Setup Nginx redirect server for druid.io
> > > - Shutdown github pages and redirect DNS for druid.io to Nginx
> redirect
> > > server
> > > - Add canonical tags to pages
> > >
> >
>

Re: Proposed website migration plan

Posted by Charles Allen <ch...@snap.com.INVALID>.
Are there other projects who have transitioned an independently successful
domain name to an apache one?

On Tue, Mar 5, 2019 at 2:13 PM David Lim <da...@apache.org> wrote:

> Who has control over the druid.io domain? Charles would that be you?
>
> We'd need support from them for the DNS redirect.
>
> On Tue, Mar 5, 2019 at 2:04 PM Jonathan Wei <jo...@apache.org> wrote:
>
> > We still need to complete the website migration to Apache infrastructure.
> >
> > I'll propose the following plan:
> >
> > Proposed Apache Druid website migration plan
> > ========================================
> >
> > These links have some previous discussion on the website migration:
> >
> >
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.apache.org_thread.html_7cae100b684e0b33e0adda993efea3d6088978700988a0ae632fdd80-40-253Cdev.druid.apache.org-253E&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=G1dTS7FlYGauxNOaQECZix2YwroWVCqJB-cT0nEeNwM&e=
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_INFRA-2D17340&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=pwg0jE385gqei6EEEbxugKHWll7oyKoCloFc8ByhlUc&e=
> >
> > From the discussions above, the recommendation is to have 2 separate
> repos
> > for the website: one for source and another for built content that will
> be
> > served.
> >
> > Generating site files
> > =======================
> >
> > The Apache site update process will be similar to our current process.
> >
> > Current process:
> > 1. Push changes to
> https://github.com/druid-io/druid-io.github.io/tree/src
> > 2. metamx bot picks up changes, builds, and commits to
> > https://github.com/druid-io/druid-io.github.io/tree/master
> > 3. https://github.com/druid-io/druid-io.github.io/tree/master is served
> by
> > github pages
> >
> > Apache process:
> > 1. Push changes to https://github.com/apache/incubator-druid-website-src
> > 2. Jenkins bot from Apache will build the website from source repo,
> commit
> > to https://github.com/apache/incubator-druid-website
> > 3. Apache Druid website will be served from the content in
> > https://github.com/apache/incubator-druid-website (asf-site branch)
> >
> >
> > Hosting and SEO
> > ================
> >
> > The Apache site will be hosted at druid.apache.org on Apache
> > infrastructure:
> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.apache.org_dev_project-2Dsite.html&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=_rHEo_asMXKypaunuBTXFkB6Ni3F6KqbEfkck18L7Ag&e=
> >
> > To preserve our search rankings, we can setup 301 redirects from the old
> > druid.io site to the corresponding pages on the druid.apache.org site. (
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__moz.com_learn_seo_redirection&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=lUeWU0dT9thy8gp11RO-Vry7zkYl_W4BXz01fyXJO0A&e=
> )
> >
> > However, Github pages (which currently hosts the druid.io site) does not
> > support 301 redirects, so we propose the following:
> > - Setup a new Nginx server that will perform 301 redirects to
> > druid.apache.org for the druid.io. Imply can host this if needed.
> > - Update the druid.io DNS entry to point to this new Nginx server
> > - Shut down Github pages hosting for druid.io
> >
> > In addition, we can also set canonical tags on our pages:
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__moz.com_learn_seo_canonicalization&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=T8G2c6d4EbQ_YDLFQXVebcj0UN9FNrbpPY5Xq4LAR8w&e=
> >
> >
> > Action items
> > ===============
> > - Setup a Jenkins bot that builds the Apache website content from source
> > - Get the Apache website up
> > - Setup Nginx redirect server for druid.io
> > - Shutdown github pages and redirect DNS for druid.io to Nginx redirect
> > server
> > - Add canonical tags to pages
> >
>

Re: Proposed website migration plan

Posted by David Lim <da...@apache.org>.
Who has control over the druid.io domain? Charles would that be you?

We'd need support from them for the DNS redirect.

On Tue, Mar 5, 2019 at 2:04 PM Jonathan Wei <jo...@apache.org> wrote:

> We still need to complete the website migration to Apache infrastructure.
>
> I'll propose the following plan:
>
> Proposed Apache Druid website migration plan
> ========================================
>
> These links have some previous discussion on the website migration:
>
>
> https://lists.apache.org/thread.html/7cae100b684e0b33e0adda993efea3d6088978700988a0ae632fdd80@%3Cdev.druid.apache.org%3E
> https://issues.apache.org/jira/browse/INFRA-17340
>
> From the discussions above, the recommendation is to have 2 separate repos
> for the website: one for source and another for built content that will be
> served.
>
> Generating site files
> =======================
>
> The Apache site update process will be similar to our current process.
>
> Current process:
> 1. Push changes to https://github.com/druid-io/druid-io.github.io/tree/src
> 2. metamx bot picks up changes, builds, and commits to
> https://github.com/druid-io/druid-io.github.io/tree/master
> 3. https://github.com/druid-io/druid-io.github.io/tree/master is served by
> github pages
>
> Apache process:
> 1. Push changes to https://github.com/apache/incubator-druid-website-src
> 2. Jenkins bot from Apache will build the website from source repo, commit
> to https://github.com/apache/incubator-druid-website
> 3. Apache Druid website will be served from the content in
> https://github.com/apache/incubator-druid-website (asf-site branch)
>
>
> Hosting and SEO
> ================
>
> The Apache site will be hosted at druid.apache.org on Apache
> infrastructure: http://www.apache.org/dev/project-site.html
>
> To preserve our search rankings, we can setup 301 redirects from the old
> druid.io site to the corresponding pages on the druid.apache.org site. (
> https://moz.com/learn/seo/redirection)
>
> However, Github pages (which currently hosts the druid.io site) does not
> support 301 redirects, so we propose the following:
> - Setup a new Nginx server that will perform 301 redirects to
> druid.apache.org for the druid.io. Imply can host this if needed.
> - Update the druid.io DNS entry to point to this new Nginx server
> - Shut down Github pages hosting for druid.io
>
> In addition, we can also set canonical tags on our pages:
> https://moz.com/learn/seo/canonicalization
>
>
> Action items
> ===============
> - Setup a Jenkins bot that builds the Apache website content from source
> - Get the Apache website up
> - Setup Nginx redirect server for druid.io
> - Shutdown github pages and redirect DNS for druid.io to Nginx redirect
> server
> - Add canonical tags to pages
>