You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@attic.apache.org by sebb <se...@gmail.com> on 2018/02/15 21:22:12 UTC

Redirects for attic project download directories

The original thread got derailed, so here is what I think the issue is.

When a project is moved to the attic, its website is updated to show a
banner that the project has been retired, with a link to the attic.

Also, its release files on www.apache.org/dist are removed, and
replaced with an .htaccess redirect to an Attic page with summary
details of the project.

e.g.
http://www.apache.org/dist/harmony
redirects to
http://attic.apache.org/projects/harmony.html

Whilst this redirect works, and is simple to administer, it means
there are many directories under www.apache.org/dist/ which only
contain an .htaccess file, and scripts may have to be aware of which
directories correspond to active projects.

It would be useful to be able to tidy this up.

However it is important that the redirects are kept as there are
likely to be links to the download location in lots of places. We
should not break URLs unnecessarily.

Note that the redirects also work on mirror links as well as on the ASF hosts.
This behaviour should be preserved as historical links will generally
use the dynamic mirror system.
For example the harmony download page [1] points to the mirrors;
externally preserved links are likely to do so as well.

Removing the .htaccess files and their parent directories will break
all the links, so IMO that is not an option.

AFAIK .htaccess files are inherited from the parent directory, so it
should be possible to move the individual redirects to a shared parent
.htaccess file.

This needs to be tested.


[1] http://harmony.apache.org/download.cgi

Re: Redirects for attic project download directories

Posted by sebb <se...@gmail.com>.
On 16 February 2018 at 12:33, Henk P. Penning <pe...@uu.nl> wrote:
> On Thu, 15 Feb 2018, sebb wrote:
>
>> Date: Thu, 15 Feb 2018 22:22:12 +0100
>> From: sebb <se...@gmail.com>
>> To: general@attic.apache.org
>> Subject: Redirects for attic project download directories
>
>
>> AFAIK .htaccess files are inherited from the parent directory, so it
>> should be possible to move the individual redirects to a shared parent
>> .htaccess file.
>>
>> This needs to be tested.
>
>
>   I couldn't make it work ;
> what would (for instance)
>   a 'beehive' Redirect-line look like ?
>
>   In general, on some mirror.org with [apache-dist] in
>
>     http://mirror.org/foo/bar/
>
>   with our/their .htaccess file in
>
>     http://mirror.org/foo/bar/.htaccess
>
>   should redirect
>
>     http://mirror.org/foo/bar/beehive/blib/blob/
>
>   but should not redirect
>
>     http://mirror.org/foo/bar/other/proj/beehive/x/y/

I see what you mean.
It looks like it may not be possible the ensure that beehive is only
matched at the correct depth when using RedirectMatch.

My position is that the we should not break the URLs just to tidy the
directory and/or make it easier for scripts.
That is putting the cart before the horse.

So I think the existing .htaccess files should be left in place.

> ------------------------------------------------------------   _
> Henk P. Penning, ICT-beta                 R Uithof MG-403    _/ \_
> Faculty of Science, Utrecht University    T +31 30 253 4106 / \_/ \
> Leuvenlaan 4, 3584CE Utrecht, NL          F +31 30 253 4553 \_/ \_/
> http://www.staff.science.uu.nl/~penni101/ M penning@uu.nl     \_/

Re: Redirects for attic project download directories

Posted by "Henk P. Penning" <pe...@uu.nl>.
On Thu, 15 Feb 2018, sebb wrote:

> Date: Thu, 15 Feb 2018 22:22:12 +0100
> From: sebb <se...@gmail.com>
> To: general@attic.apache.org
> Subject: Redirects for attic project download directories

> AFAIK .htaccess files are inherited from the parent directory, so it
> should be possible to move the individual redirects to a shared parent
> .htaccess file.
>
> This needs to be tested.

   I couldn't make it work ; what would (for instance)
   a 'beehive' Redirect-line look like ?

   In general, on some mirror.org with [apache-dist] in

     http://mirror.org/foo/bar/

   with our/their .htaccess file in

     http://mirror.org/foo/bar/.htaccess

   should redirect

     http://mirror.org/foo/bar/beehive/blib/blob/

   but should not redirect

     http://mirror.org/foo/bar/other/proj/beehive/x/y/

------------------------------------------------------------   _
Henk P. Penning, ICT-beta                 R Uithof MG-403    _/ \_
Faculty of Science, Utrecht University    T +31 30 253 4106 / \_/ \
Leuvenlaan 4, 3584CE Utrecht, NL          F +31 30 253 4553 \_/ \_/
http://www.staff.science.uu.nl/~penni101/ M penning@uu.nl     \_/

Re: Redirects for attic project download directories

Posted by Jan Iversen <ja...@apache.org>.

Sent from my iPad

> On 28 Feb 2018, at 19:28, Henk P. Penning <pe...@uu.nl> wrote:
> 
>> On Fri, 23 Feb 2018, Henk P. Penning wrote:
>> 
>> Date: Fri, 23 Feb 2018 18:09:28 +0100
>> From: Henk P. Penning <pe...@uu.nl>
>> To: general@attic.apache.org
>> Subject: Re: Redirects for attic project download directories
> 
>> I think it is now safe to remove all /dist/GHOST/ directories.
> 
> Hi,
> 
>  For the record, because INFRA-16034 requires an 'ack',
>  here's my summary of this thread :
> 
>  GHOST == retired project ;
> 
>  The objection to the rm's of the /dist/GHOST/ directories was the loss
>  of redirects caused by the /dist/GHOST/.htaccess files.
> 
>  -- On www.a.o this is now fixed with rewrite rules in the config
>     for "www.a.o/dist/".
>  -- On GHOST.apache.org, the refs to the mirrors (closer.{cgi,lua})
>     redirect to attic because closer.lua is now "attic aware".
> 
> Jan Iversen,
> 
>  If you agree, please say "ack" on "INFRA-16034" :
> 
>    https://issues.apache.org/jira/browse/INFRA-16034
Done

rgds
jan i
> 
>  Thanks ; groeten,
> 
>  HPP
> 
> ------------------------------------------------------------   _
> Henk P. Penning, ICT-beta                 R Uithof MG-403    _/ \_
> Faculty of Science, Utrecht University    T +31 30 253 4106 / \_/ \
> Leuvenlaan 4, 3584CE Utrecht, NL          F +31 30 253 4553 \_/ \_/
> http://www.staff.science.uu.nl/~penni101/ M penning@uu.nl     \_/

Re: Redirects for attic project download directories

Posted by "Henk P. Penning" <pe...@uu.nl>.
On Wed, 28 Feb 2018, sebb wrote:

> Date: Wed, 28 Feb 2018 19:48:21 +0100
> From: sebb <se...@gmail.com>
> To: general@attic.apache.org
> Subject: Re: Redirects for attic project download directories

[ top-post ; further comments in-line ]

Hi,

   Five days ago [23-2-2018] I wrote :

   > I think it is now safe to remove all /dist/GHOST/ directories.

   This was met with "silent consent".

   Regards,

   Henk Penning

> On 28 February 2018 at 18:28, Henk P. Penning <pe...@uu.nl> wrote:
>> On Fri, 23 Feb 2018, Henk P. Penning wrote:
>>
>>> Date: Fri, 23 Feb 2018 18:09:28 +0100
>>> From: Henk P. Penning <pe...@uu.nl>
>>> To: general@attic.apache.org
>>> Subject: Re: Redirects for attic project download directories
>>
>>
>>>  I think it is now safe to remove all /dist/GHOST/ directories.
>>
>>
>> Hi,
>>
>>   For the record, because INFRA-16034 requires an 'ack',
>>   here's my summary of this thread :
>>
>>   GHOST == retired project ;
>>
>>   The objection to the rm's of the /dist/GHOST/ directories was the loss
>>   of redirects caused by the /dist/GHOST/.htaccess files.
>>
>>   -- On www.a.o this is now fixed with rewrite rules in the config
>>      for "www.a.o/dist/".
>>   -- On GHOST.apache.org, the refs to the mirrors (closer.{cgi,lua})
>>      redirect to attic because closer.lua is now "attic aware".
>
> However, the redirects will no longer work for mirrors, because they
> don't see the local redirects.

   Right.

> For example, this does not redirect:
> http://mirror.ox.ac.uk/sites/rsync.apache.org/ace/
>
> Whereas this does:
> https://www.apache.org/dist/ace/

   Right [ace is/was todo ; it doesn't have a .htaccess]
   [note that dist/abdera and dist/wookie are empty :-]

> At present, some attic projects do redirect; that is because of the
> .htaccess files.
> http://mirror.ox.ac.uk/sites/rsync.apache.org/harmony/
>
> It seems to me that this is a big change.

   The change is not relevant. The point is that we (the ASF)
   don't refer to individual items on specfic mirrors anymore
   (because closer.lua is now "attic aware") ; refs like :

     http://mirror.ox.ac.uk/sites/rsync.apache.org/harmony/foo/bar

   All refs like that on third-party pages, and indeed all refs to
   /dist/ items, will eventually die ; that is the nature of /dist/.

> And it goes against what you wrote on the following issue:
>
> https://issues.apache.org/jira/browse/INFRA-16122?focusedCommentId=16380672&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16380672
>
> i.e. you still expect mirrors to redirect attic projects.

   I don't ; but redirects should work. For examples, see below.
   Perhaps we should add a test to dist/zzz/ :-).

------------------------------------------------------------   _
Henk P. Penning, ICT-beta                 R Uithof MG-403    _/ \_
Faculty of Science, Utrecht University    T +31 30 253 4106 / \_/ \
Leuvenlaan 4, 3584CE Utrecht, NL          F +31 30 253 4553 \_/ \_/
http://www.staff.science.uu.nl/~penni101/ M penning@uu.nl     \_/

/dist/thrift/rpm/.htaccess:RedirectMatch permanent (.*)rpm/(.*) http://dl.bintray.com/apache/thrift/rpm/$2
/dist/thrift/debian/.htaccess:RedirectMatch permanent (.*)thrift/debian/(.*) http://dl.bintray.com/apache/thrift/debian/$2
/dist/lucene/mahout/.htaccess:RedirectMatch Permanent .* http://www.apache.org/dyn/closer.cgi/mahout
/dist/lucene/mahout/mahout-collections-1.0/.htaccess:RedirectMatch Permanent .* http://www.apache.org/dyn/closer.cgi/mahout
/dist/lucene/tika/.htaccess:RedirectMatch Permanent .* http://www.apache.org/dyn/closer.cgi/tika
/dist/lucene/nutch/.htaccess:RedirectMatch Permanent .* http://www.apache.org/dyn/closer.cgi/nutch
/dist/apr/.htaccess:RedirectMatch 301 ^(.*)/Announcement\.(.*) $1/Announcement1.2.$2
/dist/apr/.htaccess:RedirectMatch 301 ^(.*)/Announcement-1\.(.*) $1/Announcement1.2.$2
/dist/cassandra/debian/.htaccess:RedirectMatch ^/~eevans/debian(.*) http://wiki.apache.org/cassandra/DebianPackaging
/dist/cassandra/debian/.htaccess:RedirectMatch permanent (.*)cassandra/debian/(.*) http://dl.bintray.com/apache/cassandra/$2
/dist/httpd/.htaccess:RedirectMatch 301 ^(.*)/Announcement\.(.*) $1/Announcement1.3.$2
/dist/httpd/.htaccess:RedirectMatch 301 ^(.*)/Announcement2\.([th].*) $1/Announcement2.0.$2
/dist/httpd/.htaccess:RedirectMatch 301 ^(.*)/Announcement21\.(.*) $1/Announcement2.1.$2
/dist/sling/.htaccess:Redirect Permanent /dist/sling/KEYS https://people.apache.org/keys/group/sling.asc
/dist/aurora/debian/.htaccess:RedirectMatch permanent (.*)aurora/debian/(.*) http://dl.bintray.com/apache/aurora/$2

Re: Redirects for attic project download directories

Posted by sebb <se...@gmail.com>.
On 28 February 2018 at 18:28, Henk P. Penning <pe...@uu.nl> wrote:
> On Fri, 23 Feb 2018, Henk P. Penning wrote:
>
>> Date: Fri, 23 Feb 2018 18:09:28 +0100
>> From: Henk P. Penning <pe...@uu.nl>
>> To: general@attic.apache.org
>> Subject: Re: Redirects for attic project download directories
>
>
>>  I think it is now safe to remove all /dist/GHOST/ directories.
>
>
> Hi,
>
>   For the record, because INFRA-16034 requires an 'ack',
>   here's my summary of this thread :
>
>   GHOST == retired project ;
>
>   The objection to the rm's of the /dist/GHOST/ directories was the loss
>   of redirects caused by the /dist/GHOST/.htaccess files.
>
>   -- On www.a.o this is now fixed with rewrite rules in the config
>      for "www.a.o/dist/".
>   -- On GHOST.apache.org, the refs to the mirrors (closer.{cgi,lua})
>      redirect to attic because closer.lua is now "attic aware".

However, the redirects will no longer work for mirrors, because they
don't see the local redirects.

For example, this does not redirect:
http://mirror.ox.ac.uk/sites/rsync.apache.org/ace/

Whereas this does:
https://www.apache.org/dist/ace/

At present, some attic projects do redirect; that is because of the
.htaccess files.
http://mirror.ox.ac.uk/sites/rsync.apache.org/harmony/

It seems to me that this is a big change.

And it goes against what you wrote on the following issue:

https://issues.apache.org/jira/browse/INFRA-16122?focusedCommentId=16380672&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16380672

i.e. you still expect mirrors to redirect attic projects.

> Jan Iversen,
>
>   If you agree, please say "ack" on "INFRA-16034" :
>
>     https://issues.apache.org/jira/browse/INFRA-16034
>
>   Thanks ; groeten,
>
>
>   HPP
>
> ------------------------------------------------------------   _
> Henk P. Penning, ICT-beta                 R Uithof MG-403    _/ \_
> Faculty of Science, Utrecht University    T +31 30 253 4106 / \_/ \
> Leuvenlaan 4, 3584CE Utrecht, NL          F +31 30 253 4553 \_/ \_/
> http://www.staff.science.uu.nl/~penni101/ M penning@uu.nl     \_/

Re: Redirects for attic project download directories

Posted by "Henk P. Penning" <pe...@uu.nl>.
On Fri, 23 Feb 2018, Henk P. Penning wrote:

> Date: Fri, 23 Feb 2018 18:09:28 +0100
> From: Henk P. Penning <pe...@uu.nl>
> To: general@attic.apache.org
> Subject: Re: Redirects for attic project download directories

>  I think it is now safe to remove all /dist/GHOST/ directories.

Hi,

   For the record, because INFRA-16034 requires an 'ack',
   here's my summary of this thread :

   GHOST == retired project ;

   The objection to the rm's of the /dist/GHOST/ directories was the loss
   of redirects caused by the /dist/GHOST/.htaccess files.

   -- On www.a.o this is now fixed with rewrite rules in the config
      for "www.a.o/dist/".
   -- On GHOST.apache.org, the refs to the mirrors (closer.{cgi,lua})
      redirect to attic because closer.lua is now "attic aware".

Jan Iversen,

   If you agree, please say "ack" on "INFRA-16034" :

     https://issues.apache.org/jira/browse/INFRA-16034

   Thanks ; groeten,

   HPP

------------------------------------------------------------   _
Henk P. Penning, ICT-beta                 R Uithof MG-403    _/ \_
Faculty of Science, Utrecht University    T +31 30 253 4106 / \_/ \
Leuvenlaan 4, 3584CE Utrecht, NL          F +31 30 253 4553 \_/ \_/
http://www.staff.science.uu.nl/~penni101/ M penning@uu.nl     \_/

Re: Redirects for attic project download directories

Posted by "Henk P. Penning" <pe...@uu.nl>.
On Thu, 15 Feb 2018, sebb wrote:

> Date: Thu, 15 Feb 2018 22:22:12 +0100
> From: sebb <se...@gmail.com>
> To: general@attic.apache.org
> Subject: Redirects for attic project download directories
> 
> The original thread got derailed, so here is what I think the issue is.
>
> When a project is moved to the attic, its website is updated to show a
> banner that the project has been retired, with a link to the attic.
>
> Also, its release files on www.apache.org/dist are removed, and
> replaced with an .htaccess redirect to an Attic page with summary
> details of the project.
>
> e.g.
> http://www.apache.org/dist/harmony
> redirects to
> http://attic.apache.org/projects/harmony.html
>
> Whilst this redirect works, and is simple to administer, it means
> there are many directories under www.apache.org/dist/ which only
> contain an .htaccess file, and scripts may have to be aware of which
> directories correspond to active projects.
>
> It would be useful to be able to tidy this up.

   FYI ; this now works.

   The config of www.apache.org redirects

     www.apache.org/dist/GHOST => attic.apache.org/projects/GHOST.html

   ... if the latter exists. See :

     https://dist.apache.org/repos/dist/release/etch/

   This dir has no '.htaccess' file ; note the leading '.' ;
   infra moved .htaccess => htaccess as a final test.
   Try : https://www.apache.org/dist/etch

   I think it is now safe to remove all /dist/GHOST/ directories.

   Groeten,

   HPP

------------------------------------------------------------   _
Henk P. Penning, ICT-beta                 R Uithof MG-403    _/ \_
Faculty of Science, Utrecht University    T +31 30 253 4106 / \_/ \
Leuvenlaan 4, 3584CE Utrecht, NL          F +31 30 253 4553 \_/ \_/
http://www.staff.science.uu.nl/~penni101/ M penning@uu.nl     \_/

Re: Redirects for attic project download directories

Posted by "Henk P. Penning" <pe...@uu.nl>.
On Tue, 20 Feb 2018, sebb wrote:

> Date: Tue, 20 Feb 2018 11:37:12 +0100
> From: sebb <se...@gmail.com>
> To: general@attic.apache.org
> Subject: Re: Redirects for attic project download directories
> 
> On 20 February 2018 at 09:21, Henk P. Penning <pe...@uu.nl> wrote:

>>   We would map it into attic-space if we could do it elegantly.
>
> This still preserves the URL as far as the public are concerned.
> Which is my point: we preserve ghost.a.o,

   Of course, but I hope we can add the "retired" top-of-page
   stuff without touching the content of ghost.a.o.

>                                           and at present we preserve
> www.a.o/dist/ghost as a redirect.

   Of course ; but that is fixable with 3 or 4 rewrite rules
   in the config of www.apache.org/dist/ ; that is, without
   the www.a.o/dist/ghost directories.

   These rules will work for GHOST ; as soon as there is
   a "attic.a.o/projects/GHOST.html" page, www.a.o will
   redirect to it.
   [ on the webserver, the page is in file
     /x1/www/attic.apache.org/projects/GHOST.html"
   ]

   So, there is no need for dist/attic/GHOST ;
   after attic.a.o/projects/GHOST.html is created,
   infra can simply svn rm dist/release/GHOST ;
   I believe that is the usual work-flow.
   correct me if I'm wrong :-).

> I think that is worth keeping.

   Of course.

>>   Fixing website GHOST.apache.org is a hassle.
>>   It would be nice if we could present it "as is",
>>   prefixed with the necessary "RETIRED" head.
>
> That's a subject for a separate thread.

   Agree :-)
   Ideas (or working examples) welcome ;
   think "mod_proxy", "mod_sed" etc.

   Groeten,

   HPP

------------------------------------------------------------   _
Henk P. Penning, ICT-beta                 R Uithof MG-403    _/ \_
Faculty of Science, Utrecht University    T +31 30 253 4106 / \_/ \
Leuvenlaan 4, 3584CE Utrecht, NL          F +31 30 253 4553 \_/ \_/
http://www.staff.science.uu.nl/~penni101/ M penning@uu.nl     \_/

Re: Redirects for attic project download directories

Posted by sebb <se...@gmail.com>.
On 20 February 2018 at 09:21, Henk P. Penning <pe...@uu.nl> wrote:
> On Sun, 18 Feb 2018, sebb wrote:
>
>> Date: Sun, 18 Feb 2018 19:35:40 +0100
>> From: sebb <se...@gmail.com>
>> To: general@attic.apache.org
>> Subject: Re: Redirects for attic project download directories
>
>
>> On 18 February 2018 at 16:10, Henk P. Penning <pe...@uu.nl> wrote:
>
>
>>>   On retirement we remove GHOSTS from committee-info.txt and
>>>   navigation [http://www.apache.org/#projects-list], etc.
>>>   In short, we have pointers to GHOSTs, but not among the living ;
>>>   and that should also be the case for /dist/ (and the mirrors).
>>
>>
>> But we keep ghost.apache.org.
>
>
>   We would map it into attic-space if we could do it elegantly.

This still preserves the URL as far as the public are concerned.
Which is my point: we preserve ghost.a.o, and at present we preserve
www.a.o/dist/ghost as a redirect.
I think that is worth keeping.

IMO we should not drop the redirect just to 'tidy up'.


>   Fixing website GHOST.apache.org is a hassle.
>   It would be nice if we could present it "as is",
>   prefixed with the necessary "RETIRED" head.

That's a subject for a separate thread.

>   Perhaps with something like :
>
>     https://mirror-vm.apache.org/frames.html
>
>   At the moment this doesn't work too good ;
>
>   -- browers don't like http urls on https page
>   -- sites don't like to be included in <FRAME>'s
>
>   But maybe this can fixed without touching the
>   /content/ of GHOST.a.o ; with config fixes only.
>
>
>   Groeten,
>
>   HPP
>
> ------------------------------------------------------------   _
> Henk P. Penning, ICT-beta                 R Uithof MG-403    _/ \_
> Faculty of Science, Utrecht University    T +31 30 253 4106 / \_/ \
> Leuvenlaan 4, 3584CE Utrecht, NL          F +31 30 253 4553 \_/ \_/
> http://www.staff.science.uu.nl/~penni101/ M penning@uu.nl     \_/

Re: Redirects for attic project download directories

Posted by "Henk P. Penning" <pe...@uu.nl>.
On Sun, 18 Feb 2018, sebb wrote:

> Date: Sun, 18 Feb 2018 19:35:40 +0100
> From: sebb <se...@gmail.com>
> To: general@attic.apache.org
> Subject: Re: Redirects for attic project download directories

> On 18 February 2018 at 16:10, Henk P. Penning <pe...@uu.nl> wrote:

>>   On retirement we remove GHOSTS from committee-info.txt and
>>   navigation [http://www.apache.org/#projects-list], etc.
>>   In short, we have pointers to GHOSTs, but not among the living ;
>>   and that should also be the case for /dist/ (and the mirrors).
>
> But we keep ghost.apache.org.

   We would map it into attic-space if we could do it elegantly.

   Fixing website GHOST.apache.org is a hassle.
   It would be nice if we could present it "as is",
   prefixed with the necessary "RETIRED" head.

   Perhaps with something like :

     https://mirror-vm.apache.org/frames.html

   At the moment this doesn't work too good ;

   -- browers don't like http urls on https page
   -- sites don't like to be included in <FRAME>'s

   But maybe this can fixed without touching the
   /content/ of GHOST.a.o ; with config fixes only.

   Groeten,

   HPP

------------------------------------------------------------   _
Henk P. Penning, ICT-beta                 R Uithof MG-403    _/ \_
Faculty of Science, Utrecht University    T +31 30 253 4106 / \_/ \
Leuvenlaan 4, 3584CE Utrecht, NL          F +31 30 253 4553 \_/ \_/
http://www.staff.science.uu.nl/~penni101/ M penning@uu.nl     \_/

Re: Redirects for attic project download directories

Posted by sebb <se...@gmail.com>.
On 18 February 2018 at 16:10, Henk P. Penning <pe...@uu.nl> wrote:
> On Sat, 17 Feb 2018, sebb wrote:
>
>> Date: Sat, 17 Feb 2018 19:45:56 +0100
>> From: sebb <se...@gmail.com>
>> To: general@attic.apache.org
>> Subject: Re: Redirects for attic project download directories
>>
>> On 16 February 2018 at 16:53, Henk P. Penning <pe...@uu.nl> wrote:
>>>
>>> On Thu, 15 Feb 2018, sebb wrote:
>
>
>>>> It would be useful to be able to tidy this up.
>
>
>>>   Links to mirrors are typically generated by closer.lua ;
>>>   we can make closer.lua attic-aware (says humbedooh :-).
>>>   When closer.lua encounters a target in an atticked project,
>>>   it can redirect to attic.a.o.
>>
>>
>> Good idea as it will stop further generation of useless URLs.
>
>
>   This now in test on mirror-vm ; try
>
>     https://mirror-vm.apache.org/dyn/dev_closer.lua/
>     https://mirror-vm.apache.org/dyn/dev_closer.lua/beehive/blib/blob
>
>   Aside: dev_closer.lua is also 'dist' and 'archive' aware ; try
>
>     https://mirror-vm.apache.org/dyn/dev_closer.lua/FOO/BAR
>     https://mirror-vm.apache.org/dyn/dev_closer.lua/httpd/apache_1.3.0.tar.Z
>
>>>   Now, suppose we create dist/attic/ghosts/
>>>
>>>     https:// dist.apache.org/repos/dist/release/attic/ghosts/
>>>
>>>   -- on retirement, infra svn moves dist/GHOST/ to dist/attic/ghosts/ ;
>>>      Pmc Attic can cleanup what was formerly dist/GHOST/
>>
>>
>> Unless the archive synch job is changed to ignore files under
>> dist/attic/ghosts this will result in creating copies of the release
>> artifacts on archive.a.o
>
>
>   That's a detail

Yes, I know. But details matter.

> ; the point is that /we/ can cleanup ;
>   less work for infra ; more control for us.
>
>>>   -- closer.lua can check the presence of dist/attic/ghosts/PROJ
>>>   -- the RewriteRules idem
>>>
>>>   I think this would tidy up /dist/ while keeping the proper Redirects.
>>
>>
>> It won't keep the redirects on the 3rd party mirrors.
>> Such URLs may well have been stored elsewhere.
>>
>> For example [1] points to (e.g.)
>>
>> http://mirror.org/apache/harmony/milestones/5.0/M15/apache-harmony-5.0-jre-r991518-windows-x86-snapshot.zip
>
>
>> At present such a URL will redirect back to the attic (try it!)
>> That is the functionality which I think is important to preserve.
>
>
>   When the new closer.lua is in place, [1] will point directly
>   to attic.a.o/projects/harmony.html ;
>   Are there any other examples of important pages ?
>
>> That is not true for 3rd party mirrors ...
>
>
>   Any link to a specific file on a specific mirror will stop working
>   sooner or later ; mirrors disappear and /dist/ changes.
>
>     LIVE-link  : http://some.mirror.org/.../httpd/some-old-version.gz
>     GHOST-link : http://some.mirror.org/.../beehive/some-old-version.gz
>
>   The LIVE-link gives a 404 ; the GHOST-link gives a redirect.
>   Why the difference?

The difference is that the parent directory still exists for active
projects even if a particular version does not.

>   On retirement we remove GHOSTS from committee-info.txt and
>   navigation [http://www.apache.org/#projects-list], etc.
>   In short, we have pointers to GHOSTs, but not among the living ;
>   and that should also be the case for /dist/ (and the mirrors).

But we keep ghost.apache.org.

>>>> [1] http://harmony.apache.org/download.cgi
>
>
>   Groeten,
>
>   HPP
>
>
> ------------------------------------------------------------   _
> Henk P. Penning, ICT-beta                 R Uithof MG-403    _/ \_
> Faculty of Science, Utrecht University    T +31 30 253 4106 / \_/ \
> Leuvenlaan 4, 3584CE Utrecht, NL          F +31 30 253 4553 \_/ \_/
> http://www.staff.science.uu.nl/~penni101/ M penning@uu.nl     \_/

Re: Redirects for attic project download directories

Posted by "Henk P. Penning" <pe...@uu.nl>.
On Sat, 17 Feb 2018, sebb wrote:

> Date: Sat, 17 Feb 2018 19:45:56 +0100
> From: sebb <se...@gmail.com>
> To: general@attic.apache.org
> Subject: Re: Redirects for attic project download directories
> 
> On 16 February 2018 at 16:53, Henk P. Penning <pe...@uu.nl> wrote:
>> On Thu, 15 Feb 2018, sebb wrote:

>>> It would be useful to be able to tidy this up.

>>   Links to mirrors are typically generated by closer.lua ;
>>   we can make closer.lua attic-aware (says humbedooh :-).
>>   When closer.lua encounters a target in an atticked project,
>>   it can redirect to attic.a.o.
>
> Good idea as it will stop further generation of useless URLs.

   This now in test on mirror-vm ; try

     https://mirror-vm.apache.org/dyn/dev_closer.lua/
     https://mirror-vm.apache.org/dyn/dev_closer.lua/beehive/blib/blob

   Aside: dev_closer.lua is also 'dist' and 'archive' aware ; try

     https://mirror-vm.apache.org/dyn/dev_closer.lua/FOO/BAR
     https://mirror-vm.apache.org/dyn/dev_closer.lua/httpd/apache_1.3.0.tar.Z

>>   Now, suppose we create dist/attic/ghosts/
>>
>>     https:// dist.apache.org/repos/dist/release/attic/ghosts/
>>
>>   -- on retirement, infra svn moves dist/GHOST/ to dist/attic/ghosts/ ;
>>      Pmc Attic can cleanup what was formerly dist/GHOST/
>
> Unless the archive synch job is changed to ignore files under
> dist/attic/ghosts this will result in creating copies of the release
> artifacts on archive.a.o

   That's a detail ; the point is that /we/ can cleanup ;
   less work for infra ; more control for us.

>>   -- closer.lua can check the presence of dist/attic/ghosts/PROJ
>>   -- the RewriteRules idem
>>
>>   I think this would tidy up /dist/ while keeping the proper Redirects.
>
> It won't keep the redirects on the 3rd party mirrors.
> Such URLs may well have been stored elsewhere.
>
> For example [1] points to (e.g.)
> http://mirror.org/apache/harmony/milestones/5.0/M15/apache-harmony-5.0-jre-r991518-windows-x86-snapshot.zip

> At present such a URL will redirect back to the attic (try it!)
> That is the functionality which I think is important to preserve.

   When the new closer.lua is in place, [1] will point directly
   to attic.a.o/projects/harmony.html ;
   Are there any other examples of important pages ?

> That is not true for 3rd party mirrors ...

   Any link to a specific file on a specific mirror will stop working
   sooner or later ; mirrors disappear and /dist/ changes.

     LIVE-link  : http://some.mirror.org/.../httpd/some-old-version.gz
     GHOST-link : http://some.mirror.org/.../beehive/some-old-version.gz

   The LIVE-link gives a 404 ; the GHOST-link gives a redirect.
   Why the difference?

   On retirement we remove GHOSTS from committee-info.txt and
   navigation [http://www.apache.org/#projects-list], etc.
   In short, we have pointers to GHOSTs, but not among the living ;
   and that should also be the case for /dist/ (and the mirrors).

>>> [1] http://harmony.apache.org/download.cgi

   Groeten,

   HPP

------------------------------------------------------------   _
Henk P. Penning, ICT-beta                 R Uithof MG-403    _/ \_
Faculty of Science, Utrecht University    T +31 30 253 4106 / \_/ \
Leuvenlaan 4, 3584CE Utrecht, NL          F +31 30 253 4553 \_/ \_/
http://www.staff.science.uu.nl/~penni101/ M penning@uu.nl     \_/

Re: Redirects for attic project download directories

Posted by sebb <se...@gmail.com>.
On 16 February 2018 at 16:53, Henk P. Penning <pe...@uu.nl> wrote:
> On Thu, 15 Feb 2018, sebb wrote:
>
>> Date: Thu, 15 Feb 2018 22:22:12 +0100
>> From: sebb <se...@gmail.com>
>> To: general@attic.apache.org
>> Subject: Redirects for attic project download directories
>>
>> The original thread got derailed, so here is what I think the issue is.
>>
>> When a project is moved to the attic, its website is updated to show a
>> banner that the project has been retired, with a link to the attic.
>>
>> Also, its release files on www.apache.org/dist are removed, and
>> replaced with an .htaccess redirect to an Attic page with summary
>> details of the project.
>>
>> e.g.
>> http://www.apache.org/dist/harmony
>> redirects to
>> http://attic.apache.org/projects/harmony.html
>>
>> Whilst this redirect works, and is simple to administer, it means
>> there are many directories under www.apache.org/dist/ which only
>> contain an .htaccess file, and scripts may have to be aware of which
>> directories correspond to active projects.
>>
>> It would be useful to be able to tidy this up.
>
>
>   Agree ; that's a good summary.
>
>> However it is important that the redirects are kept as there are
>> likely to be links to the download location in lots of places. We
>> should not break URLs unnecessarily.
>>
>> Note that the redirects also work on mirror links as well as on the ASF
>> hosts.
>> This behaviour should be preserved as historical links will generally
>> use the dynamic mirror system.
>> For example the harmony download page [1] points to the mirrors;
>> externally preserved links are likely to do so as well.
>>
>> Removing the .htaccess files and their parent directories will break
>> all the links, so IMO that is not an option.
>
>
>   [ humbedooh == Daniel Gruno ; infra ; initial(?) author of closer.lua
>   ; closer.lua == the script that projects use on their (download)
>     pages to link to "the mirrors"
>   ; GHOST == a retire project ; ASF GHOSTs live in the Attic :-)
>   ]
>
>   Links to mirrors are typically generated by closer.lua ;
>   we can make closer.lua attic-aware (says humbedooh :-).
>   When closer.lua encounters a target in an atticked project,
>   it can redirect to attic.a.o.

Good idea as it will stop further generation of useless URLs.

>   Are there any important links to specific files on specific mirrors ?

I don't understand what you mean by that.

>   That leaves "www.apache.org/dist/GHOST".
>
>   It would be easy (again, says humbedooh) to configure www.apache.org
>   with a few (fixed number of) Rewrite rules like (pseudo-code) :
>
>     # redirect dist/GHOST to attic.a.o/.../GHOST.hmtl
>     <Directory dist>
>     RewriteCondition "file path in an atticked dir"
>     RewriteRule .* "Redirect to attic.a.o"
>     </Directory>

AFAICT that will only work for links that target the ASF hosts directly.
Such redirects are anyway not a problem.

>   Now, suppose we create dist/attic/ghosts/
>
>     https:// dist.apache.org/repos/dist/release/attic/ghosts/
>
>   -- on retirement, infra svn moves dist/GHOST/ to dist/attic/ghosts/ ;
>      Pmc Attic can cleanup what was formerly dist/GHOST/

Unless the archive synch job is changed to ignore files under
dist/attic/ghosts this will result in creating copies of the release
artifacts on archive.a.o

>   -- closer.lua can check the presence of dist/attic/ghosts/PROJ
>   -- the RewriteRules idem
>
>   I think this would tidy up /dist/ while keeping the proper Redirects.

It won't keep the redirects on the 3rd party mirrors.
Such URLs may well have been stored elsewhere.

For example [1] points to (e.g.)
http://mirror.org/apache/harmony/milestones/5.0/M15/apache-harmony-5.0-jre-r991518-windows-x86-snapshot.zip

At present such a URL will redirect back to the attic (try it!)

That is the functionality which I think is important to preserve.

AFAICT it's trivial to preserve the redirects on the ASF hosts.
This is because we know the path structure is /dist/project.

That is not true for 3rd party mirrors which have a variety of path prefixes.
Since the .htaccess file is in the top-level project directory it only
gets invoked for paths that match the attic'ed project.

I don't see how dist/attic/ghosts would help with the redirects.

>> [1] http://harmony.apache.org/download.cgi
>
>
>   Regards,
>
>   Henk Penning
>
>
> ------------------------------------------------------------   _
> Henk P. Penning, ICT-beta                 R Uithof MG-403    _/ \_
> Faculty of Science, Utrecht University    T +31 30 253 4106 / \_/ \
> Leuvenlaan 4, 3584CE Utrecht, NL          F +31 30 253 4553 \_/ \_/
> http://www.staff.science.uu.nl/~penni101/ M penning@uu.nl     \_/

Re: Redirects for attic project download directories

Posted by "Henk P. Penning" <pe...@uu.nl>.
On Thu, 15 Feb 2018, sebb wrote:

> Date: Thu, 15 Feb 2018 22:22:12 +0100
> From: sebb <se...@gmail.com>
> To: general@attic.apache.org
> Subject: Redirects for attic project download directories
> 
> The original thread got derailed, so here is what I think the issue is.
>
> When a project is moved to the attic, its website is updated to show a
> banner that the project has been retired, with a link to the attic.
>
> Also, its release files on www.apache.org/dist are removed, and
> replaced with an .htaccess redirect to an Attic page with summary
> details of the project.
>
> e.g.
> http://www.apache.org/dist/harmony
> redirects to
> http://attic.apache.org/projects/harmony.html
>
> Whilst this redirect works, and is simple to administer, it means
> there are many directories under www.apache.org/dist/ which only
> contain an .htaccess file, and scripts may have to be aware of which
> directories correspond to active projects.
>
> It would be useful to be able to tidy this up.

   Agree ; that's a good summary.

> However it is important that the redirects are kept as there are
> likely to be links to the download location in lots of places. We
> should not break URLs unnecessarily.
>
> Note that the redirects also work on mirror links as well as on the ASF hosts.
> This behaviour should be preserved as historical links will generally
> use the dynamic mirror system.
> For example the harmony download page [1] points to the mirrors;
> externally preserved links are likely to do so as well.
>
> Removing the .htaccess files and their parent directories will break
> all the links, so IMO that is not an option.

   [ humbedooh == Daniel Gruno ; infra ; initial(?) author of closer.lua
   ; closer.lua == the script that projects use on their (download)
     pages to link to "the mirrors"
   ; GHOST == a retire project ; ASF GHOSTs live in the Attic :-)
   ]

   Links to mirrors are typically generated by closer.lua ;
   we can make closer.lua attic-aware (says humbedooh :-).
   When closer.lua encounters a target in an atticked project,
   it can redirect to attic.a.o.
   Are there any important links to specific files on specific mirrors ?

   That leaves "www.apache.org/dist/GHOST".

   It would be easy (again, says humbedooh) to configure www.apache.org
   with a few (fixed number of) Rewrite rules like (pseudo-code) :

     # redirect dist/GHOST to attic.a.o/.../GHOST.hmtl
     <Directory dist>
     RewriteCondition "file path in an atticked dir"
     RewriteRule .* "Redirect to attic.a.o"
     </Directory>

   Now, suppose we create dist/attic/ghosts/

     https:// dist.apache.org/repos/dist/release/attic/ghosts/

   -- on retirement, infra svn moves dist/GHOST/ to dist/attic/ghosts/ ;
      Pmc Attic can cleanup what was formerly dist/GHOST/
   -- closer.lua can check the presence of dist/attic/ghosts/PROJ
   -- the RewriteRules idem

   I think this would tidy up /dist/ while keeping the proper Redirects.

> [1] http://harmony.apache.org/download.cgi

   Regards,

   Henk Penning

------------------------------------------------------------   _
Henk P. Penning, ICT-beta                 R Uithof MG-403    _/ \_
Faculty of Science, Utrecht University    T +31 30 253 4106 / \_/ \
Leuvenlaan 4, 3584CE Utrecht, NL          F +31 30 253 4553 \_/ \_/
http://www.staff.science.uu.nl/~penni101/ M penning@uu.nl     \_/