You are viewing a plain text version of this content. The canonical link for it is here.
Posted to infrastructure-issues@apache.org by "Nicholas Chammas (JIRA)" <ji...@apache.org> on 2015/12/24 06:28:49 UTC

[jira] [Created] (INFRA-10999) Usage guidelines + standards for mirrors

Nicholas Chammas created INFRA-10999:
----------------------------------------

             Summary: Usage guidelines + standards for mirrors
                 Key: INFRA-10999
                 URL: https://issues.apache.org/jira/browse/INFRA-10999
             Project: Infrastructure
          Issue Type: Improvement
          Components: Mirrors
            Reporter: Nicholas Chammas
            Priority: Minor


1. Are there any concrete guidelines on how best to access the Apache mirror network? If so, where are they?

For example, I've gathered from scattered conversations here and there that:

  * The "correct" way to configure automated downloads from the Apache mirror network is to use the `closer.lua` script to automatically select a close mirror, and download from that. Don't hardcode a specific mirror into any scripts.
  * The closer.cgi script should not be used as it is superseded by the `closer.lua` script.
  * I can append `?asjson` (or `?as_json`) when querying `closer.lua` to get some detail about the best mirror in JSON, which is useful if I want to parse and use that information in a script.

It was extremely time consuming to piece together all this basic information about how to be a good user of the Apache mirror network. It shouldn't be this hard to do the right thing.

Is there no central, maintained documentation I can reference to get this kind of information?

2. Does the Apache foundation have any partnerships with a CDN like Fastly or with a large company like Amazon to make it much easier and faster for users to download Apache software?

The Python community, for example, is offered hosting for Python packages by Fastly. I wonder if Apache has ever considered seeking a similar kind of partnership (or sponsorship) to complement its current mirror network.

3. I see that mirrors are checked for availability, but are there any rough requirements for mirror performance? Something like, "You should not serve files slower than this"? I don't mean anything crazy; just a lower limit on performance that should be easy to meet.

For example, the mirror located at 104.45.233.178 works fine. It's up and it serves files. However, it consistently takes 20-30 minutes to serve `hadoop-2.7.1.tar.gz`--a 200 MB file. Is that OK? Other mirrors (like http://mirrors.gigenet.com/) serve this file in 5 minutes.

Because of the huge variance in mirror performance, I'm considering adding logic to my application to query `closer.lua` but then check an "Apache mirror blacklist" to make sure I don't get these extremely slow mirrors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)