You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@whirr.apache.org by Andrei Savu <sa...@gmail.com> on 2012/02/22 14:24:23 UTC

Downloading binaries

Hi All,

I have just done a bit of testing from eu-west-1 - downloading the Hadoop
.tar.gz (93 MB) from:
http://archive.apache.org/dist/hadoop/core/hadoop-0.20.205.0/

It failed 10 times in a row!! Any suggestions for a better location? Should
we start to think about hosting
the required artefacts in a different place? (S3 + cloudfront)

This is  the root cause of the many transient failures we are seeing when
deploying clusters.

I would even recommend to our users to  build a cache for what they are
using.

-- Andrei Savu

Re: Downloading binaries

Posted by Andrei Savu <sa...@gmail.com>.

Great! Anyone willing to turn that into a function that we can add to
whirr-core?

On Thu, Feb 23, 2012 at 9:49 AM, Adrian Cole <ad...@jclouds.org> wrote:

> here's a start:
>
> $ curl -q -s -S -L http://www.apache.org/dyn/closer.cgi |sed -n '/^<p><a
> href="http/s/.*"\(.*\)".*/\1/gp'
> http://apache.mivzakim.net//
>
>
> On Thu, Feb 23, 2012 at 11:33 AM, Adrian Cole <ad...@jclouds.org> wrote:
>
> > awk-fu or sed-fu should be able to handle this :P
> >
> >
> > On Thu, Feb 23, 2012 at 11:15 AM, Andrei Savu <savu.andrei@gmail.com
> >wrote:
> >
> >> On Thu, Feb 23, 2012 at 8:23 AM, Adrian Cole <fe...@gmail.com>
> wrote:
> >>
> >> > I like the idea of auto-selecting closest mirror.  Perhaps a shell
> >> function
> >> > for this?
> >> >
> >>
> >> Is you bash-fu that strong? I was thinking about writing a python script
> >> (python is available on most of the distros).
> >>
> >
> >
>

Re: Downloading binaries

Posted by Adrian Cole <ad...@jclouds.org>.

here's a start:

$ curl -q -s -S -L http://www.apache.org/dyn/closer.cgi |sed -n '/^<p><a
href="http/s/.*"\(.*\)".*/\1/gp'
http://apache.mivzakim.net//


On Thu, Feb 23, 2012 at 11:33 AM, Adrian Cole <ad...@jclouds.org> wrote:

> awk-fu or sed-fu should be able to handle this :P
>
>
> On Thu, Feb 23, 2012 at 11:15 AM, Andrei Savu <sa...@gmail.com>wrote:
>
>> On Thu, Feb 23, 2012 at 8:23 AM, Adrian Cole <fe...@gmail.com> wrote:
>>
>> > I like the idea of auto-selecting closest mirror.  Perhaps a shell
>> function
>> > for this?
>> >
>>
>> Is you bash-fu that strong? I was thinking about writing a python script
>> (python is available on most of the distros).
>>
>
>

Re: Downloading binaries

Posted by Adrian Cole <ad...@jclouds.org>.

awk-fu or sed-fu should be able to handle this :P

On Thu, Feb 23, 2012 at 11:15 AM, Andrei Savu <sa...@gmail.com> wrote:

> On Thu, Feb 23, 2012 at 8:23 AM, Adrian Cole <fe...@gmail.com> wrote:
>
> > I like the idea of auto-selecting closest mirror.  Perhaps a shell
> function
> > for this?
> >
>
> Is you bash-fu that strong? I was thinking about writing a python script
> (python is available on most of the distros).
>

Re: Downloading binaries

Posted by Andrei Savu <sa...@gmail.com>.

On Thu, Feb 23, 2012 at 8:23 AM, Adrian Cole <fe...@gmail.com> wrote:

> I like the idea of auto-selecting closest mirror.  Perhaps a shell function
> for this?
>

Is you bash-fu that strong? I was thinking about writing a python script
(python is available on most of the distros).

Re: Downloading binaries

Posted by Adrian Cole <fe...@gmail.com>.

I like the idea of auto-selecting closest mirror.  Perhaps a shell function
for this?

-A
On Feb 23, 2012 7:36 AM, "Andrei Savu" <sa...@gmail.com> wrote:

> On Thu, Feb 23, 2012 at 5:22 AM, Tom White <to...@cloudera.com> wrote:
>
> > I wonder if we could use
> > https://issues.apache.org/jira/browse/BIGTOP-399 or something similar
> > to select a nearby Apache mirror.
> >
>
> Sounds good. The only problem I see is that older releases are removed from
> the mirrors.
>

Re: Downloading binaries

Posted by Andrei Savu <sa...@gmail.com>.

On Thu, Feb 23, 2012 at 5:22 AM, Tom White <to...@cloudera.com> wrote:

> I wonder if we could use
> https://issues.apache.org/jira/browse/BIGTOP-399 or something similar
> to select a nearby Apache mirror.
>

Sounds good. The only problem I see is that older releases are removed from
the mirrors.

Re: Downloading binaries

Posted by Tom White <to...@cloudera.com>.

I wonder if we could use
https://issues.apache.org/jira/browse/BIGTOP-399 or something similar
to select a nearby Apache mirror.

You can always manually select a mirror and override the download
location in the Whirr properties, but it would be nice to make this
more automated.

Tom

On Wed, Feb 22, 2012 at 5:32 AM, Adrian Cole <ad...@jclouds.org> wrote:
> +1 this archive.apache.org issue also plagued me on cassandra testing of
> openjdk just now.
>
> Seems a blobcache option could be handy.  Otherwise, someone needs to foot
> the bill for CDN access (would need a volunteer ready for some $$$/month
> potentially, especially as other projects discover these links)
>
> -A
>
> On Wed, Feb 22, 2012 at 3:24 PM, Andrei Savu <sa...@gmail.com> wrote:
>
>> Hi All,
>>
>> I have just done a bit of testing from eu-west-1 - downloading the Hadoop
>> .tar.gz (93 MB) from:
>> http://archive.apache.org/dist/hadoop/core/hadoop-0.20.205.0/
>>
>> It failed 10 times in a row!! Any suggestions for a better location? Should
>> we start to think about hosting
>> the required artefacts in a different place? (S3 + cloudfront)
>>
>> This is  the root cause of the many transient failures we are seeing when
>> deploying clusters.
>>
>> I would even recommend to our users to  build a cache for what they are
>> using.
>>
>> -- Andrei Savu
>>

Re: Downloading binaries

Posted by Adrian Cole <ad...@jclouds.org>.

+1 this archive.apache.org issue also plagued me on cassandra testing of
openjdk just now.

Seems a blobcache option could be handy.  Otherwise, someone needs to foot
the bill for CDN access (would need a volunteer ready for some $$$/month
potentially, especially as other projects discover these links)

-A

On Wed, Feb 22, 2012 at 3:24 PM, Andrei Savu <sa...@gmail.com> wrote:

> Hi All,
>
> I have just done a bit of testing from eu-west-1 - downloading the Hadoop
> .tar.gz (93 MB) from:
> http://archive.apache.org/dist/hadoop/core/hadoop-0.20.205.0/
>
> It failed 10 times in a row!! Any suggestions for a better location? Should
> we start to think about hosting
> the required artefacts in a different place? (S3 + cloudfront)
>
> This is  the root cause of the many transient failures we are seeing when
> deploying clusters.
>
> I would even recommend to our users to  build a cache for what they are
> using.
>
> -- Andrei Savu
>