You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@whirr.apache.org by Paolo Castagna <ca...@googlemail.com> on 2011/10/26 10:56:04 UTC

Dying because - net.schmizz.sshj.transport.TransportException: Broken transport; encountered EOF

Hi,
I am using Whirr version 0.6.0-incubating.

This is my recipe for a small Hadoop cluster:

-----
whirr.cluster-name=hadoop
whirr.instance-templates=1 hadoop-namenode+hadoop-jobtracker, 6
hadoop-datanode+hadoop-tasktracker
whirr.provider=aws-ec2
whirr.identity=${env:AWS_ACCESS_KEY_ID}
whirr.credential=${env:AWS_SECRET_ACCESS_KEY}
whirr.hardware-id=m1.large
whirr.image-id=eu-west-1/ami-ee0e3c9a
whirr.location-id=eu-west-1
whirr.private-key-file=${sys:user.home}/.ssh/whirr
whirr.public-key-file=${whirr.private-key-file}.pub
whirr.hadoop.version=0.20.204.0
whirr.hadoop.tarball.url=http://archive.apache.org/dist/hadoop/core/hadoop-${whirr.hadoop.version}/hadoop-${whirr.hadoop.version}.tar.gz
-----

When I start up the cluster everything seems fine, but I see this
message very early on:

----
Dying because - net.schmizz.sshj.transport.TransportException: Broken
transport; encountered EOF
Dying because - net.schmizz.sshj.transport.TransportException: Broken
transport; encountered EOF
<<kex done>> woke to: net.schmizz.sshj.transport.TransportException:
Broken transport; encountered EOF
<< (ubuntu@46.137.70.148:22) error acquiring
SSHClient(ubuntu@46.137.70.148:22): Broken transport; encountered EOF
net.schmizz.sshj.transport.TransportException: Broken transport; encountered EOF
	at net.schmizz.sshj.transport.Reader.run(Reader.java:70)
----

I do not understand the problem (or if indeed, this is a problem).

... and I am missing compression. Could that be the reason? :-(

Which AMI would you recommend for Amazon EC2 eu-west-1?

Thank you very much in advance for your help (and thanks for Whirr! ;-))
Paolo

Re: Dying because - net.schmizz.sshj.transport.TransportException: Broken transport; encountered EOF

Posted by Paolo Castagna <ca...@googlemail.com>.
On 26 October 2011 11:22, Andrei Savu <sa...@gmail.com> wrote:
>
> On Wed, Oct 26, 2011 at 1:18 PM, Paolo Castagna
> <ca...@googlemail.com> wrote:
>>
>> Do you use an AMI available in the eu-west-1 region for your integration
>> tests?
>
> I don't know right now the AMI ID automatically selected by jclouds. You
> should be safe if you
> are using an Ubuntu 10.04 64bit OS AMI from
> here: http://cloud.ubuntu.com/ami/

Hi Andrei,
thank you.

I was thinking that if you could "recommend" a specific AMI or a set
of AMI which you are testing against, it might be a good thing for
users (like me) who do not want to hit new problems, just because they
are using a "wrong" AMI which does not play nicely with Whirr.

Ok, so what I was using it's just that (i.e. Ubuntu 10.04 64bit):
whirr.image-id=eu-west-1/ami-ee0e3c9a

;-)

Thanks,
Paolo

Re: Dying because - net.schmizz.sshj.transport.TransportException: Broken transport; encountered EOF

Posted by Andrei Savu <sa...@gmail.com>.
On Wed, Oct 26, 2011 at 1:18 PM, Paolo Castagna <
castagna.lists@googlemail.com> wrote:

> Do you use an AMI available in the eu-west-1 region for your integration
> tests?


I don't know right now the AMI ID automatically selected by jclouds. You
should be safe if you
are using an Ubuntu 10.04 64bit OS AMI from here:
http://cloud.ubuntu.com/ami/

Re: Dying because - net.schmizz.sshj.transport.TransportException: Broken transport; encountered EOF

Posted by Paolo Castagna <ca...@googlemail.com>.
On 26 October 2011 11:06, Andrei Savu <sa...@gmail.com> wrote:
>
> On Wed, Oct 26, 2011 at 1:01 PM, Paolo Castagna
> <ca...@googlemail.com> wrote:
>>
>> Should I remove whirr.image-id=eu-west-1/ami-ee0e3c9a from my recipe?
>
> IMHO I think it's better to specify an AMI so that you can avoid surprises.
> We are using this approach for integration testing so that we don't have to
> customise the .properties file when switching from aws-ec2 to cloudservers.

Do you use an AMI available in the eu-west-1 region for your integration tests?
If so, could I know which AMI you are using, so I can use the same. :-)

Paolo

Re: Dying because - net.schmizz.sshj.transport.TransportException: Broken transport; encountered EOF

Posted by Alex Heneveld <Al...@CloudsoftCorp.com>.
Paolo-

I've seen the same errors you are seeing with that distro on Amazon and 
they are benign.  These ubuntu images have a race creating ssh server 
keys, so sometimes at startup sshd will abort connections which 
jclouds/sshj moans about, but it rights itself within a few seconds.

The machines listed at [1] seem to be the ones jclouds currently 
returns, recent official Ubuntu contributed by Canonical.  It looks like 
this is what you're hard-coding so I'd stick with what you're already 
doing and disregard error messages from sshj that occur soon after 
machine startup.

(That said, we're not entirely certain the images are okay; we have 
tentative reports of other ssh problems with the 32-bit machines, but 
haven't been able to isolate them.  If there are problems they are 
highly intermittent -- but if you have any other issues please let us know.)

Best
Alex


[1]  http://uec-images.ubuntu.com/query/lucid/server/released.current.txt


On 26/10/2011 11:06, Andrei Savu wrote:
>
> On Wed, Oct 26, 2011 at 1:01 PM, Paolo Castagna 
> <castagna.lists@googlemail.com <ma...@googlemail.com>> 
> wrote:
>
>     Should I remove whirr.image-id=eu-west-1/ami-ee0e3c9a from my recipe?
>
>
> IMHO I think it's better to specify an AMI so that you can avoid 
> surprises. We are using this approach for integration testing so that 
> we don't have to customise the .properties file when switching from 
> aws-ec2 to cloudservers.


Re: Dying because - net.schmizz.sshj.transport.TransportException: Broken transport; encountered EOF

Posted by Andrei Savu <sa...@gmail.com>.
On Wed, Oct 26, 2011 at 1:01 PM, Paolo Castagna <
castagna.lists@googlemail.com> wrote:

> Should I remove whirr.image-id=eu-west-1/ami-ee0e3c9a from my recipe?
>

IMHO I think it's better to specify an AMI so that you can avoid surprises.
We are using this approach for integration testing so that we don't have to
customise the .properties file when switching from aws-ec2 to cloudservers.

Re: Dying because - net.schmizz.sshj.transport.TransportException: Broken transport; encountered EOF

Posted by Paolo Castagna <ca...@googlemail.com>.
On 26 October 2011 10:26, Andrei Savu <sa...@gmail.com> wrote:
>
> On Wed, Oct 26, 2011 at 12:21 PM, Paolo Castagna
> <ca...@googlemail.com> wrote:
>>
>> Hi Andrei,
>> thanks for your quick reply.
>>
>> On 26 October 2011 10:16, Andrei Savu <sa...@gmail.com> wrote:
>> > I don't have a good answer. Sometimes ssh connections are dropped but if
>> > the
>> > cluster is able to start everything should be fine. 0.7.0 will ship with
>> > jclouds 1.2.1 which has improvements around handling ssh connections.
>>
>> Is there a Whirr SNAPSHOT/nightly build I can use?
>
> No. We only make the releases available as binary artefacts.
>
>>
>> Do you think I would be better of using a SNAPSHOT instead of the
>> latest stable release?
>
> Sometimes. Even if we are trying to be as careful as possible the trunk can
> be unstable.
>
>>
>> > Another thing we've noticed while testing is that sometimes AMIs change
>> > without notice and things start to break.
>>
>> Which AMIs do you use to test Whirr? ;-)
>
> We use jclouds to automatically find an AMI running Ubuntu 10.04. This
> mechanism is cool because it works in any region and on any cloud and can
> also selecte 32bit or 64bit OSes as needed.

How do I do that?

Should I remove whirr.image-id=eu-west-1/ami-ee0e3c9a from my recipe?

Paolo

>
>>
>> I'll use the same... if available in the eu-west-1 region.
>>
>> Thanks,
>> Paolo
>>
>> >
>> > I suggest that you create some sort of smoke test for the cluster and if
>> > it
>> > fails you can rebuild it from scratch.
>> > Also take a look at the configuration guide page because there are some
>> > parameters you can tweak:
>> > http://whirr.apache.org/docs/0.6.0/configuration-guide.html
>> > (e.g. probably a cluster with only 80% of the region servers is still
>> > good
>> > enough)
>> > Cheers,
>> > -- Andrei Savu / andreisavu.ro
>> >
>> > On Wed, Oct 26, 2011 at 11:56 AM, Paolo Castagna
>> > <ca...@googlemail.com> wrote:
>> >>
>> >> Hi,
>> >> I am using Whirr version 0.6.0-incubating.
>> >>
>> >> This is my recipe for a small Hadoop cluster:
>> >>
>> >> -----
>> >> whirr.cluster-name=hadoop
>> >> whirr.instance-templates=1 hadoop-namenode+hadoop-jobtracker, 6
>> >> hadoop-datanode+hadoop-tasktracker
>> >> whirr.provider=aws-ec2
>> >> whirr.identity=${env:AWS_ACCESS_KEY_ID}
>> >> whirr.credential=${env:AWS_SECRET_ACCESS_KEY}
>> >> whirr.hardware-id=m1.large
>> >> whirr.image-id=eu-west-1/ami-ee0e3c9a
>> >> whirr.location-id=eu-west-1
>> >> whirr.private-key-file=${sys:user.home}/.ssh/whirr
>> >> whirr.public-key-file=${whirr.private-key-file}.pub
>> >> whirr.hadoop.version=0.20.204.0
>> >>
>> >>
>> >> whirr.hadoop.tarball.url=http://archive.apache.org/dist/hadoop/core/hadoop-${whirr.hadoop.version}/hadoop-${whirr.hadoop.version}.tar.gz
>> >> -----
>> >>
>> >> When I start up the cluster everything seems fine, but I see this
>> >> message very early on:
>> >>
>> >> ----
>> >> Dying because - net.schmizz.sshj.transport.TransportException: Broken
>> >> transport; encountered EOF
>> >> Dying because - net.schmizz.sshj.transport.TransportException: Broken
>> >> transport; encountered EOF
>> >> <<kex done>> woke to: net.schmizz.sshj.transport.TransportException:
>> >> Broken transport; encountered EOF
>> >> << (ubuntu@46.137.70.148:22) error acquiring
>> >> SSHClient(ubuntu@46.137.70.148:22): Broken transport; encountered EOF
>> >> net.schmizz.sshj.transport.TransportException: Broken transport;
>> >> encountered EOF
>> >>        at net.schmizz.sshj.transport.Reader.run(Reader.java:70)
>> >> ----
>> >>
>> >> I do not understand the problem (or if indeed, this is a problem).
>> >>
>> >> ... and I am missing compression. Could that be the reason? :-(
>> >>
>> >> Which AMI would you recommend for Amazon EC2 eu-west-1?
>> >>
>> >> Thank you very much in advance for your help (and thanks for Whirr!
>> >> ;-))
>> >> Paolo
>> >
>> >
>
>

Re: Dying because - net.schmizz.sshj.transport.TransportException: Broken transport; encountered EOF

Posted by Andrei Savu <sa...@gmail.com>.
On Wed, Oct 26, 2011 at 12:21 PM, Paolo Castagna <
castagna.lists@googlemail.com> wrote:

> Hi Andrei,
> thanks for your quick reply.
>
> On 26 October 2011 10:16, Andrei Savu <sa...@gmail.com> wrote:
> > I don't have a good answer. Sometimes ssh connections are dropped but if
> the
> > cluster is able to start everything should be fine. 0.7.0 will ship with
> > jclouds 1.2.1 which has improvements around handling ssh connections.
>
> Is there a Whirr SNAPSHOT/nightly build I can use?
>

No. We only make the releases available as binary artefacts.


> Do you think I would be better of using a SNAPSHOT instead of the
> latest stable release?
>

Sometimes. Even if we are trying to be as careful as possible the trunk can
be unstable.


>
> > Another thing we've noticed while testing is that sometimes AMIs change
> > without notice and things start to break.
>
> Which AMIs do you use to test Whirr? ;-)
>

We use jclouds to automatically find an AMI running Ubuntu 10.04. This
mechanism is cool because it works in any region and on any cloud and can
also selecte 32bit or 64bit OSes as needed.


> I'll use the same... if available in the eu-west-1 region.
>
> Thanks,
> Paolo
>
> >
> > I suggest that you create some sort of smoke test for the cluster and if
> it
> > fails you can rebuild it from scratch.
> > Also take a look at the configuration guide page because there are some
> > parameters you can tweak:
> > http://whirr.apache.org/docs/0.6.0/configuration-guide.html
> > (e.g. probably a cluster with only 80% of the region servers is still
> good
> > enough)
> > Cheers,
> > -- Andrei Savu / andreisavu.ro
> >
> > On Wed, Oct 26, 2011 at 11:56 AM, Paolo Castagna
> > <ca...@googlemail.com> wrote:
> >>
> >> Hi,
> >> I am using Whirr version 0.6.0-incubating.
> >>
> >> This is my recipe for a small Hadoop cluster:
> >>
> >> -----
> >> whirr.cluster-name=hadoop
> >> whirr.instance-templates=1 hadoop-namenode+hadoop-jobtracker, 6
> >> hadoop-datanode+hadoop-tasktracker
> >> whirr.provider=aws-ec2
> >> whirr.identity=${env:AWS_ACCESS_KEY_ID}
> >> whirr.credential=${env:AWS_SECRET_ACCESS_KEY}
> >> whirr.hardware-id=m1.large
> >> whirr.image-id=eu-west-1/ami-ee0e3c9a
> >> whirr.location-id=eu-west-1
> >> whirr.private-key-file=${sys:user.home}/.ssh/whirr
> >> whirr.public-key-file=${whirr.private-key-file}.pub
> >> whirr.hadoop.version=0.20.204.0
> >>
> >> whirr.hadoop.tarball.url=
> http://archive.apache.org/dist/hadoop/core/hadoop-${whirr.hadoop.version}/hadoop-${whirr.hadoop.version}.tar.gz
> >> -----
> >>
> >> When I start up the cluster everything seems fine, but I see this
> >> message very early on:
> >>
> >> ----
> >> Dying because - net.schmizz.sshj.transport.TransportException: Broken
> >> transport; encountered EOF
> >> Dying because - net.schmizz.sshj.transport.TransportException: Broken
> >> transport; encountered EOF
> >> <<kex done>> woke to: net.schmizz.sshj.transport.TransportException:
> >> Broken transport; encountered EOF
> >> << (ubuntu@46.137.70.148:22) error acquiring
> >> SSHClient(ubuntu@46.137.70.148:22): Broken transport; encountered EOF
> >> net.schmizz.sshj.transport.TransportException: Broken transport;
> >> encountered EOF
> >>        at net.schmizz.sshj.transport.Reader.run(Reader.java:70)
> >> ----
> >>
> >> I do not understand the problem (or if indeed, this is a problem).
> >>
> >> ... and I am missing compression. Could that be the reason? :-(
> >>
> >> Which AMI would you recommend for Amazon EC2 eu-west-1?
> >>
> >> Thank you very much in advance for your help (and thanks for Whirr! ;-))
> >> Paolo
> >
> >
>

Re: Dying because - net.schmizz.sshj.transport.TransportException: Broken transport; encountered EOF

Posted by Paolo Castagna <ca...@googlemail.com>.
Hi Andrei,
thanks for your quick reply.

On 26 October 2011 10:16, Andrei Savu <sa...@gmail.com> wrote:
> I don't have a good answer. Sometimes ssh connections are dropped but if the
> cluster is able to start everything should be fine. 0.7.0 will ship with
> jclouds 1.2.1 which has improvements around handling ssh connections.

Is there a Whirr SNAPSHOT/nightly build I can use?
Do you think I would be better of using a SNAPSHOT instead of the
latest stable release?

> Another thing we've noticed while testing is that sometimes AMIs change
> without notice and things start to break.

Which AMIs do you use to test Whirr? ;-)
I'll use the same... if available in the eu-west-1 region.

Thanks,
Paolo

>
> I suggest that you create some sort of smoke test for the cluster and if it
> fails you can rebuild it from scratch.
> Also take a look at the configuration guide page because there are some
> parameters you can tweak:
> http://whirr.apache.org/docs/0.6.0/configuration-guide.html
> (e.g. probably a cluster with only 80% of the region servers is still good
> enough)
> Cheers,
> -- Andrei Savu / andreisavu.ro
>
> On Wed, Oct 26, 2011 at 11:56 AM, Paolo Castagna
> <ca...@googlemail.com> wrote:
>>
>> Hi,
>> I am using Whirr version 0.6.0-incubating.
>>
>> This is my recipe for a small Hadoop cluster:
>>
>> -----
>> whirr.cluster-name=hadoop
>> whirr.instance-templates=1 hadoop-namenode+hadoop-jobtracker, 6
>> hadoop-datanode+hadoop-tasktracker
>> whirr.provider=aws-ec2
>> whirr.identity=${env:AWS_ACCESS_KEY_ID}
>> whirr.credential=${env:AWS_SECRET_ACCESS_KEY}
>> whirr.hardware-id=m1.large
>> whirr.image-id=eu-west-1/ami-ee0e3c9a
>> whirr.location-id=eu-west-1
>> whirr.private-key-file=${sys:user.home}/.ssh/whirr
>> whirr.public-key-file=${whirr.private-key-file}.pub
>> whirr.hadoop.version=0.20.204.0
>>
>> whirr.hadoop.tarball.url=http://archive.apache.org/dist/hadoop/core/hadoop-${whirr.hadoop.version}/hadoop-${whirr.hadoop.version}.tar.gz
>> -----
>>
>> When I start up the cluster everything seems fine, but I see this
>> message very early on:
>>
>> ----
>> Dying because - net.schmizz.sshj.transport.TransportException: Broken
>> transport; encountered EOF
>> Dying because - net.schmizz.sshj.transport.TransportException: Broken
>> transport; encountered EOF
>> <<kex done>> woke to: net.schmizz.sshj.transport.TransportException:
>> Broken transport; encountered EOF
>> << (ubuntu@46.137.70.148:22) error acquiring
>> SSHClient(ubuntu@46.137.70.148:22): Broken transport; encountered EOF
>> net.schmizz.sshj.transport.TransportException: Broken transport;
>> encountered EOF
>>        at net.schmizz.sshj.transport.Reader.run(Reader.java:70)
>> ----
>>
>> I do not understand the problem (or if indeed, this is a problem).
>>
>> ... and I am missing compression. Could that be the reason? :-(
>>
>> Which AMI would you recommend for Amazon EC2 eu-west-1?
>>
>> Thank you very much in advance for your help (and thanks for Whirr! ;-))
>> Paolo
>
>

Re: Dying because - net.schmizz.sshj.transport.TransportException: Broken transport; encountered EOF

Posted by Andrei Savu <sa...@gmail.com>.
I don't have a good answer. Sometimes ssh connections are dropped but if the
cluster is able to start everything should be fine. 0.7.0 will ship with
jclouds 1.2.1 which has improvements around handling ssh connections.

Another thing we've noticed while testing is that sometimes AMIs change
without notice and things start to break.

I suggest that you create some sort of smoke test for the cluster and if it
fails you can rebuild it from scratch.

Also take a look at the configuration guide page because there are some
parameters you can tweak:
http://whirr.apache.org/docs/0.6.0/configuration-guide.html

(e.g. probably a cluster with only 80% of the region servers is still good
enough)

Cheers,

-- Andrei Savu / andreisavu.ro

On Wed, Oct 26, 2011 at 11:56 AM, Paolo Castagna <
castagna.lists@googlemail.com> wrote:

> Hi,
> I am using Whirr version 0.6.0-incubating.
>
> This is my recipe for a small Hadoop cluster:
>
> -----
> whirr.cluster-name=hadoop
> whirr.instance-templates=1 hadoop-namenode+hadoop-jobtracker, 6
> hadoop-datanode+hadoop-tasktracker
> whirr.provider=aws-ec2
> whirr.identity=${env:AWS_ACCESS_KEY_ID}
> whirr.credential=${env:AWS_SECRET_ACCESS_KEY}
> whirr.hardware-id=m1.large
> whirr.image-id=eu-west-1/ami-ee0e3c9a
> whirr.location-id=eu-west-1
> whirr.private-key-file=${sys:user.home}/.ssh/whirr
> whirr.public-key-file=${whirr.private-key-file}.pub
> whirr.hadoop.version=0.20.204.0
> whirr.hadoop.tarball.url=
> http://archive.apache.org/dist/hadoop/core/hadoop-${whirr.hadoop.version}/hadoop-${whirr.hadoop.version}.tar.gz
> -----
>
> When I start up the cluster everything seems fine, but I see this
> message very early on:
>
> ----
> Dying because - net.schmizz.sshj.transport.TransportException: Broken
> transport; encountered EOF
> Dying because - net.schmizz.sshj.transport.TransportException: Broken
> transport; encountered EOF
> <<kex done>> woke to: net.schmizz.sshj.transport.TransportException:
> Broken transport; encountered EOF
> << (ubuntu@46.137.70.148:22) error acquiring
> SSHClient(ubuntu@46.137.70.148:22): Broken transport; encountered EOF
> net.schmizz.sshj.transport.TransportException: Broken transport;
> encountered EOF
>        at net.schmizz.sshj.transport.Reader.run(Reader.java:70)
> ----
>
> I do not understand the problem (or if indeed, this is a problem).
>
> ... and I am missing compression. Could that be the reason? :-(
>
> Which AMI would you recommend for Amazon EC2 eu-west-1?
>
> Thank you very much in advance for your help (and thanks for Whirr! ;-))
> Paolo
>