You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@zookeeper.apache.org by Suijian Zhou <su...@gmail.com> on 2014/04/09 00:05:09 UTC

Unable to read additional data from server sessionid...

Hi,
  I have a problem in zookeeper, after the session has been established, it
will lose connection in ~1 minute although I see the timeout is set to
600000, i.e 10minutes. What's the possible reasons?

14/04/08 16:55:22 INFO mapred.JobClient: Running job: job_201404081444_0018
14/04/08 16:55:22 INFO zookeeper.ClientCnxn: Opening socket connection to
server compute-0-13.local/10.1.255.241:22181. Will not attempt to
authenticate using SASL (unknown error)
14/04/08 16:55:22 INFO zookeeper.ClientCnxn: Socket connection established
to compute-0-13.local/10.1.255.241:22181, initiating session
14/04/08 16:55:22 INFO zookeeper.ClientCnxn: Session establishment complete
on server compute-0-13.local/10.1.255.241:22181, sessionid =
0x14543567f5e0009, negotiated timeout = 600000
......
......
14/04/08 16:57:02 INFO job.JobProgressTracker: Data from 8 workers -
Compute superstep 2: 0 out of 4847571 vertices computed; 0 out of 64
partitions computed; min free memory on worker 2 - 216.01MB, average
287.75MB
14/04/08 16:57:07 INFO zookeeper.ClientCnxn: Unable to read additional data
from server sessionid 0x14543567f5e0009, likely server has closed socket,
closing socket connection and attempting reconnect
14/04/08 16:57:09 INFO zookeeper.ClientCnxn: Opening socket connection to
server compute-0-13.local/10.1.255.241:22181. Will not attempt to
authenticate using SASL (unknown error)
14/04/08 16:57:09 WARN zookeeper.ClientCnxn: Session 0x14543567f5e0009 for
server null, unexpected error, closing socket connection and attempting
reconnect
java.net.ConnectException: Connection refused

  Best Regards,
  Suijian

Re: Unable to read additional data from server sessionid...

Posted by Michi Mutsuzaki <mi...@cs.stanford.edu>.
Hi Suijian,

The client log says that the client failed to read from the underling
TCP socket. Maybe there was a network problem, or maybe the ZooKeeper
server the client was connected to died. It's difficult to say for
sure what happened without the server log though.

On Wed, Apr 9, 2014 at 3:04 PM, Suijian Zhou <su...@gmail.com> wrote:
> Hi, Michi,
>   I could not find more logs about it as the zookeeper comes with another
> graph processing system( I did not install zookeeper seperatelly), the
> zookeeper log files of it are all empty. But do you know any possible
> reasons for this kind of errors? The server itself is running well all the
> time. But why the zookeeper session just got lost of connection so fast?
> Thanks!
>
>   Best Regards,
>   Suijian
>
>
> 2014-04-08 17:09 GMT-05:00 Michi Mutsuzaki <mi...@cs.stanford.edu>:
>
>> Hi Suijian,
>>
>> Do you have the server-side log file?
>>
>> On Tue, Apr 8, 2014 at 3:05 PM, Suijian Zhou <su...@gmail.com>
>> wrote:
>> > Hi,
>> >   I have a problem in zookeeper, after the session has been established,
>> > it
>> > will lose connection in ~1 minute although I see the timeout is set to
>> > 600000, i.e 10minutes. What's the possible reasons?
>> >
>> > 14/04/08 16:55:22 INFO mapred.JobClient: Running job:
>> > job_201404081444_0018
>> > 14/04/08 16:55:22 INFO zookeeper.ClientCnxn: Opening socket connection
>> > to
>> > server compute-0-13.local/10.1.255.241:22181. Will not attempt to
>> > authenticate using SASL (unknown error)
>> > 14/04/08 16:55:22 INFO zookeeper.ClientCnxn: Socket connection
>> > established
>> > to compute-0-13.local/10.1.255.241:22181, initiating session
>> > 14/04/08 16:55:22 INFO zookeeper.ClientCnxn: Session establishment
>> > complete
>> > on server compute-0-13.local/10.1.255.241:22181, sessionid =
>> > 0x14543567f5e0009, negotiated timeout = 600000
>> > ......
>> > ......
>> > 14/04/08 16:57:02 INFO job.JobProgressTracker: Data from 8 workers -
>> > Compute superstep 2: 0 out of 4847571 vertices computed; 0 out of 64
>> > partitions computed; min free memory on worker 2 - 216.01MB, average
>> > 287.75MB
>> > 14/04/08 16:57:07 INFO zookeeper.ClientCnxn: Unable to read additional
>> > data
>> > from server sessionid 0x14543567f5e0009, likely server has closed
>> > socket,
>> > closing socket connection and attempting reconnect
>> > 14/04/08 16:57:09 INFO zookeeper.ClientCnxn: Opening socket connection
>> > to
>> > server compute-0-13.local/10.1.255.241:22181. Will not attempt to
>> > authenticate using SASL (unknown error)
>> > 14/04/08 16:57:09 WARN zookeeper.ClientCnxn: Session 0x14543567f5e0009
>> > for
>> > server null, unexpected error, closing socket connection and attempting
>> > reconnect
>> > java.net.ConnectException: Connection refused
>> >
>> >   Best Regards,
>> >   Suijian
>
>

Re: Unable to read additional data from server sessionid...

Posted by Suijian Zhou <su...@gmail.com>.
Hi, Michi,
  I could not find more logs about it as the zookeeper comes with another
graph processing system( I did not install zookeeper seperatelly), the
zookeeper log files of it are all empty. But do you know any possible
reasons for this kind of errors? The server itself is running well all the
time. But why the zookeeper session just got lost of connection so fast?
Thanks!

  Best Regards,
  Suijian


2014-04-08 17:09 GMT-05:00 Michi Mutsuzaki <mi...@cs.stanford.edu>:

> Hi Suijian,
>
> Do you have the server-side log file?
>
> On Tue, Apr 8, 2014 at 3:05 PM, Suijian Zhou <su...@gmail.com>
> wrote:
> > Hi,
> >   I have a problem in zookeeper, after the session has been established,
> it
> > will lose connection in ~1 minute although I see the timeout is set to
> > 600000, i.e 10minutes. What's the possible reasons?
> >
> > 14/04/08 16:55:22 INFO mapred.JobClient: Running job:
> job_201404081444_0018
> > 14/04/08 16:55:22 INFO zookeeper.ClientCnxn: Opening socket connection to
> > server compute-0-13.local/10.1.255.241:22181. Will not attempt to
> > authenticate using SASL (unknown error)
> > 14/04/08 16:55:22 INFO zookeeper.ClientCnxn: Socket connection
> established
> > to compute-0-13.local/10.1.255.241:22181, initiating session
> > 14/04/08 16:55:22 INFO zookeeper.ClientCnxn: Session establishment
> complete
> > on server compute-0-13.local/10.1.255.241:22181, sessionid =
> > 0x14543567f5e0009, negotiated timeout = 600000
> > ......
> > ......
> > 14/04/08 16:57:02 INFO job.JobProgressTracker: Data from 8 workers -
> > Compute superstep 2: 0 out of 4847571 vertices computed; 0 out of 64
> > partitions computed; min free memory on worker 2 - 216.01MB, average
> > 287.75MB
> > 14/04/08 16:57:07 INFO zookeeper.ClientCnxn: Unable to read additional
> data
> > from server sessionid 0x14543567f5e0009, likely server has closed socket,
> > closing socket connection and attempting reconnect
> > 14/04/08 16:57:09 INFO zookeeper.ClientCnxn: Opening socket connection to
> > server compute-0-13.local/10.1.255.241:22181. Will not attempt to
> > authenticate using SASL (unknown error)
> > 14/04/08 16:57:09 WARN zookeeper.ClientCnxn: Session 0x14543567f5e0009
> for
> > server null, unexpected error, closing socket connection and attempting
> > reconnect
> > java.net.ConnectException: Connection refused
> >
> >   Best Regards,
> >   Suijian
>

Re: Unable to read additional data from server sessionid...

Posted by Michi Mutsuzaki <mi...@cs.stanford.edu>.
Hi Suijian,

Do you have the server-side log file?

On Tue, Apr 8, 2014 at 3:05 PM, Suijian Zhou <su...@gmail.com> wrote:
> Hi,
>   I have a problem in zookeeper, after the session has been established, it
> will lose connection in ~1 minute although I see the timeout is set to
> 600000, i.e 10minutes. What's the possible reasons?
>
> 14/04/08 16:55:22 INFO mapred.JobClient: Running job: job_201404081444_0018
> 14/04/08 16:55:22 INFO zookeeper.ClientCnxn: Opening socket connection to
> server compute-0-13.local/10.1.255.241:22181. Will not attempt to
> authenticate using SASL (unknown error)
> 14/04/08 16:55:22 INFO zookeeper.ClientCnxn: Socket connection established
> to compute-0-13.local/10.1.255.241:22181, initiating session
> 14/04/08 16:55:22 INFO zookeeper.ClientCnxn: Session establishment complete
> on server compute-0-13.local/10.1.255.241:22181, sessionid =
> 0x14543567f5e0009, negotiated timeout = 600000
> ......
> ......
> 14/04/08 16:57:02 INFO job.JobProgressTracker: Data from 8 workers -
> Compute superstep 2: 0 out of 4847571 vertices computed; 0 out of 64
> partitions computed; min free memory on worker 2 - 216.01MB, average
> 287.75MB
> 14/04/08 16:57:07 INFO zookeeper.ClientCnxn: Unable to read additional data
> from server sessionid 0x14543567f5e0009, likely server has closed socket,
> closing socket connection and attempting reconnect
> 14/04/08 16:57:09 INFO zookeeper.ClientCnxn: Opening socket connection to
> server compute-0-13.local/10.1.255.241:22181. Will not attempt to
> authenticate using SASL (unknown error)
> 14/04/08 16:57:09 WARN zookeeper.ClientCnxn: Session 0x14543567f5e0009 for
> server null, unexpected error, closing socket connection and attempting
> reconnect
> java.net.ConnectException: Connection refused
>
>   Best Regards,
>   Suijian