You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@accumulo.apache.org by "Ott, Charles H." <CH...@saic.com> on 2012/10/17 22:58:29 UTC

Thread "shell" Stuck on IO

I am using a VMware ESXi 4.1 server  with Cloudbase(Accumulo)  on RHEL5.

I cannot start with a fresh install because I am somewhat required to
use the preconfigured image on the vm. (business rules out of my hands) 

Unfortunately the support for this preconfigured instance is not
available and I am tasked with getting it working anyway... 

 

I am able to log into the shell and view the tables, however if  I
attempt to create a table or perform a scan, a line return is shown and
then it just hangs there until finally throwing the following error:



WARN thread "shell" stuck on IO to ssdev:9999:9999 (0) for at least
120044 ms.

 

I did also discover that 9999 is the property: master.port.client in my
conf/accumulo-site.xml

 

There is also an event log that was added to the VM with web based UI
reporting:



Unable to recover
192.168.0.130:11224/b4da830b-8ecb-4868-a480-35a39f4af17a(java.io.IOExcep
tion: org.apache.thrift.transport.TTransportException:
java.net.ConnectException: Connection timed out)

         java.io.IOException:
org.apache.thrift.transport.TTransportException:
java.net.ConnectException: Connection timed out

                 at
cloudbase.server.tabletserver.log.RemoteLogger.<init>(RemoteLogger.java:
75)

                 at
cloudbase.server.master.CoordinateRecoveryTask$RecoveryJob.startCopy(Coo
rdinateRecoveryTask.java:109)

                 at
cloudbase.server.master.CoordinateRecoveryTask$RecoveryJob.access$400(Co
ordinateRecoveryTask.java:93)

                 at
cloudbase.server.master.CoordinateRecoveryTask.recover(CoordinateRecover
yTask.java:279)

                 at
cloudbase.server.master.Master$TabletGroupWatcher.run(Master.java:1155)

         Caused by: org.apache.thrift.transport.TTransportException:
java.net.ConnectException: Connection timed out

                 at
cloudbase.core.client.impl.ThriftTransportPool.createNewTransport(Thrift
TransportPool.java:428)

                 at
cloudbase.core.client.impl.ThriftTransportPool.getTransport(ThriftTransp
ortPool.java:415)

                 at
cloudbase.core.client.impl.ThriftTransportPool.getTransport(ThriftTransp
ortPool.java:392)

                 at
cloudbase.core.util.ThriftUtil.getClient(ThriftUtil.java:58)

                 at
cloudbase.server.tabletserver.log.RemoteLogger.<init>(RemoteLogger.java:
73)

                 ... 4 more

         Caused by: java.net.ConnectException: Connection timed out

                 at sun.nio.ch.Net.connect(Native Method)

                 at
sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:500)

                 at
sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:81)

                 at
sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:65)

                 at
cloudbase.core.util.TTimeoutTransport.create(TTimeoutTransport.java:23)

                 at
cloudbase.core.client.impl.ThriftTransportPool.createNewTransport(Thrift
TransportPool.java:426)

                 ... 8 more

 

 

I have seen posts relating this to the walogs folder not being
available, but I have checked that and the .lock file is being created
automatically.  

A #netstat | grep 9999 shows no processes using 9999 before logging into
the shell... so Im not sure there is a port conflict either.

 

Any thoughts on the matter would be greatly appreciated. 


RE: Thread "shell" Stuck on IO

Posted by "Ott, Charles H." <CH...@saic.com>.
Just did exactly that. But didn't realize I needed to kill a process
first...

 

# cloudbase.sh init

18 15:01:36,694 [util.Initialize] INFO : Hadoop Filesystem is
hdfs://xxxxx:xxxx

18 15:01:36,704 [util.Initialize] INFO : Cloudbase data dir is
/cloudbase

18 15:01:36,704 [util.Initialize] INFO : Zookeeper server is xxxxx:xxxx

Instance name : XXX

Enter initial password for root: **************

Confirm initial password for root: **************

18 15:02:00,585 [util.NativeCodeLoader] WARN : Unable to load
native-hadoop library for your platform... using builtin-java classes
where applicable

18 15:02:00,834 [security.ZKAuthenticator] INFO : Initialized root user
with username: root at the request of user !SYSTEM

# cbshell

Enter current password for 'root'@'XXX': **************

18 15:02:17,468 [impl.ServerClient] WARN : Failed to find an available
server in the list of servers: []

18 15:02:17,468 [shell.Shell] ERROR: cloudbase.core.client.CBException:
org.apache.thrift.transport.TTransportException: Failed to connect to a
server

# cbshell

Enter current password for 'root'@'XXX': **************

18 15:02:33,387 [impl.ServerClient] WARN : Failed to find an available
server in the list of servers: []

18 15:02:33,387 [shell.Shell] ERROR: cloudbase.core.client.CBException:
org.apache.thrift.transport.TTransportException: Failed to connect to a
server

# reboot

 

^trying a reboot...

 

From: user-return-1500-CHARLES.H.OTT=saic.com@accumulo.apache.org
[mailto:user-return-1500-CHARLES.H.OTT=saic.com@accumulo.apache.org] On
Behalf Of Eric Newton
Sent: Thursday, October 18, 2012 11:04 AM
To: user@accumulo.apache.org
Subject: Re: Thread "shell" Stuck on IO

 

$ pkill -f accumulo.start

$ hadoop fs -rmr /accumulo

$ ./bin/accumulo init

 

-Eric

 

On Thu, Oct 18, 2012 at 10:46 AM, Ott, Charles H.
<CH...@saic.com> wrote:

I apologize for not giving more information from the start.

 

I am running a single instance on a single virtual server.  Zookeeper
shows a single server ssdev:2181 in 'standalone' mode.

 

This is a development system and there are no tables at this time.  The
IP conflict issue was noticed when I tried to create a table for the
first time the shell started to hang.

 

I have tried restarting the system but have been seeing the message:
"Recovery of 192.168.0.130:11224:[some UUID] failed." And the shell
still hangs when performing a scan or createtable.

 

I will look into "re-initializing" the server.

 

From: user-return-1496-CHARLES.H.OTT=saic.com@accumulo.apache.org
[mailto:user-return-1496-CHARLES.H.OTT=saic.com@accumulo.apache.org] On
Behalf Of Eric Newton
Sent: Thursday, October 18, 2012 7:41 AM


To: user@accumulo.apache.org
Subject: Re: Thread "shell" Stuck on IO

 

The reference to 192.168.0.130 is in zookeeper or the metadata table.

 

Unfortunately, this is a known problem with 1.3 and 1.4.  You can't
change your IP addresses.  You can incrementally shutdown servers and
change the IP address one-at-a-time, but not all at once.

 

If this is a dev system and you don't need the data, the fastest thing
to do is to reset the system and re-load your test data.

 

If you can't reload your data, you will have to move your data in hdfs,
re-initialize and bulk-import the existing tables.

 

-Eric

 

On Wed, Oct 17, 2012 at 5:40 PM, Ott, Charles H.
<CH...@saic.com> wrote:

I believe you have already helped me get on the right track...

First, 192.168.0.130 is the IP that the VM came with preconfigured.
I changed the IP for this new environment in RHEL5 and "most" everything
seems to be running... however, the fact that it is reporting
192.168.0.130 tells me that somewhere in the logger configuration it's
still using the old IP?

All of the properties files I have looked at specify the hostname, not
IP... I checked the hosts file and the hostname is resolving the proper
IP, so that shouldn't be an issue.

When I try to start the logger with:

# ./cloudbase.sh logger

 I see:
Failed to initialize log service args=[]
        java.io.IOException: Failed to acquire lock file
                at
cloudbase.server.logger.LogService.<init>(LogService.java:122)
                at
cloudbase.server.logger.LogService.main(LogService.java:83)
                at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
Method)
                at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.jav
a:39)
                at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessor
Impl.java:25)
                at java.lang.reflect.Method.invoke(Method.java:597)
                at cloudbase.start.Main$1.run(Main.java:73)
                at java.lang.Thread.run(Thread.java:662)


-----Original Message-----
From: user-return-1492-CHARLES.H.OTT=saic.com@accumulo.apache.org
[mailto:user-return-1492-CHARLES.H.OTT=saic.com@accumulo.apache.org] On
Behalf Of Keith Turner
Sent: Wednesday, October 17, 2012 5:09 PM
To: user@accumulo.apache.org
Subject: Re: Thread "shell" Stuck on IO

Is the logger at 192.168.0.130 running.   The stack trace indicates
that the master was attempting to contact the logger at 192.168.0.130 to
initiate log recovery.

On Wed, Oct 17, 2012 at 4:58 PM, Ott, Charles H.
<CH...@saic.com> wrote:
> I am using a VMware ESXi 4.1 server  with Cloudbase(Accumulo)  on
RHEL5.
>
> I cannot start with a fresh install because I am somewhat required to
> use the preconfigured image on the vm. (business rules out of my
> hands)
>
> Unfortunately the support for this preconfigured instance is not

> available and I am tasked with getting it working anyway...

>
>
>
> I am able to log into the shell and view the tables, however if  I
> attempt to create a table or perform a scan, a line return is shown
> and then it just hangs there until finally throwing the following
error:
>
> WARN thread "shell" stuck on IO to ssdev:9999:9999 (0) for at least
> 120044 ms.
>
>
>
> I did also discover that 9999 is the property: master.port.client in
> my conf/accumulo-site.xml
>
>
>
> There is also an event log that was added to the VM with web based UI
> reporting:
>
> Unable to recover
>
192.168.0.130:11224/b4da830b-8ecb-4868-a480-35a39f4af17a(java.io.IOExcep
tion:
> org.apache.thrift.transport.TTransportException:
java.net.ConnectException:
> Connection timed out)
>
>          java.io.IOException:
> org.apache.thrift.transport.TTransportException:
java.net.ConnectException:
> Connection timed out
>
>                  at
> cloudbase.server.tabletserver.log.RemoteLogger.<init>(RemoteLogger.jav
> a:75)
>
>                  at
> cloudbase.server.master.CoordinateRecoveryTask$RecoveryJob.startCopy(C
> oordinateRecoveryTask.java:109)
>
>                  at
> cloudbase.server.master.CoordinateRecoveryTask$RecoveryJob.access$400(
> CoordinateRecoveryTask.java:93)
>
>                  at
> cloudbase.server.master.CoordinateRecoveryTask.recover(CoordinateRecov
> eryTask.java:279)
>
>                  at
> cloudbase.server.master.Master$TabletGroupWatcher.run(Master.java:1155
> )
>
>          Caused by: org.apache.thrift.transport.TTransportException:
> java.net.ConnectException: Connection timed out
>
>                  at
> cloudbase.core.client.impl.ThriftTransportPool.createNewTransport(Thri
> ftTransportPool.java:428)
>
>                  at
> cloudbase.core.client.impl.ThriftTransportPool.getTransport(ThriftTran
> sportPool.java:415)
>
>                  at
> cloudbase.core.client.impl.ThriftTransportPool.getTransport(ThriftTran
> sportPool.java:392)
>
>                  at
> cloudbase.core.util.ThriftUtil.getClient(ThriftUtil.java:58)
>
>                  at
> cloudbase.server.tabletserver.log.RemoteLogger.<init>(RemoteLogger.jav
> a:73)
>
>                  ... 4 more
>
>          Caused by: java.net.ConnectException: Connection timed out
>
>                  at sun.nio.ch.Net.connect(Native Method)
>
>                  at
> sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:500)
>
>                  at
> sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:81)
>
>                  at
> sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:65)
>
>                  at
> cloudbase.core.util.TTimeoutTransport.create(TTimeoutTransport.java:23
> )
>
>                  at
> cloudbase.core.client.impl.ThriftTransportPool.createNewTransport(Thri
> ftTransportPool.java:426)
>
>                  ... 8 more
>
>
>
>
>
> I have seen posts relating this to the walogs folder not being
> available, but I have checked that and the .lock file is being created
automatically.
>
> A #netstat | grep 9999 shows no processes using 9999 before logging

> into the shell... so Im not sure there is a port conflict either.

>
>
>
> Any thoughts on the matter would be greatly appreciated.

 

 


RE: Thread "shell" Stuck on IO

Posted by "Ott, Charles H." <CH...@saic.com>.
A reboot ended up fixing it after deleting the dfs and re-initializing
post IP change.

 

Thanks!

 

 

From: user-return-1500-CHARLES.H.OTT=saic.com@accumulo.apache.org
[mailto:user-return-1500-CHARLES.H.OTT=saic.com@accumulo.apache.org] On
Behalf Of Eric Newton
Sent: Thursday, October 18, 2012 11:04 AM
To: user@accumulo.apache.org
Subject: Re: Thread "shell" Stuck on IO

 

$ pkill -f accumulo.start

$ hadoop fs -rmr /accumulo

$ ./bin/accumulo init

 

-Eric

 

On Thu, Oct 18, 2012 at 10:46 AM, Ott, Charles H.
<CH...@saic.com> wrote:

I apologize for not giving more information from the start.

 

I am running a single instance on a single virtual server.  Zookeeper
shows a single server ssdev:2181 in 'standalone' mode.

 

This is a development system and there are no tables at this time.  The
IP conflict issue was noticed when I tried to create a table for the
first time the shell started to hang.

 

I have tried restarting the system but have been seeing the message:
"Recovery of 192.168.0.130:11224:[some UUID] failed." And the shell
still hangs when performing a scan or createtable.

 

I will look into "re-initializing" the server.

 

From: user-return-1496-CHARLES.H.OTT=saic.com@accumulo.apache.org
[mailto:user-return-1496-CHARLES.H.OTT=saic.com@accumulo.apache.org] On
Behalf Of Eric Newton
Sent: Thursday, October 18, 2012 7:41 AM


To: user@accumulo.apache.org
Subject: Re: Thread "shell" Stuck on IO

 

The reference to 192.168.0.130 is in zookeeper or the metadata table.

 

Unfortunately, this is a known problem with 1.3 and 1.4.  You can't
change your IP addresses.  You can incrementally shutdown servers and
change the IP address one-at-a-time, but not all at once.

 

If this is a dev system and you don't need the data, the fastest thing
to do is to reset the system and re-load your test data.

 

If you can't reload your data, you will have to move your data in hdfs,
re-initialize and bulk-import the existing tables.

 

-Eric

 

On Wed, Oct 17, 2012 at 5:40 PM, Ott, Charles H.
<CH...@saic.com> wrote:

I believe you have already helped me get on the right track...

First, 192.168.0.130 is the IP that the VM came with preconfigured.
I changed the IP for this new environment in RHEL5 and "most" everything
seems to be running... however, the fact that it is reporting
192.168.0.130 tells me that somewhere in the logger configuration it's
still using the old IP?

All of the properties files I have looked at specify the hostname, not
IP... I checked the hosts file and the hostname is resolving the proper
IP, so that shouldn't be an issue.

When I try to start the logger with:

# ./cloudbase.sh logger

 I see:
Failed to initialize log service args=[]
        java.io.IOException: Failed to acquire lock file
                at
cloudbase.server.logger.LogService.<init>(LogService.java:122)
                at
cloudbase.server.logger.LogService.main(LogService.java:83)
                at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
Method)
                at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.jav
a:39)
                at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessor
Impl.java:25)
                at java.lang.reflect.Method.invoke(Method.java:597)
                at cloudbase.start.Main$1.run(Main.java:73)
                at java.lang.Thread.run(Thread.java:662)


-----Original Message-----
From: user-return-1492-CHARLES.H.OTT=saic.com@accumulo.apache.org
[mailto:user-return-1492-CHARLES.H.OTT=saic.com@accumulo.apache.org] On
Behalf Of Keith Turner
Sent: Wednesday, October 17, 2012 5:09 PM
To: user@accumulo.apache.org
Subject: Re: Thread "shell" Stuck on IO

Is the logger at 192.168.0.130 running.   The stack trace indicates
that the master was attempting to contact the logger at 192.168.0.130 to
initiate log recovery.

On Wed, Oct 17, 2012 at 4:58 PM, Ott, Charles H.
<CH...@saic.com> wrote:
> I am using a VMware ESXi 4.1 server  with Cloudbase(Accumulo)  on
RHEL5.
>
> I cannot start with a fresh install because I am somewhat required to
> use the preconfigured image on the vm. (business rules out of my
> hands)
>
> Unfortunately the support for this preconfigured instance is not

> available and I am tasked with getting it working anyway...

>
>
>
> I am able to log into the shell and view the tables, however if  I
> attempt to create a table or perform a scan, a line return is shown
> and then it just hangs there until finally throwing the following
error:
>
> WARN thread "shell" stuck on IO to ssdev:9999:9999 (0) for at least
> 120044 ms.
>
>
>
> I did also discover that 9999 is the property: master.port.client in
> my conf/accumulo-site.xml
>
>
>
> There is also an event log that was added to the VM with web based UI
> reporting:
>
> Unable to recover
>
192.168.0.130:11224/b4da830b-8ecb-4868-a480-35a39f4af17a(java.io.IOExcep
tion:
> org.apache.thrift.transport.TTransportException:
java.net.ConnectException:
> Connection timed out)
>
>          java.io.IOException:
> org.apache.thrift.transport.TTransportException:
java.net.ConnectException:
> Connection timed out
>
>                  at
> cloudbase.server.tabletserver.log.RemoteLogger.<init>(RemoteLogger.jav
> a:75)
>
>                  at
> cloudbase.server.master.CoordinateRecoveryTask$RecoveryJob.startCopy(C
> oordinateRecoveryTask.java:109)
>
>                  at
> cloudbase.server.master.CoordinateRecoveryTask$RecoveryJob.access$400(
> CoordinateRecoveryTask.java:93)
>
>                  at
> cloudbase.server.master.CoordinateRecoveryTask.recover(CoordinateRecov
> eryTask.java:279)
>
>                  at
> cloudbase.server.master.Master$TabletGroupWatcher.run(Master.java:1155
> )
>
>          Caused by: org.apache.thrift.transport.TTransportException:
> java.net.ConnectException: Connection timed out
>
>                  at
> cloudbase.core.client.impl.ThriftTransportPool.createNewTransport(Thri
> ftTransportPool.java:428)
>
>                  at
> cloudbase.core.client.impl.ThriftTransportPool.getTransport(ThriftTran
> sportPool.java:415)
>
>                  at
> cloudbase.core.client.impl.ThriftTransportPool.getTransport(ThriftTran
> sportPool.java:392)
>
>                  at
> cloudbase.core.util.ThriftUtil.getClient(ThriftUtil.java:58)
>
>                  at
> cloudbase.server.tabletserver.log.RemoteLogger.<init>(RemoteLogger.jav
> a:73)
>
>                  ... 4 more
>
>          Caused by: java.net.ConnectException: Connection timed out
>
>                  at sun.nio.ch.Net.connect(Native Method)
>
>                  at
> sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:500)
>
>                  at
> sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:81)
>
>                  at
> sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:65)
>
>                  at
> cloudbase.core.util.TTimeoutTransport.create(TTimeoutTransport.java:23
> )
>
>                  at
> cloudbase.core.client.impl.ThriftTransportPool.createNewTransport(Thri
> ftTransportPool.java:426)
>
>                  ... 8 more
>
>
>
>
>
> I have seen posts relating this to the walogs folder not being
> available, but I have checked that and the .lock file is being created
automatically.
>
> A #netstat | grep 9999 shows no processes using 9999 before logging

> into the shell... so Im not sure there is a port conflict either.

>
>
>
> Any thoughts on the matter would be greatly appreciated.

 

 


Re: Thread "shell" Stuck on IO

Posted by Eric Newton <er...@gmail.com>.
$ pkill -f accumulo.start
$ hadoop fs -rmr /accumulo
$ ./bin/accumulo init

-Eric

On Thu, Oct 18, 2012 at 10:46 AM, Ott, Charles H. <CH...@saic.com>wrote:

> I apologize for not giving more information from the start.****
>
> ** **
>
> I am running a single instance on a single virtual server.  Zookeeper
> shows a single server ssdev:2181 in ‘standalone’ mode.****
>
> ** **
>
> This is a development system and there are no tables at this time.  The IP
> conflict issue was noticed when I tried to create a table for the first
> time the shell started to hang.****
>
> ** **
>
> I have tried restarting the system but have been seeing the message:
> “Recovery of 192.168.0.130:11224:[some UUID] failed.” And the shell still
> hangs when performing a scan or createtable.****
>
> ** **
>
> I will look into “re-initializing” the server.****
>
> ** **
>
> *From:* user-return-1496-CHARLES.H.OTT=saic.com@accumulo.apache.org[mailto:
> user-return-1496-CHARLES.H.OTT=saic.com@accumulo.apache.org] *On Behalf
> Of *Eric Newton
> *Sent:* Thursday, October 18, 2012 7:41 AM
>
> *To:* user@accumulo.apache.org
> *Subject:* Re: Thread "shell" Stuck on IO****
>
> ** **
>
> The reference to 192.168.0.130 is in zookeeper or the metadata table.****
>
> ** **
>
> Unfortunately, this is a known problem with 1.3 and 1.4.  You can't change
> your IP addresses.  You can incrementally shutdown servers and change the
> IP address one-at-a-time, but not all at once.****
>
> ** **
>
> If this is a dev system and you don't need the data, the fastest thing to
> do is to reset the system and re-load your test data.****
>
> ** **
>
> If you can't reload your data, you will have to move your data in hdfs,
> re-initialize and bulk-import the existing tables.****
>
> ** **
>
> -Eric****
>
> ** **
>
> On Wed, Oct 17, 2012 at 5:40 PM, Ott, Charles H. <CH...@saic.com>
> wrote:****
>
> I believe you have already helped me get on the right track...
>
> First, 192.168.0.130 is the IP that the VM came with preconfigured.
> I changed the IP for this new environment in RHEL5 and "most" everything
> seems to be running... however, the fact that it is reporting
> 192.168.0.130 tells me that somewhere in the logger configuration it's
> still using the old IP?
>
> All of the properties files I have looked at specify the hostname, not
> IP... I checked the hosts file and the hostname is resolving the proper
> IP, so that shouldn't be an issue.
>
> When I try to start the logger with:
>
> # ./cloudbase.sh logger
>
>  I see:
> Failed to initialize log service args=[]
>         java.io.IOException: Failed to acquire lock file
>                 at
> cloudbase.server.logger.LogService.<init>(LogService.java:122)
>                 at
> cloudbase.server.logger.LogService.main(LogService.java:83)
>                 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
> Method)
>                 at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.jav
> a:39)
>                 at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessor
> Impl.java:25)
>                 at java.lang.reflect.Method.invoke(Method.java:597)
>                 at cloudbase.start.Main$1.run(Main.java:73)
>                 at java.lang.Thread.run(Thread.java:662)****
>
>
> -----Original Message-----
> From: user-return-1492-CHARLES.H.OTT=saic.com@accumulo.apache.org
> [mailto:user-return-1492-CHARLES.H.OTT=saic.com@accumulo.apache.org] On
> Behalf Of Keith Turner
> Sent: Wednesday, October 17, 2012 5:09 PM
> To: user@accumulo.apache.org
> Subject: Re: Thread "shell" Stuck on IO
>
> Is the logger at 192.168.0.130 running.   The stack trace indicates
> that the master was attempting to contact the logger at 192.168.0.130 to
> initiate log recovery.
>
> On Wed, Oct 17, 2012 at 4:58 PM, Ott, Charles H.
> <CH...@saic.com> wrote:
> > I am using a VMware ESXi 4.1 server  with Cloudbase(Accumulo)  on
> RHEL5.
> >
> > I cannot start with a fresh install because I am somewhat required to
> > use the preconfigured image on the vm. (business rules out of my
> > hands)
> >
> > Unfortunately the support for this preconfigured instance is not****
>
> > available and I am tasked with getting it working anyway...****
>
> >
> >
> >
> > I am able to log into the shell and view the tables, however if  I
> > attempt to create a table or perform a scan, a line return is shown
> > and then it just hangs there until finally throwing the following
> error:
> >
> > WARN thread "shell" stuck on IO to ssdev:9999:9999 (0) for at least
> > 120044 ms.
> >
> >
> >
> > I did also discover that 9999 is the property: master.port.client in
> > my conf/accumulo-site.xml
> >
> >
> >
> > There is also an event log that was added to the VM with web based UI
> > reporting:
> >
> > Unable to recover
> >
> 192.168.0.130:11224/b4da830b-8ecb-4868-a480-35a39f4af17a(java.io.IOExcep
> tion:
> > org.apache.thrift.transport.TTransportException:
> java.net.ConnectException:
> > Connection timed out)
> >
> >          java.io.IOException:
> > org.apache.thrift.transport.TTransportException:
> java.net.ConnectException:
> > Connection timed out
> >
> >                  at
> > cloudbase.server.tabletserver.log.RemoteLogger.<init>(RemoteLogger.jav
> > a:75)
> >
> >                  at
> > cloudbase.server.master.CoordinateRecoveryTask$RecoveryJob.startCopy(C
> > oordinateRecoveryTask.java:109)
> >
> >                  at
> > cloudbase.server.master.CoordinateRecoveryTask$RecoveryJob.access$400(
> > CoordinateRecoveryTask.java:93)
> >
> >                  at
> > cloudbase.server.master.CoordinateRecoveryTask.recover(CoordinateRecov
> > eryTask.java:279)
> >
> >                  at
> > cloudbase.server.master.Master$TabletGroupWatcher.run(Master.java:1155
> > )
> >
> >          Caused by: org.apache.thrift.transport.TTransportException:
> > java.net.ConnectException: Connection timed out
> >
> >                  at
> > cloudbase.core.client.impl.ThriftTransportPool.createNewTransport(Thri
> > ftTransportPool.java:428)
> >
> >                  at
> > cloudbase.core.client.impl.ThriftTransportPool.getTransport(ThriftTran
> > sportPool.java:415)
> >
> >                  at
> > cloudbase.core.client.impl.ThriftTransportPool.getTransport(ThriftTran
> > sportPool.java:392)
> >
> >                  at
> > cloudbase.core.util.ThriftUtil.getClient(ThriftUtil.java:58)
> >
> >                  at
> > cloudbase.server.tabletserver.log.RemoteLogger.<init>(RemoteLogger.jav
> > a:73)
> >
> >                  ... 4 more
> >
> >          Caused by: java.net.ConnectException: Connection timed out
> >
> >                  at sun.nio.ch.Net.connect(Native Method)
> >
> >                  at
> > sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:500)
> >
> >                  at
> > sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:81)
> >
> >                  at
> > sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:65)
> >
> >                  at
> > cloudbase.core.util.TTimeoutTransport.create(TTimeoutTransport.java:23
> > )
> >
> >                  at
> > cloudbase.core.client.impl.ThriftTransportPool.createNewTransport(Thri
> > ftTransportPool.java:426)
> >
> >                  ... 8 more
> >
> >
> >
> >
> >
> > I have seen posts relating this to the walogs folder not being
> > available, but I have checked that and the .lock file is being created
> automatically.
> >
> > A #netstat | grep 9999 shows no processes using 9999 before logging****
>
> > into the shell... so Im not sure there is a port conflict either.****
>
> >
> >
> >
> > Any thoughts on the matter would be greatly appreciated.****
>
> ** **
>

RE: Thread "shell" Stuck on IO

Posted by "Ott, Charles H." <CH...@saic.com>.
I apologize for not giving more information from the start.

 

I am running a single instance on a single virtual server.  Zookeeper
shows a single server ssdev:2181 in 'standalone' mode.

 

This is a development system and there are no tables at this time.  The
IP conflict issue was noticed when I tried to create a table for the
first time the shell started to hang.

 

I have tried restarting the system but have been seeing the message:
"Recovery of 192.168.0.130:11224:[some UUID] failed." And the shell
still hangs when performing a scan or createtable.

 

I will look into "re-initializing" the server.

 

From: user-return-1496-CHARLES.H.OTT=saic.com@accumulo.apache.org
[mailto:user-return-1496-CHARLES.H.OTT=saic.com@accumulo.apache.org] On
Behalf Of Eric Newton
Sent: Thursday, October 18, 2012 7:41 AM
To: user@accumulo.apache.org
Subject: Re: Thread "shell" Stuck on IO

 

The reference to 192.168.0.130 is in zookeeper or the metadata table.

 

Unfortunately, this is a known problem with 1.3 and 1.4.  You can't
change your IP addresses.  You can incrementally shutdown servers and
change the IP address one-at-a-time, but not all at once.

 

If this is a dev system and you don't need the data, the fastest thing
to do is to reset the system and re-load your test data.

 

If you can't reload your data, you will have to move your data in hdfs,
re-initialize and bulk-import the existing tables.

 

-Eric

 

On Wed, Oct 17, 2012 at 5:40 PM, Ott, Charles H.
<CH...@saic.com> wrote:

I believe you have already helped me get on the right track...

First, 192.168.0.130 is the IP that the VM came with preconfigured.
I changed the IP for this new environment in RHEL5 and "most" everything
seems to be running... however, the fact that it is reporting
192.168.0.130 tells me that somewhere in the logger configuration it's
still using the old IP?

All of the properties files I have looked at specify the hostname, not
IP... I checked the hosts file and the hostname is resolving the proper
IP, so that shouldn't be an issue.

When I try to start the logger with:

# ./cloudbase.sh logger

 I see:
Failed to initialize log service args=[]
        java.io.IOException: Failed to acquire lock file
                at
cloudbase.server.logger.LogService.<init>(LogService.java:122)
                at
cloudbase.server.logger.LogService.main(LogService.java:83)
                at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
Method)
                at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.jav
a:39)
                at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessor
Impl.java:25)
                at java.lang.reflect.Method.invoke(Method.java:597)
                at cloudbase.start.Main$1.run(Main.java:73)
                at java.lang.Thread.run(Thread.java:662)


-----Original Message-----
From: user-return-1492-CHARLES.H.OTT=saic.com@accumulo.apache.org
[mailto:user-return-1492-CHARLES.H.OTT=saic.com@accumulo.apache.org] On
Behalf Of Keith Turner
Sent: Wednesday, October 17, 2012 5:09 PM
To: user@accumulo.apache.org
Subject: Re: Thread "shell" Stuck on IO

Is the logger at 192.168.0.130 running.   The stack trace indicates
that the master was attempting to contact the logger at 192.168.0.130 to
initiate log recovery.

On Wed, Oct 17, 2012 at 4:58 PM, Ott, Charles H.
<CH...@saic.com> wrote:
> I am using a VMware ESXi 4.1 server  with Cloudbase(Accumulo)  on
RHEL5.
>
> I cannot start with a fresh install because I am somewhat required to
> use the preconfigured image on the vm. (business rules out of my
> hands)
>
> Unfortunately the support for this preconfigured instance is not

> available and I am tasked with getting it working anyway...

>
>
>
> I am able to log into the shell and view the tables, however if  I
> attempt to create a table or perform a scan, a line return is shown
> and then it just hangs there until finally throwing the following
error:
>
> WARN thread "shell" stuck on IO to ssdev:9999:9999 (0) for at least
> 120044 ms.
>
>
>
> I did also discover that 9999 is the property: master.port.client in
> my conf/accumulo-site.xml
>
>
>
> There is also an event log that was added to the VM with web based UI
> reporting:
>
> Unable to recover
>
192.168.0.130:11224/b4da830b-8ecb-4868-a480-35a39f4af17a(java.io.IOExcep
tion:
> org.apache.thrift.transport.TTransportException:
java.net.ConnectException:
> Connection timed out)
>
>          java.io.IOException:
> org.apache.thrift.transport.TTransportException:
java.net.ConnectException:
> Connection timed out
>
>                  at
> cloudbase.server.tabletserver.log.RemoteLogger.<init>(RemoteLogger.jav
> a:75)
>
>                  at
> cloudbase.server.master.CoordinateRecoveryTask$RecoveryJob.startCopy(C
> oordinateRecoveryTask.java:109)
>
>                  at
> cloudbase.server.master.CoordinateRecoveryTask$RecoveryJob.access$400(
> CoordinateRecoveryTask.java:93)
>
>                  at
> cloudbase.server.master.CoordinateRecoveryTask.recover(CoordinateRecov
> eryTask.java:279)
>
>                  at
> cloudbase.server.master.Master$TabletGroupWatcher.run(Master.java:1155
> )
>
>          Caused by: org.apache.thrift.transport.TTransportException:
> java.net.ConnectException: Connection timed out
>
>                  at
> cloudbase.core.client.impl.ThriftTransportPool.createNewTransport(Thri
> ftTransportPool.java:428)
>
>                  at
> cloudbase.core.client.impl.ThriftTransportPool.getTransport(ThriftTran
> sportPool.java:415)
>
>                  at
> cloudbase.core.client.impl.ThriftTransportPool.getTransport(ThriftTran
> sportPool.java:392)
>
>                  at
> cloudbase.core.util.ThriftUtil.getClient(ThriftUtil.java:58)
>
>                  at
> cloudbase.server.tabletserver.log.RemoteLogger.<init>(RemoteLogger.jav
> a:73)
>
>                  ... 4 more
>
>          Caused by: java.net.ConnectException: Connection timed out
>
>                  at sun.nio.ch.Net.connect(Native Method)
>
>                  at
> sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:500)
>
>                  at
> sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:81)
>
>                  at
> sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:65)
>
>                  at
> cloudbase.core.util.TTimeoutTransport.create(TTimeoutTransport.java:23
> )
>
>                  at
> cloudbase.core.client.impl.ThriftTransportPool.createNewTransport(Thri
> ftTransportPool.java:426)
>
>                  ... 8 more
>
>
>
>
>
> I have seen posts relating this to the walogs folder not being
> available, but I have checked that and the .lock file is being created
automatically.
>
> A #netstat | grep 9999 shows no processes using 9999 before logging

> into the shell... so Im not sure there is a port conflict either.

>
>
>
> Any thoughts on the matter would be greatly appreciated.

 


Re: Thread "shell" Stuck on IO

Posted by Eric Newton <er...@gmail.com>.
The reference to 192.168.0.130 is in zookeeper or the metadata table.

Unfortunately, this is a known problem with 1.3 and 1.4.  You can't change
your IP addresses.  You can incrementally shutdown servers and change the
IP address one-at-a-time, but not all at once.

If this is a dev system and you don't need the data, the fastest thing to
do is to reset the system and re-load your test data.

If you can't reload your data, you will have to move your data in hdfs,
re-initialize and bulk-import the existing tables.

-Eric


On Wed, Oct 17, 2012 at 5:40 PM, Ott, Charles H. <CH...@saic.com>wrote:

> I believe you have already helped me get on the right track...
>
> First, 192.168.0.130 is the IP that the VM came with preconfigured.
> I changed the IP for this new environment in RHEL5 and "most" everything
> seems to be running... however, the fact that it is reporting
> 192.168.0.130 tells me that somewhere in the logger configuration it's
> still using the old IP?
>
> All of the properties files I have looked at specify the hostname, not
> IP... I checked the hosts file and the hostname is resolving the proper
> IP, so that shouldn't be an issue.
>
> When I try to start the logger with:
>
> # ./cloudbase.sh logger
>
>  I see:
> Failed to initialize log service args=[]
>         java.io.IOException: Failed to acquire lock file
>                 at
> cloudbase.server.logger.LogService.<init>(LogService.java:122)
>                 at
> cloudbase.server.logger.LogService.main(LogService.java:83)
>                 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
> Method)
>                 at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.jav
> a:39)
>                 at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessor
> Impl.java:25)
>                 at java.lang.reflect.Method.invoke(Method.java:597)
>                 at cloudbase.start.Main$1.run(Main.java:73)
>                 at java.lang.Thread.run(Thread.java:662)
>
> -----Original Message-----
> From: user-return-1492-CHARLES.H.OTT=saic.com@accumulo.apache.org
> [mailto:user-return-1492-CHARLES.H.OTT=saic.com@accumulo.apache.org] On
> Behalf Of Keith Turner
> Sent: Wednesday, October 17, 2012 5:09 PM
> To: user@accumulo.apache.org
> Subject: Re: Thread "shell" Stuck on IO
>
> Is the logger at 192.168.0.130 running.   The stack trace indicates
> that the master was attempting to contact the logger at 192.168.0.130 to
> initiate log recovery.
>
> On Wed, Oct 17, 2012 at 4:58 PM, Ott, Charles H.
> <CH...@saic.com> wrote:
> > I am using a VMware ESXi 4.1 server  with Cloudbase(Accumulo)  on
> RHEL5.
> >
> > I cannot start with a fresh install because I am somewhat required to
> > use the preconfigured image on the vm. (business rules out of my
> > hands)
> >
> > Unfortunately the support for this preconfigured instance is not
> > available and I am tasked with getting it working anyway...
> >
> >
> >
> > I am able to log into the shell and view the tables, however if  I
> > attempt to create a table or perform a scan, a line return is shown
> > and then it just hangs there until finally throwing the following
> error:
> >
> > WARN thread "shell" stuck on IO to ssdev:9999:9999 (0) for at least
> > 120044 ms.
> >
> >
> >
> > I did also discover that 9999 is the property: master.port.client in
> > my conf/accumulo-site.xml
> >
> >
> >
> > There is also an event log that was added to the VM with web based UI
> > reporting:
> >
> > Unable to recover
> >
> 192.168.0.130:11224/b4da830b-8ecb-4868-a480-35a39f4af17a(java.io.IOExcep
> tion:
> > org.apache.thrift.transport.TTransportException:
> java.net.ConnectException:
> > Connection timed out)
> >
> >          java.io.IOException:
> > org.apache.thrift.transport.TTransportException:
> java.net.ConnectException:
> > Connection timed out
> >
> >                  at
> > cloudbase.server.tabletserver.log.RemoteLogger.<init>(RemoteLogger.jav
> > a:75)
> >
> >                  at
> > cloudbase.server.master.CoordinateRecoveryTask$RecoveryJob.startCopy(C
> > oordinateRecoveryTask.java:109)
> >
> >                  at
> > cloudbase.server.master.CoordinateRecoveryTask$RecoveryJob.access$400(
> > CoordinateRecoveryTask.java:93)
> >
> >                  at
> > cloudbase.server.master.CoordinateRecoveryTask.recover(CoordinateRecov
> > eryTask.java:279)
> >
> >                  at
> > cloudbase.server.master.Master$TabletGroupWatcher.run(Master.java:1155
> > )
> >
> >          Caused by: org.apache.thrift.transport.TTransportException:
> > java.net.ConnectException: Connection timed out
> >
> >                  at
> > cloudbase.core.client.impl.ThriftTransportPool.createNewTransport(Thri
> > ftTransportPool.java:428)
> >
> >                  at
> > cloudbase.core.client.impl.ThriftTransportPool.getTransport(ThriftTran
> > sportPool.java:415)
> >
> >                  at
> > cloudbase.core.client.impl.ThriftTransportPool.getTransport(ThriftTran
> > sportPool.java:392)
> >
> >                  at
> > cloudbase.core.util.ThriftUtil.getClient(ThriftUtil.java:58)
> >
> >                  at
> > cloudbase.server.tabletserver.log.RemoteLogger.<init>(RemoteLogger.jav
> > a:73)
> >
> >                  ... 4 more
> >
> >          Caused by: java.net.ConnectException: Connection timed out
> >
> >                  at sun.nio.ch.Net.connect(Native Method)
> >
> >                  at
> > sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:500)
> >
> >                  at
> > sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:81)
> >
> >                  at
> > sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:65)
> >
> >                  at
> > cloudbase.core.util.TTimeoutTransport.create(TTimeoutTransport.java:23
> > )
> >
> >                  at
> > cloudbase.core.client.impl.ThriftTransportPool.createNewTransport(Thri
> > ftTransportPool.java:426)
> >
> >                  ... 8 more
> >
> >
> >
> >
> >
> > I have seen posts relating this to the walogs folder not being
> > available, but I have checked that and the .lock file is being created
> automatically.
> >
> > A #netstat | grep 9999 shows no processes using 9999 before logging
> > into the shell... so Im not sure there is a port conflict either.
> >
> >
> >
> > Any thoughts on the matter would be greatly appreciated.
>

RE: Thread "shell" Stuck on IO

Posted by "Ott, Charles H." <CH...@saic.com>.
I believe you have already helped me get on the right track...

First, 192.168.0.130 is the IP that the VM came with preconfigured.
I changed the IP for this new environment in RHEL5 and "most" everything
seems to be running... however, the fact that it is reporting
192.168.0.130 tells me that somewhere in the logger configuration it's
still using the old IP?

All of the properties files I have looked at specify the hostname, not
IP... I checked the hosts file and the hostname is resolving the proper
IP, so that shouldn't be an issue.

When I try to start the logger with:

# ./cloudbase.sh logger

 I see:
Failed to initialize log service args=[]
	java.io.IOException: Failed to acquire lock file
		at
cloudbase.server.logger.LogService.<init>(LogService.java:122)
		at
cloudbase.server.logger.LogService.main(LogService.java:83)
		at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
Method)
		at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.jav
a:39)
		at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessor
Impl.java:25)
		at java.lang.reflect.Method.invoke(Method.java:597)
		at cloudbase.start.Main$1.run(Main.java:73)
		at java.lang.Thread.run(Thread.java:662)

-----Original Message-----
From: user-return-1492-CHARLES.H.OTT=saic.com@accumulo.apache.org
[mailto:user-return-1492-CHARLES.H.OTT=saic.com@accumulo.apache.org] On
Behalf Of Keith Turner
Sent: Wednesday, October 17, 2012 5:09 PM
To: user@accumulo.apache.org
Subject: Re: Thread "shell" Stuck on IO

Is the logger at 192.168.0.130 running.   The stack trace indicates
that the master was attempting to contact the logger at 192.168.0.130 to
initiate log recovery.

On Wed, Oct 17, 2012 at 4:58 PM, Ott, Charles H.
<CH...@saic.com> wrote:
> I am using a VMware ESXi 4.1 server  with Cloudbase(Accumulo)  on
RHEL5.
>
> I cannot start with a fresh install because I am somewhat required to 
> use the preconfigured image on the vm. (business rules out of my 
> hands)
>
> Unfortunately the support for this preconfigured instance is not 
> available and I am tasked with getting it working anyway...
>
>
>
> I am able to log into the shell and view the tables, however if  I 
> attempt to create a table or perform a scan, a line return is shown 
> and then it just hangs there until finally throwing the following
error:
>
> WARN thread "shell" stuck on IO to ssdev:9999:9999 (0) for at least 
> 120044 ms.
>
>
>
> I did also discover that 9999 is the property: master.port.client in 
> my conf/accumulo-site.xml
>
>
>
> There is also an event log that was added to the VM with web based UI
> reporting:
>
> Unable to recover
>
192.168.0.130:11224/b4da830b-8ecb-4868-a480-35a39f4af17a(java.io.IOExcep
tion:
> org.apache.thrift.transport.TTransportException:
java.net.ConnectException:
> Connection timed out)
>
>          java.io.IOException:
> org.apache.thrift.transport.TTransportException:
java.net.ConnectException:
> Connection timed out
>
>                  at
> cloudbase.server.tabletserver.log.RemoteLogger.<init>(RemoteLogger.jav
> a:75)
>
>                  at
> cloudbase.server.master.CoordinateRecoveryTask$RecoveryJob.startCopy(C
> oordinateRecoveryTask.java:109)
>
>                  at
> cloudbase.server.master.CoordinateRecoveryTask$RecoveryJob.access$400(
> CoordinateRecoveryTask.java:93)
>
>                  at
> cloudbase.server.master.CoordinateRecoveryTask.recover(CoordinateRecov
> eryTask.java:279)
>
>                  at
> cloudbase.server.master.Master$TabletGroupWatcher.run(Master.java:1155
> )
>
>          Caused by: org.apache.thrift.transport.TTransportException:
> java.net.ConnectException: Connection timed out
>
>                  at
> cloudbase.core.client.impl.ThriftTransportPool.createNewTransport(Thri
> ftTransportPool.java:428)
>
>                  at
> cloudbase.core.client.impl.ThriftTransportPool.getTransport(ThriftTran
> sportPool.java:415)
>
>                  at
> cloudbase.core.client.impl.ThriftTransportPool.getTransport(ThriftTran
> sportPool.java:392)
>
>                  at
> cloudbase.core.util.ThriftUtil.getClient(ThriftUtil.java:58)
>
>                  at
> cloudbase.server.tabletserver.log.RemoteLogger.<init>(RemoteLogger.jav
> a:73)
>
>                  ... 4 more
>
>          Caused by: java.net.ConnectException: Connection timed out
>
>                  at sun.nio.ch.Net.connect(Native Method)
>
>                  at
> sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:500)
>
>                  at 
> sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:81)
>
>                  at 
> sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:65)
>
>                  at
> cloudbase.core.util.TTimeoutTransport.create(TTimeoutTransport.java:23
> )
>
>                  at
> cloudbase.core.client.impl.ThriftTransportPool.createNewTransport(Thri
> ftTransportPool.java:426)
>
>                  ... 8 more
>
>
>
>
>
> I have seen posts relating this to the walogs folder not being 
> available, but I have checked that and the .lock file is being created
automatically.
>
> A #netstat | grep 9999 shows no processes using 9999 before logging 
> into the shell... so Im not sure there is a port conflict either.
>
>
>
> Any thoughts on the matter would be greatly appreciated.

Re: Thread "shell" Stuck on IO

Posted by Keith Turner <ke...@deenlo.com>.
Is the logger at 192.168.0.130 running.   The stack trace indicates
that the master was attempting to contact the logger at 192.168.0.130
to initiate log recovery.

On Wed, Oct 17, 2012 at 4:58 PM, Ott, Charles H. <CH...@saic.com> wrote:
> I am using a VMware ESXi 4.1 server  with Cloudbase(Accumulo)  on RHEL5.
>
> I cannot start with a fresh install because I am somewhat required to use
> the preconfigured image on the vm. (business rules out of my hands)
>
> Unfortunately the support for this preconfigured instance is not available
> and I am tasked with getting it working anyway…
>
>
>
> I am able to log into the shell and view the tables, however if  I attempt
> to create a table or perform a scan, a line return is shown and then it just
> hangs there until finally throwing the following error:
>
> WARN thread “shell” stuck on IO to ssdev:9999:9999 (0) for at least 120044
> ms.
>
>
>
> I did also discover that 9999 is the property: master.port.client in my
> conf/accumulo-site.xml
>
>
>
> There is also an event log that was added to the VM with web based UI
> reporting:
>
> Unable to recover
> 192.168.0.130:11224/b4da830b-8ecb-4868-a480-35a39f4af17a(java.io.IOException:
> org.apache.thrift.transport.TTransportException: java.net.ConnectException:
> Connection timed out)
>
>          java.io.IOException:
> org.apache.thrift.transport.TTransportException: java.net.ConnectException:
> Connection timed out
>
>                  at
> cloudbase.server.tabletserver.log.RemoteLogger.<init>(RemoteLogger.java:75)
>
>                  at
> cloudbase.server.master.CoordinateRecoveryTask$RecoveryJob.startCopy(CoordinateRecoveryTask.java:109)
>
>                  at
> cloudbase.server.master.CoordinateRecoveryTask$RecoveryJob.access$400(CoordinateRecoveryTask.java:93)
>
>                  at
> cloudbase.server.master.CoordinateRecoveryTask.recover(CoordinateRecoveryTask.java:279)
>
>                  at
> cloudbase.server.master.Master$TabletGroupWatcher.run(Master.java:1155)
>
>          Caused by: org.apache.thrift.transport.TTransportException:
> java.net.ConnectException: Connection timed out
>
>                  at
> cloudbase.core.client.impl.ThriftTransportPool.createNewTransport(ThriftTransportPool.java:428)
>
>                  at
> cloudbase.core.client.impl.ThriftTransportPool.getTransport(ThriftTransportPool.java:415)
>
>                  at
> cloudbase.core.client.impl.ThriftTransportPool.getTransport(ThriftTransportPool.java:392)
>
>                  at
> cloudbase.core.util.ThriftUtil.getClient(ThriftUtil.java:58)
>
>                  at
> cloudbase.server.tabletserver.log.RemoteLogger.<init>(RemoteLogger.java:73)
>
>                  ... 4 more
>
>          Caused by: java.net.ConnectException: Connection timed out
>
>                  at sun.nio.ch.Net.connect(Native Method)
>
>                  at
> sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:500)
>
>                  at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:81)
>
>                  at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:65)
>
>                  at
> cloudbase.core.util.TTimeoutTransport.create(TTimeoutTransport.java:23)
>
>                  at
> cloudbase.core.client.impl.ThriftTransportPool.createNewTransport(ThriftTransportPool.java:426)
>
>                  ... 8 more
>
>
>
>
>
> I have seen posts relating this to the walogs folder not being available,
> but I have checked that and the .lock file is being created automatically.
>
> A #netstat | grep 9999 shows no processes using 9999 before logging into the
> shell… so Im not sure there is a port conflict either.
>
>
>
> Any thoughts on the matter would be greatly appreciated.