You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by "Josh Elser (JIRA)" <ji...@apache.org> on 2014/09/19 01:43:34 UTC

[jira] [Created] (ACCUMULO-3148) TabletServer didn't get Session expired in HalfDeadTServerIT

Josh Elser created ACCUMULO-3148:
------------------------------------

             Summary: TabletServer didn't get Session expired in HalfDeadTServerIT
                 Key: ACCUMULO-3148
                 URL: https://issues.apache.org/jira/browse/ACCUMULO-3148
             Project: Accumulo
          Issue Type: Bug
          Components: test
            Reporter: Josh Elser
            Assignee: Josh Elser
             Fix For: 1.6.1, 1.7.0


Beening seeing spurious failures with HalfDeadTServerIT where it doesn't get the ZK session expiration

{noformat}
2014-09-15 09:39:59,201 [tserver.TabletServer] DEBUG: ScanSess tid 172.31.33.94:35957 !0 0 entries in 0.07 secs, nbTimes = [63 63 63.00 1] 
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
sleeping
2014-09-15 09:40:20,088 [tserver.TabletServer] FATAL: Lost tablet server lock (reason = LOCK_DELETED), exiting.
2014-09-15 09:40:20,088 [zookeeper.ZooCache] WARN : Zookeeper error, will retry
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /accumulo/d0b9b8e7-9869-4b00-9ae7-317f5231f2c1/tables/1/conf/table.iterator.minc.vers.opt.maxVersions
	at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
	at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
	at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1151)
	at org.apache.accumulo.fate.zookeeper.ZooCache$2.run(ZooCache.java:261)
	at org.apache.accumulo.fate.zookeeper.ZooCache.retry(ZooCache.java:153)
	at org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:277)
	at org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:224)
	at org.apache.accumulo.server.conf.ZooCachePropertyAccessor.get(ZooCachePropertyAccessor.java:114)
	at org.apache.accumulo.server.conf.ZooCachePropertyAccessor.getProperties(ZooCachePropertyAccessor.java:144)
	at org.apache.accumulo.server.conf.TableConfiguration.getProperties(TableConfiguration.java:108)
	at org.apache.accumulo.core.conf.AccumuloConfiguration.iterator(AccumuloConfiguration.java:69)
	at org.apache.accumulo.core.conf.ConfigSanityCheck.validate(ConfigSanityCheck.java:40)
	at org.apache.accumulo.server.conf.ServerConfigurationFactory.getTableConfiguration(ServerConfigurationFactory.java:155)
	at org.apache.accumulo.server.conf.ServerConfiguration.getTableConfiguration(ServerConfiguration.java:69)
	at org.apache.accumulo.tserver.TabletServer.getTableConfiguration(TabletServer.java:3983)
	at org.apache.accumulo.tserver.Tablet.<init>(Tablet.java:1277)
	at org.apache.accumulo.tserver.Tablet.<init>(Tablet.java:1256)
	at org.apache.accumulo.tserver.Tablet.<init>(Tablet.java:1112)
	at org.apache.accumulo.tserver.Tablet.<init>(Tablet.java:1089)
	at org.apache.accumulo.tserver.TabletServer$AssignmentHandler.run(TabletServer.java:2935)
	at org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34)
	at org.apache.accumulo.trace.instrument.TraceRunnable.run(TraceRunnable.java:47)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at org.apache.accumulo.trace.instrument.TraceRunnable.run(TraceRunnable.java:47)
	at org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34)
	at java.lang.Thread.run(Thread.java:745)
2014-09-15 09:40:20,090 [tserver.TabletServer] WARN : Check for long GC pauses not called in a timely fashion. Expected every 5.0 seconds but was 16.3 seconds since last check
2014-09-15 09:40:20,477 [datanode.DataNode] ERROR: 127.0.0.1:57185:DataXceiver error processing WRITE_BLOCK operation  src: /127.0.0.1:42146 dst: /127.0.0.1:57185
java.io.IOException: Premature EOF from inputStream
	at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:194)
	at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:213)
	at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:134)
	at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109)
	at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:467)
	at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:771)
	at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:718)
	at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:126)
	at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:72)
	at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:225)
	at java.lang.Thread.run(Thread.java:745)
{noformat}

It looks like the tserver killed itself after the connection loss but before the tserver retried to connect and got the session expiration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)