You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@accumulo.apache.org by "Keith Turner (Updated) (JIRA)" <ji...@apache.org> on 2012/01/19 18:53:40 UTC

[jira] [Updated] (ACCUMULO-331) Scan of metadata table failed because tablet closed

     [ https://issues.apache.org/jira/browse/ACCUMULO-331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Keith Turner updated ACCUMULO-331:
----------------------------------

    Description: 
The security random walk test failed to drop a table.

{noformat}
19 05:56:04,122 [randomwalk.Framework] ERROR: Error during random walk
java.lang.Exception: Error running node Security.xml
        at org.apache.accumulo.server.test.randomwalk.Module.visit(Module.java:253)
        at org.apache.accumulo.server.test.randomwalk.Framework.run(Framework.java:61)
        at org.apache.accumulo.server.test.randomwalk.Framework.main(Framework.java:114)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.accumulo.start.Main$1.run(Main.java:89)
        at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.Exception: Error running node security.TableOp
        at org.apache.accumulo.server.test.randomwalk.Module.visit(Module.java:253)
        at org.apache.accumulo.server.test.randomwalk.Module.visit(Module.java:249)
        ... 8 more
Caused by: org.apache.accumulo.core.client.AccumuloException: Internal error processing waitForTableOperation
        at org.apache.accumulo.core.client.admin.TableOperationsImpl.doTableOperation(TableOperationsImpl.java:293)
        at org.apache.accumulo.core.client.admin.TableOperationsImpl.doTableOperation(TableOperationsImpl.java:261)
        at org.apache.accumulo.core.client.admin.TableOperationsImpl.delete(TableOperationsImpl.java:508)
        at org.apache.accumulo.server.test.randomwalk.security.DropTable.dropTable(DropTable.java:66)
        at org.apache.accumulo.server.test.randomwalk.security.TableOp.visit(TableOp.java:204)
        at org.apache.accumulo.server.test.randomwalk.Module.visit(Module.java:249)
        ... 9 more
Caused by: org.apache.thrift.TApplicationException: Internal error processing waitForTableOperation
        at org.apache.thrift.TApplicationException.read(TApplicationException.java:108)
        at org.apache.accumulo.core.master.thrift.MasterClientService$Client.recv_waitForTableOperation(MasterClientService.java:684)
        at org.apache.accumulo.core.master.thrift.MasterClientService$Client.waitForTableOperation(MasterClientService.java:665)
        at sun.reflect.GeneratedMethodAccessor19.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at cloudtrace.instrument.thrift.TraceWrap$2.invoke(TraceWrap.java:83)
        at $Proxy1.waitForTableOperation(Unknown Source)
        at org.apache.accumulo.core.client.admin.TableOperationsImpl.waitForTableOperation(TableOperationsImpl.java:233)
        at org.apache.accumulo.core.client.admin.TableOperationsImpl.doTableOperation(TableOperationsImpl.java:275)
        ... 14 more
{noformat}

Looking in the master logs see the following indicating a scan failed while trying to delete the table.

{noformat}
19 05:56:04,098 [thrift.MasterClientService$Processor] ERROR: Internal error processing waitForTableOperation
java.lang.RuntimeException: org.apache.accumulo.core.client.impl.AccumuloServerException: Error on server xxx.xxx.xxx.xxx:9997
        at org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.hasNext(TabletServerBatchReaderIterator.java:188)
        at org.apache.accumulo.server.master.state.MetaDataTableScanner.hasNext(MetaDataTableScanner.java:97)
        at org.apache.accumulo.server.master.tableOps.CleanUp.isReady(DeleteTable.java:94)
        at org.apache.accumulo.server.master.tableOps.CleanUp.isReady(DeleteTable.java:55)
        at org.apache.accumulo.server.master.tableOps.TraceRepo.isReady(TraceRepo.java:50)
        at org.apache.accumulo.server.fate.Fate$TransactionRunner.run(Fate.java:62)
        at org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34)
        at java.lang.Thread.run(Thread.java:662)
Caused by: org.apache.accumulo.core.client.impl.AccumuloServerException: Error on server xxx.xxx.xxx.xxx:9997
        at org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.doLookup(TabletServerBatchReaderIterator.java:579)
        at org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator$QueryTask.run(TabletServerBatchReaderIterator.java:335)
        at cloudtrace.instrument.TraceRunnable.run(TraceRunnable.java:47)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        ... 1 more
Caused by: org.apache.thrift.TApplicationException: Internal error processing startMultiScan
        at org.apache.thrift.TApplicationException.read(TApplicationException.java:108)
        at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.recv_startMultiScan(TabletClientService.java:298)
        at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.startMultiScan(TabletClientService.java:274)
        at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at cloudtrace.instrument.thrift.TraceWrap$2.invoke(TraceWrap.java:83)
        at $Proxy4.startMultiScan(Unknown Source)
        at org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.doLookup(TabletServerBatchReaderIterator.java:539)
        ... 5 more
{noformat}


Looking on the tablet server where the scan failed, see the following

{noformat}

19 05:56:04,062 [tabletserver.Tablet] DEBUG: initiateClose(saveState=true queueMinC=false disableWrites=false) !0;~;!0<
19 05:56:04,062 [tabletserver.Tablet] DEBUG: completeClose(saveState=true completeClose=true) !0;~;!0<
19 05:56:04,063 [problems.ProblemReports] DEBUG: Filing problem report !0 FILE_READ /accumulo/tables/!0/table_info/F0002xng.rf

19 05:56:04,072 [tabletserver.Tablet] TABLET_HIST: !0;~;!0< closed
19 05:56:04,072 [tabletserver.TabletServer] DEBUG: Unassigning !0;~;!0<@(null,192.168.117.13:9997[134d7425fc5940d],null)
19 05:56:04,073 [tabletserver.TabletServer] WARN : lookup failed for tablet !0;~;!0<
java.io.EOFException
        at java.io.DataInputStream.readByte(DataInputStream.java:250)
        at org.apache.accumulo.core.file.rfile.RelativeKey.fastSkip(RelativeKey.java:249)
        at org.apache.accumulo.core.file.rfile.RFile$LocalityGroupReader._seek(RFile.java:677)
        at org.apache.accumulo.core.file.rfile.RFile$LocalityGroupReader.seek(RFile.java:614)
        at org.apache.accumulo.core.iterators.WrappingIterator.seek(WrappingIterator.java:92)
        at org.apache.accumulo.core.iterators.SkippingIterator.seek(SkippingIterator.java:37)
        at org.apache.accumulo.core.iterators.system.ColumnFamilySkippingIterator.seek(ColumnFamilySkippingIterator.java:125)
        at org.apache.accumulo.core.file.rfile.RFile$Reader.seek(RFile.java:1020)
        at org.apache.accumulo.server.problems.ProblemReportingIterator.seek(ProblemReportingIterator.java:94)
        at org.apache.accumulo.core.iterators.system.MultiIterator.seek(MultiIterator.java:105)
        at org.apache.accumulo.core.iterators.WrappingIterator.seek(WrappingIterator.java:92)
        at org.apache.accumulo.core.iterators.system.DeletingIterator.seek(DeletingIterator.java:67)
        at org.apache.accumulo.core.iterators.WrappingIterator.seek(WrappingIterator.java:92)
        at org.apache.accumulo.core.iterators.Filter.seek(Filter.java:64)
        at org.apache.accumulo.core.iterators.WrappingIterator.seek(WrappingIterator.java:92)
        at org.apache.accumulo.core.iterators.Filter.seek(Filter.java:64)
        at org.apache.accumulo.core.iterators.WrappingIterator.seek(WrappingIterator.java:92)
        at org.apache.accumulo.core.iterators.user.VersioningIterator.seek(VersioningIterator.java:89)
        at org.apache.accumulo.core.iterators.user.WholeRowIterator.seek(WholeRowIterator.java:240)
        at org.apache.accumulo.core.iterators.WrappingIterator.seek(WrappingIterator.java:92)
        at org.apache.accumulo.core.iterators.SkippingIterator.seek(SkippingIterator.java:37)
        at org.apache.accumulo.core.iterators.system.SourceSwitchingIterator.readNext(SourceSwitchingIterator.java:116)
        at org.apache.accumulo.core.iterators.system.SourceSwitchingIterator.seek(SourceSwitchingIterator.java:168)
        at org.apache.accumulo.server.tabletserver.Tablet.lookup(Tablet.java:1607)
        at org.apache.accumulo.server.tabletserver.Tablet.lookup(Tablet.java:1710)
        at org.apache.accumulo.server.tabletserver.TabletServer$ThriftClientHandler$LookupTask.run(TabletServer.java:988)
        at cloudtrace.instrument.TraceRunnable.run(TraceRunnable.java:47)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)

19 05:56:04,076 [tabletserver.TabletServer] INFO : unloaded !0;~;!0<
19 05:56:04,076 [tabletserver.TabletServer] INFO : told to unload tablet that was not being served !0;~;!0<
19 05:56:04,084 [thrift.TabletClientService$Processor] ERROR: Internal error processing startMultiScan
java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.io.EOFException
        at org.apache.accumulo.server.tabletserver.TabletServer$ThriftClientHandler.continueMultiScan(TabletServer.java:1321)
        at org.apache.accumulo.server.tabletserver.TabletServer$ThriftClientHandler.startMultiScan(TabletServer.java:1272)
        at sun.reflect.GeneratedMethodAccessor21.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at cloudtrace.instrument.thrift.TraceWrap$1.invoke(TraceWrap.java:58)
        at $Proxy2.startMultiScan(Unknown Source)
        at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Processor$startMultiScan.process(TabletClientService.java:2164)
        at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Processor.process(TabletClientService.java:2037)
        at org.apache.accumulo.server.util.TServerUtils$TimedProcessor.process(TServerUtils.java:151)
        at org.apache.thrift.server.TNonblockingServer$FrameBuffer.invoke(TNonblockingServer.java:631)
        at org.apache.accumulo.server.util.TServerUtils$THsHaServer$Invocation.run(TServerUtils.java:199)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34)
        at java.lang.Thread.run(Thread.java:662)
Caused by: java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.io.EOFException
        at org.apache.accumulo.server.tabletserver.TabletServer$ScanTask.get(TabletServer.java:661)
        at org.apache.accumulo.server.tabletserver.TabletServer$ThriftClientHandler.continueMultiScan(TabletServer.java:1308)
        ... 15 more
Caused by: java.lang.RuntimeException: java.io.EOFException
        at org.apache.accumulo.server.tabletserver.TabletServer$ThriftClientHandler$LookupTask.run(TabletServer.java:999)
        at cloudtrace.instrument.TraceRunnable.run(TraceRunnable.java:47)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        ... 1 more
Caused by: java.io.EOFException
        at java.io.DataInputStream.readByte(DataInputStream.java:250)
        at org.apache.accumulo.core.file.rfile.RelativeKey.fastSkip(RelativeKey.java:249)
        at org.apache.accumulo.core.file.rfile.RFile$LocalityGroupReader._seek(RFile.java:677)
        at org.apache.accumulo.core.file.rfile.RFile$LocalityGroupReader.seek(RFile.java:614)
        at org.apache.accumulo.core.iterators.WrappingIterator.seek(WrappingIterator.java:92)
        at org.apache.accumulo.core.iterators.SkippingIterator.seek(SkippingIterator.java:37)
        at org.apache.accumulo.core.iterators.system.ColumnFamilySkippingIterator.seek(ColumnFamilySkippingIterator.java:125)
        at org.apache.accumulo.core.file.rfile.RFile$Reader.seek(RFile.java:1020)
        at org.apache.accumulo.server.problems.ProblemReportingIterator.seek(ProblemReportingIterator.java:94)
        at org.apache.accumulo.core.iterators.system.MultiIterator.seek(MultiIterator.java:105)
        at org.apache.accumulo.core.iterators.WrappingIterator.seek(WrappingIterator.java:92)
        at org.apache.accumulo.core.iterators.system.DeletingIterator.seek(DeletingIterator.java:67)
        at org.apache.accumulo.core.iterators.WrappingIterator.seek(WrappingIterator.java:92)
        at org.apache.accumulo.core.iterators.Filter.seek(Filter.java:64)
        at org.apache.accumulo.core.iterators.WrappingIterator.seek(WrappingIterator.java:92)
        at org.apache.accumulo.core.iterators.Filter.seek(Filter.java:64)
        at org.apache.accumulo.core.iterators.WrappingIterator.seek(WrappingIterator.java:92)
        at org.apache.accumulo.core.iterators.user.VersioningIterator.seek(VersioningIterator.java:89)
        at org.apache.accumulo.core.iterators.user.WholeRowIterator.seek(WholeRowIterator.java:240)
        at org.apache.accumulo.core.iterators.WrappingIterator.seek(WrappingIterator.java:92)
        at org.apache.accumulo.core.iterators.SkippingIterator.seek(SkippingIterator.java:37)
        at org.apache.accumulo.core.iterators.system.SourceSwitchingIterator.readNext(SourceSwitchingIterator.java:116)
        at org.apache.accumulo.core.iterators.system.SourceSwitchingIterator.seek(SourceSwitchingIterator.java:168)
        at org.apache.accumulo.server.tabletserver.Tablet.lookup(Tablet.java:1607)
        at org.apache.accumulo.server.tabletserver.Tablet.lookup(Tablet.java:1710)
        at org.apache.accumulo.server.tabletserver.TabletServer$ThriftClientHandler$LookupTask.run(TabletServer.java:988)
        ... 4 more
19 05:56:04,120 [tabletserver.TabletServer] INFO : Loading tablet !0;~;!0<

19 05:56:04,128 [tabletserver.Tablet] TABLET_HIST: !0;~;!0< opened
{noformat}

The metadata tablet were being frequently loaded and unloaded as descried in ACCUMULO-329 when this happened.  

The file in question was major compacted away on another tablet server. Shortly after this.

{noformat}
19 05:57:54,187 [tabletserver.Tablet] DEBUG: Starting MajC !0;~;!0< [/table_info/A0002xm4.rf, /table_info/C0002xz3.rf, /table_info/C0002y0d.rf, /table_info/F0002xng.rf, /table_info/M0002xzv.rf, /table_info/M0002y6z.rf] --> /table_info/A0002y0e.rf_tmp

19 05:57:54,309 [tabletserver.Tablet] TABLET_HIST: !0;~;!0< MajC [/table_info/A0002xm4.rf, /table_info/C0002xz3.rf, /table_info/C0002y0d.rf, /table_info/F0002xng.rf, /table_info/M0002xzv.rf, /table_info/M0002y6z.rf] --> /table_info/A0002y0e.rf
{noformat}

{noformat}
19 06:10:29,546 [gc.SimpleGarbageCollector] DEBUG: Deleting /accumulo/tables/!0/table_info/F0002xng.rf
{noformat}


The file was garbage collected later.  The GC and Majc show that the file existed while the EOF happened.  So this is probably either a bug in RFile fast skip or a bug related to the tablet closing. Probably related to the close w/ the close temporal relationship.

  was:
The security random walk test failed to drop a table.

{noformat}
19 05:56:04,122 [randomwalk.Framework] ERROR: Error during random walk
java.lang.Exception: Error running node Security.xml
        at org.apache.accumulo.server.test.randomwalk.Module.visit(Module.java:253)
        at org.apache.accumulo.server.test.randomwalk.Framework.run(Framework.java:61)
        at org.apache.accumulo.server.test.randomwalk.Framework.main(Framework.java:114)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.accumulo.start.Main$1.run(Main.java:89)
        at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.Exception: Error running node security.TableOp
        at org.apache.accumulo.server.test.randomwalk.Module.visit(Module.java:253)
        at org.apache.accumulo.server.test.randomwalk.Module.visit(Module.java:249)
        ... 8 more
Caused by: org.apache.accumulo.core.client.AccumuloException: Internal error processing waitForTableOperation
        at org.apache.accumulo.core.client.admin.TableOperationsImpl.doTableOperation(TableOperationsImpl.java:293)
        at org.apache.accumulo.core.client.admin.TableOperationsImpl.doTableOperation(TableOperationsImpl.java:261)
        at org.apache.accumulo.core.client.admin.TableOperationsImpl.delete(TableOperationsImpl.java:508)
        at org.apache.accumulo.server.test.randomwalk.security.DropTable.dropTable(DropTable.java:66)
        at org.apache.accumulo.server.test.randomwalk.security.TableOp.visit(TableOp.java:204)
        at org.apache.accumulo.server.test.randomwalk.Module.visit(Module.java:249)
        ... 9 more
Caused by: org.apache.thrift.TApplicationException: Internal error processing waitForTableOperation
        at org.apache.thrift.TApplicationException.read(TApplicationException.java:108)
        at org.apache.accumulo.core.master.thrift.MasterClientService$Client.recv_waitForTableOperation(MasterClientService.java:684)
        at org.apache.accumulo.core.master.thrift.MasterClientService$Client.waitForTableOperation(MasterClientService.java:665)
        at sun.reflect.GeneratedMethodAccessor19.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at cloudtrace.instrument.thrift.TraceWrap$2.invoke(TraceWrap.java:83)
        at $Proxy1.waitForTableOperation(Unknown Source)
        at org.apache.accumulo.core.client.admin.TableOperationsImpl.waitForTableOperation(TableOperationsImpl.java:233)
        at org.apache.accumulo.core.client.admin.TableOperationsImpl.doTableOperation(TableOperationsImpl.java:275)
        ... 14 more
{noformat}

Looking in the master logs see the following indicating a scan failed while trying to delete the table.

{noformat}
19 05:56:04,098 [thrift.MasterClientService$Processor] ERROR: Internal error processing waitForTableOperation
java.lang.RuntimeException: org.apache.accumulo.core.client.impl.AccumuloServerException: Error on server xxx.xxx.xxx.xxx:9997
        at org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.hasNext(TabletServerBatchReaderIterator.java:188)
        at org.apache.accumulo.server.master.state.MetaDataTableScanner.hasNext(MetaDataTableScanner.java:97)
        at org.apache.accumulo.server.master.tableOps.CleanUp.isReady(DeleteTable.java:94)
        at org.apache.accumulo.server.master.tableOps.CleanUp.isReady(DeleteTable.java:55)
        at org.apache.accumulo.server.master.tableOps.TraceRepo.isReady(TraceRepo.java:50)
        at org.apache.accumulo.server.fate.Fate$TransactionRunner.run(Fate.java:62)
        at org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34)
        at java.lang.Thread.run(Thread.java:662)
Caused by: org.apache.accumulo.core.client.impl.AccumuloServerException: Error on server xxx.xxx.xxx.xxx:9997
        at org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.doLookup(TabletServerBatchReaderIterator.java:579)
        at org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator$QueryTask.run(TabletServerBatchReaderIterator.java:335)
        at cloudtrace.instrument.TraceRunnable.run(TraceRunnable.java:47)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        ... 1 more
Caused by: org.apache.thrift.TApplicationException: Internal error processing startMultiScan
        at org.apache.thrift.TApplicationException.read(TApplicationException.java:108)
        at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.recv_startMultiScan(TabletClientService.java:298)
        at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.startMultiScan(TabletClientService.java:274)
        at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at cloudtrace.instrument.thrift.TraceWrap$2.invoke(TraceWrap.java:83)
        at $Proxy4.startMultiScan(Unknown Source)
        at org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.doLookup(TabletServerBatchReaderIterator.java:539)
        ... 5 more
{noformat}


Looking on the tablet server where the scan failed, see the following

{noformat}

19 05:56:04,062 [tabletserver.Tablet] DEBUG: initiateClose(saveState=true queueMinC=false disableWrites=false) !0;~;!0<
19 05:56:04,062 [tabletserver.Tablet] DEBUG: completeClose(saveState=true completeClose=true) !0;~;!0<
19 05:56:04,063 [problems.ProblemReports] DEBUG: Filing problem report !0 FILE_READ /accumulo/tables/!0/table_info/F0002xng.rf

19 05:56:04,072 [tabletserver.Tablet] TABLET_HIST: !0;~;!0< closed
19 05:56:04,072 [tabletserver.TabletServer] DEBUG: Unassigning !0;~;!0<@(null,192.168.117.13:9997[134d7425fc5940d],null)
19 05:56:04,073 [tabletserver.TabletServer] WARN : lookup failed for tablet !0;~;!0<
java.io.EOFException
        at java.io.DataInputStream.readByte(DataInputStream.java:250)
        at org.apache.accumulo.core.file.rfile.RelativeKey.fastSkip(RelativeKey.java:249)
        at org.apache.accumulo.core.file.rfile.RFile$LocalityGroupReader._seek(RFile.java:677)
        at org.apache.accumulo.core.file.rfile.RFile$LocalityGroupReader.seek(RFile.java:614)
        at org.apache.accumulo.core.iterators.WrappingIterator.seek(WrappingIterator.java:92)
        at org.apache.accumulo.core.iterators.SkippingIterator.seek(SkippingIterator.java:37)
        at org.apache.accumulo.core.iterators.system.ColumnFamilySkippingIterator.seek(ColumnFamilySkippingIterator.java:125)
        at org.apache.accumulo.core.file.rfile.RFile$Reader.seek(RFile.java:1020)
        at org.apache.accumulo.server.problems.ProblemReportingIterator.seek(ProblemReportingIterator.java:94)
        at org.apache.accumulo.core.iterators.system.MultiIterator.seek(MultiIterator.java:105)
        at org.apache.accumulo.core.iterators.WrappingIterator.seek(WrappingIterator.java:92)
        at org.apache.accumulo.core.iterators.system.DeletingIterator.seek(DeletingIterator.java:67)
        at org.apache.accumulo.core.iterators.WrappingIterator.seek(WrappingIterator.java:92)
        at org.apache.accumulo.core.iterators.Filter.seek(Filter.java:64)
        at org.apache.accumulo.core.iterators.WrappingIterator.seek(WrappingIterator.java:92)
        at org.apache.accumulo.core.iterators.Filter.seek(Filter.java:64)
        at org.apache.accumulo.core.iterators.WrappingIterator.seek(WrappingIterator.java:92)
        at org.apache.accumulo.core.iterators.user.VersioningIterator.seek(VersioningIterator.java:89)
        at org.apache.accumulo.core.iterators.user.WholeRowIterator.seek(WholeRowIterator.java:240)
        at org.apache.accumulo.core.iterators.WrappingIterator.seek(WrappingIterator.java:92)
        at org.apache.accumulo.core.iterators.SkippingIterator.seek(SkippingIterator.java:37)
        at org.apache.accumulo.core.iterators.system.SourceSwitchingIterator.readNext(SourceSwitchingIterator.java:116)
        at org.apache.accumulo.core.iterators.system.SourceSwitchingIterator.seek(SourceSwitchingIterator.java:168)
        at org.apache.accumulo.server.tabletserver.Tablet.lookup(Tablet.java:1607)
        at org.apache.accumulo.server.tabletserver.Tablet.lookup(Tablet.java:1710)
        at org.apache.accumulo.server.tabletserver.TabletServer$ThriftClientHandler$LookupTask.run(TabletServer.java:988)
        at cloudtrace.instrument.TraceRunnable.run(TraceRunnable.java:47)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)

19 05:56:04,120 [tabletserver.TabletServer] INFO : Loading tablet !0;~;!0<

19 05:56:04,128 [tabletserver.Tablet] TABLET_HIST: !0;~;!0< opened
{noformat}

The metadata tablet were being frequently loaded and unloaded as descried in ACCUMULO-329 when this happened.  

The file in question was major compacted away on another tablet server. Shortly after this.

{noformat}
19 05:57:54,187 [tabletserver.Tablet] DEBUG: Starting MajC !0;~;!0< [/table_info/A0002xm4.rf, /table_info/C0002xz3.rf, /table_info/C0002y0d.rf, /table_info/F0002xng.rf, /table_info/M0002xzv.rf, /table_info/M0002y6z.rf] --> /table_info/A0002y0e.rf_tmp

19 05:57:54,309 [tabletserver.Tablet] TABLET_HIST: !0;~;!0< MajC [/table_info/A0002xm4.rf, /table_info/C0002xz3.rf, /table_info/C0002y0d.rf, /table_info/F0002xng.rf, /table_info/M0002xzv.rf, /table_info/M0002y6z.rf] --> /table_info/A0002y0e.rf
{noformat}

{noformat}
19 06:10:29,546 [gc.SimpleGarbageCollector] DEBUG: Deleting /accumulo/tables/!0/table_info/F0002xng.rf
{noformat}


The file was garbage collected later.  The GC and Majc show that the file existed while the EOF happened.  So this is probably either a bug in RFile fast skip or a bug related to the tablet closing. Probably related to the close w/ the close temporal relationship.

    
> Scan of metadata table failed because tablet closed
> ---------------------------------------------------
>
>                 Key: ACCUMULO-331
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-331
>             Project: Accumulo
>          Issue Type: Bug
>          Components: tserver
>         Environment: Running random walk test on 10 node cluster
>            Reporter: Keith Turner
>            Assignee: Keith Turner
>             Fix For: 1.4.0
>
>
> The security random walk test failed to drop a table.
> {noformat}
> 19 05:56:04,122 [randomwalk.Framework] ERROR: Error during random walk
> java.lang.Exception: Error running node Security.xml
>         at org.apache.accumulo.server.test.randomwalk.Module.visit(Module.java:253)
>         at org.apache.accumulo.server.test.randomwalk.Framework.run(Framework.java:61)
>         at org.apache.accumulo.server.test.randomwalk.Framework.main(Framework.java:114)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.accumulo.start.Main$1.run(Main.java:89)
>         at java.lang.Thread.run(Thread.java:662)
> Caused by: java.lang.Exception: Error running node security.TableOp
>         at org.apache.accumulo.server.test.randomwalk.Module.visit(Module.java:253)
>         at org.apache.accumulo.server.test.randomwalk.Module.visit(Module.java:249)
>         ... 8 more
> Caused by: org.apache.accumulo.core.client.AccumuloException: Internal error processing waitForTableOperation
>         at org.apache.accumulo.core.client.admin.TableOperationsImpl.doTableOperation(TableOperationsImpl.java:293)
>         at org.apache.accumulo.core.client.admin.TableOperationsImpl.doTableOperation(TableOperationsImpl.java:261)
>         at org.apache.accumulo.core.client.admin.TableOperationsImpl.delete(TableOperationsImpl.java:508)
>         at org.apache.accumulo.server.test.randomwalk.security.DropTable.dropTable(DropTable.java:66)
>         at org.apache.accumulo.server.test.randomwalk.security.TableOp.visit(TableOp.java:204)
>         at org.apache.accumulo.server.test.randomwalk.Module.visit(Module.java:249)
>         ... 9 more
> Caused by: org.apache.thrift.TApplicationException: Internal error processing waitForTableOperation
>         at org.apache.thrift.TApplicationException.read(TApplicationException.java:108)
>         at org.apache.accumulo.core.master.thrift.MasterClientService$Client.recv_waitForTableOperation(MasterClientService.java:684)
>         at org.apache.accumulo.core.master.thrift.MasterClientService$Client.waitForTableOperation(MasterClientService.java:665)
>         at sun.reflect.GeneratedMethodAccessor19.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at cloudtrace.instrument.thrift.TraceWrap$2.invoke(TraceWrap.java:83)
>         at $Proxy1.waitForTableOperation(Unknown Source)
>         at org.apache.accumulo.core.client.admin.TableOperationsImpl.waitForTableOperation(TableOperationsImpl.java:233)
>         at org.apache.accumulo.core.client.admin.TableOperationsImpl.doTableOperation(TableOperationsImpl.java:275)
>         ... 14 more
> {noformat}
> Looking in the master logs see the following indicating a scan failed while trying to delete the table.
> {noformat}
> 19 05:56:04,098 [thrift.MasterClientService$Processor] ERROR: Internal error processing waitForTableOperation
> java.lang.RuntimeException: org.apache.accumulo.core.client.impl.AccumuloServerException: Error on server xxx.xxx.xxx.xxx:9997
>         at org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.hasNext(TabletServerBatchReaderIterator.java:188)
>         at org.apache.accumulo.server.master.state.MetaDataTableScanner.hasNext(MetaDataTableScanner.java:97)
>         at org.apache.accumulo.server.master.tableOps.CleanUp.isReady(DeleteTable.java:94)
>         at org.apache.accumulo.server.master.tableOps.CleanUp.isReady(DeleteTable.java:55)
>         at org.apache.accumulo.server.master.tableOps.TraceRepo.isReady(TraceRepo.java:50)
>         at org.apache.accumulo.server.fate.Fate$TransactionRunner.run(Fate.java:62)
>         at org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34)
>         at java.lang.Thread.run(Thread.java:662)
> Caused by: org.apache.accumulo.core.client.impl.AccumuloServerException: Error on server xxx.xxx.xxx.xxx:9997
>         at org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.doLookup(TabletServerBatchReaderIterator.java:579)
>         at org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator$QueryTask.run(TabletServerBatchReaderIterator.java:335)
>         at cloudtrace.instrument.TraceRunnable.run(TraceRunnable.java:47)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         ... 1 more
> Caused by: org.apache.thrift.TApplicationException: Internal error processing startMultiScan
>         at org.apache.thrift.TApplicationException.read(TApplicationException.java:108)
>         at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.recv_startMultiScan(TabletClientService.java:298)
>         at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.startMultiScan(TabletClientService.java:274)
>         at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at cloudtrace.instrument.thrift.TraceWrap$2.invoke(TraceWrap.java:83)
>         at $Proxy4.startMultiScan(Unknown Source)
>         at org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.doLookup(TabletServerBatchReaderIterator.java:539)
>         ... 5 more
> {noformat}
> Looking on the tablet server where the scan failed, see the following
> {noformat}
> 19 05:56:04,062 [tabletserver.Tablet] DEBUG: initiateClose(saveState=true queueMinC=false disableWrites=false) !0;~;!0<
> 19 05:56:04,062 [tabletserver.Tablet] DEBUG: completeClose(saveState=true completeClose=true) !0;~;!0<
> 19 05:56:04,063 [problems.ProblemReports] DEBUG: Filing problem report !0 FILE_READ /accumulo/tables/!0/table_info/F0002xng.rf
> 19 05:56:04,072 [tabletserver.Tablet] TABLET_HIST: !0;~;!0< closed
> 19 05:56:04,072 [tabletserver.TabletServer] DEBUG: Unassigning !0;~;!0<@(null,192.168.117.13:9997[134d7425fc5940d],null)
> 19 05:56:04,073 [tabletserver.TabletServer] WARN : lookup failed for tablet !0;~;!0<
> java.io.EOFException
>         at java.io.DataInputStream.readByte(DataInputStream.java:250)
>         at org.apache.accumulo.core.file.rfile.RelativeKey.fastSkip(RelativeKey.java:249)
>         at org.apache.accumulo.core.file.rfile.RFile$LocalityGroupReader._seek(RFile.java:677)
>         at org.apache.accumulo.core.file.rfile.RFile$LocalityGroupReader.seek(RFile.java:614)
>         at org.apache.accumulo.core.iterators.WrappingIterator.seek(WrappingIterator.java:92)
>         at org.apache.accumulo.core.iterators.SkippingIterator.seek(SkippingIterator.java:37)
>         at org.apache.accumulo.core.iterators.system.ColumnFamilySkippingIterator.seek(ColumnFamilySkippingIterator.java:125)
>         at org.apache.accumulo.core.file.rfile.RFile$Reader.seek(RFile.java:1020)
>         at org.apache.accumulo.server.problems.ProblemReportingIterator.seek(ProblemReportingIterator.java:94)
>         at org.apache.accumulo.core.iterators.system.MultiIterator.seek(MultiIterator.java:105)
>         at org.apache.accumulo.core.iterators.WrappingIterator.seek(WrappingIterator.java:92)
>         at org.apache.accumulo.core.iterators.system.DeletingIterator.seek(DeletingIterator.java:67)
>         at org.apache.accumulo.core.iterators.WrappingIterator.seek(WrappingIterator.java:92)
>         at org.apache.accumulo.core.iterators.Filter.seek(Filter.java:64)
>         at org.apache.accumulo.core.iterators.WrappingIterator.seek(WrappingIterator.java:92)
>         at org.apache.accumulo.core.iterators.Filter.seek(Filter.java:64)
>         at org.apache.accumulo.core.iterators.WrappingIterator.seek(WrappingIterator.java:92)
>         at org.apache.accumulo.core.iterators.user.VersioningIterator.seek(VersioningIterator.java:89)
>         at org.apache.accumulo.core.iterators.user.WholeRowIterator.seek(WholeRowIterator.java:240)
>         at org.apache.accumulo.core.iterators.WrappingIterator.seek(WrappingIterator.java:92)
>         at org.apache.accumulo.core.iterators.SkippingIterator.seek(SkippingIterator.java:37)
>         at org.apache.accumulo.core.iterators.system.SourceSwitchingIterator.readNext(SourceSwitchingIterator.java:116)
>         at org.apache.accumulo.core.iterators.system.SourceSwitchingIterator.seek(SourceSwitchingIterator.java:168)
>         at org.apache.accumulo.server.tabletserver.Tablet.lookup(Tablet.java:1607)
>         at org.apache.accumulo.server.tabletserver.Tablet.lookup(Tablet.java:1710)
>         at org.apache.accumulo.server.tabletserver.TabletServer$ThriftClientHandler$LookupTask.run(TabletServer.java:988)
>         at cloudtrace.instrument.TraceRunnable.run(TraceRunnable.java:47)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:662)
> 19 05:56:04,076 [tabletserver.TabletServer] INFO : unloaded !0;~;!0<
> 19 05:56:04,076 [tabletserver.TabletServer] INFO : told to unload tablet that was not being served !0;~;!0<
> 19 05:56:04,084 [thrift.TabletClientService$Processor] ERROR: Internal error processing startMultiScan
> java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.io.EOFException
>         at org.apache.accumulo.server.tabletserver.TabletServer$ThriftClientHandler.continueMultiScan(TabletServer.java:1321)
>         at org.apache.accumulo.server.tabletserver.TabletServer$ThriftClientHandler.startMultiScan(TabletServer.java:1272)
>         at sun.reflect.GeneratedMethodAccessor21.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at cloudtrace.instrument.thrift.TraceWrap$1.invoke(TraceWrap.java:58)
>         at $Proxy2.startMultiScan(Unknown Source)
>         at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Processor$startMultiScan.process(TabletClientService.java:2164)
>         at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Processor.process(TabletClientService.java:2037)
>         at org.apache.accumulo.server.util.TServerUtils$TimedProcessor.process(TServerUtils.java:151)
>         at org.apache.thrift.server.TNonblockingServer$FrameBuffer.invoke(TNonblockingServer.java:631)
>         at org.apache.accumulo.server.util.TServerUtils$THsHaServer$Invocation.run(TServerUtils.java:199)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34)
>         at java.lang.Thread.run(Thread.java:662)
> Caused by: java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.io.EOFException
>         at org.apache.accumulo.server.tabletserver.TabletServer$ScanTask.get(TabletServer.java:661)
>         at org.apache.accumulo.server.tabletserver.TabletServer$ThriftClientHandler.continueMultiScan(TabletServer.java:1308)
>         ... 15 more
> Caused by: java.lang.RuntimeException: java.io.EOFException
>         at org.apache.accumulo.server.tabletserver.TabletServer$ThriftClientHandler$LookupTask.run(TabletServer.java:999)
>         at cloudtrace.instrument.TraceRunnable.run(TraceRunnable.java:47)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         ... 1 more
> Caused by: java.io.EOFException
>         at java.io.DataInputStream.readByte(DataInputStream.java:250)
>         at org.apache.accumulo.core.file.rfile.RelativeKey.fastSkip(RelativeKey.java:249)
>         at org.apache.accumulo.core.file.rfile.RFile$LocalityGroupReader._seek(RFile.java:677)
>         at org.apache.accumulo.core.file.rfile.RFile$LocalityGroupReader.seek(RFile.java:614)
>         at org.apache.accumulo.core.iterators.WrappingIterator.seek(WrappingIterator.java:92)
>         at org.apache.accumulo.core.iterators.SkippingIterator.seek(SkippingIterator.java:37)
>         at org.apache.accumulo.core.iterators.system.ColumnFamilySkippingIterator.seek(ColumnFamilySkippingIterator.java:125)
>         at org.apache.accumulo.core.file.rfile.RFile$Reader.seek(RFile.java:1020)
>         at org.apache.accumulo.server.problems.ProblemReportingIterator.seek(ProblemReportingIterator.java:94)
>         at org.apache.accumulo.core.iterators.system.MultiIterator.seek(MultiIterator.java:105)
>         at org.apache.accumulo.core.iterators.WrappingIterator.seek(WrappingIterator.java:92)
>         at org.apache.accumulo.core.iterators.system.DeletingIterator.seek(DeletingIterator.java:67)
>         at org.apache.accumulo.core.iterators.WrappingIterator.seek(WrappingIterator.java:92)
>         at org.apache.accumulo.core.iterators.Filter.seek(Filter.java:64)
>         at org.apache.accumulo.core.iterators.WrappingIterator.seek(WrappingIterator.java:92)
>         at org.apache.accumulo.core.iterators.Filter.seek(Filter.java:64)
>         at org.apache.accumulo.core.iterators.WrappingIterator.seek(WrappingIterator.java:92)
>         at org.apache.accumulo.core.iterators.user.VersioningIterator.seek(VersioningIterator.java:89)
>         at org.apache.accumulo.core.iterators.user.WholeRowIterator.seek(WholeRowIterator.java:240)
>         at org.apache.accumulo.core.iterators.WrappingIterator.seek(WrappingIterator.java:92)
>         at org.apache.accumulo.core.iterators.SkippingIterator.seek(SkippingIterator.java:37)
>         at org.apache.accumulo.core.iterators.system.SourceSwitchingIterator.readNext(SourceSwitchingIterator.java:116)
>         at org.apache.accumulo.core.iterators.system.SourceSwitchingIterator.seek(SourceSwitchingIterator.java:168)
>         at org.apache.accumulo.server.tabletserver.Tablet.lookup(Tablet.java:1607)
>         at org.apache.accumulo.server.tabletserver.Tablet.lookup(Tablet.java:1710)
>         at org.apache.accumulo.server.tabletserver.TabletServer$ThriftClientHandler$LookupTask.run(TabletServer.java:988)
>         ... 4 more
> 19 05:56:04,120 [tabletserver.TabletServer] INFO : Loading tablet !0;~;!0<
> 19 05:56:04,128 [tabletserver.Tablet] TABLET_HIST: !0;~;!0< opened
> {noformat}
> The metadata tablet were being frequently loaded and unloaded as descried in ACCUMULO-329 when this happened.  
> The file in question was major compacted away on another tablet server. Shortly after this.
> {noformat}
> 19 05:57:54,187 [tabletserver.Tablet] DEBUG: Starting MajC !0;~;!0< [/table_info/A0002xm4.rf, /table_info/C0002xz3.rf, /table_info/C0002y0d.rf, /table_info/F0002xng.rf, /table_info/M0002xzv.rf, /table_info/M0002y6z.rf] --> /table_info/A0002y0e.rf_tmp
> 19 05:57:54,309 [tabletserver.Tablet] TABLET_HIST: !0;~;!0< MajC [/table_info/A0002xm4.rf, /table_info/C0002xz3.rf, /table_info/C0002y0d.rf, /table_info/F0002xng.rf, /table_info/M0002xzv.rf, /table_info/M0002y6z.rf] --> /table_info/A0002y0e.rf
> {noformat}
> {noformat}
> 19 06:10:29,546 [gc.SimpleGarbageCollector] DEBUG: Deleting /accumulo/tables/!0/table_info/F0002xng.rf
> {noformat}
> The file was garbage collected later.  The GC and Majc show that the file existed while the EOF happened.  So this is probably either a bug in RFile fast skip or a bug related to the tablet closing. Probably related to the close w/ the close temporal relationship.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira