You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by "Eric Newton (JIRA)" <ji...@apache.org> on 2014/01/02 15:17:52 UTC

[jira] [Updated] (ACCUMULO-1940) Data file in !METADATA differs from in memory data

     [ https://issues.apache.org/jira/browse/ACCUMULO-1940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Eric Newton updated ACCUMULO-1940:
----------------------------------

    Labels: 16_qa_bug  (was: )

> Data file in !METADATA differs from in memory data
> --------------------------------------------------
>
>                 Key: ACCUMULO-1940
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-1940
>             Project: Accumulo
>          Issue Type: Bug
>          Components: test
>    Affects Versions: 1.4.0, 1.4.1, 1.4.2, 1.4.3, 1.4.4, 1.5.0
>            Reporter: Josh Elser
>            Assignee: Eric Newton
>              Labels: 16_qa_bug
>             Fix For: 1.4.5, 1.5.1, 1.6.0
>
>
> Found during CI run with agitation.
> Got the first two error messages 5 times (assuming in a retry on failure block):
> {noformat}
> Failed to do close consistency check for tablet c;79d0ab;7870a
> 	java.lang.RuntimeException: Data file in !METADATA differ from in memory data c;79d0ab;7870a  {/t-0005h1j/A0005n8k.rf=797350457 19198312, /t-0005h1j/C0005skm.rf=798078368 19322025, /t-0005h1j/C0005tet.rf=89783168 2196349, /t-0005h1j/C0005u20.rf=90979448 2227972, /t-0005h1j/F0005u0v.rf=23410023 582233, /t-0005h1j/F0005u2p.rf=21958551 547159, /t-0005h1j/F0005u3g.rf=14395121 358893}  {/t-0005h1j/A0005n8k.rf=797350457 19198312, /t-0005h1j/C0005skm.rf=798078368 19322025, /t-0005h1j/C0005tet.rf=89783168 2196349, /t-0005h1j/C0005u20.rf=90979448 2227972, /t-0005h1j/F0005u2p.rf=21958551 547159, /t-0005h1j/F0005u3g.rf=14395121 358893}
> 		at org.apache.accumulo.server.tabletserver.Tablet.closeConsistencyCheck(Tablet.java:2847)
> 		at org.apache.accumulo.server.tabletserver.Tablet.completeClose(Tablet.java:2780)
> 		at org.apache.accumulo.server.tabletserver.Tablet.close(Tablet.java:2658)
> 		at org.apache.accumulo.server.tabletserver.TabletServer$UnloadTabletHandler.run(TabletServer.java:2357)
> 		at org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34)
> 		at org.apache.accumulo.trace.instrument.TraceRunnable.run(TraceRunnable.java:47)
> 		at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> 		at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> 		at org.apache.accumulo.trace.instrument.TraceRunnable.run(TraceRunnable.java:47)
> 		at org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34)
> 		at java.lang.Thread.run(Thread.java:744)
> {noformat}
> Then, we logged that we failed the consistency check
> {noformat}
> Consistency check fails, retrying java.lang.RuntimeException: Failed to do close consistency check for tablet c;79d0ab;7870a
> {noformat}
> In the end, we gave up and closed it anyways.
> {noformat}
> Tablet closed consistency check has failed for c;79d0ab;7870a giving up and closing
> {noformat}
> Before all of this happened, we tried to bring this tablet online after a failure on a new tserver. During the minc as part of the recovery process, we failed to get the lease on the .rf_tmp file we tried to create. We failed this a couple of times, but eventually got the tmp file we needed and the recovery process completed and we could bring the tablet online. The difference between the in-memory version and the !METADATA version was this one flushed rfile that we created during this recovery process.
> The problem eventually fixed itself because the tablet was migrated to a different server and we just took what was (correctly) in the !METADATA table.
> There still is an unknown issue of how we missed the flush RFile in the DatafileManager's copy.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)