You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-dev@db.apache.org by "Knut Anders Hatlen (Commented) (JIRA)" <ji...@apache.org> on 2011/10/18 17:05:10 UTC

[jira] [Commented] (DERBY-5406) Intermittent failures in CompressTableTest and TruncateTableTest

    [ https://issues.apache.org/jira/browse/DERBY-5406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13129781#comment-13129781 ] 

Knut Anders Hatlen commented on DERBY-5406:
-------------------------------------------

Of the two stack traces mentioned above, I see (2) more frequently than (1). (I also sometimes see other stack traces, and I suspect there may be multiple holes.)

Stack trace (2) is in fact the same problem that caused the NullPointerException fixed in DERBY-4275. The fix made it throw a StandardException instead, so that the retry logic would come into play. In some cases it actually does recover from that error, but apparently not always. Here's what I think is happening in FromBaseTable.bindNonVTITables() when this error occurs:

1) The statement is in the process of being recompiled, and it builds the table descriptor at line 2190:

		TableDescriptor tableDescriptor = bindTableDescriptor();

2) The statement's dependency on the table is registered at line 2341:

			/* This represents a table - query is dependent on the TableDescriptor */
			compilerContext.createDependency(tableDescriptor);

3) It discovers that the conglomerate referred to by the table descriptor no longer exists at line 2351 and raises an exception:

            // Bail out if the descriptor couldn't be found. The conglomerate
            // probably doesn't exist anymore.
            if (baseConglomerateDescriptor == null) {
                throw StandardException.newException(
                        SQLState.STORE_CONGLOMERATE_DOES_NOT_EXIST,
                        new Long(tableDescriptor.getHeapConglomerateId()));
            }

Now, the conglomerate disappeared some time after the table descriptor was built, because of a compress or truncate operation. If the dependency on the table had been registered before the conglomerate was removed, the compress/truncate operation will have invalidated the statement, so the retry logic knows it should try again.

If the compress/truncate operation happened after the table descriptor was built, but before the dependency was registered, the statement will not be invalidated. In that case, the retry logic does not know that an invalidation has occurred, and it won't retry the compilation.

So it looks like we either need to find a way to close the window between the calls to bindTableDescriptor() and createDependency(), or when this happens the statement should invalidate itself before it throws the exception.
                
> Intermittent failures in CompressTableTest and TruncateTableTest
> ----------------------------------------------------------------
>
>                 Key: DERBY-5406
>                 URL: https://issues.apache.org/jira/browse/DERBY-5406
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 10.8.2.0, 10.9.0.0
>            Reporter: Knut Anders Hatlen
>            Assignee: Knut Anders Hatlen
>         Attachments: d5406-1a-detect-invalidation-during-compilation.diff, d5406-1b.diff
>
>
> The test cases CompressTableTest.testConcurrentInvalidation() and TruncateTableTest.testConcurrentInvalidation() fail intermittently with errors such as:
> ERROR XSAI2: The conglomerate (2,720) requested does not exist.
> The problem has been analyzed in the comments on DERBY-4275, and a patch attached to that issue (invalidation-during-compilation.diff) fixes the underlying race condition. However, that patch only works correctly together with the fix for DERBY-5161, which was backed out because it caused the regression DERBY-5280.
> We will therefore need to find a way to fix DERBY-5161 without reintroducing DERBY-5280 in order to resolve this issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira