You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-dev@db.apache.org by "Kristian Waagan (JIRA)" <ji...@apache.org> on 2011/03/08 18:47:01 UTC

[jira] Created: (DERBY-5108) Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows

Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows
---------------------------------------------------------------------------------------------------

                 Key: DERBY-5108
                 URL: https://issues.apache.org/jira/browse/DERBY-5108
             Project: Derby
          Issue Type: Bug
          Components: Test
    Affects Versions: 10.8.0.0
         Environment: Windows platforms.
            Reporter: Kristian Waagan


The test AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete fails intermittently on Windows platforms because the test is unable to delete a database directory.
Even after several retries and sleeps (the formula should be (attempt -1) * 2000, resulting in a total sleep time of 12 seconds), the conglomerate system\singleUse\copyShutdown\seg0\c481.dat cannot be deleted.

For instance from http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/w2003/1078855-suitesAll_diff.txt :
(truncated paths)
testShutdownWhileScanningThenDelete <assertDirectoryDeleted> attempt 1 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
<assertDirectoryDeleted> attempt 2 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
<assertDirectoryDeleted> attempt 3 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
<assertDirectoryDeleted> attempt 4 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
used 205814 ms F.

Maybe the database isn't shut down, or some specific timing of events causes a file to be reopened when it shouldn't have been (i.e. after the database shutdown has been initiated).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (DERBY-5108) Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows

Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DERBY-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kristian Waagan updated DERBY-5108:
-----------------------------------

    Attachment: derby-5108-3a-istat_log_abort.diff

Attaching patch 3a, which adjusts a log message if a scan is aborted.

Committed to trunk with revision 1133317.

> Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows
> ---------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5108
>                 URL: https://issues.apache.org/jira/browse/DERBY-5108
>             Project: Derby
>          Issue Type: Bug
>          Components: Test
>    Affects Versions: 10.8.1.2
>         Environment: Windows platforms.
>            Reporter: Kristian Waagan
>            Assignee: Kristian Waagan
>            Priority: Blocker
>             Fix For: 10.9.0.0
>
>         Attachments: derby-5108-1a-early_istats_shutdown.diff, derby-5108-2.diff, derby-5108-2a-early_istats_shutdown_broad.diff, derby-5108-3a-istat_log_abort.diff, javacore.20110309.125807.4048.0001.txt, waitOnShutdown.diff
>
>
> The test AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete fails intermittently on Windows platforms because the test is unable to delete a database directory.
> Even after several retries and sleeps (the formula should be (attempt -1) * 2000, resulting in a total sleep time of 12 seconds), the conglomerate system\singleUse\copyShutdown\seg0\c481.dat cannot be deleted.
> For instance from http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/w2003/1078855-suitesAll_diff.txt :
> (truncated paths)
> testShutdownWhileScanningThenDelete <assertDirectoryDeleted> attempt 1 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 2 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 3 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 4 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> used 205814 ms F.
> Maybe the database isn't shut down, or some specific timing of events causes a file to be reopened when it shouldn't have been (i.e. after the database shutdown has been initiated).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (DERBY-5108) Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows

Posted by "Mike Matrigali (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DERBY-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mike Matrigali updated DERBY-5108:
----------------------------------

    Attachment: javacore.20110309.125807.4048.0001.txt

snapshot of system looping while trying to delete last file in the seg0 directory.

> Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows
> ---------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5108
>                 URL: https://issues.apache.org/jira/browse/DERBY-5108
>             Project: Derby
>          Issue Type: Bug
>          Components: Test
>    Affects Versions: 10.8.0.0
>         Environment: Windows platforms.
>            Reporter: Kristian Waagan
>         Attachments: javacore.20110309.125807.4048.0001.txt
>
>
> The test AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete fails intermittently on Windows platforms because the test is unable to delete a database directory.
> Even after several retries and sleeps (the formula should be (attempt -1) * 2000, resulting in a total sleep time of 12 seconds), the conglomerate system\singleUse\copyShutdown\seg0\c481.dat cannot be deleted.
> For instance from http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/w2003/1078855-suitesAll_diff.txt :
> (truncated paths)
> testShutdownWhileScanningThenDelete <assertDirectoryDeleted> attempt 1 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 2 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 3 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 4 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> used 205814 ms F.
> Maybe the database isn't shut down, or some specific timing of events causes a file to be reopened when it shouldn't have been (i.e. after the database shutdown has been initiated).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (DERBY-5108) Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows

Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DERBY-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13005562#comment-13005562 ] 

Kristian Waagan commented on DERBY-5108:
----------------------------------------

Thanks for looking into this, Mike.
If you enable istat logging and tracing, you should be able to see which exception/error is raised when the istat daemon is shut down (i.e. is it always being raised in the same place?)

Kathey, even though I agree this could be extremely problematic, there are some points to consider:
 o the istat daemon must be doing work as the database is shut down
 o the bug is timing dependent (it doesn't fail every time on the regression test machines, nor on my Windows Vista quad core)
 o the user must try to delete the database files
 o affects Windows platforms only

With the points above, this issue is still not a blocker for me, especially since the feature can be disabled. I still think this issue is the most important istat bug to fix, so it would be great if it made it into 10.8.

It is unclear to me at this point what happens resource-wise if you just skip the delete-step, reboot the database and repeat the sequence (without killing the JVM).
One thing that will definitely make it a blocker is if the database can't be booted and/or be written into after the bug occurs (again, without killing the JVM, as would typically by the case in appserver environments).

> Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows
> ---------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5108
>                 URL: https://issues.apache.org/jira/browse/DERBY-5108
>             Project: Derby
>          Issue Type: Bug
>          Components: Test
>    Affects Versions: 10.8.0.0
>         Environment: Windows platforms.
>            Reporter: Kristian Waagan
>            Assignee: Mike Matrigali
>            Priority: Blocker
>         Attachments: javacore.20110309.125807.4048.0001.txt
>
>
> The test AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete fails intermittently on Windows platforms because the test is unable to delete a database directory.
> Even after several retries and sleeps (the formula should be (attempt -1) * 2000, resulting in a total sleep time of 12 seconds), the conglomerate system\singleUse\copyShutdown\seg0\c481.dat cannot be deleted.
> For instance from http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/w2003/1078855-suitesAll_diff.txt :
> (truncated paths)
> testShutdownWhileScanningThenDelete <assertDirectoryDeleted> attempt 1 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 2 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 3 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 4 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> used 205814 ms F.
> Maybe the database isn't shut down, or some specific timing of events causes a file to be reopened when it shouldn't have been (i.e. after the database shutdown has been initiated).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (DERBY-5108) Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows

Posted by "Knut Anders Hatlen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DERBY-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13007247#comment-13007247 ] 

Knut Anders Hatlen commented on DERBY-5108:
-------------------------------------------

> if we go a reasonable amount of time without seeing it in nightly's or reported by anyone

Didn't you say you still saw the problem in one out of five runs with the fix?

> Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows
> ---------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5108
>                 URL: https://issues.apache.org/jira/browse/DERBY-5108
>             Project: Derby
>          Issue Type: Bug
>          Components: Test
>    Affects Versions: 10.8.0.0
>         Environment: Windows platforms.
>            Reporter: Kristian Waagan
>            Priority: Blocker
>         Attachments: javacore.20110309.125807.4048.0001.txt, waitOnShutdown.diff
>
>
> The test AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete fails intermittently on Windows platforms because the test is unable to delete a database directory.
> Even after several retries and sleeps (the formula should be (attempt -1) * 2000, resulting in a total sleep time of 12 seconds), the conglomerate system\singleUse\copyShutdown\seg0\c481.dat cannot be deleted.
> For instance from http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/w2003/1078855-suitesAll_diff.txt :
> (truncated paths)
> testShutdownWhileScanningThenDelete <assertDirectoryDeleted> attempt 1 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 2 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 3 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 4 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> used 205814 ms F.
> Maybe the database isn't shut down, or some specific timing of events causes a file to be reopened when it shouldn't have been (i.e. after the database shutdown has been initiated).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Closed] (DERBY-5108) Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows

Posted by "Myrna van Lunteren (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DERBY-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Myrna van Lunteren closed DERBY-5108.
-------------------------------------

    Resolution: Fixed

> Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows
> ---------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5108
>                 URL: https://issues.apache.org/jira/browse/DERBY-5108
>             Project: Derby
>          Issue Type: Bug
>          Components: Services
>    Affects Versions: 10.8.1.2
>         Environment: Windows platforms.
>            Reporter: Kristian Waagan
>            Assignee: Kristian Waagan
>            Priority: Blocker
>             Fix For: 10.8.1.6, 10.9.0.0
>
>         Attachments: derby-5108-1a-early_istats_shutdown.diff, derby-5108-2.diff, derby-5108-2a-early_istats_shutdown_broad.diff, derby-5108-3a-istat_log_abort.diff, javacore.20110309.125807.4048.0001.txt, waitOnShutdown.diff
>
>
> The test AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete fails intermittently on Windows platforms because the test is unable to delete a database directory.
> Even after several retries and sleeps (the formula should be (attempt -1) * 2000, resulting in a total sleep time of 12 seconds), the conglomerate system\singleUse\copyShutdown\seg0\c481.dat cannot be deleted.
> For instance from http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/w2003/1078855-suitesAll_diff.txt :
> (truncated paths)
> testShutdownWhileScanningThenDelete <assertDirectoryDeleted> attempt 1 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 2 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 3 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 4 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> used 205814 ms F.
> Maybe the database isn't shut down, or some specific timing of events causes a file to be reopened when it shouldn't have been (i.e. after the database shutdown has been initiated).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Closed] (DERBY-5108) Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows

Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DERBY-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kristian Waagan closed DERBY-5108.
----------------------------------


Closing issue.

> Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows
> ---------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5108
>                 URL: https://issues.apache.org/jira/browse/DERBY-5108
>             Project: Derby
>          Issue Type: Bug
>          Components: Test
>    Affects Versions: 10.8.1.2
>         Environment: Windows platforms.
>            Reporter: Kristian Waagan
>            Assignee: Kristian Waagan
>            Priority: Blocker
>             Fix For: 10.8.1.5, 10.9.0.0
>
>         Attachments: derby-5108-1a-early_istats_shutdown.diff, derby-5108-2.diff, derby-5108-2a-early_istats_shutdown_broad.diff, derby-5108-3a-istat_log_abort.diff, javacore.20110309.125807.4048.0001.txt, waitOnShutdown.diff
>
>
> The test AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete fails intermittently on Windows platforms because the test is unable to delete a database directory.
> Even after several retries and sleeps (the formula should be (attempt -1) * 2000, resulting in a total sleep time of 12 seconds), the conglomerate system\singleUse\copyShutdown\seg0\c481.dat cannot be deleted.
> For instance from http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/w2003/1078855-suitesAll_diff.txt :
> (truncated paths)
> testShutdownWhileScanningThenDelete <assertDirectoryDeleted> attempt 1 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 2 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 3 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 4 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> used 205814 ms F.
> Maybe the database isn't shut down, or some specific timing of events causes a file to be reopened when it shouldn't have been (i.e. after the database shutdown has been initiated).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Issue Comment Edited: (DERBY-5108) Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows

Posted by "Knut Anders Hatlen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DERBY-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006492#comment-13006492 ] 

Knut Anders Hatlen edited comment on DERBY-5108 at 3/14/11 4:28 PM:
--------------------------------------------------------------------

Thanks for testing the patch, Lily!

> Does this approach fulfill the top down shutdown supported by the architecture?

Yes, I think so. The istat service is shut down from the data dictionary, which is top-down. That was done before as well, but the actual worker thread would be left running until it finished by itself. With the patch, stop() won't complete until the worker thread is done, so when the higher levels proceed in the shutdown code, we know that the istat thread won't be causing any bad interactions.

      was (Author: knutanders):
    Thanks for testing the patch, Lily!

> Does this approach fulfill the top down shutdown supported by the architecture?

Yes, I think so. The istat service is shut down from the data dictionary, which is top-down. That was done before as well, but the actual worker thread would be left running until it finished by itself. With the patch, stop() won't complete until the worker thread is done, so when the higher leveles proceed in the shutdown code, we know that the istat thread won't be causing any bad interactions.
  
> Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows
> ---------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5108
>                 URL: https://issues.apache.org/jira/browse/DERBY-5108
>             Project: Derby
>          Issue Type: Bug
>          Components: Test
>    Affects Versions: 10.8.0.0
>         Environment: Windows platforms.
>            Reporter: Kristian Waagan
>            Assignee: Mike Matrigali
>            Priority: Blocker
>         Attachments: javacore.20110309.125807.4048.0001.txt, waitOnShutdown.diff
>
>
> The test AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete fails intermittently on Windows platforms because the test is unable to delete a database directory.
> Even after several retries and sleeps (the formula should be (attempt -1) * 2000, resulting in a total sleep time of 12 seconds), the conglomerate system\singleUse\copyShutdown\seg0\c481.dat cannot be deleted.
> For instance from http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/w2003/1078855-suitesAll_diff.txt :
> (truncated paths)
> testShutdownWhileScanningThenDelete <assertDirectoryDeleted> attempt 1 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 2 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 3 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 4 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> used 205814 ms F.
> Maybe the database isn't shut down, or some specific timing of events causes a file to be reopened when it shouldn't have been (i.e. after the database shutdown has been initiated).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (DERBY-5108) Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows

Posted by "Dag H. Wanvik (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DERBY-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13045411#comment-13045411 ] 

Dag H. Wanvik commented on DERBY-5108:
--------------------------------------

I was just wondering about the code ca line 339-342 in TransactionResourceImpl.. could that bypass it,? Note that
this is part of the old interrupt machinery to close down threads when shutting down the engine. But presumably by then you have already shut down the thread. Depends what happens first, the shutdown of the statistics  thread or the call to notifyAllActiveThreads, but I see in DataBaseContextImpl#cleanupOnError you have inserted the logic *before* the call to notifyAllActiveThreads so you should be golden.



> Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows
> ---------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5108
>                 URL: https://issues.apache.org/jira/browse/DERBY-5108
>             Project: Derby
>          Issue Type: Bug
>          Components: Test
>    Affects Versions: 10.8.1.2
>         Environment: Windows platforms.
>            Reporter: Kristian Waagan
>            Assignee: Kristian Waagan
>            Priority: Blocker
>         Attachments: derby-5108-1a-early_istats_shutdown.diff, derby-5108-2.diff, derby-5108-2a-early_istats_shutdown_broad.diff, javacore.20110309.125807.4048.0001.txt, waitOnShutdown.diff
>
>
> The test AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete fails intermittently on Windows platforms because the test is unable to delete a database directory.
> Even after several retries and sleeps (the formula should be (attempt -1) * 2000, resulting in a total sleep time of 12 seconds), the conglomerate system\singleUse\copyShutdown\seg0\c481.dat cannot be deleted.
> For instance from http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/w2003/1078855-suitesAll_diff.txt :
> (truncated paths)
> testShutdownWhileScanningThenDelete <assertDirectoryDeleted> attempt 1 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 2 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 3 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 4 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> used 205814 ms F.
> Maybe the database isn't shut down, or some specific timing of events causes a file to be reopened when it shouldn't have been (i.e. after the database shutdown has been initiated).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (DERBY-5108) Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows

Posted by "Knut Anders Hatlen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DERBY-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006010#comment-13006010 ] 

Knut Anders Hatlen commented on DERBY-5108:
-------------------------------------------

Engine shutdown interrupts all threads that are active inside Derby code. I'm wondering if that's the interrupt the istat thread is experiencing. DERBY-4920 was similar, only then it was a checkpoint that was interrupted by the engine shutdown and reopened the channel. I don't know if we want to handle interrupts differently for the istat daemon than we do for other threads. I'd rather see that the reopening logic in store knew that it shouldn't attempt to reopen channels if the engine is about to shut down. I thought it already had some logic to take care of that, but I'm not sure.

Waiting for the istat thread before completing shutdown sounds OK. I think the reason why it doesn't do that now, was a concern that it may take a very long time if it's processing a large index. But shutdown also performs a checkpoint, which may take a long time if the page cache is big. And if the istat thread is also interrupted by the engine shutdown, it should stop rather quickly (if we only can make it stop reopening the channel).

I'm wondering, though, if this problem is istat-specific, or if we would see the same problem if we called SYSCS_UPDATE_STATISTICS on a large index just before shutting down the database?

> Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows
> ---------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5108
>                 URL: https://issues.apache.org/jira/browse/DERBY-5108
>             Project: Derby
>          Issue Type: Bug
>          Components: Test
>    Affects Versions: 10.8.0.0
>         Environment: Windows platforms.
>            Reporter: Kristian Waagan
>            Assignee: Mike Matrigali
>            Priority: Blocker
>         Attachments: javacore.20110309.125807.4048.0001.txt
>
>
> The test AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete fails intermittently on Windows platforms because the test is unable to delete a database directory.
> Even after several retries and sleeps (the formula should be (attempt -1) * 2000, resulting in a total sleep time of 12 seconds), the conglomerate system\singleUse\copyShutdown\seg0\c481.dat cannot be deleted.
> For instance from http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/w2003/1078855-suitesAll_diff.txt :
> (truncated paths)
> testShutdownWhileScanningThenDelete <assertDirectoryDeleted> attempt 1 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 2 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 3 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 4 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> used 205814 ms F.
> Maybe the database isn't shut down, or some specific timing of events causes a file to be reopened when it shouldn't have been (i.e. after the database shutdown has been initiated).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Assigned: (DERBY-5108) Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows

Posted by "Mike Matrigali (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DERBY-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mike Matrigali reassigned DERBY-5108:
-------------------------------------

    Assignee:     (was: Mike Matrigali)

I am not actively working on this any more.  I think it should be left open until after istat is re-enabled or the istat tests are changed to run even if istat is disabled by default.   Then should be closed if we go a reasonable amount of time without seeing it in nightly's or reported by anyone.

> Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows
> ---------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5108
>                 URL: https://issues.apache.org/jira/browse/DERBY-5108
>             Project: Derby
>          Issue Type: Bug
>          Components: Test
>    Affects Versions: 10.8.0.0
>         Environment: Windows platforms.
>            Reporter: Kristian Waagan
>            Priority: Blocker
>         Attachments: javacore.20110309.125807.4048.0001.txt, waitOnShutdown.diff
>
>
> The test AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete fails intermittently on Windows platforms because the test is unable to delete a database directory.
> Even after several retries and sleeps (the formula should be (attempt -1) * 2000, resulting in a total sleep time of 12 seconds), the conglomerate system\singleUse\copyShutdown\seg0\c481.dat cannot be deleted.
> For instance from http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/w2003/1078855-suitesAll_diff.txt :
> (truncated paths)
> testShutdownWhileScanningThenDelete <assertDirectoryDeleted> attempt 1 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 2 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 3 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 4 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> used 205814 ms F.
> Maybe the database isn't shut down, or some specific timing of events causes a file to be reopened when it shouldn't have been (i.e. after the database shutdown has been initiated).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (DERBY-5108) Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows

Posted by "Dag H. Wanvik (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DERBY-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13046878#comment-13046878 ] 

Dag H. Wanvik commented on DERBY-5108:
--------------------------------------

Re cleanUpOnError: I believe the Throwable can be an SQLException, but essentially anything. The InterruptedExceptions
are handled, I think, before we get this far, although I think I saw a couple of unhandled wait's the other day which is on my list of things to check. 

But there could be other unhandled Exceptions and Errors as well, cf. the call to DCI#cleanUpOnError from ContextManager#cleanupOnError.called from TRI#handleException called from EmbedConnection#handleException which is called from many API methods in catch-all (Throwable) constructs.

EmbedConnection#handleException is also called from ConnectionChild#handleException which is used from ResultSet API methods etc...

Note the call to wrap "other" exceptions in the call TRI#warpInSQLException towards the end of TRI#handleException.


> Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows
> ---------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5108
>                 URL: https://issues.apache.org/jira/browse/DERBY-5108
>             Project: Derby
>          Issue Type: Bug
>          Components: Test
>    Affects Versions: 10.8.1.2
>         Environment: Windows platforms.
>            Reporter: Kristian Waagan
>            Assignee: Kristian Waagan
>            Priority: Blocker
>             Fix For: 10.9.0.0
>
>         Attachments: derby-5108-1a-early_istats_shutdown.diff, derby-5108-2.diff, derby-5108-2a-early_istats_shutdown_broad.diff, derby-5108-3a-istat_log_abort.diff, javacore.20110309.125807.4048.0001.txt, waitOnShutdown.diff
>
>
> The test AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete fails intermittently on Windows platforms because the test is unable to delete a database directory.
> Even after several retries and sleeps (the formula should be (attempt -1) * 2000, resulting in a total sleep time of 12 seconds), the conglomerate system\singleUse\copyShutdown\seg0\c481.dat cannot be deleted.
> For instance from http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/w2003/1078855-suitesAll_diff.txt :
> (truncated paths)
> testShutdownWhileScanningThenDelete <assertDirectoryDeleted> attempt 1 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 2 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 3 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 4 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> used 205814 ms F.
> Maybe the database isn't shut down, or some specific timing of events causes a file to be reopened when it shouldn't have been (i.e. after the database shutdown has been initiated).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (DERBY-5108) Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows

Posted by "Knut Anders Hatlen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DERBY-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006891#comment-13006891 ] 

Knut Anders Hatlen commented on DERBY-5108:
-------------------------------------------

There may be other issues than the istat daemon coming into play here. I see that RawStore.stop() doesn't wait for the checkpoint to complete if there already is a checkpoint in progress, so there may be similar problems in that code. A related problem with checkpoints reopening containers during shutdown was fixed in DERBY-4920.

> Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows
> ---------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5108
>                 URL: https://issues.apache.org/jira/browse/DERBY-5108
>             Project: Derby
>          Issue Type: Bug
>          Components: Test
>    Affects Versions: 10.8.0.0
>         Environment: Windows platforms.
>            Reporter: Kristian Waagan
>            Assignee: Mike Matrigali
>            Priority: Blocker
>         Attachments: javacore.20110309.125807.4048.0001.txt, waitOnShutdown.diff
>
>
> The test AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete fails intermittently on Windows platforms because the test is unable to delete a database directory.
> Even after several retries and sleeps (the formula should be (attempt -1) * 2000, resulting in a total sleep time of 12 seconds), the conglomerate system\singleUse\copyShutdown\seg0\c481.dat cannot be deleted.
> For instance from http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/w2003/1078855-suitesAll_diff.txt :
> (truncated paths)
> testShutdownWhileScanningThenDelete <assertDirectoryDeleted> attempt 1 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 2 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 3 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 4 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> used 205814 ms F.
> Maybe the database isn't shut down, or some specific timing of events causes a file to be reopened when it shouldn't have been (i.e. after the database shutdown has been initiated).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (DERBY-5108) Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows

Posted by "Knut Anders Hatlen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DERBY-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Knut Anders Hatlen updated DERBY-5108:
--------------------------------------

    Attachment: waitOnShutdown.diff

I don't think we need to use interrupts to shut down the istat thread more quickly. IndexStatisticsDaemonImpl.stop() sets a flag daemonDisabled that we can check every now and then while scanning the indexes. Currently, the flag is checked right before the scan starts, but I was thinking we could check that flag for example each time we've fetched a new chunk of rows from store. The attached patch does just that, and also makes stop() wait for the istat thread to complete if there is an active thread.

Does that help on the machine where you'll able to reproduce the problem, Mike?

> Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows
> ---------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5108
>                 URL: https://issues.apache.org/jira/browse/DERBY-5108
>             Project: Derby
>          Issue Type: Bug
>          Components: Test
>    Affects Versions: 10.8.0.0
>         Environment: Windows platforms.
>            Reporter: Kristian Waagan
>            Assignee: Mike Matrigali
>            Priority: Blocker
>         Attachments: javacore.20110309.125807.4048.0001.txt, waitOnShutdown.diff
>
>
> The test AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete fails intermittently on Windows platforms because the test is unable to delete a database directory.
> Even after several retries and sleeps (the formula should be (attempt -1) * 2000, resulting in a total sleep time of 12 seconds), the conglomerate system\singleUse\copyShutdown\seg0\c481.dat cannot be deleted.
> For instance from http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/w2003/1078855-suitesAll_diff.txt :
> (truncated paths)
> testShutdownWhileScanningThenDelete <assertDirectoryDeleted> attempt 1 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 2 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 3 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 4 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> used 205814 ms F.
> Maybe the database isn't shut down, or some specific timing of events causes a file to be reopened when it shouldn't have been (i.e. after the database shutdown has been initiated).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (DERBY-5108) Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows

Posted by "Mike Matrigali (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DERBY-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13004812#comment-13004812 ] 

Mike Matrigali commented on DERBY-5108:
---------------------------------------

This issue seems to be reproducible in my environment.  3 out of 3 times so far just running the AutomaticIndexStatisticsTest alone - I will try more later.  As a test I bumped up the retry to 100 and went to lunch and came back to the test still stuck trying to delete the same file.  So it seems likely a bug in the code rather than not enough wait time in the test.  I am going to attach a java core I took while it was looping.  I see a derby.rawStoreDaemon but have not debugged enough to know if this is expected or not in this case.  I don't know if the harness when running just this test case might still create the default db and leave it up while then creating and starting a separate one for the single use database case that this test uses.

> Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows
> ---------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5108
>                 URL: https://issues.apache.org/jira/browse/DERBY-5108
>             Project: Derby
>          Issue Type: Bug
>          Components: Test
>    Affects Versions: 10.8.0.0
>         Environment: Windows platforms.
>            Reporter: Kristian Waagan
>
> The test AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete fails intermittently on Windows platforms because the test is unable to delete a database directory.
> Even after several retries and sleeps (the formula should be (attempt -1) * 2000, resulting in a total sleep time of 12 seconds), the conglomerate system\singleUse\copyShutdown\seg0\c481.dat cannot be deleted.
> For instance from http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/w2003/1078855-suitesAll_diff.txt :
> (truncated paths)
> testShutdownWhileScanningThenDelete <assertDirectoryDeleted> attempt 1 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 2 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 3 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 4 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> used 205814 ms F.
> Maybe the database isn't shut down, or some specific timing of events causes a file to be reopened when it shouldn't have been (i.e. after the database shutdown has been initiated).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (DERBY-5108) Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows

Posted by "Mike Matrigali (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DERBY-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13004683#comment-13004683 ] 

Mike Matrigali commented on DERBY-5108:
---------------------------------------

more detail on the 3/7 weme occurrence:
http://people.apache.org/~myrnavl/derby_test_results/main/windows/testlog/weme6.2/1079089-suites.All_diff.txt
There was 1 failure:
1) testShutdownWhileScanningThenDelete(org.apache.derbyTesting.functionTests.tests.store.AutomaticIndexStatisticsTest)junit.framework.AssertionFailedError: Failed to delete 3 files (root=C:\jartest\JarResults.2011-03-07\weme6.2_suites.All\system\singleUse\copyShutdown: C:\jartest\JarResults.2011-03-07\weme6.2_suites.All\system\singleUse\copyShutdown\seg0\c481.dat (isDir=false, canRead=true, canWrite=true, size=22351872), C:\jartest\JarResults.2011-03-07\weme6.2_suites.All\system\singleUse\copyShutdown\seg0 (isDir=true, canRead=true, canWrite=true, size=0), C:\jartest\JarResults.2011-03-07\weme6.2_suites.All\system\singleUse\copyShutdown (isDir=true, canRead=true, canWrite=true, size=0)
	at org.apache.derbyTesting.junit.BaseJDBCTestCase.assertDirectoryDeleted(BaseJDBCTestCase.java:1526)
	at org.apache.derbyTesting.functionTests.tests.store.AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete(AutomaticIndexStatisticsTest.java:188)
	at java.lang.reflect.AccessibleObject.invokeV(AccessibleObject.java:195)
	at org.apache.derbyTesting.junit.BaseTestCase.runBare(BaseTestCase.java:112)
	at junit.extensions.TestDecorator.basicRun(TestDecorator.java:22)
	at junit.extensions.TestSetup$1.protect(TestSetup.java:19)
	at junit.extensions.TestSetup.run(TestSetup.java:23)
	at org.apache.derbyTesting.junit.BaseTestSetup.run(BaseTestSetup.java:57)
	at junit.extensions.TestDecorator.basicRun(TestDecorator.java:22)
	at junit.extensions.TestSetup$1.protect(TestSetup.java:19)
	at junit.extensions.TestSetup.run(TestSetup.java:23)



> Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows
> ---------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5108
>                 URL: https://issues.apache.org/jira/browse/DERBY-5108
>             Project: Derby
>          Issue Type: Bug
>          Components: Test
>    Affects Versions: 10.8.0.0
>         Environment: Windows platforms.
>            Reporter: Kristian Waagan
>
> The test AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete fails intermittently on Windows platforms because the test is unable to delete a database directory.
> Even after several retries and sleeps (the formula should be (attempt -1) * 2000, resulting in a total sleep time of 12 seconds), the conglomerate system\singleUse\copyShutdown\seg0\c481.dat cannot be deleted.
> For instance from http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/w2003/1078855-suitesAll_diff.txt :
> (truncated paths)
> testShutdownWhileScanningThenDelete <assertDirectoryDeleted> attempt 1 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 2 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 3 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 4 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> used 205814 ms F.
> Maybe the database isn't shut down, or some specific timing of events causes a file to be reopened when it shouldn't have been (i.e. after the database shutdown has been initiated).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Assigned: (DERBY-5108) Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows

Posted by "Mike Matrigali (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DERBY-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mike Matrigali reassigned DERBY-5108:
-------------------------------------

    Assignee: Mike Matrigali

> Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows
> ---------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5108
>                 URL: https://issues.apache.org/jira/browse/DERBY-5108
>             Project: Derby
>          Issue Type: Bug
>          Components: Test
>    Affects Versions: 10.8.0.0
>         Environment: Windows platforms.
>            Reporter: Kristian Waagan
>            Assignee: Mike Matrigali
>         Attachments: javacore.20110309.125807.4048.0001.txt
>
>
> The test AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete fails intermittently on Windows platforms because the test is unable to delete a database directory.
> Even after several retries and sleeps (the formula should be (attempt -1) * 2000, resulting in a total sleep time of 12 seconds), the conglomerate system\singleUse\copyShutdown\seg0\c481.dat cannot be deleted.
> For instance from http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/w2003/1078855-suitesAll_diff.txt :
> (truncated paths)
> testShutdownWhileScanningThenDelete <assertDirectoryDeleted> attempt 1 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 2 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 3 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 4 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> used 205814 ms F.
> Maybe the database isn't shut down, or some specific timing of events causes a file to be reopened when it shouldn't have been (i.e. after the database shutdown has been initiated).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (DERBY-5108) Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows

Posted by "Dag H. Wanvik (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DERBY-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dag H. Wanvik updated DERBY-5108:
---------------------------------

    Issue & fix info: [Patch Available]

> Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows
> ---------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5108
>                 URL: https://issues.apache.org/jira/browse/DERBY-5108
>             Project: Derby
>          Issue Type: Bug
>          Components: Test
>    Affects Versions: 10.8.0.0
>         Environment: Windows platforms.
>            Reporter: Kristian Waagan
>            Priority: Blocker
>         Attachments: derby-5108-2.diff, javacore.20110309.125807.4048.0001.txt, waitOnShutdown.diff
>
>
> The test AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete fails intermittently on Windows platforms because the test is unable to delete a database directory.
> Even after several retries and sleeps (the formula should be (attempt -1) * 2000, resulting in a total sleep time of 12 seconds), the conglomerate system\singleUse\copyShutdown\seg0\c481.dat cannot be deleted.
> For instance from http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/w2003/1078855-suitesAll_diff.txt :
> (truncated paths)
> testShutdownWhileScanningThenDelete <assertDirectoryDeleted> attempt 1 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 2 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 3 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 4 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> used 205814 ms F.
> Maybe the database isn't shut down, or some specific timing of events causes a file to be reopened when it shouldn't have been (i.e. after the database shutdown has been initiated).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (DERBY-5108) Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows

Posted by "Rick Hillegas (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DERBY-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rick Hillegas updated DERBY-5108:
---------------------------------

    Urgency: Normal  (was: Blocker)

Reducing the urgency of this issue. Now that istat is not turned on by default for 10.8.1, this issue is not a blocker for the release.

> Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows
> ---------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5108
>                 URL: https://issues.apache.org/jira/browse/DERBY-5108
>             Project: Derby
>          Issue Type: Bug
>          Components: Test
>    Affects Versions: 10.8.0.0
>         Environment: Windows platforms.
>            Reporter: Kristian Waagan
>            Assignee: Mike Matrigali
>            Priority: Blocker
>         Attachments: javacore.20110309.125807.4048.0001.txt, waitOnShutdown.diff
>
>
> The test AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete fails intermittently on Windows platforms because the test is unable to delete a database directory.
> Even after several retries and sleeps (the formula should be (attempt -1) * 2000, resulting in a total sleep time of 12 seconds), the conglomerate system\singleUse\copyShutdown\seg0\c481.dat cannot be deleted.
> For instance from http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/w2003/1078855-suitesAll_diff.txt :
> (truncated paths)
> testShutdownWhileScanningThenDelete <assertDirectoryDeleted> attempt 1 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 2 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 3 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 4 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> used 205814 ms F.
> Maybe the database isn't shut down, or some specific timing of events causes a file to be reopened when it shouldn't have been (i.e. after the database shutdown has been initiated).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (DERBY-5108) Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows

Posted by "Dag H. Wanvik (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DERBY-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dag H. Wanvik updated DERBY-5108:
---------------------------------

    Attachment: derby-5108-2.diff

Attaching a small fix to handle the case of the thread stopping the daemon being interrupted while doing the IndexStatisticsDaemonImpl#join introduced in this issue: note the interrupt and retry the join. I think this is more in keeping with what we do elsewhere now.

Running regressions.

> Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows
> ---------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5108
>                 URL: https://issues.apache.org/jira/browse/DERBY-5108
>             Project: Derby
>          Issue Type: Bug
>          Components: Test
>    Affects Versions: 10.8.0.0
>         Environment: Windows platforms.
>            Reporter: Kristian Waagan
>            Priority: Blocker
>         Attachments: derby-5108-2.diff, javacore.20110309.125807.4048.0001.txt, waitOnShutdown.diff
>
>
> The test AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete fails intermittently on Windows platforms because the test is unable to delete a database directory.
> Even after several retries and sleeps (the formula should be (attempt -1) * 2000, resulting in a total sleep time of 12 seconds), the conglomerate system\singleUse\copyShutdown\seg0\c481.dat cannot be deleted.
> For instance from http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/w2003/1078855-suitesAll_diff.txt :
> (truncated paths)
> testShutdownWhileScanningThenDelete <assertDirectoryDeleted> attempt 1 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 2 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 3 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 4 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> used 205814 ms F.
> Maybe the database isn't shut down, or some specific timing of events causes a file to be reopened when it shouldn't have been (i.e. after the database shutdown has been initiated).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (DERBY-5108) Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows

Posted by "Mike Matrigali (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DERBY-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006542#comment-13006542 ] 

Mike Matrigali commented on DERBY-5108:
---------------------------------------

thanks for the work knut.  Your patch looks good to me so far, I am running my single test now.

Even though update statistics in a user thread exhibits the same symptom I would argue that it is much more serious in a default on istat release.  
A user shutting down
while actively running can work around the issue.  But the background process is not in their control (other than disabling it entirely).  So doing more
work to behave nicely during shutdown in istat important in my opinion.  We should probably log a separate JIRA for your update statistics test case.  I
assume other work by user threads probably has a similar problem.  It would be good to know if this is a regression caused by the improved interrupt
handling.

You bring up some interesting thoughts on interrupts and shutdown.  I think we should start a thread and discuss that separately from this JIRA, deciding first
 what is the right thing to do then how to do it.

> Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows
> ---------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5108
>                 URL: https://issues.apache.org/jira/browse/DERBY-5108
>             Project: Derby
>          Issue Type: Bug
>          Components: Test
>    Affects Versions: 10.8.0.0
>         Environment: Windows platforms.
>            Reporter: Kristian Waagan
>            Assignee: Mike Matrigali
>            Priority: Blocker
>         Attachments: javacore.20110309.125807.4048.0001.txt, waitOnShutdown.diff
>
>
> The test AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete fails intermittently on Windows platforms because the test is unable to delete a database directory.
> Even after several retries and sleeps (the formula should be (attempt -1) * 2000, resulting in a total sleep time of 12 seconds), the conglomerate system\singleUse\copyShutdown\seg0\c481.dat cannot be deleted.
> For instance from http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/w2003/1078855-suitesAll_diff.txt :
> (truncated paths)
> testShutdownWhileScanningThenDelete <assertDirectoryDeleted> attempt 1 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 2 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 3 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 4 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> used 205814 ms F.
> Maybe the database isn't shut down, or some specific timing of events causes a file to be reopened when it shouldn't have been (i.e. after the database shutdown has been initiated).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (DERBY-5108) Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows

Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DERBY-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kristian Waagan updated DERBY-5108:
-----------------------------------

    Issue & fix info:   (was: [Patch Available])
       Fix Version/s: 10.9.0.0

Thanks for looking at the patch, Dag.

I think your reasoning is sound.
Do you happen to know if other exceptions than StandardExceptions can be passed in to cleanupOnError? What happens with InterruptedExceptions?
I assume the normal case is that all exceptions are either StandardException or other exceptions wrapped in a StandardException such that we can ask for the severity. In any case, if another kind of exception is passed into DatabaseContextImpl.cleanupOnError the code to shut down the istat daemon won't be run.

The test case testShutdownWhileScanningThenDelete ran 236 times without a failure on a Windows machine with the patch applied (I then aborted the loop).

Committed patch 1b to trunk with revision 1133304.
I'll monitor the automated tests (Windows platforms) to see if the issue shows itself again with this fix.

> Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows
> ---------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5108
>                 URL: https://issues.apache.org/jira/browse/DERBY-5108
>             Project: Derby
>          Issue Type: Bug
>          Components: Test
>    Affects Versions: 10.8.1.2
>         Environment: Windows platforms.
>            Reporter: Kristian Waagan
>            Assignee: Kristian Waagan
>            Priority: Blocker
>             Fix For: 10.9.0.0
>
>         Attachments: derby-5108-1a-early_istats_shutdown.diff, derby-5108-2.diff, derby-5108-2a-early_istats_shutdown_broad.diff, javacore.20110309.125807.4048.0001.txt, waitOnShutdown.diff
>
>
> The test AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete fails intermittently on Windows platforms because the test is unable to delete a database directory.
> Even after several retries and sleeps (the formula should be (attempt -1) * 2000, resulting in a total sleep time of 12 seconds), the conglomerate system\singleUse\copyShutdown\seg0\c481.dat cannot be deleted.
> For instance from http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/w2003/1078855-suitesAll_diff.txt :
> (truncated paths)
> testShutdownWhileScanningThenDelete <assertDirectoryDeleted> attempt 1 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 2 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 3 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 4 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> used 205814 ms F.
> Maybe the database isn't shut down, or some specific timing of events causes a file to be reopened when it shouldn't have been (i.e. after the database shutdown has been initiated).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (DERBY-5108) Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows

Posted by "Myrna van Lunteren (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DERBY-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Myrna van Lunteren updated DERBY-5108:
--------------------------------------

    Component/s:     (was: Test)
                 Services

> Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows
> ---------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5108
>                 URL: https://issues.apache.org/jira/browse/DERBY-5108
>             Project: Derby
>          Issue Type: Bug
>          Components: Services
>    Affects Versions: 10.8.1.2
>         Environment: Windows platforms.
>            Reporter: Kristian Waagan
>            Assignee: Kristian Waagan
>            Priority: Blocker
>             Fix For: 10.8.1.6, 10.9.0.0
>
>         Attachments: derby-5108-1a-early_istats_shutdown.diff, derby-5108-2.diff, derby-5108-2a-early_istats_shutdown_broad.diff, derby-5108-3a-istat_log_abort.diff, javacore.20110309.125807.4048.0001.txt, waitOnShutdown.diff
>
>
> The test AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete fails intermittently on Windows platforms because the test is unable to delete a database directory.
> Even after several retries and sleeps (the formula should be (attempt -1) * 2000, resulting in a total sleep time of 12 seconds), the conglomerate system\singleUse\copyShutdown\seg0\c481.dat cannot be deleted.
> For instance from http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/w2003/1078855-suitesAll_diff.txt :
> (truncated paths)
> testShutdownWhileScanningThenDelete <assertDirectoryDeleted> attempt 1 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 2 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 3 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 4 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> used 205814 ms F.
> Maybe the database isn't shut down, or some specific timing of events causes a file to be reopened when it shouldn't have been (i.e. after the database shutdown has been initiated).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (DERBY-5108) Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows

Posted by "Dag H. Wanvik (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DERBY-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13043895#comment-13043895 ] 

Dag H. Wanvik commented on DERBY-5108:
--------------------------------------

Thanks, Kristian. 

I see that you put the call to disableIndexStatsRefresher at one point in  EmbedConnection's constructor correspoding to a requested shutdown. Database shutdown is also initiated from other locations, e.g from TransactionResourceImpl#handleException and others (many to do with replication handling). Is there no need to similarly stop the stat daemon when the db is shut down from these other locations?


> Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows
> ---------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5108
>                 URL: https://issues.apache.org/jira/browse/DERBY-5108
>             Project: Derby
>          Issue Type: Bug
>          Components: Test
>    Affects Versions: 10.8.1.2
>         Environment: Windows platforms.
>            Reporter: Kristian Waagan
>            Assignee: Kristian Waagan
>            Priority: Blocker
>         Attachments: derby-5108-1a-early_istats_shutdown.diff, derby-5108-2.diff, javacore.20110309.125807.4048.0001.txt, waitOnShutdown.diff
>
>
> The test AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete fails intermittently on Windows platforms because the test is unable to delete a database directory.
> Even after several retries and sleeps (the formula should be (attempt -1) * 2000, resulting in a total sleep time of 12 seconds), the conglomerate system\singleUse\copyShutdown\seg0\c481.dat cannot be deleted.
> For instance from http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/w2003/1078855-suitesAll_diff.txt :
> (truncated paths)
> testShutdownWhileScanningThenDelete <assertDirectoryDeleted> attempt 1 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 2 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 3 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 4 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> used 205814 ms F.
> Maybe the database isn't shut down, or some specific timing of events causes a file to be reopened when it shouldn't have been (i.e. after the database shutdown has been initiated).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (DERBY-5108) Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows

Posted by "Mike Matrigali (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DERBY-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13007279#comment-13007279 ] 

Mike Matrigali commented on DERBY-5108:
---------------------------------------

yesterday I tried exactly 5 runs using the exact submitted patch and it failed once.  I then modified the patch, mostly adding comments and slightly different break structure.  I then ran it exactly 40 times and it did not reproduce.  I then ran full set of tests in a codeline with istat still enabled as default and they all passed and checked it in - but had to update my codeline to do the submit.  I have not run it since.  It is a little harder to run since istat does not run by default - which was why I was looking for a checked in solution to running the tests no matter what the default happened to be.  I just have not gotten around to hacking a codeline or test run to do it since.  I wish the disabling had waited and happened solely in the branch if necessary, so that we could be getting testing on istat in the trunk now.  



> Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows
> ---------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5108
>                 URL: https://issues.apache.org/jira/browse/DERBY-5108
>             Project: Derby
>          Issue Type: Bug
>          Components: Test
>    Affects Versions: 10.8.0.0
>         Environment: Windows platforms.
>            Reporter: Kristian Waagan
>            Priority: Blocker
>         Attachments: javacore.20110309.125807.4048.0001.txt, waitOnShutdown.diff
>
>
> The test AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete fails intermittently on Windows platforms because the test is unable to delete a database directory.
> Even after several retries and sleeps (the formula should be (attempt -1) * 2000, resulting in a total sleep time of 12 seconds), the conglomerate system\singleUse\copyShutdown\seg0\c481.dat cannot be deleted.
> For instance from http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/w2003/1078855-suitesAll_diff.txt :
> (truncated paths)
> testShutdownWhileScanningThenDelete <assertDirectoryDeleted> attempt 1 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 2 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 3 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 4 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> used 205814 ms F.
> Maybe the database isn't shut down, or some specific timing of events causes a file to be reopened when it shouldn't have been (i.e. after the database shutdown has been initiated).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Issue Comment Edited: (DERBY-5108) Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows

Posted by "Kathey Marsden (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DERBY-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13004107#comment-13004107 ] 

Kathey Marsden edited comment on DERBY-5108 at 3/8/11 6:12 PM:
---------------------------------------------------------------

I saw this on weme 6.2  10.8.0.1 alpha - (1079089) 3/7/2011
 

      was (Author: kmarsden):
    I saw this on weme 6.2 0.8.0.1 alpha - (1079089) 3/7/2011
 
  
> Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows
> ---------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5108
>                 URL: https://issues.apache.org/jira/browse/DERBY-5108
>             Project: Derby
>          Issue Type: Bug
>          Components: Test
>    Affects Versions: 10.8.0.0
>         Environment: Windows platforms.
>            Reporter: Kristian Waagan
>
> The test AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete fails intermittently on Windows platforms because the test is unable to delete a database directory.
> Even after several retries and sleeps (the formula should be (attempt -1) * 2000, resulting in a total sleep time of 12 seconds), the conglomerate system\singleUse\copyShutdown\seg0\c481.dat cannot be deleted.
> For instance from http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/w2003/1078855-suitesAll_diff.txt :
> (truncated paths)
> testShutdownWhileScanningThenDelete <assertDirectoryDeleted> attempt 1 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 2 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 3 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 4 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> used 205814 ms F.
> Maybe the database isn't shut down, or some specific timing of events causes a file to be reopened when it shouldn't have been (i.e. after the database shutdown has been initiated).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (DERBY-5108) Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows

Posted by "Lily Wei (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DERBY-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13005680#comment-13005680 ] 

Lily Wei commented on DERBY-5108:
---------------------------------

+1 to making this a blocker for this application shuts down the database then delete the database is problematic on Windows.

> Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows
> ---------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5108
>                 URL: https://issues.apache.org/jira/browse/DERBY-5108
>             Project: Derby
>          Issue Type: Bug
>          Components: Test
>    Affects Versions: 10.8.0.0
>         Environment: Windows platforms.
>            Reporter: Kristian Waagan
>            Assignee: Mike Matrigali
>            Priority: Blocker
>         Attachments: javacore.20110309.125807.4048.0001.txt
>
>
> The test AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete fails intermittently on Windows platforms because the test is unable to delete a database directory.
> Even after several retries and sleeps (the formula should be (attempt -1) * 2000, resulting in a total sleep time of 12 seconds), the conglomerate system\singleUse\copyShutdown\seg0\c481.dat cannot be deleted.
> For instance from http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/w2003/1078855-suitesAll_diff.txt :
> (truncated paths)
> testShutdownWhileScanningThenDelete <assertDirectoryDeleted> attempt 1 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 2 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 3 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 4 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> used 205814 ms F.
> Maybe the database isn't shut down, or some specific timing of events causes a file to be reopened when it shouldn't have been (i.e. after the database shutdown has been initiated).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (DERBY-5108) Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows

Posted by "Knut Anders Hatlen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DERBY-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006941#comment-13006941 ] 

Knut Anders Hatlen commented on DERBY-5108:
-------------------------------------------

I added some println statements to TopService.shutdown() to see in which order the modules were taken down. I found that BaseDataFileFactoryJ4 was stopped before DataDictionaryImpl (which also stops the istat daemon). BaseDataFileFactoryJ4 shuts down the page cache and the container cache, so ideally no I/O should happen after that point. Since we don't stop the istat thread until after the caches have been shut down, that may explain why we still see that files are being reopened. I think it's the container cache that closes the files.

BaseMonitor sends an interrupt to all active threads before invoking TopService.shutdown(), so maybe what's happening is something like this:

1) BaseMonitor interrupts the threads and goes on shutting down the modules.

2) The istat thread notices the interrupt inside the store code and tries reopening the container that was closed by the interrupt. But before it has completed that task, ...

3) The data file factory is shut down, and all data files are closed when the container cache shuts down.

4) The istat thread finishes the reopening of the index file.

5) DataDictionaryImpl stops the istat thread, but too late.

> Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows
> ---------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5108
>                 URL: https://issues.apache.org/jira/browse/DERBY-5108
>             Project: Derby
>          Issue Type: Bug
>          Components: Test
>    Affects Versions: 10.8.0.0
>         Environment: Windows platforms.
>            Reporter: Kristian Waagan
>            Assignee: Mike Matrigali
>            Priority: Blocker
>         Attachments: javacore.20110309.125807.4048.0001.txt, waitOnShutdown.diff
>
>
> The test AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete fails intermittently on Windows platforms because the test is unable to delete a database directory.
> Even after several retries and sleeps (the formula should be (attempt -1) * 2000, resulting in a total sleep time of 12 seconds), the conglomerate system\singleUse\copyShutdown\seg0\c481.dat cannot be deleted.
> For instance from http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/w2003/1078855-suitesAll_diff.txt :
> (truncated paths)
> testShutdownWhileScanningThenDelete <assertDirectoryDeleted> attempt 1 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 2 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 3 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 4 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> used 205814 ms F.
> Maybe the database isn't shut down, or some specific timing of events causes a file to be reopened when it shouldn't have been (i.e. after the database shutdown has been initiated).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (DERBY-5108) Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows

Posted by "Kathey Marsden (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DERBY-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13004107#comment-13004107 ] 

Kathey Marsden commented on DERBY-5108:
---------------------------------------

I saw this on weme 6.2 0.8.0.1 alpha - (1079089) 3/7/2011
 

> Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows
> ---------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5108
>                 URL: https://issues.apache.org/jira/browse/DERBY-5108
>             Project: Derby
>          Issue Type: Bug
>          Components: Test
>    Affects Versions: 10.8.0.0
>         Environment: Windows platforms.
>            Reporter: Kristian Waagan
>
> The test AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete fails intermittently on Windows platforms because the test is unable to delete a database directory.
> Even after several retries and sleeps (the formula should be (attempt -1) * 2000, resulting in a total sleep time of 12 seconds), the conglomerate system\singleUse\copyShutdown\seg0\c481.dat cannot be deleted.
> For instance from http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/w2003/1078855-suitesAll_diff.txt :
> (truncated paths)
> testShutdownWhileScanningThenDelete <assertDirectoryDeleted> attempt 1 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 2 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 3 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 4 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> used 205814 ms F.
> Maybe the database isn't shut down, or some specific timing of events causes a file to be reopened when it shouldn't have been (i.e. after the database shutdown has been initiated).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (DERBY-5108) Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows

Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DERBY-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kristian Waagan updated DERBY-5108:
-----------------------------------

    Attachment: derby-5108-2a-early_istats_shutdown_broad.diff

Hi Dag,

There is a need to stop the istat daemon in other cases too to guarantee that this issue will never happen. There are several criteria that must occur for the issue to happen, and a few more to be able to observe it. Nevertheless, I think the takeaway is that Derby will leak file handles. How/if this affects normal Derby operation is unclear to me, and a JVM restart will clean this up.

Attaching patch 2a.
To make the fix broader, I moved the logic to stop the daemon to DatabaseContextImpl.cleanupOnError.
Do you reckon this is sufficient, or are you aware of other location that will bypass that code?

The regression tests ran without failures with patch 2a, but I have not yet tested this on Windows (I'll try to start a test loop later today when I can access a Windows machine).

Patch ready for review. 

> Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows
> ---------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5108
>                 URL: https://issues.apache.org/jira/browse/DERBY-5108
>             Project: Derby
>          Issue Type: Bug
>          Components: Test
>    Affects Versions: 10.8.1.2
>         Environment: Windows platforms.
>            Reporter: Kristian Waagan
>            Assignee: Kristian Waagan
>            Priority: Blocker
>         Attachments: derby-5108-1a-early_istats_shutdown.diff, derby-5108-2.diff, derby-5108-2a-early_istats_shutdown_broad.diff, javacore.20110309.125807.4048.0001.txt, waitOnShutdown.diff
>
>
> The test AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete fails intermittently on Windows platforms because the test is unable to delete a database directory.
> Even after several retries and sleeps (the formula should be (attempt -1) * 2000, resulting in a total sleep time of 12 seconds), the conglomerate system\singleUse\copyShutdown\seg0\c481.dat cannot be deleted.
> For instance from http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/w2003/1078855-suitesAll_diff.txt :
> (truncated paths)
> testShutdownWhileScanningThenDelete <assertDirectoryDeleted> attempt 1 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 2 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 3 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 4 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> used 205814 ms F.
> Maybe the database isn't shut down, or some specific timing of events causes a file to be reopened when it shouldn't have been (i.e. after the database shutdown has been initiated).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (DERBY-5108) Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows

Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DERBY-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kristian Waagan updated DERBY-5108:
-----------------------------------

    Fix Version/s: 10.8.1.5

Merged missing fixes (2a and 3a) to 10.8 with revision 1140583.

> Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows
> ---------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5108
>                 URL: https://issues.apache.org/jira/browse/DERBY-5108
>             Project: Derby
>          Issue Type: Bug
>          Components: Test
>    Affects Versions: 10.8.1.2
>         Environment: Windows platforms.
>            Reporter: Kristian Waagan
>            Assignee: Kristian Waagan
>            Priority: Blocker
>             Fix For: 10.8.1.5, 10.9.0.0
>
>         Attachments: derby-5108-1a-early_istats_shutdown.diff, derby-5108-2.diff, derby-5108-2a-early_istats_shutdown_broad.diff, derby-5108-3a-istat_log_abort.diff, javacore.20110309.125807.4048.0001.txt, waitOnShutdown.diff
>
>
> The test AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete fails intermittently on Windows platforms because the test is unable to delete a database directory.
> Even after several retries and sleeps (the formula should be (attempt -1) * 2000, resulting in a total sleep time of 12 seconds), the conglomerate system\singleUse\copyShutdown\seg0\c481.dat cannot be deleted.
> For instance from http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/w2003/1078855-suitesAll_diff.txt :
> (truncated paths)
> testShutdownWhileScanningThenDelete <assertDirectoryDeleted> attempt 1 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 2 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 3 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 4 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> used 205814 ms F.
> Maybe the database isn't shut down, or some specific timing of events causes a file to be reopened when it shouldn't have been (i.e. after the database shutdown has been initiated).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (DERBY-5108) Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows

Posted by "Dag H. Wanvik (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DERBY-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13010745#comment-13010745 ] 

Dag H. Wanvik commented on DERBY-5108:
--------------------------------------

Committed patch as svn 1085027.

Thanks for looking at this, Knut. I thought it should restore the flag, I'll investigate it.


> Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows
> ---------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5108
>                 URL: https://issues.apache.org/jira/browse/DERBY-5108
>             Project: Derby
>          Issue Type: Bug
>          Components: Test
>    Affects Versions: 10.8.0.0
>         Environment: Windows platforms.
>            Reporter: Kristian Waagan
>            Priority: Blocker
>         Attachments: derby-5108-2.diff, javacore.20110309.125807.4048.0001.txt, waitOnShutdown.diff
>
>
> The test AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete fails intermittently on Windows platforms because the test is unable to delete a database directory.
> Even after several retries and sleeps (the formula should be (attempt -1) * 2000, resulting in a total sleep time of 12 seconds), the conglomerate system\singleUse\copyShutdown\seg0\c481.dat cannot be deleted.
> For instance from http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/w2003/1078855-suitesAll_diff.txt :
> (truncated paths)
> testShutdownWhileScanningThenDelete <assertDirectoryDeleted> attempt 1 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 2 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 3 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 4 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> used 205814 ms F.
> Maybe the database isn't shut down, or some specific timing of events causes a file to be reopened when it shouldn't have been (i.e. after the database shutdown has been initiated).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Issue Comment Edited: (DERBY-5108) Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows

Posted by "Mike Matrigali (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DERBY-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006567#comment-13006567 ] 

Mike Matrigali edited comment on DERBY-5108 at 3/14/11 6:58 PM:
----------------------------------------------------------------

In my environment just got 4 for 4 errors without the patch and 4 for 4 sucesses with the patch.  I will submit a fix once I can get a clean full test run, probably late night pacific time.

4 out of 4 worked and 5th failed.  So fix is better but something still going on, will continue working ...

      was (Author: mikem):
    In my environment just got 4 for 4 errors without the patch and 4 for 4 sucesses with the patch.  I will submit a fix once I can get a clean full test run, probably late night pacific time.
  
> Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows
> ---------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5108
>                 URL: https://issues.apache.org/jira/browse/DERBY-5108
>             Project: Derby
>          Issue Type: Bug
>          Components: Test
>    Affects Versions: 10.8.0.0
>         Environment: Windows platforms.
>            Reporter: Kristian Waagan
>            Assignee: Mike Matrigali
>            Priority: Blocker
>         Attachments: javacore.20110309.125807.4048.0001.txt, waitOnShutdown.diff
>
>
> The test AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete fails intermittently on Windows platforms because the test is unable to delete a database directory.
> Even after several retries and sleeps (the formula should be (attempt -1) * 2000, resulting in a total sleep time of 12 seconds), the conglomerate system\singleUse\copyShutdown\seg0\c481.dat cannot be deleted.
> For instance from http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/w2003/1078855-suitesAll_diff.txt :
> (truncated paths)
> testShutdownWhileScanningThenDelete <assertDirectoryDeleted> attempt 1 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 2 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 3 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 4 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> used 205814 ms F.
> Maybe the database isn't shut down, or some specific timing of events causes a file to be reopened when it shouldn't have been (i.e. after the database shutdown has been initiated).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (DERBY-5108) Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows

Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DERBY-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kristian Waagan resolved DERBY-5108.
------------------------------------

    Resolution: Fixed

Thanks, Dag.

Although there seems to be situations where the code shutting down the istat daemon as part of the cleanup is bypassed, I plan to leave the fix as it is for now.
As far as I understand, non-StandardException exceptions are supposed to be wrapped (although I haven't verified if this happens before or during/after the cleanup), and unless the exception brings down the database/system there the daemon isn't supposed to be shut down. I think it should be fairly easy to change this at a later time if we find the current fix inadequate.

I haven't seen any more occurrences of the bug so far, but I'll continue to monitor the test results for a while longer before closing. I'd also like to see if the error shows up in the 10.8 branch testing.

Will backport to 10.8 later.

> Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows
> ---------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5108
>                 URL: https://issues.apache.org/jira/browse/DERBY-5108
>             Project: Derby
>          Issue Type: Bug
>          Components: Test
>    Affects Versions: 10.8.1.2
>         Environment: Windows platforms.
>            Reporter: Kristian Waagan
>            Assignee: Kristian Waagan
>            Priority: Blocker
>             Fix For: 10.9.0.0
>
>         Attachments: derby-5108-1a-early_istats_shutdown.diff, derby-5108-2.diff, derby-5108-2a-early_istats_shutdown_broad.diff, derby-5108-3a-istat_log_abort.diff, javacore.20110309.125807.4048.0001.txt, waitOnShutdown.diff
>
>
> The test AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete fails intermittently on Windows platforms because the test is unable to delete a database directory.
> Even after several retries and sleeps (the formula should be (attempt -1) * 2000, resulting in a total sleep time of 12 seconds), the conglomerate system\singleUse\copyShutdown\seg0\c481.dat cannot be deleted.
> For instance from http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/w2003/1078855-suitesAll_diff.txt :
> (truncated paths)
> testShutdownWhileScanningThenDelete <assertDirectoryDeleted> attempt 1 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 2 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 3 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 4 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> used 205814 ms F.
> Maybe the database isn't shut down, or some specific timing of events causes a file to be reopened when it shouldn't have been (i.e. after the database shutdown has been initiated).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Issue Comment Edited: (DERBY-5108) Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows

Posted by "Mike Matrigali (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DERBY-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13005899#comment-13005899 ] 

Mike Matrigali edited comment on DERBY-5108 at 3/12/11 12:14 AM:
-----------------------------------------------------------------

The more I look at this issue I think the problem is that the istat daemon should shutdown and not return until it has completed this 
shutdown when indexRefresher.stop(); is called from the DataDictionary's stop call().  For a clean shutdown of the system the store
needs all it's clients shutdown first and then it can cleanly shutdown, and force the database files and transaction logs insuring 
a clean shutdown with no recovery work necessary on the next boot.

By leaving the istat daemon running we can run into a number of errors that I don't think can be solved.  We might fix a specific one shown
up by this test but the system is just not designed to handle clean shutdown while stuff is still running without first waiting for the running
stuff to stop somehow.

Kristian noted in DERBY-5037:
> I think Mike's comments/observations above agree pretty much with my thinking when writing the code. Seems there are several error-handling issues to iron out though...
>A few specific comments:
>o I decided to not make Derby wait for the background thread to finish on shutdown, as it might potentially be scanning a very large table.
>o Logging is rather verbose now during testing, but I agree it should be less verbose (or maybe turned off completely) when released.
>o I'm logging a lot of exceptions to aid testing/debugging. These should also go away, or be enabled by a property if the user wishes to do so. 

I now think that it was wrong to not wait for the background thread.  This would match the behavior of the rawStoreDaemon thread which is "owned"
by the raw store module - the module stops the daemon and the daemon waits around for work to stop/complete before returning from the stop, and
then the raw store continues with it's data and transaction file cleanup prior to stopping.   I agree it would be a nice optimization to somehow stop the background thread in
the middle of a big scan, and it seems like with the better interrupt support this should be much easier than was the case before 10.8.   I would like
some feedback before proceding from those more knowledgeable about the istat work.

I do think that the work rick did for DERBY-5037 is still valuable as it will handle much better the non-clean shutdowns that Derby can experience.  ButFor
a non-clean shutdown we might have to just live with a file left open until the thread or jvm exits.   But for a requested orderly shutdown of the system I 
think we should go with the top down shutdown supported by the architecture rather than try to fix errors encountered when top level modules are still
running while lower level modules are trying to shut down.

      was (Author: mikem):
    The more I look at this issue I think the problem is that the istat daemon should shutdown and not return until it has completed this 
shutdown when indexRefresher.stop(); is called from the DataDictionary's stop call().  For a clean shutdown of the system the store
needs all it's clients shutdown first and then it can cleanly shutdown, and force the database files and transaction logs insuring 
a clean shutdown with no recovery work necessary on the next boot.

By leaving the istat daemon running we can run into a number of errors that I don't think can be solved.  We might fix a specific one shown
up by this test but the system is just not designed to handle clean shutdown while stuff is still running without first waiting for the running
stuff to stop somehow.

Kristian noted in DERBY-5037:
> I think Mike's comments/observations above agree pretty much with my thinking when writing the code. Seems there are several error-handling issues to iron out though...
>A few specific comments:
>o I decided to not make Derby wait for the background thread to finish on shutdown, as it might potentially be scanning a very large table.
>o Logging is rather verbose now during testing, but I agree it should be less verbose (or maybe turned off completely) when released.
>o I'm logging a lot of exceptions to aid testing/debugging. These should also go away, or be enabled by a property if the user wishes to do so. 

I now think that it was wrong to not wait for the background thread.  I agree it would be a nice optimization to somehow stop the background thread in
the middle of a big scan, and it seems like with the better interrupt support this should be much easier than was the case before 10.8.   I would like
some feedback before proceding from those more knowledgeable about the istat work.

I do think that the work rick did for DERBY-5037 is still valuable as it will handle much better the non-clean shutdowns that Derby can experience.  ButFor
a non-clean shutdown we might have to just live with a file left open until the thread or jvm exits.   But for a requested orderly shutdown of the system I 
think we should go with the top down shutdown supported by the architecture rather than try to fix errors encountered when top level modules are still
running while lower level modules are trying to shut down.
  
> Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows
> ---------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5108
>                 URL: https://issues.apache.org/jira/browse/DERBY-5108
>             Project: Derby
>          Issue Type: Bug
>          Components: Test
>    Affects Versions: 10.8.0.0
>         Environment: Windows platforms.
>            Reporter: Kristian Waagan
>            Assignee: Mike Matrigali
>            Priority: Blocker
>         Attachments: javacore.20110309.125807.4048.0001.txt
>
>
> The test AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete fails intermittently on Windows platforms because the test is unable to delete a database directory.
> Even after several retries and sleeps (the formula should be (attempt -1) * 2000, resulting in a total sleep time of 12 seconds), the conglomerate system\singleUse\copyShutdown\seg0\c481.dat cannot be deleted.
> For instance from http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/w2003/1078855-suitesAll_diff.txt :
> (truncated paths)
> testShutdownWhileScanningThenDelete <assertDirectoryDeleted> attempt 1 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 2 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 3 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 4 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> used 205814 ms F.
> Maybe the database isn't shut down, or some specific timing of events causes a file to be reopened when it shouldn't have been (i.e. after the database shutdown has been initiated).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (DERBY-5108) Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows

Posted by "Dag H. Wanvik (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DERBY-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13010815#comment-13010815 ] 

Dag H. Wanvik commented on DERBY-5108:
--------------------------------------

During shutdown there is a corner case where the lcc (which contains the interrupt status) is deleted before we reach the place where the flag is attempted restored. I'll file a separate issue for this.

> Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows
> ---------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5108
>                 URL: https://issues.apache.org/jira/browse/DERBY-5108
>             Project: Derby
>          Issue Type: Bug
>          Components: Test
>    Affects Versions: 10.8.0.0
>         Environment: Windows platforms.
>            Reporter: Kristian Waagan
>            Priority: Blocker
>         Attachments: derby-5108-2.diff, javacore.20110309.125807.4048.0001.txt, waitOnShutdown.diff
>
>
> The test AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete fails intermittently on Windows platforms because the test is unable to delete a database directory.
> Even after several retries and sleeps (the formula should be (attempt -1) * 2000, resulting in a total sleep time of 12 seconds), the conglomerate system\singleUse\copyShutdown\seg0\c481.dat cannot be deleted.
> For instance from http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/w2003/1078855-suitesAll_diff.txt :
> (truncated paths)
> testShutdownWhileScanningThenDelete <assertDirectoryDeleted> attempt 1 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 2 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 3 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 4 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> used 205814 ms F.
> Maybe the database isn't shut down, or some specific timing of events causes a file to be reopened when it shouldn't have been (i.e. after the database shutdown has been initiated).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (DERBY-5108) Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows

Posted by "Mike Matrigali (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DERBY-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006567#comment-13006567 ] 

Mike Matrigali commented on DERBY-5108:
---------------------------------------

In my environment just got 4 for 4 errors without the patch and 4 for 4 sucesses with the patch.  I will submit a fix once I can get a clean full test run, probably late night pacific time.

> Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows
> ---------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5108
>                 URL: https://issues.apache.org/jira/browse/DERBY-5108
>             Project: Derby
>          Issue Type: Bug
>          Components: Test
>    Affects Versions: 10.8.0.0
>         Environment: Windows platforms.
>            Reporter: Kristian Waagan
>            Assignee: Mike Matrigali
>            Priority: Blocker
>         Attachments: javacore.20110309.125807.4048.0001.txt, waitOnShutdown.diff
>
>
> The test AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete fails intermittently on Windows platforms because the test is unable to delete a database directory.
> Even after several retries and sleeps (the formula should be (attempt -1) * 2000, resulting in a total sleep time of 12 seconds), the conglomerate system\singleUse\copyShutdown\seg0\c481.dat cannot be deleted.
> For instance from http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/w2003/1078855-suitesAll_diff.txt :
> (truncated paths)
> testShutdownWhileScanningThenDelete <assertDirectoryDeleted> attempt 1 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 2 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 3 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 4 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> used 205814 ms F.
> Maybe the database isn't shut down, or some specific timing of events causes a file to be reopened when it shouldn't have been (i.e. after the database shutdown has been initiated).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (DERBY-5108) Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows

Posted by "Knut Anders Hatlen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DERBY-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006492#comment-13006492 ] 

Knut Anders Hatlen commented on DERBY-5108:
-------------------------------------------

Thanks for testing the patch, Lily!

> Does this approach fulfill the top down shutdown supported by the architecture?

Yes, I think so. The istat service is shut down from the data dictionary, which is top-down. That was done before as well, but the actual worker thread would be left running until it finished by itself. With the patch, stop() won't complete until the worker thread is done, so when the higher leveles proceed in the shutdown code, we know that the istat thread won't be causing any bad interactions.

> Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows
> ---------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5108
>                 URL: https://issues.apache.org/jira/browse/DERBY-5108
>             Project: Derby
>          Issue Type: Bug
>          Components: Test
>    Affects Versions: 10.8.0.0
>         Environment: Windows platforms.
>            Reporter: Kristian Waagan
>            Assignee: Mike Matrigali
>            Priority: Blocker
>         Attachments: javacore.20110309.125807.4048.0001.txt, waitOnShutdown.diff
>
>
> The test AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete fails intermittently on Windows platforms because the test is unable to delete a database directory.
> Even after several retries and sleeps (the formula should be (attempt -1) * 2000, resulting in a total sleep time of 12 seconds), the conglomerate system\singleUse\copyShutdown\seg0\c481.dat cannot be deleted.
> For instance from http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/w2003/1078855-suitesAll_diff.txt :
> (truncated paths)
> testShutdownWhileScanningThenDelete <assertDirectoryDeleted> attempt 1 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 2 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 3 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 4 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> used 205814 ms F.
> Maybe the database isn't shut down, or some specific timing of events causes a file to be reopened when it shouldn't have been (i.e. after the database shutdown has been initiated).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (DERBY-5108) Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows

Posted by "Rick Hillegas (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DERBY-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rick Hillegas updated DERBY-5108:
---------------------------------

    Urgency: Blocker

Changing the urgency to Blocker. That makes this issue a showstopper for 10.8.1. If I understand Mike's analysis, we could see problems with DDL on the table in the following scenario:

o Running on Windows
o The application shuts down the database but the application continues to do other work.
o The application then reboots the database for more work.


> Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows
> ---------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5108
>                 URL: https://issues.apache.org/jira/browse/DERBY-5108
>             Project: Derby
>          Issue Type: Bug
>          Components: Test
>    Affects Versions: 10.8.0.0
>         Environment: Windows platforms.
>            Reporter: Kristian Waagan
>            Assignee: Mike Matrigali
>            Priority: Blocker
>         Attachments: javacore.20110309.125807.4048.0001.txt
>
>
> The test AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete fails intermittently on Windows platforms because the test is unable to delete a database directory.
> Even after several retries and sleeps (the formula should be (attempt -1) * 2000, resulting in a total sleep time of 12 seconds), the conglomerate system\singleUse\copyShutdown\seg0\c481.dat cannot be deleted.
> For instance from http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/w2003/1078855-suitesAll_diff.txt :
> (truncated paths)
> testShutdownWhileScanningThenDelete <assertDirectoryDeleted> attempt 1 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 2 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 3 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 4 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> used 205814 ms F.
> Maybe the database isn't shut down, or some specific timing of events causes a file to be reopened when it shouldn't have been (i.e. after the database shutdown has been initiated).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (DERBY-5108) Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows

Posted by "Kathey Marsden (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DERBY-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13005402#comment-13005402 ] 

Kathey Marsden commented on DERBY-5108:
---------------------------------------

+1 to making this a blocker.  I have seen cases in the past where I have instructed users to wait for a bit after shutdown before deleting the database to avoid deletion errors.    If that won't work reliably no matter how long they wait it would be extremely problematic.


> Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows
> ---------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5108
>                 URL: https://issues.apache.org/jira/browse/DERBY-5108
>             Project: Derby
>          Issue Type: Bug
>          Components: Test
>    Affects Versions: 10.8.0.0
>         Environment: Windows platforms.
>            Reporter: Kristian Waagan
>            Assignee: Mike Matrigali
>            Priority: Blocker
>         Attachments: javacore.20110309.125807.4048.0001.txt
>
>
> The test AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete fails intermittently on Windows platforms because the test is unable to delete a database directory.
> Even after several retries and sleeps (the formula should be (attempt -1) * 2000, resulting in a total sleep time of 12 seconds), the conglomerate system\singleUse\copyShutdown\seg0\c481.dat cannot be deleted.
> For instance from http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/w2003/1078855-suitesAll_diff.txt :
> (truncated paths)
> testShutdownWhileScanningThenDelete <assertDirectoryDeleted> attempt 1 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 2 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 3 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 4 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> used 205814 ms F.
> Maybe the database isn't shut down, or some specific timing of events causes a file to be reopened when it shouldn't have been (i.e. after the database shutdown has been initiated).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (DERBY-5108) Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows

Posted by "Knut Anders Hatlen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DERBY-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13010701#comment-13010701 ] 

Knut Anders Hatlen commented on DERBY-5108:
-------------------------------------------

Thanks for the followup patch, Dag. I think it makes sense to change the interrupt handling to match what we do in the rest of the engine.

I applied the patch and added a call to Thread.interrupt() to make join() always throw an exception. I verified that it saved the interrupt status and continued waiting until the thread had completed. However, when I checked the interrupt flag after DriverManager.getConnection("...;shutdown=true") had completed, the interrupt flag was not set in the thread. Perhaps the shutdown code doesn't have the mechanism in place to restore the interrupt flag?

> Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows
> ---------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5108
>                 URL: https://issues.apache.org/jira/browse/DERBY-5108
>             Project: Derby
>          Issue Type: Bug
>          Components: Test
>    Affects Versions: 10.8.0.0
>         Environment: Windows platforms.
>            Reporter: Kristian Waagan
>            Priority: Blocker
>         Attachments: derby-5108-2.diff, javacore.20110309.125807.4048.0001.txt, waitOnShutdown.diff
>
>
> The test AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete fails intermittently on Windows platforms because the test is unable to delete a database directory.
> Even after several retries and sleeps (the formula should be (attempt -1) * 2000, resulting in a total sleep time of 12 seconds), the conglomerate system\singleUse\copyShutdown\seg0\c481.dat cannot be deleted.
> For instance from http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/w2003/1078855-suitesAll_diff.txt :
> (truncated paths)
> testShutdownWhileScanningThenDelete <assertDirectoryDeleted> attempt 1 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 2 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 3 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 4 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> used 205814 ms F.
> Maybe the database isn't shut down, or some specific timing of events causes a file to be reopened when it shouldn't have been (i.e. after the database shutdown has been initiated).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (DERBY-5108) Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows

Posted by "Knut Anders Hatlen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DERBY-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13007504#comment-13007504 ] 

Knut Anders Hatlen commented on DERBY-5108:
-------------------------------------------

Thanks, Mike. Sounds like it fixed most cases then. I still suspect the shutdown order is problematic, so I'd like to keep the bug open at least until we've investigated and found out if it actually is a problem.

As to running AutomaticIndexStatisticsTest on head of trunk when istat is disabled, I found that I could do that by adding -Dderby.storage.indexStats.auto=true to the java command when invoking the test runner. I would think that it should also work on suites.All.

> Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows
> ---------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5108
>                 URL: https://issues.apache.org/jira/browse/DERBY-5108
>             Project: Derby
>          Issue Type: Bug
>          Components: Test
>    Affects Versions: 10.8.0.0
>         Environment: Windows platforms.
>            Reporter: Kristian Waagan
>            Priority: Blocker
>         Attachments: javacore.20110309.125807.4048.0001.txt, waitOnShutdown.diff
>
>
> The test AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete fails intermittently on Windows platforms because the test is unable to delete a database directory.
> Even after several retries and sleeps (the formula should be (attempt -1) * 2000, resulting in a total sleep time of 12 seconds), the conglomerate system\singleUse\copyShutdown\seg0\c481.dat cannot be deleted.
> For instance from http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/w2003/1078855-suitesAll_diff.txt :
> (truncated paths)
> testShutdownWhileScanningThenDelete <assertDirectoryDeleted> attempt 1 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 2 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 3 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 4 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> used 205814 ms F.
> Maybe the database isn't shut down, or some specific timing of events causes a file to be reopened when it shouldn't have been (i.e. after the database shutdown has been initiated).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (DERBY-5108) Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows

Posted by "Myrna van Lunteren (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DERBY-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13004116#comment-13004116 ] 

Myrna van Lunteren commented on DERBY-5108:
-------------------------------------------

I saw this with ibm 1.6 - 10.8.0.1 alpha (1078654), on windows XP.


> Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows
> ---------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5108
>                 URL: https://issues.apache.org/jira/browse/DERBY-5108
>             Project: Derby
>          Issue Type: Bug
>          Components: Test
>    Affects Versions: 10.8.0.0
>         Environment: Windows platforms.
>            Reporter: Kristian Waagan
>
> The test AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete fails intermittently on Windows platforms because the test is unable to delete a database directory.
> Even after several retries and sleeps (the formula should be (attempt -1) * 2000, resulting in a total sleep time of 12 seconds), the conglomerate system\singleUse\copyShutdown\seg0\c481.dat cannot be deleted.
> For instance from http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/w2003/1078855-suitesAll_diff.txt :
> (truncated paths)
> testShutdownWhileScanningThenDelete <assertDirectoryDeleted> attempt 1 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 2 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 3 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 4 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> used 205814 ms F.
> Maybe the database isn't shut down, or some specific timing of events causes a file to be reopened when it shouldn't have been (i.e. after the database shutdown has been initiated).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (DERBY-5108) Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows

Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DERBY-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kristian Waagan reassigned DERBY-5108:
--------------------------------------

    Assignee: Kristian Waagan

> Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows
> ---------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5108
>                 URL: https://issues.apache.org/jira/browse/DERBY-5108
>             Project: Derby
>          Issue Type: Bug
>          Components: Test
>    Affects Versions: 10.8.1.2
>         Environment: Windows platforms.
>            Reporter: Kristian Waagan
>            Assignee: Kristian Waagan
>            Priority: Blocker
>         Attachments: derby-5108-2.diff, javacore.20110309.125807.4048.0001.txt, waitOnShutdown.diff
>
>
> The test AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete fails intermittently on Windows platforms because the test is unable to delete a database directory.
> Even after several retries and sleeps (the formula should be (attempt -1) * 2000, resulting in a total sleep time of 12 seconds), the conglomerate system\singleUse\copyShutdown\seg0\c481.dat cannot be deleted.
> For instance from http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/w2003/1078855-suitesAll_diff.txt :
> (truncated paths)
> testShutdownWhileScanningThenDelete <assertDirectoryDeleted> attempt 1 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 2 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 3 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 4 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> used 205814 ms F.
> Maybe the database isn't shut down, or some specific timing of events causes a file to be reopened when it shouldn't have been (i.e. after the database shutdown has been initiated).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (DERBY-5108) Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows

Posted by "Mike Matrigali (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DERBY-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13005876#comment-13005876 ] 

Mike Matrigali commented on DERBY-5108:
---------------------------------------

It looks like one of the last things that happens in the istat thread prior to store peforming it's shutdown and closing all of it's open files is that 
it recovers an open from an interrupt.  I'll include the stacks below that show this.  What should we do if the istat thread get's an interrupt.  I am leaning
toward it should check for interrupts at critical places and then give up on the work it is doing.  Maybe there is a place it could check whether it should be
shutting down that needs to be checked more often?  

The store shutdown code only closes files that are not currently "kept" (store's term for a file that is open by a store client), so not surprising that file is
not closed if it is executing while the istat thread is busy reading from the file).  Here is the stack of the store closing it's open files:
    at org.apache.derby.impl.store.raw.data.RAFContainer.closeContainer(RAFContainer.java:216)^M
    at org.apache.derby.impl.store.raw.data.RAFContainer4.closeContainer(RAFContainer4.java:227)^M
    at org.apache.derby.impl.store.raw.data.FileContainer.clearIdentity(FileContainer.java:476)^M
    at org.apache.derby.impl.services.cache.ConcurrentCache.removeEntry(ConcurrentCache.java:167)^M
    at org.apache.derby.impl.services.cache.ConcurrentCache.ageOut(ConcurrentCache.java:583)^M
    at org.apache.derby.impl.services.cache.ConcurrentCache.shutdown(ConcurrentCache.java:598)^M
    at org.apache.derby.impl.store.raw.data.BaseDataFileFactory.stop(BaseDataFileFactory.java:519)^M
    at org.apache.derby.impl.services.monitor.TopService.stop(TopService.java:443)^M
    at org.apache.derby.impl.services.monitor.TopService.shutdown(TopService.java:394)^M
    at org.apache.derby.impl.services.monitor.BaseMonitor.shutdown(BaseMonitor.java:229)^M

Here is some debug showing the stacks where the istat daemon is recovering from an interrupt: on the file of interest c481.dat - conglom/container  id  1153
Thu Mar 10 16:22:59 PST 2011 Thread[index-stat-thread,5,main] {istat,trace@637806084} worker thread started (xid=24016) [q/p/s=1/0/1,err:k/u/c=0/0/0,rej:f/d/
/0/0]^M
Thu Mar 10 16:22:59 PST 2011 Thread[index-stat-thread,5,main] {istat,trace@637806084}     processing "APP"."BIG_TABLE" ^M
DEBUG IndexStatisticsDaemonImpl OUTPUT: opened conglomerateNumber[1] = 1153^M
DEBUG RAFContainer OUTPUT: about to close file C:\derby\s1\systest\out\junit\system\singleUse\copyShutdown\seg0\c481.dat^M
Exception trace: ^M
java.lang.Throwable^M
    at org.apache.derby.impl.store.raw.data.RAFContainer.closeContainer(RAFContainer.java:216)^M
    at org.apache.derby.impl.store.raw.data.RAFContainer4.closeContainer(RAFContainer4.java:227)^M
    at org.apache.derby.impl.store.raw.data.RAFContainer4.recoverContainerAfterInterrupt(RAFContainer4.java:860)^M
    at org.apache.derby.impl.store.raw.data.RAFContainer4.readPage(RAFContainer4.java:368)^M
    at org.apache.derby.impl.store.raw.data.RAFContainer4.readPage(RAFContainer4.java:246)^M
    at org.apache.derby.impl.store.raw.data.CachedPage.readPage(CachedPage.java:673)^M
    at org.apache.derby.impl.store.raw.data.CachedPage.setIdentity(CachedPage.java:193)^M
    at org.apache.derby.impl.services.cache.ConcurrentCache.find(ConcurrentCache.java:295)^M
    at org.apache.derby.impl.store.raw.data.FileContainer.getUserPage(FileContainer.java:2530)^M
    at org.apache.derby.impl.store.raw.data.FileContainer.getPage(FileContainer.java:2580)^M
    at org.apache.derby.impl.store.raw.data.BaseContainerHandle.getPage(BaseContainerHandle.java:319)^M
    at org.apache.derby.impl.store.access.btree.ControlRow.get(ControlRow.java:833)^M
    at org.apache.derby.impl.store.access.btree.ControlRow.get(ControlRow.java:820)^M
    at org.apache.derby.impl.store.access.btree.ControlRow.getRightSibling(ControlRow.java:531)^M
    at org.apache.derby.impl.store.access.btree.BTreeScan.positionAtNextPage(BTreeScan.java:493)^M
    at org.apache.derby.impl.store.access.btree.BTreeForwardScan.fetchRows(BTreeForwardScan.java:464)^M
    at org.apache.derby.impl.store.access.btree.BTreeScan.fetchNextGroup(BTreeScan.java:1681)^M
    at org.apache.derby.impl.services.daemon.IndexStatisticsDaemonImpl$KeyComparator.fetchRows(IndexStatisticsDaemonImpl.java:1174)^M
    at org.apache.derby.impl.services.daemon.IndexStatisticsDaemonImpl.updateIndexStatsMinion(IndexStatisticsDaemonImpl.java:470)^M
    at org.apache.derby.impl.services.daemon.IndexStatisticsDaemonImpl.generateStatistics(IndexStatisticsDaemonImpl.java:327)^M
    at org.apache.derby.impl.services.daemon.IndexStatisticsDaemonImpl.processingLoop(IndexStatisticsDaemonImpl.java:777)^M
    at org.apache.derby.impl.services.daemon.IndexStatisticsDaemonImpl.run(IndexStatisticsDaemonImpl.java:687)^M
    at java.lang.Thread.run(Thread.java:736)^M
DEBUG RAFContainer OUTPUT: closed file C:\derby\s1\systest\out\junit\system\singleUse\copyShutdown\seg0\c481.dat^M
DEBUG RAFContainer OUTPUT: opening file: C:\derby\s1\systest\out\junit\system\singleUse\copyShutdown\seg0\c481.dat^M
Exception trace: ^M
java.lang.Throwable^M
    at org.apache.derby.impl.store.raw.data.RAFContainer.run(RAFContainer.java:1587)^M
    at java.security.AccessController.doPrivileged(AccessController.java:251)^M
    at org.apache.derby.impl.store.raw.data.RAFContainer.openContainerMinion(RAFContainer.java:939)^M
    at org.apache.derby.impl.store.raw.data.RAFContainer.reopenContainer(RAFContainer.java:914)^M
    at org.apache.derby.impl.store.raw.data.RAFContainer4.recoverContainerAfterInterrupt(RAFContainer4.java:861)^M
    at org.apache.derby.impl.store.raw.data.RAFContainer4.readPage(RAFContainer4.java:368)^M
    at org.apache.derby.impl.store.raw.data.RAFContainer4.readPage(RAFContainer4.java:246)^M
    at org.apache.derby.impl.store.raw.data.CachedPage.readPage(CachedPage.java:673)^M
    at org.apache.derby.impl.store.raw.data.CachedPage.setIdentity(CachedPage.java:193)^M
    at org.apache.derby.impl.services.cache.ConcurrentCache.find(ConcurrentCache.java:295)^M
    at org.apache.derby.impl.store.raw.data.FileContainer.getUserPage(FileContainer.java:2530)^M
    at org.apache.derby.impl.store.raw.data.FileContainer.getPage(FileContainer.java:2580)^M
    at org.apache.derby.impl.store.raw.data.BaseContainerHandle.getPage(BaseContainerHandle.java:319)^M
    at org.apache.derby.impl.store.access.btree.ControlRow.get(ControlRow.java:833)^M
    at org.apache.derby.impl.store.access.btree.ControlRow.get(ControlRow.java:820)^M
    at org.apache.derby.impl.store.access.btree.ControlRow.getRightSibling(ControlRow.java:531)^M
    at org.apache.derby.impl.store.access.btree.BTreeScan.positionAtNextPage(BTreeScan.java:493)^M
    at org.apache.derby.impl.store.access.btree.BTreeForwardScan.fetchRows(BTreeForwardScan.java:464)^M
    at org.apache.derby.impl.store.access.btree.BTreeScan.fetchNextGroup(BTreeScan.java:1681)^M
    at org.apache.derby.impl.services.daemon.IndexStatisticsDaemonImpl$KeyComparator.fetchRows(IndexStatisticsDaemonImpl.java:1174)^M
    at org.apache.derby.impl.services.daemon.IndexStatisticsDaemonImpl.updateIndexStatsMinion(IndexStatisticsDaemonImpl.java:470)^M
    at org.apache.derby.impl.services.daemon.IndexStatisticsDaemonImpl.generateStatistics(IndexStatisticsDaemonImpl.java:327)^M
    at org.apache.derby.impl.services.daemon.IndexStatisticsDaemonImpl.processingLoop(IndexStatisticsDaemonImpl.java:777)^M
    at org.apache.derby.impl.services.daemon.IndexStatisticsDaemonImpl.run(IndexStatisticsDaemonImpl.java:687)^M
    at java.lang.Thread.run(Thread.java:736)^M
----------------------------------------------------------------^M
----------------------------------------------------------------^M
Thu Mar 10 16:22:59 PST 2011:
Shutting down instance 35924132-012e-a249-3f92-ffffe527380d on database directory C:\derby\s1\systest\out\junit\system\singleUse\copyShutdown with class loade
sun.misc.Launcher$AppClassLoader@49be49be ^M
shutting down:
isCorrupt = false
pageCache = org.apache.derby.impl.services.cache.ConcurrentCache@1c2a1c2a
containerCache = org.apache.derby.impl.services.cache.ConcurrentCache@20fa20fa^M

> Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows
> ---------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5108
>                 URL: https://issues.apache.org/jira/browse/DERBY-5108
>             Project: Derby
>          Issue Type: Bug
>          Components: Test
>    Affects Versions: 10.8.0.0
>         Environment: Windows platforms.
>            Reporter: Kristian Waagan
>            Assignee: Mike Matrigali
>            Priority: Blocker
>         Attachments: javacore.20110309.125807.4048.0001.txt
>
>
> The test AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete fails intermittently on Windows platforms because the test is unable to delete a database directory.
> Even after several retries and sleeps (the formula should be (attempt -1) * 2000, resulting in a total sleep time of 12 seconds), the conglomerate system\singleUse\copyShutdown\seg0\c481.dat cannot be deleted.
> For instance from http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/w2003/1078855-suitesAll_diff.txt :
> (truncated paths)
> testShutdownWhileScanningThenDelete <assertDirectoryDeleted> attempt 1 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 2 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 3 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 4 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> used 205814 ms F.
> Maybe the database isn't shut down, or some specific timing of events causes a file to be reopened when it shouldn't have been (i.e. after the database shutdown has been initiated).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (DERBY-5108) Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows

Posted by "Dag H. Wanvik (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DERBY-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13010082#comment-13010082 ] 

Dag H. Wanvik commented on DERBY-5108:
--------------------------------------

Regression ran ok with my patch.

> Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows
> ---------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5108
>                 URL: https://issues.apache.org/jira/browse/DERBY-5108
>             Project: Derby
>          Issue Type: Bug
>          Components: Test
>    Affects Versions: 10.8.0.0
>         Environment: Windows platforms.
>            Reporter: Kristian Waagan
>            Priority: Blocker
>         Attachments: derby-5108-2.diff, javacore.20110309.125807.4048.0001.txt, waitOnShutdown.diff
>
>
> The test AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete fails intermittently on Windows platforms because the test is unable to delete a database directory.
> Even after several retries and sleeps (the formula should be (attempt -1) * 2000, resulting in a total sleep time of 12 seconds), the conglomerate system\singleUse\copyShutdown\seg0\c481.dat cannot be deleted.
> For instance from http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/w2003/1078855-suitesAll_diff.txt :
> (truncated paths)
> testShutdownWhileScanningThenDelete <assertDirectoryDeleted> attempt 1 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 2 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 3 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 4 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> used 205814 ms F.
> Maybe the database isn't shut down, or some specific timing of events causes a file to be reopened when it shouldn't have been (i.e. after the database shutdown has been initiated).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (DERBY-5108) Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows

Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DERBY-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13039375#comment-13039375 ] 

Kristian Waagan commented on DERBY-5108:
----------------------------------------

Another occurrence seen on May 8th 2011: http://people.apache.org/~myrnavl/derby_test_results/main/windows/testSummary-1100426.html

> Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows
> ---------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5108
>                 URL: https://issues.apache.org/jira/browse/DERBY-5108
>             Project: Derby
>          Issue Type: Bug
>          Components: Test
>    Affects Versions: 10.8.1.2
>         Environment: Windows platforms.
>            Reporter: Kristian Waagan
>            Priority: Blocker
>         Attachments: derby-5108-2.diff, javacore.20110309.125807.4048.0001.txt, waitOnShutdown.diff
>
>
> The test AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete fails intermittently on Windows platforms because the test is unable to delete a database directory.
> Even after several retries and sleeps (the formula should be (attempt -1) * 2000, resulting in a total sleep time of 12 seconds), the conglomerate system\singleUse\copyShutdown\seg0\c481.dat cannot be deleted.
> For instance from http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/w2003/1078855-suitesAll_diff.txt :
> (truncated paths)
> testShutdownWhileScanningThenDelete <assertDirectoryDeleted> attempt 1 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 2 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 3 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 4 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> used 205814 ms F.
> Maybe the database isn't shut down, or some specific timing of events causes a file to be reopened when it shouldn't have been (i.e. after the database shutdown has been initiated).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (DERBY-5108) Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows

Posted by "Dag H. Wanvik (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DERBY-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13045408#comment-13045408 ] 

Dag H. Wanvik commented on DERBY-5108:
--------------------------------------

I think the move should cover it. +1


> Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows
> ---------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5108
>                 URL: https://issues.apache.org/jira/browse/DERBY-5108
>             Project: Derby
>          Issue Type: Bug
>          Components: Test
>    Affects Versions: 10.8.1.2
>         Environment: Windows platforms.
>            Reporter: Kristian Waagan
>            Assignee: Kristian Waagan
>            Priority: Blocker
>         Attachments: derby-5108-1a-early_istats_shutdown.diff, derby-5108-2.diff, derby-5108-2a-early_istats_shutdown_broad.diff, javacore.20110309.125807.4048.0001.txt, waitOnShutdown.diff
>
>
> The test AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete fails intermittently on Windows platforms because the test is unable to delete a database directory.
> Even after several retries and sleeps (the formula should be (attempt -1) * 2000, resulting in a total sleep time of 12 seconds), the conglomerate system\singleUse\copyShutdown\seg0\c481.dat cannot be deleted.
> For instance from http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/w2003/1078855-suitesAll_diff.txt :
> (truncated paths)
> testShutdownWhileScanningThenDelete <assertDirectoryDeleted> attempt 1 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 2 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 3 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 4 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> used 205814 ms F.
> Maybe the database isn't shut down, or some specific timing of events causes a file to be reopened when it shouldn't have been (i.e. after the database shutdown has been initiated).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (DERBY-5108) Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows

Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DERBY-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kristian Waagan updated DERBY-5108:
-----------------------------------

    Attachment: derby-5108-1a-early_istats_shutdown.diff

Attaching patch 1a, which will make Derby shut down the istat daemon before throwing the shutdown exception.
I also tweaked the log entry to tell if the scan was aborted.

I have tested the patch on a Windows machine and I have run the regression tests, but I had problems reproducing the error on my machine. Nonetheless, I never saw the error with the patch applied.
If someone wants to test the patch on Windows before I commit, please let me know and I'll hold back.

Patch ready for review.

> Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows
> ---------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5108
>                 URL: https://issues.apache.org/jira/browse/DERBY-5108
>             Project: Derby
>          Issue Type: Bug
>          Components: Test
>    Affects Versions: 10.8.1.2
>         Environment: Windows platforms.
>            Reporter: Kristian Waagan
>            Assignee: Kristian Waagan
>            Priority: Blocker
>         Attachments: derby-5108-1a-early_istats_shutdown.diff, derby-5108-2.diff, javacore.20110309.125807.4048.0001.txt, waitOnShutdown.diff
>
>
> The test AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete fails intermittently on Windows platforms because the test is unable to delete a database directory.
> Even after several retries and sleeps (the formula should be (attempt -1) * 2000, resulting in a total sleep time of 12 seconds), the conglomerate system\singleUse\copyShutdown\seg0\c481.dat cannot be deleted.
> For instance from http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/w2003/1078855-suitesAll_diff.txt :
> (truncated paths)
> testShutdownWhileScanningThenDelete <assertDirectoryDeleted> attempt 1 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 2 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 3 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 4 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> used 205814 ms F.
> Maybe the database isn't shut down, or some specific timing of events causes a file to be reopened when it shouldn't have been (i.e. after the database shutdown has been initiated).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (DERBY-5108) Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows

Posted by "Knut Anders Hatlen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DERBY-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006212#comment-13006212 ] 

Knut Anders Hatlen commented on DERBY-5108:
-------------------------------------------

I invoked an engine shutdown while a call to SYSCS_UTIL.SYSCS_UPDATE_STATISTICS was in progress, and then I observed that the process still had an open file handle to the index file after the shutdown had completed. I had disabled istat when I ran this experiment.

> Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows
> ---------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5108
>                 URL: https://issues.apache.org/jira/browse/DERBY-5108
>             Project: Derby
>          Issue Type: Bug
>          Components: Test
>    Affects Versions: 10.8.0.0
>         Environment: Windows platforms.
>            Reporter: Kristian Waagan
>            Assignee: Mike Matrigali
>            Priority: Blocker
>         Attachments: javacore.20110309.125807.4048.0001.txt
>
>
> The test AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete fails intermittently on Windows platforms because the test is unable to delete a database directory.
> Even after several retries and sleeps (the formula should be (attempt -1) * 2000, resulting in a total sleep time of 12 seconds), the conglomerate system\singleUse\copyShutdown\seg0\c481.dat cannot be deleted.
> For instance from http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/w2003/1078855-suitesAll_diff.txt :
> (truncated paths)
> testShutdownWhileScanningThenDelete <assertDirectoryDeleted> attempt 1 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 2 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 3 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 4 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> used 205814 ms F.
> Maybe the database isn't shut down, or some specific timing of events causes a file to be reopened when it shouldn't have been (i.e. after the database shutdown has been initiated).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (DERBY-5108) Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows

Posted by "Mike Matrigali (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DERBY-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13012238#comment-13012238 ] 

Mike Matrigali commented on DERBY-5108:
---------------------------------------

I tried out the latest in trunk as of 3/26/2011 and ran the AutomaticIndexStatisticsTest suite 64 times and got 61 good runs and 3 failures all in the shutdown fixture as follows:
There was 1 failure:
1) testShutdownWhileScanningThenDelete(org.apache.derbyTesting.functionTests.tes
ts.store.AutomaticIndexStatisticsTest)junit.framework.AssertionFailedError: Fail
ed to delete 3 files (root=C:\derby\s2\newout\system\singleUse\copyShutdown: C:\
derby\s2\newout\system\singleUse\copyShutdown\seg0\c481.dat (isDir=false, canRea
d=true, canWrite=true, size=22351872), C:\derby\s2\newout\system\singleUse\copyS
hutdown\seg0 (isDir=true, canRead=true, canWrite=true, size=0), C:\derby\s2\newo
ut\system\singleUse\copyShutdown (isDir=true, canRead=true, canWrite=true, size=
0)
    at org.apache.derbyTesting.junit.BaseJDBCTestCase.assertDirectoryDeleted(Bas
eJDBCTestCase.java:1526)
    at org.apache.derbyTesting.functionTests.tests.store.AutomaticIndexStatistic
sTest.testShutdownWhileScanningThenDelete(AutomaticIndexStatisticsTest.java:188)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java
:60)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorI
mpl.java:37)
    at org.apache.derbyTesting.junit.BaseTestCase.runBare(BaseTestCase.java:112)
    at junit.extensions.TestDecorator.basicRun(TestDecorator.java:22)
    at junit.extensions.TestSetup$1.protect(TestSetup.java:19)
    at junit.extensions.TestSetup.run(TestSetup.java:23)
    at org.apache.derbyTesting.junit.BaseTestSetup.run(BaseTestSetup.java:57)
    at junit.extensions.TestDecorator.basicRun(TestDecorator.java:22)
    at junit.extensions.TestSetup$1.protect(TestSetup.java:19)
    at junit.extensions.TestSetup.run(TestSetup.java:23)

There didn't seem to be anything interesting in derby.log.  I think we should enable the "extra" logging by default in
this test while there is still and issue. I would be happy to run in my environment and reproduce if there is any 
additional logging that we think will track down the issue (even if it something we don't want checked into the
actual codeline).  Tonight I will at least run the test with the extra logging enabled from the command line.

> Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows
> ---------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5108
>                 URL: https://issues.apache.org/jira/browse/DERBY-5108
>             Project: Derby
>          Issue Type: Bug
>          Components: Test
>    Affects Versions: 10.8.0.0
>         Environment: Windows platforms.
>            Reporter: Kristian Waagan
>            Priority: Blocker
>         Attachments: derby-5108-2.diff, javacore.20110309.125807.4048.0001.txt, waitOnShutdown.diff
>
>
> The test AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete fails intermittently on Windows platforms because the test is unable to delete a database directory.
> Even after several retries and sleeps (the formula should be (attempt -1) * 2000, resulting in a total sleep time of 12 seconds), the conglomerate system\singleUse\copyShutdown\seg0\c481.dat cannot be deleted.
> For instance from http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/w2003/1078855-suitesAll_diff.txt :
> (truncated paths)
> testShutdownWhileScanningThenDelete <assertDirectoryDeleted> attempt 1 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 2 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 3 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 4 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> used 205814 ms F.
> Maybe the database isn't shut down, or some specific timing of events causes a file to be reopened when it shouldn't have been (i.e. after the database shutdown has been initiated).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Reopened] (DERBY-5108) Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows

Posted by "Myrna van Lunteren (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DERBY-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Myrna van Lunteren reopened DERBY-5108:
---------------------------------------


changing issue to change component for 10.8.2 release notes.

> Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows
> ---------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5108
>                 URL: https://issues.apache.org/jira/browse/DERBY-5108
>             Project: Derby
>          Issue Type: Bug
>          Components: Test
>    Affects Versions: 10.8.1.2
>         Environment: Windows platforms.
>            Reporter: Kristian Waagan
>            Assignee: Kristian Waagan
>            Priority: Blocker
>             Fix For: 10.8.1.6, 10.9.0.0
>
>         Attachments: derby-5108-1a-early_istats_shutdown.diff, derby-5108-2.diff, derby-5108-2a-early_istats_shutdown_broad.diff, derby-5108-3a-istat_log_abort.diff, javacore.20110309.125807.4048.0001.txt, waitOnShutdown.diff
>
>
> The test AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete fails intermittently on Windows platforms because the test is unable to delete a database directory.
> Even after several retries and sleeps (the formula should be (attempt -1) * 2000, resulting in a total sleep time of 12 seconds), the conglomerate system\singleUse\copyShutdown\seg0\c481.dat cannot be deleted.
> For instance from http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/w2003/1078855-suitesAll_diff.txt :
> (truncated paths)
> testShutdownWhileScanningThenDelete <assertDirectoryDeleted> attempt 1 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 2 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 3 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 4 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> used 205814 ms F.
> Maybe the database isn't shut down, or some specific timing of events causes a file to be reopened when it shouldn't have been (i.e. after the database shutdown has been initiated).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (DERBY-5108) Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows

Posted by "Mike Matrigali (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DERBY-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13005899#comment-13005899 ] 

Mike Matrigali commented on DERBY-5108:
---------------------------------------

The more I look at this issue I think the problem is that the istat daemon should shutdown and not return until it has completed this 
shutdown when indexRefresher.stop(); is called from the DataDictionary's stop call().  For a clean shutdown of the system the store
needs all it's clients shutdown first and then it can cleanly shutdown, and force the database files and transaction logs insuring 
a clean shutdown with no recovery work necessary on the next boot.

By leaving the istat daemon running we can run into a number of errors that I don't think can be solved.  We might fix a specific one shown
up by this test but the system is just not designed to handle clean shutdown while stuff is still running without first waiting for the running
stuff to stop somehow.

Kristian noted in DERBY-5037:
> I think Mike's comments/observations above agree pretty much with my thinking when writing the code. Seems there are several error-handling issues to iron out though...
>A few specific comments:
>o I decided to not make Derby wait for the background thread to finish on shutdown, as it might potentially be scanning a very large table.
>o Logging is rather verbose now during testing, but I agree it should be less verbose (or maybe turned off completely) when released.
>o I'm logging a lot of exceptions to aid testing/debugging. These should also go away, or be enabled by a property if the user wishes to do so. 

I now think that it was wrong to not wait for the background thread.  I agree it would be a nice optimization to somehow stop the background thread in
the middle of a big scan, and it seems like with the better interrupt support this should be much easier than was the case before 10.8.   I would like
some feedback before proceding from those more knowledgeable about the istat work.

I do think that the work rick did for DERBY-5037 is still valuable as it will handle much better the non-clean shutdowns that Derby can experience.  ButFor
a non-clean shutdown we might have to just live with a file left open until the thread or jvm exits.   But for a requested orderly shutdown of the system I 
think we should go with the top down shutdown supported by the architecture rather than try to fix errors encountered when top level modules are still
running while lower level modules are trying to shut down.

> Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows
> ---------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5108
>                 URL: https://issues.apache.org/jira/browse/DERBY-5108
>             Project: Derby
>          Issue Type: Bug
>          Components: Test
>    Affects Versions: 10.8.0.0
>         Environment: Windows platforms.
>            Reporter: Kristian Waagan
>            Assignee: Mike Matrigali
>            Priority: Blocker
>         Attachments: javacore.20110309.125807.4048.0001.txt
>
>
> The test AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete fails intermittently on Windows platforms because the test is unable to delete a database directory.
> Even after several retries and sleeps (the formula should be (attempt -1) * 2000, resulting in a total sleep time of 12 seconds), the conglomerate system\singleUse\copyShutdown\seg0\c481.dat cannot be deleted.
> For instance from http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/w2003/1078855-suitesAll_diff.txt :
> (truncated paths)
> testShutdownWhileScanningThenDelete <assertDirectoryDeleted> attempt 1 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 2 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 3 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 4 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> used 205814 ms F.
> Maybe the database isn't shut down, or some specific timing of events causes a file to be reopened when it shouldn't have been (i.e. after the database shutdown has been initiated).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (DERBY-5108) Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows

Posted by "Mike Matrigali (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DERBY-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mike Matrigali updated DERBY-5108:
----------------------------------

              Priority: Blocker  (was: Major)
    Bug behavior facts: [Regression, Regression Test Failure]  (was: [Regression Test Failure])

I am actively working on this one.  Right now I am concentrating on SANE, classes, windows failure of just this one fixture.  For me on my laptop this fails every time.
Unless anyone protests I am going to check in a change that disables just this one fixture while I am working on it.  I have verified that the file that is the problem is the
data file that istat is scanning when the shutdown happens.  It stuck around for over an hour so I am going to assume the resource is stuck open at least until the thread
that started the server goes away and in worst case until the jvm comes down.  I 

I am marking this a regression, because a user previous to 10.8 that shuts down when his threads are doing nothing won't see this.  But in 10.8 istat might be running
and he will run into it.  I could see this leading to problems with subsquent ddl which has to do deletes or renames of the associated table.

My current guess is that DERBY-5037 fixed the symptom but shutdown is still failing and leaving a real resource problem.  I still need to do some more debugging to 
verify what is going on.  In my case I am consistently getting the ASSERT.  But I believe we see the issue in nightly runs against SANE=false so it is not specific to stuff
we do in ASSERT logic.  I am hoping that fixing the SANE path will apply to the SANE=false path also.  

Logically it seems like what should happen on "clean" shutdown is first user threads should be shutdown, then istat, and finally store.  I wonder if in 10.8 we can take advantage of the interrupt work to quickly and safely shutdown the user and istat threads?

> Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows
> ---------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5108
>                 URL: https://issues.apache.org/jira/browse/DERBY-5108
>             Project: Derby
>          Issue Type: Bug
>          Components: Test
>    Affects Versions: 10.8.0.0
>         Environment: Windows platforms.
>            Reporter: Kristian Waagan
>            Assignee: Mike Matrigali
>            Priority: Blocker
>         Attachments: javacore.20110309.125807.4048.0001.txt
>
>
> The test AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete fails intermittently on Windows platforms because the test is unable to delete a database directory.
> Even after several retries and sleeps (the formula should be (attempt -1) * 2000, resulting in a total sleep time of 12 seconds), the conglomerate system\singleUse\copyShutdown\seg0\c481.dat cannot be deleted.
> For instance from http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/w2003/1078855-suitesAll_diff.txt :
> (truncated paths)
> testShutdownWhileScanningThenDelete <assertDirectoryDeleted> attempt 1 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 2 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 3 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 4 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> used 205814 ms F.
> Maybe the database isn't shut down, or some specific timing of events causes a file to be reopened when it shouldn't have been (i.e. after the database shutdown has been initiated).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (DERBY-5108) Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows

Posted by "Lily Wei (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DERBY-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006478#comment-13006478 ] 

Lily Wei commented on DERBY-5108:
---------------------------------

Thanks Knut for the patch. I applied to my windows 7 machine. I used to able to reproduce the delete problem quickly. After running AutomaticIndexStatisticsTest 5 times, the delete problem did not show up. With waitOnShutdown.diff patch, checking the daemonDisabled flag each time we fetch a new chuck of rows from store and making stop() wait for istat thread to complete if there is an active thread definitely help. Does this approach fulfill the top down shutdown supported by the architecture?

> Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows
> ---------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5108
>                 URL: https://issues.apache.org/jira/browse/DERBY-5108
>             Project: Derby
>          Issue Type: Bug
>          Components: Test
>    Affects Versions: 10.8.0.0
>         Environment: Windows platforms.
>            Reporter: Kristian Waagan
>            Assignee: Mike Matrigali
>            Priority: Blocker
>         Attachments: javacore.20110309.125807.4048.0001.txt, waitOnShutdown.diff
>
>
> The test AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete fails intermittently on Windows platforms because the test is unable to delete a database directory.
> Even after several retries and sleeps (the formula should be (attempt -1) * 2000, resulting in a total sleep time of 12 seconds), the conglomerate system\singleUse\copyShutdown\seg0\c481.dat cannot be deleted.
> For instance from http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/w2003/1078855-suitesAll_diff.txt :
> (truncated paths)
> testShutdownWhileScanningThenDelete <assertDirectoryDeleted> attempt 1 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 2 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 3 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 4 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> used 205814 ms F.
> Maybe the database isn't shut down, or some specific timing of events causes a file to be reopened when it shouldn't have been (i.e. after the database shutdown has been initiated).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira