You are viewing a plain text version of this content. The canonical link for it is here.

Posted to derby-dev@db.apache.org by "Kathey Marsden (JIRA)" <ji...@apache.org> on 2009/05/21 01:37:45 UTC

[jira] Created: (DERBY-4239) corruption on z/OS with storerecovery oc_rec? tests. ERROR XSLA7: Cannot redo operation null in the log.

corruption on z/OS with storerecovery oc_rec? tests.  ERROR XSLA7: Cannot redo operation null in the log.
---------------------------------------------------------------------------------------------------------

                 Key: DERBY-4239
                 URL: https://issues.apache.org/jira/browse/DERBY-4239
             Project: Derby
          Issue Type: Bug
          Components: Store
    Affects Versions: 10.5.1.1
         Environment: z/OS z10 processor. 
java version "1.6.0"
Java(TM) SE Runtime Environment (build pmz3160sr4-20090219_01(SR4))
IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160-20090215_29883 (JIT enabled, AOT enabled)
J9VM - 20090215_029883_bHdSMr
JIT  - r9_20090213_2028
GC   - 20090213_AA)
JCL  - 20090218_01
also 
java version "1.6.0"
Java(TM) SE Runtime Environment (build pmz3160sr2ifix-20081021_01(SR2+IZ32776+IZ33456))
IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160ifx-20081010_24288 (JIT enabled, AOT enabled)
J9VM - 20081009_024288_bHdSMr
JIT  - r9_20080721_1330ifx2
GC   - 20080724_AA)
JCL  - 20080808_02

            Reporter: Kathey Marsden
            Priority: Critical


I saw corruption on z/OS with the storerecovery tests and 10.5.1.1.  The failure comes in oc_rec3 trying to connect to the database, but the actual problem seems to have occurred with the prior test oc_rec2.  The problem is somewhat intermittent, happening approximately 1/4 times.  I extracted the case from the harness and will attach the reproduction and run the script repro.ksh.  The script will loop up to 50 times until it gets the failure which looks like.

ERROR XSLA7: Cannot redo operation null in the log.
	at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
	at org.apache.derby.impl.store.raw.log.FileLogger.redo(Unknown Source)
	at org.apache.derby.impl.store.raw.log.LogToFile.recover(Unknown Source)
	at org.apache.derby.impl.store.raw.RawStore.boot(Unknown Source)
	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
	at org.apache.derby.impl.store.access.RAMAccessManager.boot(Unknown Source)
	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
	at org.apache.derby.impl.db.BasicDatabase.bootStore(Unknown Source)
	at org.apache.derby.impl.db.BasicDatabase.boot(Unknown Source)
	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
	at org.apache.derby.impl.services.monitor.BaseMonitor.bootService(Unknown Source)
	at org.apache.derby.impl.services.monitor.BaseMonitor.startProviderService(Unknown Source)
	at org.apache.derby.impl.services.monitor.BaseMonitor.findProviderAndStartService(Unknown Source)
	at org.apache.derby.impl.services.monitor.BaseMonitor.startPersistentService(Unknown Source)
	at org.apache.derby.iapi.services.monitor.Monitor.startPersistentService(Unknown Source)
	at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(Unknown Source)
	at org.apache.derby.impl.jdbc.EmbedConnection.<init>(Unknown Source)
	at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Unknown Source)
	at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source)
	at org.apache.derby.jdbc.AutoloadedDriver.connect(Unknown Source)
	at java.sql.DriverManager.getConnection(DriverManager.java:311)
	at java.sql.DriverManager.getConnection(DriverManager.java:268)
	at CheckTables.main(CheckTables.java:8)
Caused by: ERROR XSDBB: Unknown page format at page Page(16,Container(0, 1073)), page dump follows: Hex dump:
00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
<snip lots of 000's>

I ran it with 10.3 and it completed all 50 iterations, so whether JVM or Derby issue it seems new since 10.3. (I haven't tried with 10.4).  Oddly I have run tests many times before on this machine using in the 10.5.1.1 release and the same jvm and have never seen this failure, so am looking into whether maybe something changed on the machine or environment.



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (DERBY-4239) corruption with storerecovery oc_rec? tests. ERROR XSLA7: Cannot redo operation null in the log when compress occurs during checkpoint, then jvm exits

Posted by "Kathey Marsden (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/DERBY-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kathey Marsden updated DERBY-4239:
----------------------------------

    Summary: corruption with storerecovery oc_rec? tests.  ERROR XSLA7: Cannot redo operation null in the log when compress occurs during checkpoint, then jvm exits  (was: corruption on z/OS with storerecovery oc_rec? tests.  ERROR XSLA7: Cannot redo operation null in the log.)

Changing title to be more descriptive and reflect that the problem is not z/OS specific.

> corruption with storerecovery oc_rec? tests.  ERROR XSLA7: Cannot redo operation null in the log when compress occurs during checkpoint, then jvm exits
> -------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-4239
>                 URL: https://issues.apache.org/jira/browse/DERBY-4239
>             Project: Derby
>          Issue Type: Bug
>          Components: Store
>    Affects Versions: 10.1.3.3, 10.2.2.1, 10.3.2.1, 10.4.2.0, 10.5.1.1, 10.6.0.0
>         Environment: z/OS z10 processor. 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr4-20090219_01(SR4))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160-20090215_29883 (JIT enabled, AOT enabled)
> J9VM - 20090215_029883_bHdSMr
> JIT  - r9_20090213_2028
> GC   - 20090213_AA)
> JCL  - 20090218_01
> also 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr2ifix-20081021_01(SR2+IZ32776+IZ33456))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160ifx-20081010_24288 (JIT enabled, AOT enabled)
> J9VM - 20081009_024288_bHdSMr
> JIT  - r9_20080721_1330ifx2
> GC   - 20080724_AA)
> JCL  - 20080808_02
>            Reporter: Kathey Marsden
>            Assignee: Mike Matrigali
>            Priority: Critical
>             Fix For: 10.1.4.0, 10.2.3.0, 10.3.4.0, 10.4.3.0, 10.5.1.2, 10.6.0.0
>
>         Attachments: badlogsizes.txt, derby-4239_1.diff, DERBY-4239_2.diff, DERBY-4239_3.diff, derby.log, derby.log, derby_dumponly.zip, goodlogsizes.txt, identifyBadContainer.ksh, reproBackgroundCheckpoint.zip, reproDerby4239.zip, wombat_keeplog_notcorrupt.zip, wombat_with_keeplog.zip
>
>
> I saw corruption on z/OS with the storerecovery tests and 10.5.1.1.  The failure comes in oc_rec3 trying to connect to the database, but the actual problem seems to have occurred with the prior test oc_rec2.  The problem is somewhat intermittent, happening approximately 1/4 times.  I extracted the case from the harness and will attach the reproduction and run the script repro.ksh.  The script will loop up to 50 times until it gets the failure which looks like.
> ERROR XSLA7: Cannot redo operation null in the log.
> 	at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.FileLogger.redo(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.LogToFile.recover(Unknown Source)
> 	at org.apache.derby.impl.store.raw.RawStore.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.store.access.RAMAccessManager.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.bootStore(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.bootService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startProviderService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.findProviderAndStartService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.<init>(Unknown Source)
> 	at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Unknown Source)
> 	at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source)
> 	at org.apache.derby.jdbc.AutoloadedDriver.connect(Unknown Source)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:311)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:268)
> 	at CheckTables.main(CheckTables.java:8)
> Caused by: ERROR XSDBB: Unknown page format at page Page(16,Container(0, 1073)), page dump follows: Hex dump:
> 00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 00000010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> <snip lots of 000's>
> I ran it with 10.3 and it completed all 50 iterations, so whether JVM or Derby issue it seems new since 10.3. (I haven't tried with 10.4).  Oddly I have run tests many times before on this machine using in the 10.5.1.1 release and the same jvm and have never seen this failure, so am looking into whether maybe something changed on the machine or environment.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (DERBY-4239) corruption on z/OS with storerecovery oc_rec? tests. ERROR XSLA7: Cannot redo operation null in the log.

Posted by "Mike Matrigali (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/DERBY-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mike Matrigali updated DERBY-4239:
----------------------------------


I am not sure what the expected behavior is of the ReproCorruptionBackgroundCheckpoint() test.   Reading 
the current test it seems like it starts a thread that loops forever doing checkpoints one after another.  The
stack that you posted doesn't seem like anything is hanging, it looks like there is a single thread actively doing
a checkpoint.  

It seems like the test should stop the user checkpoint thread after the main test has finished.

> corruption on z/OS with storerecovery oc_rec? tests.  ERROR XSLA7: Cannot redo operation null in the log.
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-4239
>                 URL: https://issues.apache.org/jira/browse/DERBY-4239
>             Project: Derby
>          Issue Type: Bug
>          Components: Store
>    Affects Versions: 10.1.3.3, 10.2.2.1, 10.3.2.1, 10.4.2.0, 10.5.1.1, 10.6.0.0
>         Environment: z/OS z10 processor. 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr4-20090219_01(SR4))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160-20090215_29883 (JIT enabled, AOT enabled)
> J9VM - 20090215_029883_bHdSMr
> JIT  - r9_20090213_2028
> GC   - 20090213_AA)
> JCL  - 20090218_01
> also 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr2ifix-20081021_01(SR2+IZ32776+IZ33456))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160ifx-20081010_24288 (JIT enabled, AOT enabled)
> J9VM - 20081009_024288_bHdSMr
> JIT  - r9_20080721_1330ifx2
> GC   - 20080724_AA)
> JCL  - 20080808_02
>            Reporter: Kathey Marsden
>            Assignee: Mike Matrigali
>            Priority: Critical
>         Attachments: badlogsizes.txt, derby-4239_1.diff, DERBY-4239_2.diff, DERBY-4239_3.diff, derby.log, derby.log, derby_dumponly.zip, goodlogsizes.txt, identifyBadContainer.ksh, reproBackgroundCheckpoint.zip, reproDerby4239.zip, wombat_keeplog_notcorrupt.zip, wombat_with_keeplog.zip
>
>
> I saw corruption on z/OS with the storerecovery tests and 10.5.1.1.  The failure comes in oc_rec3 trying to connect to the database, but the actual problem seems to have occurred with the prior test oc_rec2.  The problem is somewhat intermittent, happening approximately 1/4 times.  I extracted the case from the harness and will attach the reproduction and run the script repro.ksh.  The script will loop up to 50 times until it gets the failure which looks like.
> ERROR XSLA7: Cannot redo operation null in the log.
> 	at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.FileLogger.redo(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.LogToFile.recover(Unknown Source)
> 	at org.apache.derby.impl.store.raw.RawStore.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.store.access.RAMAccessManager.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.bootStore(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.bootService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startProviderService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.findProviderAndStartService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.<init>(Unknown Source)
> 	at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Unknown Source)
> 	at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source)
> 	at org.apache.derby.jdbc.AutoloadedDriver.connect(Unknown Source)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:311)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:268)
> 	at CheckTables.main(CheckTables.java:8)
> Caused by: ERROR XSDBB: Unknown page format at page Page(16,Container(0, 1073)), page dump follows: Hex dump:
> 00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 00000010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> <snip lots of 000's>
> I ran it with 10.3 and it completed all 50 iterations, so whether JVM or Derby issue it seems new since 10.3. (I haven't tried with 10.4).  Oddly I have run tests many times before on this machine using in the 10.5.1.1 release and the same jvm and have never seen this failure, so am looking into whether maybe something changed on the machine or environment.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (DERBY-4239) corruption on z/OS with storerecovery oc_rec? tests. ERROR XSLA7: Cannot redo operation null in the log.

Posted by "Mike Matrigali (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/DERBY-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mike Matrigali updated DERBY-4239:
----------------------------------

    Attachment: DERBY-4239_3.diff

Derby-4239_3.diff is the patch I intend to commit.  It passes complete set of nightly tests.

After looking at the backup code it seemed like backup really wanted to have the same behavior
that compress was looking for.  I also changed the behavior of the system procedure checkpoint
to match backup and compress checkpoint.  

I moved the waiting code into the subroutine so that it could differ between a checkpoint returning
false because another a checkpoint was in progress and a couple of other possible conditions.
Without this change system could get in a state where it looped forever trying to get a checkpoint
(one case was trying to force a clean shutdown after we had already closed down the logging
system).



> corruption on z/OS with storerecovery oc_rec? tests.  ERROR XSLA7: Cannot redo operation null in the log.
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-4239
>                 URL: https://issues.apache.org/jira/browse/DERBY-4239
>             Project: Derby
>          Issue Type: Bug
>          Components: Store
>    Affects Versions: 10.1.3.3, 10.2.2.1, 10.3.2.1, 10.4.2.0, 10.5.1.1, 10.6.0.0
>         Environment: z/OS z10 processor. 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr4-20090219_01(SR4))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160-20090215_29883 (JIT enabled, AOT enabled)
> J9VM - 20090215_029883_bHdSMr
> JIT  - r9_20090213_2028
> GC   - 20090213_AA)
> JCL  - 20090218_01
> also 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr2ifix-20081021_01(SR2+IZ32776+IZ33456))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160ifx-20081010_24288 (JIT enabled, AOT enabled)
> J9VM - 20081009_024288_bHdSMr
> JIT  - r9_20080721_1330ifx2
> GC   - 20080724_AA)
> JCL  - 20080808_02
>            Reporter: Kathey Marsden
>            Assignee: Mike Matrigali
>            Priority: Critical
>         Attachments: badlogsizes.txt, derby-4239_1.diff, DERBY-4239_2.diff, DERBY-4239_3.diff, derby.log, derby.log, derby_dumponly.zip, goodlogsizes.txt, identifyBadContainer.ksh, reproBackgroundCheckpoint.zip, reproDerby4239.zip, wombat_keeplog_notcorrupt.zip, wombat_with_keeplog.zip
>
>
> I saw corruption on z/OS with the storerecovery tests and 10.5.1.1.  The failure comes in oc_rec3 trying to connect to the database, but the actual problem seems to have occurred with the prior test oc_rec2.  The problem is somewhat intermittent, happening approximately 1/4 times.  I extracted the case from the harness and will attach the reproduction and run the script repro.ksh.  The script will loop up to 50 times until it gets the failure which looks like.
> ERROR XSLA7: Cannot redo operation null in the log.
> 	at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.FileLogger.redo(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.LogToFile.recover(Unknown Source)
> 	at org.apache.derby.impl.store.raw.RawStore.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.store.access.RAMAccessManager.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.bootStore(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.bootService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startProviderService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.findProviderAndStartService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.<init>(Unknown Source)
> 	at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Unknown Source)
> 	at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source)
> 	at org.apache.derby.jdbc.AutoloadedDriver.connect(Unknown Source)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:311)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:268)
> 	at CheckTables.main(CheckTables.java:8)
> Caused by: ERROR XSDBB: Unknown page format at page Page(16,Container(0, 1073)), page dump follows: Hex dump:
> 00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 00000010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> <snip lots of 000's>
> I ran it with 10.3 and it completed all 50 iterations, so whether JVM or Derby issue it seems new since 10.3. (I haven't tried with 10.4).  Oddly I have run tests many times before on this machine using in the 10.5.1.1 release and the same jvm and have never seen this failure, so am looking into whether maybe something changed on the machine or environment.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (DERBY-4239) corruption on z/OS with storerecovery oc_rec? tests. ERROR XSLA7: Cannot redo operation null in the log.

Posted by "Mike Matrigali (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/DERBY-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mike Matrigali updated DERBY-4239:
----------------------------------

    Attachment: derby_dumponly.zip

derby_dumponly.log is a derby.log booted on the database that won't boot, with the following debug flag.  This forces the boot to just print all the log records rather than 
process them:
derby.debug.true=DumpLogOnly,LogTrace

Using this I can see what was going on in the db after the problem log record 
causing the crash.  What I see is that toward the end the is a log record for 
the compress of container 1073:
DEBUG LogTrace OUTPUT: scanned 64851 : Page Operation: Page(0,Container(0, 1073)) pageVersion 107 :  CompressSpaceOperation: newHighestPage = 13;num_p
ages_truncated = 31 to Page(0,Container(0, 1073)) instant = (6,3249863) logEnd = (6,3249908) logIn at 25 available 4^M

If I look backward in the log from this point I find the most recent operation on 1073:
DEBUG LogTrace OUTPUT: scanned 64820 : Page Operation: Page(29,Container(0, 1024)) pageVersion 66 : Purge : 1 slots starting at 12 (recordId=18) insta
nt = (6,3210739) logEnd = (6,3210791) logIn at 23 available 13^M

And If I again start at that Compress log record and search backward for a 
checkpoint record I find:
DEBUG LogTrace OUTPUT: scanned 61634 : Checkpoint :     redoLWM (4,943)
        undoLWM (4,943)
**************************
org.apache.derby.impl.store.raw.xact.TransactionTable@16ca16ca
Transaction Table: size = 1 largestUpdateXactId = 61634
Xid=61634 gid=null firstLog=(4,943) lastLog=null transactionStatus=0 myxact=null update=true recovery=true prepare=false needExclusion=true
--------------------------- instant = (4,981) logEnd = (4,1056) logIn at 55 available 4^M

And the last checkpoint in the log is:
DEBUG LogTrace OUTPUT: scanned 64191 : Checkpoint :     redoLWM (6,24)
        undoLWM (5,61)
**************************
org.apache.derby.impl.store.raw.xact.TransactionTable@64a864a8
Transaction Table: size = 2 largestUpdateXactId = 64852
Xid=64191 gid=null firstLog=(6,3250100) lastLog=null transactionStatus=0 myxact=null update=true recovery=true prepare=false needExclusion=true
Xid=64852 gid=null firstLog=(6,3249980) lastLog=(6,3250018) transactionStatus=0 myxact=null update=true recovery=true prepare=false needExclusion=true
--------------------------- instant = (6,3250138) logEnd = (6,3250245) logIn at 87 available 4^M

The problem with this is that for redo recovery of compress space record to work
properly, there must be a checkpoint with a redo lwm (low water mark) that is after any operation on the container before the compress operation happens.
Compress calls checkpoint to make this happen.  The reason is that redo 
recovery wants to replay any log record it has making pages march orderly from
version n to version n+1, ...   But the compress space operation shrinks the file
on disk losing version n, so version n+1 can't be redone.  

In this case the last checkpoint has redo lwm of (6,24) and the last operation is
after that at (6,3210739)

> corruption on z/OS with storerecovery oc_rec? tests.  ERROR XSLA7: Cannot redo operation null in the log.
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-4239
>                 URL: https://issues.apache.org/jira/browse/DERBY-4239
>             Project: Derby
>          Issue Type: Bug
>          Components: Store
>    Affects Versions: 10.5.1.1
>         Environment: z/OS z10 processor. 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr4-20090219_01(SR4))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160-20090215_29883 (JIT enabled, AOT enabled)
> J9VM - 20090215_029883_bHdSMr
> JIT  - r9_20090213_2028
> GC   - 20090213_AA)
> JCL  - 20090218_01
> also 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr2ifix-20081021_01(SR2+IZ32776+IZ33456))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160ifx-20081010_24288 (JIT enabled, AOT enabled)
> J9VM - 20081009_024288_bHdSMr
> JIT  - r9_20080721_1330ifx2
> GC   - 20080724_AA)
> JCL  - 20080808_02
>            Reporter: Kathey Marsden
>            Assignee: Mike Matrigali
>            Priority: Critical
>         Attachments: badlogsizes.txt, derby.log, derby.log, derby_dumponly.zip, goodlogsizes.txt, identifyBadContainer.ksh, reproBackgroundCheckpoint.zip, reproDerby4239.zip, wombat_keeplog_notcorrupt.zip, wombat_with_keeplog.zip
>
>
> I saw corruption on z/OS with the storerecovery tests and 10.5.1.1.  The failure comes in oc_rec3 trying to connect to the database, but the actual problem seems to have occurred with the prior test oc_rec2.  The problem is somewhat intermittent, happening approximately 1/4 times.  I extracted the case from the harness and will attach the reproduction and run the script repro.ksh.  The script will loop up to 50 times until it gets the failure which looks like.
> ERROR XSLA7: Cannot redo operation null in the log.
> 	at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.FileLogger.redo(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.LogToFile.recover(Unknown Source)
> 	at org.apache.derby.impl.store.raw.RawStore.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.store.access.RAMAccessManager.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.bootStore(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.bootService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startProviderService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.findProviderAndStartService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.<init>(Unknown Source)
> 	at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Unknown Source)
> 	at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source)
> 	at org.apache.derby.jdbc.AutoloadedDriver.connect(Unknown Source)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:311)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:268)
> 	at CheckTables.main(CheckTables.java:8)
> Caused by: ERROR XSDBB: Unknown page format at page Page(16,Container(0, 1073)), page dump follows: Hex dump:
> 00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 00000010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> <snip lots of 000's>
> I ran it with 10.3 and it completed all 50 iterations, so whether JVM or Derby issue it seems new since 10.3. (I haven't tried with 10.4).  Oddly I have run tests many times before on this machine using in the 10.5.1.1 release and the same jvm and have never seen this failure, so am looking into whether maybe something changed on the machine or environment.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (DERBY-4239) corruption on z/OS with storerecovery oc_rec? tests. ERROR XSLA7: Cannot redo operation null in the log.

Posted by "Mike Matrigali (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/DERBY-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mike Matrigali updated DERBY-4239:
----------------------------------

    Attachment: derby-4239_1.diff

Preliminary patch for this issue.  I have not run full tests yet, but would like feedback from anyone who could reproduce the original error - I have not actually reproduced
it myself.  

This patch only includes code changes, no new tests.

The fix is to add interfaces that allow compress table to tell the underlying store that
it needs a new checkpoint and needs to wait until that checkpoint has made it into
the log before proceeding with the operation which will shrink the file destroying 
pages that may otherwise participate in redo recovery.

I have only altered the behavior for the compress operation and left all other
checkpoint() calling paths the same, but reading some 
comments while looking at the code makes me concerned that some of the
backup code and backup for encryption code may have also have problems with
an ongoing checkpoint.  But would rather address those problems if they exist in
another issue.

> corruption on z/OS with storerecovery oc_rec? tests.  ERROR XSLA7: Cannot redo operation null in the log.
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-4239
>                 URL: https://issues.apache.org/jira/browse/DERBY-4239
>             Project: Derby
>          Issue Type: Bug
>          Components: Store
>    Affects Versions: 10.1.3.3, 10.2.2.1, 10.3.2.1, 10.4.2.0, 10.5.1.1, 10.6.0.0
>         Environment: z/OS z10 processor. 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr4-20090219_01(SR4))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160-20090215_29883 (JIT enabled, AOT enabled)
> J9VM - 20090215_029883_bHdSMr
> JIT  - r9_20090213_2028
> GC   - 20090213_AA)
> JCL  - 20090218_01
> also 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr2ifix-20081021_01(SR2+IZ32776+IZ33456))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160ifx-20081010_24288 (JIT enabled, AOT enabled)
> J9VM - 20081009_024288_bHdSMr
> JIT  - r9_20080721_1330ifx2
> GC   - 20080724_AA)
> JCL  - 20080808_02
>            Reporter: Kathey Marsden
>            Assignee: Mike Matrigali
>            Priority: Critical
>         Attachments: badlogsizes.txt, derby-4239_1.diff, derby.log, derby.log, derby_dumponly.zip, goodlogsizes.txt, identifyBadContainer.ksh, reproBackgroundCheckpoint.zip, reproDerby4239.zip, wombat_keeplog_notcorrupt.zip, wombat_with_keeplog.zip
>
>
> I saw corruption on z/OS with the storerecovery tests and 10.5.1.1.  The failure comes in oc_rec3 trying to connect to the database, but the actual problem seems to have occurred with the prior test oc_rec2.  The problem is somewhat intermittent, happening approximately 1/4 times.  I extracted the case from the harness and will attach the reproduction and run the script repro.ksh.  The script will loop up to 50 times until it gets the failure which looks like.
> ERROR XSLA7: Cannot redo operation null in the log.
> 	at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.FileLogger.redo(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.LogToFile.recover(Unknown Source)
> 	at org.apache.derby.impl.store.raw.RawStore.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.store.access.RAMAccessManager.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.bootStore(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.bootService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startProviderService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.findProviderAndStartService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.<init>(Unknown Source)
> 	at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Unknown Source)
> 	at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source)
> 	at org.apache.derby.jdbc.AutoloadedDriver.connect(Unknown Source)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:311)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:268)
> 	at CheckTables.main(CheckTables.java:8)
> Caused by: ERROR XSDBB: Unknown page format at page Page(16,Container(0, 1073)), page dump follows: Hex dump:
> 00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 00000010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> <snip lots of 000's>
> I ran it with 10.3 and it completed all 50 iterations, so whether JVM or Derby issue it seems new since 10.3. (I haven't tried with 10.4).  Oddly I have run tests many times before on this machine using in the 10.5.1.1 release and the same jvm and have never seen this failure, so am looking into whether maybe something changed on the machine or environment.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (DERBY-4239) corruption on z/OS with storerecovery oc_rec? tests. ERROR XSLA7: Cannot redo operation null in the log.

Posted by "Kathey Marsden (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/DERBY-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12713314#action_12713314 ] 

Kathey Marsden commented on DERBY-4239:
---------------------------------------

Thanks Mike for the quick fix.  With the patch, I got 200 clean runs on z/OS and also 200 clean runs on Windows (IBM 1.6) with the windows repro.

I took a quick look at the patch and have no useful technical comments but noticed that the patch mixes spaces and tabs.  Also there is an extra @param wait in javadoc(LogFactory:107).

I have a question though.  Under what conditions would we not want to queue the checkpoint requests and force a new checkpoint?

When we look at the other cases,  in addition to backup, I think it would be good to look at:
CALL SYSCS_UTIL.SYSCS_CHECKPOINT_DATABASE() to make sure it has no issues.  It seems like it should  force a new checkpoint as well.






> corruption on z/OS with storerecovery oc_rec? tests.  ERROR XSLA7: Cannot redo operation null in the log.
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-4239
>                 URL: https://issues.apache.org/jira/browse/DERBY-4239
>             Project: Derby
>          Issue Type: Bug
>          Components: Store
>    Affects Versions: 10.1.3.3, 10.2.2.1, 10.3.2.1, 10.4.2.0, 10.5.1.1, 10.6.0.0
>         Environment: z/OS z10 processor. 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr4-20090219_01(SR4))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160-20090215_29883 (JIT enabled, AOT enabled)
> J9VM - 20090215_029883_bHdSMr
> JIT  - r9_20090213_2028
> GC   - 20090213_AA)
> JCL  - 20090218_01
> also 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr2ifix-20081021_01(SR2+IZ32776+IZ33456))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160ifx-20081010_24288 (JIT enabled, AOT enabled)
> J9VM - 20081009_024288_bHdSMr
> JIT  - r9_20080721_1330ifx2
> GC   - 20080724_AA)
> JCL  - 20080808_02
>            Reporter: Kathey Marsden
>            Assignee: Mike Matrigali
>            Priority: Critical
>         Attachments: badlogsizes.txt, derby-4239_1.diff, derby.log, derby.log, derby_dumponly.zip, goodlogsizes.txt, identifyBadContainer.ksh, reproBackgroundCheckpoint.zip, reproDerby4239.zip, wombat_keeplog_notcorrupt.zip, wombat_with_keeplog.zip
>
>
> I saw corruption on z/OS with the storerecovery tests and 10.5.1.1.  The failure comes in oc_rec3 trying to connect to the database, but the actual problem seems to have occurred with the prior test oc_rec2.  The problem is somewhat intermittent, happening approximately 1/4 times.  I extracted the case from the harness and will attach the reproduction and run the script repro.ksh.  The script will loop up to 50 times until it gets the failure which looks like.
> ERROR XSLA7: Cannot redo operation null in the log.
> 	at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.FileLogger.redo(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.LogToFile.recover(Unknown Source)
> 	at org.apache.derby.impl.store.raw.RawStore.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.store.access.RAMAccessManager.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.bootStore(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.bootService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startProviderService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.findProviderAndStartService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.<init>(Unknown Source)
> 	at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Unknown Source)
> 	at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source)
> 	at org.apache.derby.jdbc.AutoloadedDriver.connect(Unknown Source)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:311)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:268)
> 	at CheckTables.main(CheckTables.java:8)
> Caused by: ERROR XSDBB: Unknown page format at page Page(16,Container(0, 1073)), page dump follows: Hex dump:
> 00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 00000010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> <snip lots of 000's>
> I ran it with 10.3 and it completed all 50 iterations, so whether JVM or Derby issue it seems new since 10.3. (I haven't tried with 10.4).  Oddly I have run tests many times before on this machine using in the 10.5.1.1 release and the same jvm and have never seen this failure, so am looking into whether maybe something changed on the machine or environment.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (DERBY-4239) Possible corruption if SYSCS_UTIL.SYSCS_INPLACE_COMPRESS_TABLE is called during checkpoint

Posted by "Mike Matrigali (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/DERBY-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mike Matrigali updated DERBY-4239:
----------------------------------

    Description: 
corruption with storerecovery oc_rec? tests.  ERROR XSLA7: Cannot redo operation null in the log when compress occurs during checkpoint, then jvm exits

I saw corruption on z/OS with the storerecovery tests and 10.5.1.1.  The failure comes in oc_rec3 trying to connect to the database, but the actual problem seems to have occurred with the prior test oc_rec2.  The problem is somewhat intermittent, happening approximately 1/4 times.  I extracted the case from the harness and will attach the reproduction and run the script repro.ksh.  The script will loop up to 50 times until it gets the failure which looks like.

ERROR XSLA7: Cannot redo operation null in the log.
	at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
	at org.apache.derby.impl.store.raw.log.FileLogger.redo(Unknown Source)
	at org.apache.derby.impl.store.raw.log.LogToFile.recover(Unknown Source)
	at org.apache.derby.impl.store.raw.RawStore.boot(Unknown Source)
	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
	at org.apache.derby.impl.store.access.RAMAccessManager.boot(Unknown Source)
	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
	at org.apache.derby.impl.db.BasicDatabase.bootStore(Unknown Source)
	at org.apache.derby.impl.db.BasicDatabase.boot(Unknown Source)
	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
	at org.apache.derby.impl.services.monitor.BaseMonitor.bootService(Unknown Source)
	at org.apache.derby.impl.services.monitor.BaseMonitor.startProviderService(Unknown Source)
	at org.apache.derby.impl.services.monitor.BaseMonitor.findProviderAndStartService(Unknown Source)
	at org.apache.derby.impl.services.monitor.BaseMonitor.startPersistentService(Unknown Source)
	at org.apache.derby.iapi.services.monitor.Monitor.startPersistentService(Unknown Source)
	at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(Unknown Source)
	at org.apache.derby.impl.jdbc.EmbedConnection.<init>(Unknown Source)
	at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Unknown Source)
	at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source)
	at org.apache.derby.jdbc.AutoloadedDriver.connect(Unknown Source)
	at java.sql.DriverManager.getConnection(DriverManager.java:311)
	at java.sql.DriverManager.getConnection(DriverManager.java:268)
	at CheckTables.main(CheckTables.java:8)
Caused by: ERROR XSDBB: Unknown page format at page Page(16,Container(0, 1073)), page dump follows: Hex dump:
00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
<snip lots of 000's>

I ran it with 10.3 and it completed all 50 iterations, so whether JVM or Derby issue it seems new since 10.3. (I haven't tried with 10.4).  Oddly I have run tests many times before on this machine using in the 10.5.1.1 release and the same jvm and have never seen this failure, so am looking into whether maybe something changed on the machine or environment.



  was:
I saw corruption on z/OS with the storerecovery tests and 10.5.1.1.  The failure comes in oc_rec3 trying to connect to the database, but the actual problem seems to have occurred with the prior test oc_rec2.  The problem is somewhat intermittent, happening approximately 1/4 times.  I extracted the case from the harness and will attach the reproduction and run the script repro.ksh.  The script will loop up to 50 times until it gets the failure which looks like.

ERROR XSLA7: Cannot redo operation null in the log.
	at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
	at org.apache.derby.impl.store.raw.log.FileLogger.redo(Unknown Source)
	at org.apache.derby.impl.store.raw.log.LogToFile.recover(Unknown Source)
	at org.apache.derby.impl.store.raw.RawStore.boot(Unknown Source)
	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
	at org.apache.derby.impl.store.access.RAMAccessManager.boot(Unknown Source)
	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
	at org.apache.derby.impl.db.BasicDatabase.bootStore(Unknown Source)
	at org.apache.derby.impl.db.BasicDatabase.boot(Unknown Source)
	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
	at org.apache.derby.impl.services.monitor.BaseMonitor.bootService(Unknown Source)
	at org.apache.derby.impl.services.monitor.BaseMonitor.startProviderService(Unknown Source)
	at org.apache.derby.impl.services.monitor.BaseMonitor.findProviderAndStartService(Unknown Source)
	at org.apache.derby.impl.services.monitor.BaseMonitor.startPersistentService(Unknown Source)
	at org.apache.derby.iapi.services.monitor.Monitor.startPersistentService(Unknown Source)
	at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(Unknown Source)
	at org.apache.derby.impl.jdbc.EmbedConnection.<init>(Unknown Source)
	at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Unknown Source)
	at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source)
	at org.apache.derby.jdbc.AutoloadedDriver.connect(Unknown Source)
	at java.sql.DriverManager.getConnection(DriverManager.java:311)
	at java.sql.DriverManager.getConnection(DriverManager.java:268)
	at CheckTables.main(CheckTables.java:8)
Caused by: ERROR XSDBB: Unknown page format at page Page(16,Container(0, 1073)), page dump follows: Hex dump:
00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
<snip lots of 000's>

I ran it with 10.3 and it completed all 50 iterations, so whether JVM or Derby issue it seems new since 10.3. (I haven't tried with 10.4).  Oddly I have run tests many times before on this machine using in the 10.5.1.1 release and the same jvm and have never seen this failure, so am looking into whether maybe something changed on the machine or environment.



        Summary: Possible corruption if SYSCS_UTIL.SYSCS_INPLACE_COMPRESS_TABLE is called during checkpoint   (was: corruption with storerecovery oc_rec? tests.  ERROR XSLA7: Cannot redo operation null in the log when compress occurs during checkpoint, then jvm exits)

> Possible corruption if SYSCS_UTIL.SYSCS_INPLACE_COMPRESS_TABLE is called during checkpoint 
> -------------------------------------------------------------------------------------------
>
>                 Key: DERBY-4239
>                 URL: https://issues.apache.org/jira/browse/DERBY-4239
>             Project: Derby
>          Issue Type: Bug
>          Components: Store
>    Affects Versions: 10.1.3.3, 10.2.2.1, 10.3.2.1, 10.4.2.0, 10.5.1.1, 10.6.0.0
>         Environment: z/OS z10 processor. 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr4-20090219_01(SR4))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160-20090215_29883 (JIT enabled, AOT enabled)
> J9VM - 20090215_029883_bHdSMr
> JIT  - r9_20090213_2028
> GC   - 20090213_AA)
> JCL  - 20090218_01
> also 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr2ifix-20081021_01(SR2+IZ32776+IZ33456))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160ifx-20081010_24288 (JIT enabled, AOT enabled)
> J9VM - 20081009_024288_bHdSMr
> JIT  - r9_20080721_1330ifx2
> GC   - 20080724_AA)
> JCL  - 20080808_02
>            Reporter: Kathey Marsden
>            Assignee: Mike Matrigali
>            Priority: Critical
>             Fix For: 10.1.4.0, 10.2.3.0, 10.3.4.0, 10.4.3.0, 10.5.2.0, 10.6.0.0
>
>         Attachments: badlogsizes.txt, derby-4239_1.diff, DERBY-4239_2.diff, DERBY-4239_3.diff, derby.log, derby.log, derby_dumponly.zip, goodlogsizes.txt, identifyBadContainer.ksh, reproBackgroundCheckpoint.zip, reproDerby4239.zip, wombat_keeplog_notcorrupt.zip, wombat_with_keeplog.zip
>
>
> corruption with storerecovery oc_rec? tests.  ERROR XSLA7: Cannot redo operation null in the log when compress occurs during checkpoint, then jvm exits
> I saw corruption on z/OS with the storerecovery tests and 10.5.1.1.  The failure comes in oc_rec3 trying to connect to the database, but the actual problem seems to have occurred with the prior test oc_rec2.  The problem is somewhat intermittent, happening approximately 1/4 times.  I extracted the case from the harness and will attach the reproduction and run the script repro.ksh.  The script will loop up to 50 times until it gets the failure which looks like.
> ERROR XSLA7: Cannot redo operation null in the log.
> 	at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.FileLogger.redo(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.LogToFile.recover(Unknown Source)
> 	at org.apache.derby.impl.store.raw.RawStore.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.store.access.RAMAccessManager.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.bootStore(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.bootService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startProviderService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.findProviderAndStartService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.<init>(Unknown Source)
> 	at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Unknown Source)
> 	at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source)
> 	at org.apache.derby.jdbc.AutoloadedDriver.connect(Unknown Source)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:311)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:268)
> 	at CheckTables.main(CheckTables.java:8)
> Caused by: ERROR XSDBB: Unknown page format at page Page(16,Container(0, 1073)), page dump follows: Hex dump:
> 00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 00000010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> <snip lots of 000's>
> I ran it with 10.3 and it completed all 50 iterations, so whether JVM or Derby issue it seems new since 10.3. (I haven't tried with 10.4).  Oddly I have run tests many times before on this machine using in the 10.5.1.1 release and the same jvm and have never seen this failure, so am looking into whether maybe something changed on the machine or environment.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (DERBY-4239) corruption on z/OS with storerecovery oc_rec? tests. ERROR XSLA7: Cannot redo operation null in the log.

Posted by "Kathey Marsden (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/DERBY-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12714941#action_12714941 ] 

Kathey Marsden commented on DERBY-4239:
---------------------------------------

I did another thread dump and confirmed that as you suspected, it was not hung but still looping.
The thread doing the checkpoint,  is set as a daemon thread, so should have terminated when System.exit() was  called. I am not quite sure why it didn't with this particular run, but it doesn't look like a Derby issue, so I won't pursue it for now.



> corruption on z/OS with storerecovery oc_rec? tests.  ERROR XSLA7: Cannot redo operation null in the log.
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-4239
>                 URL: https://issues.apache.org/jira/browse/DERBY-4239
>             Project: Derby
>          Issue Type: Bug
>          Components: Store
>    Affects Versions: 10.1.3.3, 10.2.2.1, 10.3.2.1, 10.4.2.0, 10.5.1.1, 10.6.0.0
>         Environment: z/OS z10 processor. 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr4-20090219_01(SR4))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160-20090215_29883 (JIT enabled, AOT enabled)
> J9VM - 20090215_029883_bHdSMr
> JIT  - r9_20090213_2028
> GC   - 20090213_AA)
> JCL  - 20090218_01
> also 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr2ifix-20081021_01(SR2+IZ32776+IZ33456))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160ifx-20081010_24288 (JIT enabled, AOT enabled)
> J9VM - 20081009_024288_bHdSMr
> JIT  - r9_20080721_1330ifx2
> GC   - 20080724_AA)
> JCL  - 20080808_02
>            Reporter: Kathey Marsden
>            Assignee: Mike Matrigali
>            Priority: Critical
>         Attachments: badlogsizes.txt, derby-4239_1.diff, DERBY-4239_2.diff, DERBY-4239_3.diff, derby.log, derby.log, derby_dumponly.zip, goodlogsizes.txt, identifyBadContainer.ksh, reproBackgroundCheckpoint.zip, reproDerby4239.zip, wombat_keeplog_notcorrupt.zip, wombat_with_keeplog.zip
>
>
> I saw corruption on z/OS with the storerecovery tests and 10.5.1.1.  The failure comes in oc_rec3 trying to connect to the database, but the actual problem seems to have occurred with the prior test oc_rec2.  The problem is somewhat intermittent, happening approximately 1/4 times.  I extracted the case from the harness and will attach the reproduction and run the script repro.ksh.  The script will loop up to 50 times until it gets the failure which looks like.
> ERROR XSLA7: Cannot redo operation null in the log.
> 	at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.FileLogger.redo(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.LogToFile.recover(Unknown Source)
> 	at org.apache.derby.impl.store.raw.RawStore.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.store.access.RAMAccessManager.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.bootStore(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.bootService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startProviderService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.findProviderAndStartService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.<init>(Unknown Source)
> 	at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Unknown Source)
> 	at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source)
> 	at org.apache.derby.jdbc.AutoloadedDriver.connect(Unknown Source)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:311)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:268)
> 	at CheckTables.main(CheckTables.java:8)
> Caused by: ERROR XSDBB: Unknown page format at page Page(16,Container(0, 1073)), page dump follows: Hex dump:
> 00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 00000010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> <snip lots of 000's>
> I ran it with 10.3 and it completed all 50 iterations, so whether JVM or Derby issue it seems new since 10.3. (I haven't tried with 10.4).  Oddly I have run tests many times before on this machine using in the 10.5.1.1 release and the same jvm and have never seen this failure, so am looking into whether maybe something changed on the machine or environment.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (DERBY-4239) corruption on z/OS with storerecovery oc_rec? tests. ERROR XSLA7: Cannot redo operation null in the log.

Posted by "Mike Matrigali (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/DERBY-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mike Matrigali reassigned DERBY-4239:
-------------------------------------

    Assignee: Mike Matrigali

> corruption on z/OS with storerecovery oc_rec? tests.  ERROR XSLA7: Cannot redo operation null in the log.
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-4239
>                 URL: https://issues.apache.org/jira/browse/DERBY-4239
>             Project: Derby
>          Issue Type: Bug
>          Components: Store
>    Affects Versions: 10.5.1.1
>         Environment: z/OS z10 processor. 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr4-20090219_01(SR4))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160-20090215_29883 (JIT enabled, AOT enabled)
> J9VM - 20090215_029883_bHdSMr
> JIT  - r9_20090213_2028
> GC   - 20090213_AA)
> JCL  - 20090218_01
> also 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr2ifix-20081021_01(SR2+IZ32776+IZ33456))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160ifx-20081010_24288 (JIT enabled, AOT enabled)
> J9VM - 20081009_024288_bHdSMr
> JIT  - r9_20080721_1330ifx2
> GC   - 20080724_AA)
> JCL  - 20080808_02
>            Reporter: Kathey Marsden
>            Assignee: Mike Matrigali
>            Priority: Critical
>         Attachments: badlogsizes.txt, derby.log, derby.log, goodlogsizes.txt, identifyBadContainer.ksh, reproBackgroundCheckpoint.zip, reproDerby4239.zip, wombat_keeplog_notcorrupt.zip, wombat_with_keeplog.zip
>
>
> I saw corruption on z/OS with the storerecovery tests and 10.5.1.1.  The failure comes in oc_rec3 trying to connect to the database, but the actual problem seems to have occurred with the prior test oc_rec2.  The problem is somewhat intermittent, happening approximately 1/4 times.  I extracted the case from the harness and will attach the reproduction and run the script repro.ksh.  The script will loop up to 50 times until it gets the failure which looks like.
> ERROR XSLA7: Cannot redo operation null in the log.
> 	at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.FileLogger.redo(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.LogToFile.recover(Unknown Source)
> 	at org.apache.derby.impl.store.raw.RawStore.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.store.access.RAMAccessManager.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.bootStore(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.bootService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startProviderService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.findProviderAndStartService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.<init>(Unknown Source)
> 	at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Unknown Source)
> 	at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source)
> 	at org.apache.derby.jdbc.AutoloadedDriver.connect(Unknown Source)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:311)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:268)
> 	at CheckTables.main(CheckTables.java:8)
> Caused by: ERROR XSDBB: Unknown page format at page Page(16,Container(0, 1073)), page dump follows: Hex dump:
> 00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 00000010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> <snip lots of 000's>
> I ran it with 10.3 and it completed all 50 iterations, so whether JVM or Derby issue it seems new since 10.3. (I haven't tried with 10.4).  Oddly I have run tests many times before on this machine using in the 10.5.1.1 release and the same jvm and have never seen this failure, so am looking into whether maybe something changed on the machine or environment.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (DERBY-4239) corruption on z/OS with storerecovery oc_rec? tests. ERROR XSLA7: Cannot redo operation null in the log.

Posted by "Mike Matrigali (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/DERBY-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mike Matrigali updated DERBY-4239:
----------------------------------


The normal case where we ask for a checkpoint, which is triggered by default when we think 
we have logged approximately 10meg of log is a case where we don't want to start a new one.
The only reason we are doing a checkpoint in this case is to minimize recovery time if we happen to crash.  If there is already a checkpoint in progress, then that is good enough.  There
is no correctness of needed a checkpoint to start NOW and wait for it to finish.
Checkpoints can really slow down the over all throughput of the system, especially if user
has increased the cache size, so we don't want to do additional ones if they are
unnecessary.

I am not sure what backup needs.

In the case of the user callable routine we don't really say much about what it does:
The SYSCS_UTIL.SYSCS_CHECKPOINT_DATABASE system procedure checkpoints the database by flushing all cached data to disk.  But I would lean toward changing its behavior to
also do another checkpoint.  

I am tempted to change the patch to eliminate the wait parameter, and instead all code that 
currently calls wait will always force a new checkpoint and wait for it if it finds a checkpoint in
progress.  If I do this change I will make sure the "normal" checkpoint does not call this path.  Any opinions?  It would be nice if we could generate bug scripts that show the
specific bugs that are fixed by adding the additional checkpoints, but this is hard as is
evidenced we still don't have a perfect repro for the compress bug.


> corruption on z/OS with storerecovery oc_rec? tests.  ERROR XSLA7: Cannot redo operation null in the log.
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-4239
>                 URL: https://issues.apache.org/jira/browse/DERBY-4239
>             Project: Derby
>          Issue Type: Bug
>          Components: Store
>    Affects Versions: 10.1.3.3, 10.2.2.1, 10.3.2.1, 10.4.2.0, 10.5.1.1, 10.6.0.0
>         Environment: z/OS z10 processor. 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr4-20090219_01(SR4))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160-20090215_29883 (JIT enabled, AOT enabled)
> J9VM - 20090215_029883_bHdSMr
> JIT  - r9_20090213_2028
> GC   - 20090213_AA)
> JCL  - 20090218_01
> also 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr2ifix-20081021_01(SR2+IZ32776+IZ33456))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160ifx-20081010_24288 (JIT enabled, AOT enabled)
> J9VM - 20081009_024288_bHdSMr
> JIT  - r9_20080721_1330ifx2
> GC   - 20080724_AA)
> JCL  - 20080808_02
>            Reporter: Kathey Marsden
>            Assignee: Mike Matrigali
>            Priority: Critical
>         Attachments: badlogsizes.txt, derby-4239_1.diff, derby.log, derby.log, derby_dumponly.zip, goodlogsizes.txt, identifyBadContainer.ksh, reproBackgroundCheckpoint.zip, reproDerby4239.zip, wombat_keeplog_notcorrupt.zip, wombat_with_keeplog.zip
>
>
> I saw corruption on z/OS with the storerecovery tests and 10.5.1.1.  The failure comes in oc_rec3 trying to connect to the database, but the actual problem seems to have occurred with the prior test oc_rec2.  The problem is somewhat intermittent, happening approximately 1/4 times.  I extracted the case from the harness and will attach the reproduction and run the script repro.ksh.  The script will loop up to 50 times until it gets the failure which looks like.
> ERROR XSLA7: Cannot redo operation null in the log.
> 	at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.FileLogger.redo(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.LogToFile.recover(Unknown Source)
> 	at org.apache.derby.impl.store.raw.RawStore.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.store.access.RAMAccessManager.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.bootStore(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.bootService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startProviderService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.findProviderAndStartService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.<init>(Unknown Source)
> 	at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Unknown Source)
> 	at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source)
> 	at org.apache.derby.jdbc.AutoloadedDriver.connect(Unknown Source)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:311)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:268)
> 	at CheckTables.main(CheckTables.java:8)
> Caused by: ERROR XSDBB: Unknown page format at page Page(16,Container(0, 1073)), page dump follows: Hex dump:
> 00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 00000010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> <snip lots of 000's>
> I ran it with 10.3 and it completed all 50 iterations, so whether JVM or Derby issue it seems new since 10.3. (I haven't tried with 10.4).  Oddly I have run tests many times before on this machine using in the 10.5.1.1 release and the same jvm and have never seen this failure, so am looking into whether maybe something changed on the machine or environment.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (DERBY-4239) corruption on z/OS with storerecovery oc_rec? tests. ERROR XSLA7: Cannot redo operation null in the log.

Posted by "Kathey Marsden (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/DERBY-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12712245#action_12712245 ] 

Kathey Marsden commented on DERBY-4239:
---------------------------------------

Just a few more notes.  
- I reproduced the issue on  IBM 1.5 again on z/OS.
java version "1.5.0"
Java(TM) 2 Runtime Environment, Standard Edition (build pmz31devifx-20090408 (SR9-2 ))
IBM J9 VM (build 2.3, J2RE 1.5.0 IBM J9 2.3 z/OS s390-31 j9vmmz3123ifx-20090324 (JIT enabled)
J9VM - 20090319_32038_bHdSMr
JIT  - 20081112_1511ifx1_r8
GC   - 200811_07)
JCL  - 20090408

- I put a wait for post commit after the delete and again after the compress before exiting the jvm and it still reproduced, so the issue does not seem related to any contention between the delete postcommit operations and the compress or any problem related to not completing the postcommit before exiting the JVM.

- It does not seem to reproduce with a clean shutdown, so seems specific to recovery and the log files that were written.  Mike thinks perhaps some timing issue related to when the checkpoint record gets laid down and what log records are after the checkpoint record.

- Mike said the log records appear to be well formed.  There is not just the random corruption that I would have expected if this were a JVM bug.

- I am running with 1.4.2 and haven't seen it yet after 34  runs. (It of course always seems to pop just after I hit send saying it hasn't happened.)    I know there are significant changes in the JVM from 1.4.2 to 1.5 and I think also the I/O behavior of Derby under 1.4.2 is different than it is with 1.5 with the incorporation of nio  so this doesn't really help us determine if it is a JVM or Derby issue.


> corruption on z/OS with storerecovery oc_rec? tests.  ERROR XSLA7: Cannot redo operation null in the log.
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-4239
>                 URL: https://issues.apache.org/jira/browse/DERBY-4239
>             Project: Derby
>          Issue Type: Bug
>          Components: Store
>    Affects Versions: 10.5.1.1
>         Environment: z/OS z10 processor. 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr4-20090219_01(SR4))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160-20090215_29883 (JIT enabled, AOT enabled)
> J9VM - 20090215_029883_bHdSMr
> JIT  - r9_20090213_2028
> GC   - 20090213_AA)
> JCL  - 20090218_01
> also 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr2ifix-20081021_01(SR2+IZ32776+IZ33456))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160ifx-20081010_24288 (JIT enabled, AOT enabled)
> J9VM - 20081009_024288_bHdSMr
> JIT  - r9_20080721_1330ifx2
> GC   - 20080724_AA)
> JCL  - 20080808_02
>            Reporter: Kathey Marsden
>            Priority: Critical
>         Attachments: badlogsizes.txt, derby.log, derby.log, goodlogsizes.txt, identifyBadContainer.ksh, reproDerby4239.zip, wombat_keeplog_notcorrupt.zip, wombat_with_keeplog.zip
>
>
> I saw corruption on z/OS with the storerecovery tests and 10.5.1.1.  The failure comes in oc_rec3 trying to connect to the database, but the actual problem seems to have occurred with the prior test oc_rec2.  The problem is somewhat intermittent, happening approximately 1/4 times.  I extracted the case from the harness and will attach the reproduction and run the script repro.ksh.  The script will loop up to 50 times until it gets the failure which looks like.
> ERROR XSLA7: Cannot redo operation null in the log.
> 	at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.FileLogger.redo(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.LogToFile.recover(Unknown Source)
> 	at org.apache.derby.impl.store.raw.RawStore.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.store.access.RAMAccessManager.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.bootStore(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.bootService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startProviderService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.findProviderAndStartService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.<init>(Unknown Source)
> 	at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Unknown Source)
> 	at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source)
> 	at org.apache.derby.jdbc.AutoloadedDriver.connect(Unknown Source)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:311)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:268)
> 	at CheckTables.main(CheckTables.java:8)
> Caused by: ERROR XSDBB: Unknown page format at page Page(16,Container(0, 1073)), page dump follows: Hex dump:
> 00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 00000010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> <snip lots of 000's>
> I ran it with 10.3 and it completed all 50 iterations, so whether JVM or Derby issue it seems new since 10.3. (I haven't tried with 10.4).  Oddly I have run tests many times before on this machine using in the 10.5.1.1 release and the same jvm and have never seen this failure, so am looking into whether maybe something changed on the machine or environment.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (DERBY-4239) corruption on z/OS with storerecovery oc_rec? tests. ERROR XSLA7: Cannot redo operation null in the log.

Posted by "Kathey Marsden (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/DERBY-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kathey Marsden updated DERBY-4239:
----------------------------------

    Attachment:     (was: badlogsizes.txt)

> corruption on z/OS with storerecovery oc_rec? tests.  ERROR XSLA7: Cannot redo operation null in the log.
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-4239
>                 URL: https://issues.apache.org/jira/browse/DERBY-4239
>             Project: Derby
>          Issue Type: Bug
>          Components: Store
>    Affects Versions: 10.5.1.1
>         Environment: z/OS z10 processor. 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr4-20090219_01(SR4))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160-20090215_29883 (JIT enabled, AOT enabled)
> J9VM - 20090215_029883_bHdSMr
> JIT  - r9_20090213_2028
> GC   - 20090213_AA)
> JCL  - 20090218_01
> also 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr2ifix-20081021_01(SR2+IZ32776+IZ33456))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160ifx-20081010_24288 (JIT enabled, AOT enabled)
> J9VM - 20081009_024288_bHdSMr
> JIT  - r9_20080721_1330ifx2
> GC   - 20080724_AA)
> JCL  - 20080808_02
>            Reporter: Kathey Marsden
>            Priority: Critical
>         Attachments: derby.log, derby.log, reproDerby4239.zip, wombat_keeplog_notcorrupt.zip, wombat_with_keeplog.zip
>
>
> I saw corruption on z/OS with the storerecovery tests and 10.5.1.1.  The failure comes in oc_rec3 trying to connect to the database, but the actual problem seems to have occurred with the prior test oc_rec2.  The problem is somewhat intermittent, happening approximately 1/4 times.  I extracted the case from the harness and will attach the reproduction and run the script repro.ksh.  The script will loop up to 50 times until it gets the failure which looks like.
> ERROR XSLA7: Cannot redo operation null in the log.
> 	at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.FileLogger.redo(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.LogToFile.recover(Unknown Source)
> 	at org.apache.derby.impl.store.raw.RawStore.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.store.access.RAMAccessManager.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.bootStore(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.bootService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startProviderService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.findProviderAndStartService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.<init>(Unknown Source)
> 	at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Unknown Source)
> 	at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source)
> 	at org.apache.derby.jdbc.AutoloadedDriver.connect(Unknown Source)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:311)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:268)
> 	at CheckTables.main(CheckTables.java:8)
> Caused by: ERROR XSDBB: Unknown page format at page Page(16,Container(0, 1073)), page dump follows: Hex dump:
> 00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 00000010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> <snip lots of 000's>
> I ran it with 10.3 and it completed all 50 iterations, so whether JVM or Derby issue it seems new since 10.3. (I haven't tried with 10.4).  Oddly I have run tests many times before on this machine using in the 10.5.1.1 release and the same jvm and have never seen this failure, so am looking into whether maybe something changed on the machine or environment.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (DERBY-4239) corruption on z/OS with storerecovery oc_rec? tests. ERROR XSLA7: Cannot redo operation null in the log.

Posted by "Mike Matrigali (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/DERBY-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mike Matrigali resolved DERBY-4239.
-----------------------------------

       Resolution: Fixed
    Fix Version/s: 10.6.0.0
                   10.5.1.2
                   10.4.3.0
                   10.3.4.0
                   10.2.3.0
                   10.1.4.0

> corruption on z/OS with storerecovery oc_rec? tests.  ERROR XSLA7: Cannot redo operation null in the log.
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-4239
>                 URL: https://issues.apache.org/jira/browse/DERBY-4239
>             Project: Derby
>          Issue Type: Bug
>          Components: Store
>    Affects Versions: 10.1.3.3, 10.2.2.1, 10.3.2.1, 10.4.2.0, 10.5.1.1, 10.6.0.0
>         Environment: z/OS z10 processor. 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr4-20090219_01(SR4))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160-20090215_29883 (JIT enabled, AOT enabled)
> J9VM - 20090215_029883_bHdSMr
> JIT  - r9_20090213_2028
> GC   - 20090213_AA)
> JCL  - 20090218_01
> also 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr2ifix-20081021_01(SR2+IZ32776+IZ33456))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160ifx-20081010_24288 (JIT enabled, AOT enabled)
> J9VM - 20081009_024288_bHdSMr
> JIT  - r9_20080721_1330ifx2
> GC   - 20080724_AA)
> JCL  - 20080808_02
>            Reporter: Kathey Marsden
>            Assignee: Mike Matrigali
>            Priority: Critical
>             Fix For: 10.1.4.0, 10.2.3.0, 10.3.4.0, 10.4.3.0, 10.5.1.2, 10.6.0.0
>
>         Attachments: badlogsizes.txt, derby-4239_1.diff, DERBY-4239_2.diff, DERBY-4239_3.diff, derby.log, derby.log, derby_dumponly.zip, goodlogsizes.txt, identifyBadContainer.ksh, reproBackgroundCheckpoint.zip, reproDerby4239.zip, wombat_keeplog_notcorrupt.zip, wombat_with_keeplog.zip
>
>
> I saw corruption on z/OS with the storerecovery tests and 10.5.1.1.  The failure comes in oc_rec3 trying to connect to the database, but the actual problem seems to have occurred with the prior test oc_rec2.  The problem is somewhat intermittent, happening approximately 1/4 times.  I extracted the case from the harness and will attach the reproduction and run the script repro.ksh.  The script will loop up to 50 times until it gets the failure which looks like.
> ERROR XSLA7: Cannot redo operation null in the log.
> 	at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.FileLogger.redo(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.LogToFile.recover(Unknown Source)
> 	at org.apache.derby.impl.store.raw.RawStore.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.store.access.RAMAccessManager.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.bootStore(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.bootService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startProviderService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.findProviderAndStartService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.<init>(Unknown Source)
> 	at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Unknown Source)
> 	at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source)
> 	at org.apache.derby.jdbc.AutoloadedDriver.connect(Unknown Source)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:311)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:268)
> 	at CheckTables.main(CheckTables.java:8)
> Caused by: ERROR XSDBB: Unknown page format at page Page(16,Container(0, 1073)), page dump follows: Hex dump:
> 00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 00000010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> <snip lots of 000's>
> I ran it with 10.3 and it completed all 50 iterations, so whether JVM or Derby issue it seems new since 10.3. (I haven't tried with 10.4).  Oddly I have run tests many times before on this machine using in the 10.5.1.1 release and the same jvm and have never seen this failure, so am looking into whether maybe something changed on the machine or environment.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (DERBY-4239) corruption on z/OS with storerecovery oc_rec? tests. ERROR XSLA7: Cannot redo operation null in the log.

Posted by "Kathey Marsden (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/DERBY-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kathey Marsden updated DERBY-4239:
----------------------------------

    Affects Version/s: 10.6.0.0
                       10.2.2.1
                       10.1.3.3
                       10.3.2.1
                       10.4.2.0

Changing the affects version to include past versions.   I haven't actually seen it  with 10.2, but did with 10.1.3.3 and see no reason why 10.2 wouldn't be affected.


> corruption on z/OS with storerecovery oc_rec? tests.  ERROR XSLA7: Cannot redo operation null in the log.
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-4239
>                 URL: https://issues.apache.org/jira/browse/DERBY-4239
>             Project: Derby
>          Issue Type: Bug
>          Components: Store
>    Affects Versions: 10.1.3.3, 10.2.2.1, 10.3.2.1, 10.4.2.0, 10.5.1.1, 10.6.0.0
>         Environment: z/OS z10 processor. 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr4-20090219_01(SR4))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160-20090215_29883 (JIT enabled, AOT enabled)
> J9VM - 20090215_029883_bHdSMr
> JIT  - r9_20090213_2028
> GC   - 20090213_AA)
> JCL  - 20090218_01
> also 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr2ifix-20081021_01(SR2+IZ32776+IZ33456))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160ifx-20081010_24288 (JIT enabled, AOT enabled)
> J9VM - 20081009_024288_bHdSMr
> JIT  - r9_20080721_1330ifx2
> GC   - 20080724_AA)
> JCL  - 20080808_02
>            Reporter: Kathey Marsden
>            Assignee: Mike Matrigali
>            Priority: Critical
>         Attachments: badlogsizes.txt, derby.log, derby.log, derby_dumponly.zip, goodlogsizes.txt, identifyBadContainer.ksh, reproBackgroundCheckpoint.zip, reproDerby4239.zip, wombat_keeplog_notcorrupt.zip, wombat_with_keeplog.zip
>
>
> I saw corruption on z/OS with the storerecovery tests and 10.5.1.1.  The failure comes in oc_rec3 trying to connect to the database, but the actual problem seems to have occurred with the prior test oc_rec2.  The problem is somewhat intermittent, happening approximately 1/4 times.  I extracted the case from the harness and will attach the reproduction and run the script repro.ksh.  The script will loop up to 50 times until it gets the failure which looks like.
> ERROR XSLA7: Cannot redo operation null in the log.
> 	at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.FileLogger.redo(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.LogToFile.recover(Unknown Source)
> 	at org.apache.derby.impl.store.raw.RawStore.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.store.access.RAMAccessManager.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.bootStore(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.bootService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startProviderService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.findProviderAndStartService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.<init>(Unknown Source)
> 	at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Unknown Source)
> 	at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source)
> 	at org.apache.derby.jdbc.AutoloadedDriver.connect(Unknown Source)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:311)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:268)
> 	at CheckTables.main(CheckTables.java:8)
> Caused by: ERROR XSDBB: Unknown page format at page Page(16,Container(0, 1073)), page dump follows: Hex dump:
> 00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 00000010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> <snip lots of 000's>
> I ran it with 10.3 and it completed all 50 iterations, so whether JVM or Derby issue it seems new since 10.3. (I haven't tried with 10.4).  Oddly I have run tests many times before on this machine using in the 10.5.1.1 release and the same jvm and have never seen this failure, so am looking into whether maybe something changed on the machine or environment.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (DERBY-4239) corruption on z/OS with storerecovery oc_rec? tests. ERROR XSLA7: Cannot redo operation null in the log.

Posted by "Kathey Marsden (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/DERBY-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kathey Marsden updated DERBY-4239:
----------------------------------

    Attachment: reproDerby4239.zip
                derby.log
                wombat_with_keeplog.zip

Attaching wombat_with_keeplog.zip -  the database after a failing  run with derby.storage.keepTransactionLog=true
derby.log - the derby.log from a failing run.
reproDerby4239.zip - java files and repro.ksh.

To reproduce, on z/OS unzip reproDerby4239.zip 
javac -g *.java
repro.ksh

I tried on Windows and Myrna on Linux and we were not able to reproduce on those platforms.



> corruption on z/OS with storerecovery oc_rec? tests.  ERROR XSLA7: Cannot redo operation null in the log.
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-4239
>                 URL: https://issues.apache.org/jira/browse/DERBY-4239
>             Project: Derby
>          Issue Type: Bug
>          Components: Store
>    Affects Versions: 10.5.1.1
>         Environment: z/OS z10 processor. 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr4-20090219_01(SR4))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160-20090215_29883 (JIT enabled, AOT enabled)
> J9VM - 20090215_029883_bHdSMr
> JIT  - r9_20090213_2028
> GC   - 20090213_AA)
> JCL  - 20090218_01
> also 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr2ifix-20081021_01(SR2+IZ32776+IZ33456))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160ifx-20081010_24288 (JIT enabled, AOT enabled)
> J9VM - 20081009_024288_bHdSMr
> JIT  - r9_20080721_1330ifx2
> GC   - 20080724_AA)
> JCL  - 20080808_02
>            Reporter: Kathey Marsden
>            Priority: Critical
>         Attachments: derby.log, reproDerby4239.zip, wombat_with_keeplog.zip
>
>
> I saw corruption on z/OS with the storerecovery tests and 10.5.1.1.  The failure comes in oc_rec3 trying to connect to the database, but the actual problem seems to have occurred with the prior test oc_rec2.  The problem is somewhat intermittent, happening approximately 1/4 times.  I extracted the case from the harness and will attach the reproduction and run the script repro.ksh.  The script will loop up to 50 times until it gets the failure which looks like.
> ERROR XSLA7: Cannot redo operation null in the log.
> 	at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.FileLogger.redo(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.LogToFile.recover(Unknown Source)
> 	at org.apache.derby.impl.store.raw.RawStore.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.store.access.RAMAccessManager.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.bootStore(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.bootService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startProviderService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.findProviderAndStartService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.<init>(Unknown Source)
> 	at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Unknown Source)
> 	at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source)
> 	at org.apache.derby.jdbc.AutoloadedDriver.connect(Unknown Source)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:311)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:268)
> 	at CheckTables.main(CheckTables.java:8)
> Caused by: ERROR XSDBB: Unknown page format at page Page(16,Container(0, 1073)), page dump follows: Hex dump:
> 00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 00000010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> <snip lots of 000's>
> I ran it with 10.3 and it completed all 50 iterations, so whether JVM or Derby issue it seems new since 10.3. (I haven't tried with 10.4).  Oddly I have run tests many times before on this machine using in the 10.5.1.1 release and the same jvm and have never seen this failure, so am looking into whether maybe something changed on the machine or environment.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (DERBY-4239) corruption on z/OS with storerecovery oc_rec? tests. ERROR XSLA7: Cannot redo operation null in the log.

Posted by "Mike Matrigali (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/DERBY-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mike Matrigali updated DERBY-4239:
----------------------------------

    Attachment: DERBY-4239_2.diff

First patch hung the tests.  Needed to move the retry of the checkpoint out of the synchronized block.  Still not ready for commit, rerunning all tests.

> corruption on z/OS with storerecovery oc_rec? tests.  ERROR XSLA7: Cannot redo operation null in the log.
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-4239
>                 URL: https://issues.apache.org/jira/browse/DERBY-4239
>             Project: Derby
>          Issue Type: Bug
>          Components: Store
>    Affects Versions: 10.1.3.3, 10.2.2.1, 10.3.2.1, 10.4.2.0, 10.5.1.1, 10.6.0.0
>         Environment: z/OS z10 processor. 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr4-20090219_01(SR4))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160-20090215_29883 (JIT enabled, AOT enabled)
> J9VM - 20090215_029883_bHdSMr
> JIT  - r9_20090213_2028
> GC   - 20090213_AA)
> JCL  - 20090218_01
> also 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr2ifix-20081021_01(SR2+IZ32776+IZ33456))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160ifx-20081010_24288 (JIT enabled, AOT enabled)
> J9VM - 20081009_024288_bHdSMr
> JIT  - r9_20080721_1330ifx2
> GC   - 20080724_AA)
> JCL  - 20080808_02
>            Reporter: Kathey Marsden
>            Assignee: Mike Matrigali
>            Priority: Critical
>         Attachments: badlogsizes.txt, derby-4239_1.diff, DERBY-4239_2.diff, derby.log, derby.log, derby_dumponly.zip, goodlogsizes.txt, identifyBadContainer.ksh, reproBackgroundCheckpoint.zip, reproDerby4239.zip, wombat_keeplog_notcorrupt.zip, wombat_with_keeplog.zip
>
>
> I saw corruption on z/OS with the storerecovery tests and 10.5.1.1.  The failure comes in oc_rec3 trying to connect to the database, but the actual problem seems to have occurred with the prior test oc_rec2.  The problem is somewhat intermittent, happening approximately 1/4 times.  I extracted the case from the harness and will attach the reproduction and run the script repro.ksh.  The script will loop up to 50 times until it gets the failure which looks like.
> ERROR XSLA7: Cannot redo operation null in the log.
> 	at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.FileLogger.redo(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.LogToFile.recover(Unknown Source)
> 	at org.apache.derby.impl.store.raw.RawStore.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.store.access.RAMAccessManager.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.bootStore(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.bootService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startProviderService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.findProviderAndStartService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.<init>(Unknown Source)
> 	at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Unknown Source)
> 	at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source)
> 	at org.apache.derby.jdbc.AutoloadedDriver.connect(Unknown Source)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:311)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:268)
> 	at CheckTables.main(CheckTables.java:8)
> Caused by: ERROR XSDBB: Unknown page format at page Page(16,Container(0, 1073)), page dump follows: Hex dump:
> 00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 00000010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> <snip lots of 000's>
> I ran it with 10.3 and it completed all 50 iterations, so whether JVM or Derby issue it seems new since 10.3. (I haven't tried with 10.4).  Oddly I have run tests many times before on this machine using in the 10.5.1.1 release and the same jvm and have never seen this failure, so am looking into whether maybe something changed on the machine or environment.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (DERBY-4239) corruption on z/OS with storerecovery oc_rec? tests. ERROR XSLA7: Cannot redo operation null in the log.

Posted by "Kathey Marsden (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/DERBY-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12713069#action_12713069 ] 

Kathey Marsden commented on DERBY-4239:
---------------------------------------

Thanks Mike for looking at this.  If we still see the EOFException  with my background checkpoint repro after your fix,  I will file a separate bug.


> corruption on z/OS with storerecovery oc_rec? tests.  ERROR XSLA7: Cannot redo operation null in the log.
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-4239
>                 URL: https://issues.apache.org/jira/browse/DERBY-4239
>             Project: Derby
>          Issue Type: Bug
>          Components: Store
>    Affects Versions: 10.5.1.1
>         Environment: z/OS z10 processor. 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr4-20090219_01(SR4))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160-20090215_29883 (JIT enabled, AOT enabled)
> J9VM - 20090215_029883_bHdSMr
> JIT  - r9_20090213_2028
> GC   - 20090213_AA)
> JCL  - 20090218_01
> also 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr2ifix-20081021_01(SR2+IZ32776+IZ33456))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160ifx-20081010_24288 (JIT enabled, AOT enabled)
> J9VM - 20081009_024288_bHdSMr
> JIT  - r9_20080721_1330ifx2
> GC   - 20080724_AA)
> JCL  - 20080808_02
>            Reporter: Kathey Marsden
>            Assignee: Mike Matrigali
>            Priority: Critical
>         Attachments: badlogsizes.txt, derby.log, derby.log, derby_dumponly.zip, goodlogsizes.txt, identifyBadContainer.ksh, reproBackgroundCheckpoint.zip, reproDerby4239.zip, wombat_keeplog_notcorrupt.zip, wombat_with_keeplog.zip
>
>
> I saw corruption on z/OS with the storerecovery tests and 10.5.1.1.  The failure comes in oc_rec3 trying to connect to the database, but the actual problem seems to have occurred with the prior test oc_rec2.  The problem is somewhat intermittent, happening approximately 1/4 times.  I extracted the case from the harness and will attach the reproduction and run the script repro.ksh.  The script will loop up to 50 times until it gets the failure which looks like.
> ERROR XSLA7: Cannot redo operation null in the log.
> 	at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.FileLogger.redo(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.LogToFile.recover(Unknown Source)
> 	at org.apache.derby.impl.store.raw.RawStore.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.store.access.RAMAccessManager.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.bootStore(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.bootService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startProviderService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.findProviderAndStartService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.<init>(Unknown Source)
> 	at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Unknown Source)
> 	at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source)
> 	at org.apache.derby.jdbc.AutoloadedDriver.connect(Unknown Source)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:311)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:268)
> 	at CheckTables.main(CheckTables.java:8)
> Caused by: ERROR XSDBB: Unknown page format at page Page(16,Container(0, 1073)), page dump follows: Hex dump:
> 00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 00000010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> <snip lots of 000's>
> I ran it with 10.3 and it completed all 50 iterations, so whether JVM or Derby issue it seems new since 10.3. (I haven't tried with 10.4).  Oddly I have run tests many times before on this machine using in the 10.5.1.1 release and the same jvm and have never seen this failure, so am looking into whether maybe something changed on the machine or environment.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (DERBY-4239) corruption with storerecovery oc_rec? tests. ERROR XSLA7: Cannot redo operation null in the log when compress occurs during checkpoint, then jvm exits

Posted by "Dag H. Wanvik (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/DERBY-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dag H. Wanvik updated DERBY-4239:
---------------------------------

    Component/s: Test

> corruption with storerecovery oc_rec? tests.  ERROR XSLA7: Cannot redo operation null in the log when compress occurs during checkpoint, then jvm exits
> -------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-4239
>                 URL: https://issues.apache.org/jira/browse/DERBY-4239
>             Project: Derby
>          Issue Type: Bug
>          Components: Store, Test
>    Affects Versions: 10.1.3.3, 10.2.2.1, 10.3.2.1, 10.4.2.0, 10.5.1.1, 10.6.0.0
>         Environment: z/OS z10 processor. 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr4-20090219_01(SR4))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160-20090215_29883 (JIT enabled, AOT enabled)
> J9VM - 20090215_029883_bHdSMr
> JIT  - r9_20090213_2028
> GC   - 20090213_AA)
> JCL  - 20090218_01
> also 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr2ifix-20081021_01(SR2+IZ32776+IZ33456))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160ifx-20081010_24288 (JIT enabled, AOT enabled)
> J9VM - 20081009_024288_bHdSMr
> JIT  - r9_20080721_1330ifx2
> GC   - 20080724_AA)
> JCL  - 20080808_02
>            Reporter: Kathey Marsden
>            Assignee: Mike Matrigali
>            Priority: Critical
>             Fix For: 10.1.4.0, 10.2.3.0, 10.3.4.0, 10.4.3.0, 10.5.1.2, 10.6.0.0
>
>         Attachments: badlogsizes.txt, derby-4239_1.diff, DERBY-4239_2.diff, DERBY-4239_3.diff, derby.log, derby.log, derby_dumponly.zip, goodlogsizes.txt, identifyBadContainer.ksh, reproBackgroundCheckpoint.zip, reproDerby4239.zip, wombat_keeplog_notcorrupt.zip, wombat_with_keeplog.zip
>
>
> I saw corruption on z/OS with the storerecovery tests and 10.5.1.1.  The failure comes in oc_rec3 trying to connect to the database, but the actual problem seems to have occurred with the prior test oc_rec2.  The problem is somewhat intermittent, happening approximately 1/4 times.  I extracted the case from the harness and will attach the reproduction and run the script repro.ksh.  The script will loop up to 50 times until it gets the failure which looks like.
> ERROR XSLA7: Cannot redo operation null in the log.
> 	at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.FileLogger.redo(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.LogToFile.recover(Unknown Source)
> 	at org.apache.derby.impl.store.raw.RawStore.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.store.access.RAMAccessManager.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.bootStore(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.bootService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startProviderService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.findProviderAndStartService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.<init>(Unknown Source)
> 	at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Unknown Source)
> 	at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source)
> 	at org.apache.derby.jdbc.AutoloadedDriver.connect(Unknown Source)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:311)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:268)
> 	at CheckTables.main(CheckTables.java:8)
> Caused by: ERROR XSDBB: Unknown page format at page Page(16,Container(0, 1073)), page dump follows: Hex dump:
> 00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 00000010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> <snip lots of 000's>
> I ran it with 10.3 and it completed all 50 iterations, so whether JVM or Derby issue it seems new since 10.3. (I haven't tried with 10.4).  Oddly I have run tests many times before on this machine using in the 10.5.1.1 release and the same jvm and have never seen this failure, so am looking into whether maybe something changed on the machine or environment.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (DERBY-4239) corruption with storerecovery oc_rec? tests. ERROR XSLA7: Cannot redo operation null in the log when compress occurs during checkpoint, then jvm exits

Posted by "Dag H. Wanvik (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/DERBY-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dag H. Wanvik updated DERBY-4239:
---------------------------------

    Component/s:     (was: Test)

> corruption with storerecovery oc_rec? tests.  ERROR XSLA7: Cannot redo operation null in the log when compress occurs during checkpoint, then jvm exits
> -------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-4239
>                 URL: https://issues.apache.org/jira/browse/DERBY-4239
>             Project: Derby
>          Issue Type: Bug
>          Components: Store
>    Affects Versions: 10.1.3.3, 10.2.2.1, 10.3.2.1, 10.4.2.0, 10.5.1.1, 10.6.0.0
>         Environment: z/OS z10 processor. 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr4-20090219_01(SR4))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160-20090215_29883 (JIT enabled, AOT enabled)
> J9VM - 20090215_029883_bHdSMr
> JIT  - r9_20090213_2028
> GC   - 20090213_AA)
> JCL  - 20090218_01
> also 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr2ifix-20081021_01(SR2+IZ32776+IZ33456))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160ifx-20081010_24288 (JIT enabled, AOT enabled)
> J9VM - 20081009_024288_bHdSMr
> JIT  - r9_20080721_1330ifx2
> GC   - 20080724_AA)
> JCL  - 20080808_02
>            Reporter: Kathey Marsden
>            Assignee: Mike Matrigali
>            Priority: Critical
>             Fix For: 10.1.4.0, 10.2.3.0, 10.3.4.0, 10.4.3.0, 10.5.1.2, 10.6.0.0
>
>         Attachments: badlogsizes.txt, derby-4239_1.diff, DERBY-4239_2.diff, DERBY-4239_3.diff, derby.log, derby.log, derby_dumponly.zip, goodlogsizes.txt, identifyBadContainer.ksh, reproBackgroundCheckpoint.zip, reproDerby4239.zip, wombat_keeplog_notcorrupt.zip, wombat_with_keeplog.zip
>
>
> I saw corruption on z/OS with the storerecovery tests and 10.5.1.1.  The failure comes in oc_rec3 trying to connect to the database, but the actual problem seems to have occurred with the prior test oc_rec2.  The problem is somewhat intermittent, happening approximately 1/4 times.  I extracted the case from the harness and will attach the reproduction and run the script repro.ksh.  The script will loop up to 50 times until it gets the failure which looks like.
> ERROR XSLA7: Cannot redo operation null in the log.
> 	at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.FileLogger.redo(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.LogToFile.recover(Unknown Source)
> 	at org.apache.derby.impl.store.raw.RawStore.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.store.access.RAMAccessManager.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.bootStore(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.bootService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startProviderService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.findProviderAndStartService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.<init>(Unknown Source)
> 	at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Unknown Source)
> 	at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source)
> 	at org.apache.derby.jdbc.AutoloadedDriver.connect(Unknown Source)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:311)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:268)
> 	at CheckTables.main(CheckTables.java:8)
> Caused by: ERROR XSDBB: Unknown page format at page Page(16,Container(0, 1073)), page dump follows: Hex dump:
> 00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 00000010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> <snip lots of 000's>
> I ran it with 10.3 and it completed all 50 iterations, so whether JVM or Derby issue it seems new since 10.3. (I haven't tried with 10.4).  Oddly I have run tests many times before on this machine using in the 10.5.1.1 release and the same jvm and have never seen this failure, so am looking into whether maybe something changed on the machine or environment.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (DERBY-4239) corruption on z/OS with storerecovery oc_rec? tests. ERROR XSLA7: Cannot redo operation null in the log.

Posted by "Kathey Marsden (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/DERBY-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kathey Marsden updated DERBY-4239:
----------------------------------

    Derby Categories: [Data corruption, Regression Test Failure]

I have not been able to reproduce with 100 iterations with -Xint, so at first glance it would appear to be a JIT issue.  It does reproduce with -Xjit:optLevel=noOpt,count=0 which removes most of  JIT optimizations.   I was incorrect that it does not produce with 10.3.  I was able to pop the issue with 10.3.3.1 - (765035). I had done the original run with a slightly earlier sane build.  I am not sure yet whether it only reproduces with insane builds.

Typically with JIT problems you can generate a log of all the compiled methods and their optimization level and 1) Feed that back into the next run to get a consistent reproduction. and 2) Do a binary search with iterative runs with half the log file to narrow down the failing method.
Unfortunately, neither of these methods work in this case, suggesting some timing or order of compilation issue.

I would like some tips on how to identify  issue earlier, preferably as the  bad log record as it was written to disk.   Is there any way to do this?



> corruption on z/OS with storerecovery oc_rec? tests.  ERROR XSLA7: Cannot redo operation null in the log.
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-4239
>                 URL: https://issues.apache.org/jira/browse/DERBY-4239
>             Project: Derby
>          Issue Type: Bug
>          Components: Store
>    Affects Versions: 10.5.1.1
>         Environment: z/OS z10 processor. 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr4-20090219_01(SR4))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160-20090215_29883 (JIT enabled, AOT enabled)
> J9VM - 20090215_029883_bHdSMr
> JIT  - r9_20090213_2028
> GC   - 20090213_AA)
> JCL  - 20090218_01
> also 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr2ifix-20081021_01(SR2+IZ32776+IZ33456))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160ifx-20081010_24288 (JIT enabled, AOT enabled)
> J9VM - 20081009_024288_bHdSMr
> JIT  - r9_20080721_1330ifx2
> GC   - 20080724_AA)
> JCL  - 20080808_02
>            Reporter: Kathey Marsden
>            Priority: Critical
>         Attachments: derby.log, reproDerby4239.zip, wombat_with_keeplog.zip
>
>
> I saw corruption on z/OS with the storerecovery tests and 10.5.1.1.  The failure comes in oc_rec3 trying to connect to the database, but the actual problem seems to have occurred with the prior test oc_rec2.  The problem is somewhat intermittent, happening approximately 1/4 times.  I extracted the case from the harness and will attach the reproduction and run the script repro.ksh.  The script will loop up to 50 times until it gets the failure which looks like.
> ERROR XSLA7: Cannot redo operation null in the log.
> 	at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.FileLogger.redo(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.LogToFile.recover(Unknown Source)
> 	at org.apache.derby.impl.store.raw.RawStore.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.store.access.RAMAccessManager.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.bootStore(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.bootService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startProviderService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.findProviderAndStartService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.<init>(Unknown Source)
> 	at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Unknown Source)
> 	at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source)
> 	at org.apache.derby.jdbc.AutoloadedDriver.connect(Unknown Source)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:311)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:268)
> 	at CheckTables.main(CheckTables.java:8)
> Caused by: ERROR XSDBB: Unknown page format at page Page(16,Container(0, 1073)), page dump follows: Hex dump:
> 00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 00000010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> <snip lots of 000's>
> I ran it with 10.3 and it completed all 50 iterations, so whether JVM or Derby issue it seems new since 10.3. (I haven't tried with 10.4).  Oddly I have run tests many times before on this machine using in the 10.5.1.1 release and the same jvm and have never seen this failure, so am looking into whether maybe something changed on the machine or environment.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (DERBY-4239) corruption on z/OS with storerecovery oc_rec? tests. ERROR XSLA7: Cannot redo operation null in the log.

Posted by "Kathey Marsden (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/DERBY-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kathey Marsden updated DERBY-4239:
----------------------------------

    Attachment:     (was: goodlogsizes.txt)

> corruption on z/OS with storerecovery oc_rec? tests.  ERROR XSLA7: Cannot redo operation null in the log.
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-4239
>                 URL: https://issues.apache.org/jira/browse/DERBY-4239
>             Project: Derby
>          Issue Type: Bug
>          Components: Store
>    Affects Versions: 10.5.1.1
>         Environment: z/OS z10 processor. 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr4-20090219_01(SR4))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160-20090215_29883 (JIT enabled, AOT enabled)
> J9VM - 20090215_029883_bHdSMr
> JIT  - r9_20090213_2028
> GC   - 20090213_AA)
> JCL  - 20090218_01
> also 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr2ifix-20081021_01(SR2+IZ32776+IZ33456))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160ifx-20081010_24288 (JIT enabled, AOT enabled)
> J9VM - 20081009_024288_bHdSMr
> JIT  - r9_20080721_1330ifx2
> GC   - 20080724_AA)
> JCL  - 20080808_02
>            Reporter: Kathey Marsden
>            Priority: Critical
>         Attachments: derby.log, derby.log, reproDerby4239.zip, wombat_keeplog_notcorrupt.zip, wombat_with_keeplog.zip
>
>
> I saw corruption on z/OS with the storerecovery tests and 10.5.1.1.  The failure comes in oc_rec3 trying to connect to the database, but the actual problem seems to have occurred with the prior test oc_rec2.  The problem is somewhat intermittent, happening approximately 1/4 times.  I extracted the case from the harness and will attach the reproduction and run the script repro.ksh.  The script will loop up to 50 times until it gets the failure which looks like.
> ERROR XSLA7: Cannot redo operation null in the log.
> 	at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.FileLogger.redo(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.LogToFile.recover(Unknown Source)
> 	at org.apache.derby.impl.store.raw.RawStore.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.store.access.RAMAccessManager.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.bootStore(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.bootService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startProviderService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.findProviderAndStartService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.<init>(Unknown Source)
> 	at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Unknown Source)
> 	at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source)
> 	at org.apache.derby.jdbc.AutoloadedDriver.connect(Unknown Source)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:311)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:268)
> 	at CheckTables.main(CheckTables.java:8)
> Caused by: ERROR XSDBB: Unknown page format at page Page(16,Container(0, 1073)), page dump follows: Hex dump:
> 00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 00000010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> <snip lots of 000's>
> I ran it with 10.3 and it completed all 50 iterations, so whether JVM or Derby issue it seems new since 10.3. (I haven't tried with 10.4).  Oddly I have run tests many times before on this machine using in the 10.5.1.1 release and the same jvm and have never seen this failure, so am looking into whether maybe something changed on the machine or environment.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (DERBY-4239) corruption on z/OS with storerecovery oc_rec? tests. ERROR XSLA7: Cannot redo operation null in the log.

Posted by "Kathey Marsden (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/DERBY-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12712265#action_12712265 ] 

Kathey Marsden commented on DERBY-4239:
---------------------------------------

Looking again I see nio is available in 1.4.2.  Does the Derby store behavior change in anyway using 1.4.2. vs 1.5?



> corruption on z/OS with storerecovery oc_rec? tests.  ERROR XSLA7: Cannot redo operation null in the log.
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-4239
>                 URL: https://issues.apache.org/jira/browse/DERBY-4239
>             Project: Derby
>          Issue Type: Bug
>          Components: Store
>    Affects Versions: 10.5.1.1
>         Environment: z/OS z10 processor. 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr4-20090219_01(SR4))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160-20090215_29883 (JIT enabled, AOT enabled)
> J9VM - 20090215_029883_bHdSMr
> JIT  - r9_20090213_2028
> GC   - 20090213_AA)
> JCL  - 20090218_01
> also 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr2ifix-20081021_01(SR2+IZ32776+IZ33456))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160ifx-20081010_24288 (JIT enabled, AOT enabled)
> J9VM - 20081009_024288_bHdSMr
> JIT  - r9_20080721_1330ifx2
> GC   - 20080724_AA)
> JCL  - 20080808_02
>            Reporter: Kathey Marsden
>            Priority: Critical
>         Attachments: badlogsizes.txt, derby.log, derby.log, goodlogsizes.txt, identifyBadContainer.ksh, reproDerby4239.zip, wombat_keeplog_notcorrupt.zip, wombat_with_keeplog.zip
>
>
> I saw corruption on z/OS with the storerecovery tests and 10.5.1.1.  The failure comes in oc_rec3 trying to connect to the database, but the actual problem seems to have occurred with the prior test oc_rec2.  The problem is somewhat intermittent, happening approximately 1/4 times.  I extracted the case from the harness and will attach the reproduction and run the script repro.ksh.  The script will loop up to 50 times until it gets the failure which looks like.
> ERROR XSLA7: Cannot redo operation null in the log.
> 	at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.FileLogger.redo(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.LogToFile.recover(Unknown Source)
> 	at org.apache.derby.impl.store.raw.RawStore.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.store.access.RAMAccessManager.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.bootStore(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.bootService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startProviderService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.findProviderAndStartService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.<init>(Unknown Source)
> 	at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Unknown Source)
> 	at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source)
> 	at org.apache.derby.jdbc.AutoloadedDriver.connect(Unknown Source)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:311)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:268)
> 	at CheckTables.main(CheckTables.java:8)
> Caused by: ERROR XSDBB: Unknown page format at page Page(16,Container(0, 1073)), page dump follows: Hex dump:
> 00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 00000010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> <snip lots of 000's>
> I ran it with 10.3 and it completed all 50 iterations, so whether JVM or Derby issue it seems new since 10.3. (I haven't tried with 10.4).  Oddly I have run tests many times before on this machine using in the 10.5.1.1 release and the same jvm and have never seen this failure, so am looking into whether maybe something changed on the machine or environment.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (DERBY-4239) corruption on z/OS with storerecovery oc_rec? tests. ERROR XSLA7: Cannot redo operation null in the log.

Posted by "Kathey Marsden (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/DERBY-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kathey Marsden updated DERBY-4239:
----------------------------------

    Attachment: reproBackgroundCheckpoint.zip

I am not sure if this is another bug or the same problem or a combination of both, but the attached reprodcution reproBackgroundCheckpoint.zip reproduces corruption on Windows with IBM 1.6 Sun JDK 1.6.

The reproduction is the same as the original one except it has a thread which continually runs checkpoints while the program runs and therefore makes it more likely there is a conflict between the normal checkpoint and the one initiated by the compress.

To run, compile the java programs and run the script  reprobckchkpt.ksh.  It may take a dozen iterations or so.

With the Sun JVM, got the same 
Caused by: ERROR XSDBB: Unknown page format at page Page(98,Container(0, 1024)), page dump follows: Hex dump:...

The exceptions with IBM 1.6 were  different though:
============= begin nested exception, level (4) ===========

java.io.EOFException: Reached end of file while attempting to read a whole page.

	at org.apache.derby.impl.store.raw.data.RAFContainer4.readFull(Unknown Source)

	at org.apache.derby.impl.store.raw.data.RAFContainer4.readPage0(Unknown Source)

	at org.apache.derby.impl.store.raw.data.RAFContainer4.readPage(Unknown Source)

	at org.apache.derby.impl.store.raw.data.CachedPage.readPage(Unknown Source)

	at org.apache.derby.impl.store.raw.data.CachedPage.setIdentity(Unknown Source)

	at org.apache.derby.impl.services.cache.ConcurrentCache.find(Unknown Source)

	at org.apache.derby.impl.store.raw.data.FileContainer.getAnyPage(Unknown Source)

	at org.apache.derby.impl.store.raw.data.BaseContainer.getAnyPage(Unknown Source)

	at org.apache.derby.impl.store.raw.data.BaseContainerHandle.getAnyPage(Unknown Source)

	at org.apache.derby.impl.store.raw.data.PageBasicOperation.findpage(Unknown Source)

	at org.apache.derby.impl.store.raw.data.PageBasicOperation.needsRedo(Unknown Source)

	at org.apache.derby.impl.store.raw.log.FileLogger.redo(Unknown Source)

	at org.apache.derby.impl.store.raw.log.LogToFile.recover(Unknown Source)

	at org.apache.derby.impl.store.raw.RawStore.boot(Unknown Source)

	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)

	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)

	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)

	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)

	at org.apache.derby.impl.store.access.RAMAccessManager.boot(Unknown Source)

	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)

	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)

	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)

	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)

	at org.apache.derby.impl.db.BasicDatabase.bootStore(Unknown Source)

	at org.apache.derby.impl.db.BasicDatabase.boot(Unknown Source)

	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)

	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)

	at org.apache.derby.impl.services.monitor.BaseMonitor.bootService(Unknown Source)

	at org.apache.derby.impl.services.monitor.BaseMonitor.startProviderService(Unknown Source)

	at org.apache.derby.impl.services.monitor.BaseMonitor.findProviderAndStartService(Unknown Source)

	at org.apache.derby.impl.services.monitor.BaseMonitor.startPersistentService(Unknown Source)

	at org.apache.derby.iapi.services.monitor.Monitor.startPersistentService(Unknown Source)

	at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(Unknown Source)

	at org.apache.derby.impl.jdbc.EmbedConnection.<init>(Unknown Source)

	at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Unknown Source)

	at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source)

	at org.apache.derby.jdbc.AutoloadedDriver.connect(Unknown Source)

	at java.sql.DriverManager.getConnection(DriverManager.java:316)

	at java.sql.DriverManager.getConnection(DriverManager.java:273)

	at CheckTables.main(CheckTables.java:8)

and

Caused by: ERROR XSLAM: Cannot verify database format at {1} due to IOException.

	at org.apache.derby.iapi.error.StandardException.newException(StandardException.java:296)

	at org.apache.derby.impl.store.raw.log.LogToFile.verifyLogFormat(LogToFile.java:1882)

	at org.apache.derby.impl.store.raw.log.LogToFile.getLogFileAtPosition(LogToFile.java:2985)

	at org.apache.derby.impl.store.raw.log.LogToFile.getLogFileAtBeginning(LogToFile.java:2944)

	at org.apache.derby.impl.store.raw.log.Scan.getNextRecordForward(Scan.java:704)

	at org.apache.derby.impl.store.raw.log.Scan.getNextRecord(Scan.java:206)

	at org.apache.derby.impl.store.raw.log.FileLogger.redo(FileLogger.java:1176)

	at org.apache.derby.impl.store.raw.log.LogToFile.recover(LogToFile.java:924)

	at org.apache.derby.impl.store.raw.RawStore.boot(RawStore.java:339)

	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(BaseMonitor.java:2021)

	at org.apache.derby.impl.services.monitor.TopService.bootModule(TopService.java:291)

	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(BaseMonitor.java:573)

	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Monitor.java:427)

	at org.apache.derby.impl.store.access.RAMAccessManager.boot(RAMAccessManager.java:1019)

	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(BaseMonitor.java:2021)

	at org.apache.derby.impl.services.monitor.TopService.bootModule(TopService.java:291)

	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(BaseMonitor.java:573)

	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Monitor.java:427)

	at org.apache.derby.impl.db.BasicDatabase.bootStore(BasicDatabase.java:780)

	at org.apache.derby.impl.db.BasicDatabase.boot(BasicDatabase.java:196)

	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(BaseMonitor.java:2021)

	at org.apache.derby.impl.services.monitor.TopService.bootModule(TopService.java:291)

	at org.apache.derby.impl.services.monitor.BaseMonitor.bootService(BaseMonitor.java:1858)

	at org.apache.derby.impl.services.monitor.BaseMonitor.startProviderService(BaseMonitor.java:1724)

	at org.apache.derby.impl.services.monitor.BaseMonitor.findProviderAndStartService(BaseMonitor.java:1602)

	at org.apache.derby.impl.services.monitor.BaseMonitor.startPersistentService(BaseMonitor.java:1021)

	at org.apache.derby.iapi.services.monitor.Monitor.startPersistentService(Monitor.java:550)

	at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(EmbedConnection.java:2581)

	... 7 more

Caused by: java.io.EOFException

	at java.io.RandomAccessFile.readInt(RandomAccessFile.java:739)

	at org.apache.derby.impl.store.raw.log.LogToFile.verifyLogFormat(LogToFile.java:1869)

	... 33 more

============= begin nested exception, level (1) ===========

java.sql.SQLException: Cannot verify database format at {1} due to IOException.

	at org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(SQLExceptionFactory40.java:95)

	at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Util.java:201)

	at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(EmbedConnection.java:2614)

	at org.apache.derby.impl.jdbc.EmbedConnection.<init>(EmbedConnection.java:374)

	at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Driver40.java:68)

	at org.apache.derby.jdbc.InternalDriver.connect(InternalDriver.java:238)

	at org.apache.derby.jdbc.AutoloadedDriver.connect(AutoloadedDriver.java:119)

	at java.sql.DriverManager.getConnection(DriverManager.java:316)

	at java.sql.DriverManager.getConnection(DriverManager.java:273)

	at CheckTables.main(CheckTables.java:8)

Caused by: java.sql.SQLException: Cannot verify database format at {1} due to IOException.

	at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(SQLExceptionFactory.java:45)

	at org.apache.derby.impl.jdbc.SQLExceptionFactory40.wrapArgsForTransportAcrossDRDA(SQLExceptionFactory40.java:119)

	at org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(SQLExceptionFactory40.java:70)

	... 9 more

Caused by: ERROR XSLAM: Cannot verify database format at {1} due to IOException.

	at org.apache.derby.iapi.error.StandardException.newException(StandardException.java:296)

	at org.apache.derby.impl.store.raw.log.LogToFile.verifyLogFormat(LogToFile.java:1882)

	at org.apache.derby.impl.store.raw.log.LogToFile.getLogFileAtPosition(LogToFile.java:2985)

	at org.apache.derby.impl.store.raw.log.LogToFile.getLogFileAtBeginning(LogToFile.java:2944)

	at org.apache.derby.impl.store.raw.log.Scan.getNextRecordForward(Scan.java:704)

	at org.apache.derby.impl.store.raw.log.Scan.getNextRecord(Scan.java:206)

	at org.apache.derby.impl.store.raw.log.FileLogger.redo(FileLogger.java:1176)

	at org.apache.derby.impl.store.raw.log.LogToFile.recover(LogToFile.java:924)

	at org.apache.derby.impl.store.raw.RawStore.boot(RawStore.java:339)

	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(BaseMonitor.java:2021)

	at org.apache.derby.impl.services.monitor.TopService.bootModule(TopService.java:291)

	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(BaseMonitor.java:573)

	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Monitor.java:427)

	at org.apache.derby.impl.store.access.RAMAccessManager.boot(RAMAccessManager.java:1019)

	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(BaseMonitor.java:2021)

	at org.apache.derby.impl.services.monitor.TopService.bootModule(TopService.java:291)

	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(BaseMonitor.java:573)

	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Monitor.java:427)

	at org.apache.derby.impl.db.BasicDatabase.bootStore(BasicDatabase.java:780)

	at org.apache.derby.impl.db.BasicDatabase.boot(BasicDatabase.java:196)

	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(BaseMonitor.java:2021)

	at org.apache.derby.impl.services.monitor.TopService.bootModule(TopService.java:291)

	at org.apache.derby.impl.services.monitor.BaseMonitor.bootService(BaseMonitor.java:1858)

	at org.apache.derby.impl.services.monitor.BaseMonitor.startProviderService(BaseMonitor.java:1724)

	at org.apache.derby.impl.services.monitor.BaseMonitor.findProviderAndStartService(BaseMonitor.java:1602)

	at org.apache.derby.impl.services.monitor.BaseMonitor.startPersistentService(BaseMonitor.java:1021)

	at org.apache.derby.iapi.services.monitor.Monitor.startPersistentService(Monitor.java:550)

	at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(EmbedConnection.java:2581)

	... 7 more

Caused by: java.io.EOFException

	at java.io.RandomAccessFile.readInt(RandomAccessFile.java:739)

	at org.apache.derby.impl.store.raw.log.LogToFile.verifyLogFormat(LogToFile.java:1869)

	... 33 more

============= end nested exception, level (1) ===========

============= begin nested exception, level (2) ===========

java.sql.SQLException: Cannot verify database format at {1} due to IOException.

	at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(SQLExceptionFactory.java:45)

	at org.apache.derby.impl.jdbc.SQLExceptionFactory40.wrapArgsForTransportAcrossDRDA(SQLExceptionFactory40.java:119)

	at org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(SQLExceptionFactory40.java:70)

	at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Util.java:201)

	at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(EmbedConnection.java:2614)

	at org.apache.derby.impl.jdbc.EmbedConnection.<init>(EmbedConnection.java:374)

	at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Driver40.java:68)

	at org.apache.derby.jdbc.InternalDriver.connect(InternalDriver.java:238)

	at org.apache.derby.jdbc.AutoloadedDriver.connect(AutoloadedDriver.java:119)

	at java.sql.DriverManager.getConnection(DriverManager.java:316)

	at java.sql.DriverManager.getConnection(DriverManager.java:273)

	at CheckTables.main(CheckTables.java:8)

Caused by: ERROR XSLAM: Cannot verify database format at {1} due to IOException.

	at org.apache.derby.iapi.error.StandardException.newException(StandardException.java:296)

	at org.apache.derby.impl.store.raw.log.LogToFile.verifyLogFormat(LogToFile.java:1882)

	at org.apache.derby.impl.store.raw.log.LogToFile.getLogFileAtPosition(LogToFile.java:2985)

	at org.apache.derby.impl.store.raw.log.LogToFile.getLogFileAtBeginning(LogToFile.java:2944)

	at org.apache.derby.impl.store.raw.log.Scan.getNextRecordForward(Scan.java:704)

	at org.apache.derby.impl.store.raw.log.Scan.getNextRecord(Scan.java:206)

	at org.apache.derby.impl.store.raw.log.FileLogger.redo(FileLogger.java:1176)

	at org.apache.derby.impl.store.raw.log.LogToFile.recover(LogToFile.java:924)

	at org.apache.derby.impl.store.raw.RawStore.boot(RawStore.java:339)

	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(BaseMonitor.java:2021)

	at org.apache.derby.impl.services.monitor.TopService.bootModule(TopService.java:291)

	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(BaseMonitor.java:573)

	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Monitor.java:427)

	at org.apache.derby.impl.store.access.RAMAccessManager.boot(RAMAccessManager.java:1019)

	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(BaseMonitor.java:2021)

	at org.apache.derby.impl.services.monitor.TopService.bootModule(TopService.java:291)

	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(BaseMonitor.java:573)

	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Monitor.java:427)

	at org.apache.derby.impl.db.BasicDatabase.bootStore(BasicDatabase.java:780)

	at org.apache.derby.impl.db.BasicDatabase.boot(BasicDatabase.java:196)

	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(BaseMonitor.java:2021)

	at org.apache.derby.impl.services.monitor.TopService.bootModule(TopService.java:291)

	at org.apache.derby.impl.services.monitor.BaseMonitor.bootService(BaseMonitor.java:1858)

	at org.apache.derby.impl.services.monitor.BaseMonitor.startProviderService(BaseMonitor.java:1724)

	at org.apache.derby.impl.services.monitor.BaseMonitor.findProviderAndStartService(BaseMonitor.java:1602)

	at org.apache.derby.impl.services.monitor.BaseMonitor.startPersistentService(BaseMonitor.java:1021)

	at org.apache.derby.iapi.services.monitor.Monitor.startPersistentService(Monitor.java:550)

	at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(EmbedConnection.java:2581)

	... 7 more

Caused by: java.io.EOFException

	at java.io.RandomAccessFile.readInt(RandomAccessFile.java:739)

	at org.apache.derby.impl.store.raw.log.LogToFile.verifyLogFormat(LogToFile.java:1869)

	... 33 more

============= end nested exception, level (2) ===========

============= begin nested exception, level (3) ===========

ERROR XSLAM: Cannot verify database format at {1} due to IOException.

	at org.apache.derby.iapi.error.StandardException.newException(StandardException.java:296)

	at org.apache.derby.impl.store.raw.log.LogToFile.verifyLogFormat(LogToFile.java:1882)

	at org.apache.derby.impl.store.raw.log.LogToFile.getLogFileAtPosition(LogToFile.java:2985)

	at org.apache.derby.impl.store.raw.log.LogToFile.getLogFileAtBeginning(LogToFile.java:2944)

	at org.apache.derby.impl.store.raw.log.Scan.getNextRecordForward(Scan.java:704)

	at org.apache.derby.impl.store.raw.log.Scan.getNextRecord(Scan.java:206)

	at org.apache.derby.impl.store.raw.log.FileLogger.redo(FileLogger.java:1176)

	at org.apache.derby.impl.store.raw.log.LogToFile.recover(LogToFile.java:924)

	at org.apache.derby.impl.store.raw.RawStore.boot(RawStore.java:339)

	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(BaseMonitor.java:2021)

	at org.apache.derby.impl.services.monitor.TopService.bootModule(TopService.java:291)

	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(BaseMonitor.java:573)

	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Monitor.java:427)

	at org.apache.derby.impl.store.access.RAMAccessManager.boot(RAMAccessManager.java:1019)

	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(BaseMonitor.java:2021)

	at org.apache.derby.impl.services.monitor.TopService.bootModule(TopService.java:291)

	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(BaseMonitor.java:573)

	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Monitor.java:427)

	at org.apache.derby.impl.db.BasicDatabase.bootStore(BasicDatabase.java:780)

	at org.apache.derby.impl.db.BasicDatabase.boot(BasicDatabase.java:196)

	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(BaseMonitor.java:2021)

	at org.apache.derby.impl.services.monitor.TopService.bootModule(TopService.java:291)

	at org.apache.derby.impl.services.monitor.BaseMonitor.bootService(BaseMonitor.java:1858)

	at org.apache.derby.impl.services.monitor.BaseMonitor.startProviderService(BaseMonitor.java:1724)

	at org.apache.derby.impl.services.monitor.BaseMonitor.findProviderAndStartService(BaseMonitor.java:1602)

	at org.apache.derby.impl.services.monitor.BaseMonitor.startPersistentService(BaseMonitor.java:1021)

	at org.apache.derby.iapi.services.monitor.Monitor.startPersistentService(Monitor.java:550)

	at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(EmbedConnection.java:2581)

	at org.apache.derby.impl.jdbc.EmbedConnection.<init>(EmbedConnection.java:374)

	at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Driver40.java:68)

	at org.apache.derby.jdbc.InternalDriver.connect(InternalDriver.java:238)

	at org.apache.derby.jdbc.AutoloadedDriver.connect(AutoloadedDriver.java:119)

	at java.sql.DriverManager.getConnection(DriverManager.java:316)

	at java.sql.DriverManager.getConnection(DriverManager.java:273)

	at CheckTables.main(CheckTables.java:8)

Caused by: java.io.EOFException

	at java.io.RandomAccessFile.readInt(RandomAccessFile.java:739)

	at org.apache.derby.impl.store.raw.log.LogToFile.verifyLogFormat(LogToFile.java:1869)

	... 33 more

============= end nested exception, level (3) ===========

============= begin nested exception, level (4) ===========

java.io.EOFException

	at java.io.RandomAccessFile.readInt(RandomAccessFile.java:739)

	at org.apache.derby.impl.store.raw.log.LogToFile.verifyLogFormat(LogToFile.java:1869)

	at org.apache.derby.impl.store.raw.log.LogToFile.getLogFileAtPosition(LogToFile.java:2985)

	at org.apache.derby.impl.store.raw.log.LogToFile.getLogFileAtBeginning(LogToFile.java:2944)

	at org.apache.derby.impl.store.raw.log.Scan.getNextRecordForward(Scan.java:704)

	at org.apache.derby.impl.store.raw.log.Scan.getNextRecord(Scan.java:206)

	at org.apache.derby.impl.store.raw.log.FileLogger.redo(FileLogger.java:1176)

	at org.apache.derby.impl.store.raw.log.LogToFile.recover(LogToFile.java:924)

	at org.apache.derby.impl.store.raw.RawStore.boot(RawStore.java:339)

	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(BaseMonitor.java:2021)

	at org.apache.derby.impl.services.monitor.TopService.bootModule(TopService.java:291)

	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(BaseMonitor.java:573)

	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Monitor.java:427)

	at org.apache.derby.impl.store.access.RAMAccessManager.boot(RAMAccessManager.java:1019)

	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(BaseMonitor.java:2021)

	at org.apache.derby.impl.services.monitor.TopService.bootModule(TopService.java:291)

	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(BaseMonitor.java:573)

	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Monitor.java:427)

	at org.apache.derby.impl.db.BasicDatabase.bootStore(BasicDatabase.java:780)

	at org.apache.derby.impl.db.BasicDatabase.boot(BasicDatabase.java:196)

	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(BaseMonitor.java:2021)

	at org.apache.derby.impl.services.monitor.TopService.bootModule(TopService.java:291)

	at org.apache.derby.impl.services.monitor.BaseMonitor.bootService(BaseMonitor.java:1858)

	at org.apache.derby.impl.services.monitor.BaseMonitor.startProviderService(BaseMonitor.java:1724)

	at org.apache.derby.impl.services.monitor.BaseMonitor.findProviderAndStartService(BaseMonitor.java:1602)

	at org.apache.derby.impl.services.monitor.BaseMonitor.startPersistentService(BaseMonitor.java:1021)

	at org.apache.derby.iapi.services.monitor.Monitor.startPersistentService(Monitor.java:550)

	at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(EmbedConnection.java:2581)

	at org.apache.derby.impl.jdbc.EmbedConnection.<init>(EmbedConnection.java:374)

	at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Driver40.java:68)

	at org.apache.derby.jdbc.InternalDriver.connect(InternalDriver.java:238)

	at org.apache.derby.jdbc.AutoloadedDriver.connect(AutoloadedDriver.java:119)

	at java.sql.DriverManager.getConnection(DriverManager.java:316)

	at java.sql.DriverManager.getConnection(DriverManager.java:273)

	at CheckTables.main(CheckTables.java:8)

============= end nested exception, level (4) ===========

2009-05-25 02:28:24.156 GMT Thread[main,5,main] Less severe exception raised during cleanup (ignored) An attempt was made to close a transaction that was still active. The transaction has been aborted.

ERROR 40XT4: An attempt was made to close a transaction that was still active. The transaction has been aborted.

	at org.apache.derby.iapi.error.StandardException.newException(StandardException.java:276)

	at org.apache.derby.impl.store.raw.xact.Xact.close(Xact.java:1136)

	at org.apache.derby.impl.store.raw.xact.XactContext.cleanupOnError(XactContext.java:140)

	at org.apache.derby.iapi.services.context.ContextManager.cleanupOnError(ContextManager.java:333)

	at org.apache.derby.impl.jdbc.TransactionResourceImpl.cleanupOnError(TransactionResourceImpl.java:419)

	at org.apache.derby.impl.jdbc.EmbedConnection.<init>(EmbedConnection.java:584)

	at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Driver40.java:68)

	at org.apache.derby.jdbc.InternalDriver.connect(InternalDriver.java:238)

	at org.apache.derby.jdbc.AutoloadedDriver.connect(AutoloadedDriver.java:119)

	at java.sql.DriverManager.getConnection(DriverManager.java:316)

	at java.sql.DriverManager.getConnection(DriverManager.java:273)

	at CheckTables.main(CheckTables.java:8)




> corruption on z/OS with storerecovery oc_rec? tests.  ERROR XSLA7: Cannot redo operation null in the log.
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-4239
>                 URL: https://issues.apache.org/jira/browse/DERBY-4239
>             Project: Derby
>          Issue Type: Bug
>          Components: Store
>    Affects Versions: 10.5.1.1
>         Environment: z/OS z10 processor. 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr4-20090219_01(SR4))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160-20090215_29883 (JIT enabled, AOT enabled)
> J9VM - 20090215_029883_bHdSMr
> JIT  - r9_20090213_2028
> GC   - 20090213_AA)
> JCL  - 20090218_01
> also 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr2ifix-20081021_01(SR2+IZ32776+IZ33456))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160ifx-20081010_24288 (JIT enabled, AOT enabled)
> J9VM - 20081009_024288_bHdSMr
> JIT  - r9_20080721_1330ifx2
> GC   - 20080724_AA)
> JCL  - 20080808_02
>            Reporter: Kathey Marsden
>            Priority: Critical
>         Attachments: badlogsizes.txt, derby.log, derby.log, goodlogsizes.txt, identifyBadContainer.ksh, reproBackgroundCheckpoint.zip, reproDerby4239.zip, wombat_keeplog_notcorrupt.zip, wombat_with_keeplog.zip
>
>
> I saw corruption on z/OS with the storerecovery tests and 10.5.1.1.  The failure comes in oc_rec3 trying to connect to the database, but the actual problem seems to have occurred with the prior test oc_rec2.  The problem is somewhat intermittent, happening approximately 1/4 times.  I extracted the case from the harness and will attach the reproduction and run the script repro.ksh.  The script will loop up to 50 times until it gets the failure which looks like.
> ERROR XSLA7: Cannot redo operation null in the log.
> 	at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.FileLogger.redo(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.LogToFile.recover(Unknown Source)
> 	at org.apache.derby.impl.store.raw.RawStore.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.store.access.RAMAccessManager.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.bootStore(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.bootService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startProviderService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.findProviderAndStartService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.<init>(Unknown Source)
> 	at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Unknown Source)
> 	at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source)
> 	at org.apache.derby.jdbc.AutoloadedDriver.connect(Unknown Source)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:311)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:268)
> 	at CheckTables.main(CheckTables.java:8)
> Caused by: ERROR XSDBB: Unknown page format at page Page(16,Container(0, 1073)), page dump follows: Hex dump:
> 00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 00000010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> <snip lots of 000's>
> I ran it with 10.3 and it completed all 50 iterations, so whether JVM or Derby issue it seems new since 10.3. (I haven't tried with 10.4).  Oddly I have run tests many times before on this machine using in the 10.5.1.1 release and the same jvm and have never seen this failure, so am looking into whether maybe something changed on the machine or environment.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (DERBY-4239) corruption on z/OS with storerecovery oc_rec? tests. ERROR XSLA7: Cannot redo operation null in the log.

Posted by "Kathey Marsden (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/DERBY-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kathey Marsden updated DERBY-4239:
----------------------------------

    Attachment: goodlogsizes.txt
                badlogsizes.txt

reattaching goodlogsizes.txt and badlogsizes.txt as the original ones were unreadable (still in EBCDIC)


> corruption on z/OS with storerecovery oc_rec? tests.  ERROR XSLA7: Cannot redo operation null in the log.
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-4239
>                 URL: https://issues.apache.org/jira/browse/DERBY-4239
>             Project: Derby
>          Issue Type: Bug
>          Components: Store
>    Affects Versions: 10.5.1.1
>         Environment: z/OS z10 processor. 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr4-20090219_01(SR4))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160-20090215_29883 (JIT enabled, AOT enabled)
> J9VM - 20090215_029883_bHdSMr
> JIT  - r9_20090213_2028
> GC   - 20090213_AA)
> JCL  - 20090218_01
> also 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr2ifix-20081021_01(SR2+IZ32776+IZ33456))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160ifx-20081010_24288 (JIT enabled, AOT enabled)
> J9VM - 20081009_024288_bHdSMr
> JIT  - r9_20080721_1330ifx2
> GC   - 20080724_AA)
> JCL  - 20080808_02
>            Reporter: Kathey Marsden
>            Priority: Critical
>         Attachments: badlogsizes.txt, derby.log, derby.log, goodlogsizes.txt, reproDerby4239.zip, wombat_keeplog_notcorrupt.zip, wombat_with_keeplog.zip
>
>
> I saw corruption on z/OS with the storerecovery tests and 10.5.1.1.  The failure comes in oc_rec3 trying to connect to the database, but the actual problem seems to have occurred with the prior test oc_rec2.  The problem is somewhat intermittent, happening approximately 1/4 times.  I extracted the case from the harness and will attach the reproduction and run the script repro.ksh.  The script will loop up to 50 times until it gets the failure which looks like.
> ERROR XSLA7: Cannot redo operation null in the log.
> 	at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.FileLogger.redo(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.LogToFile.recover(Unknown Source)
> 	at org.apache.derby.impl.store.raw.RawStore.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.store.access.RAMAccessManager.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.bootStore(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.bootService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startProviderService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.findProviderAndStartService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.<init>(Unknown Source)
> 	at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Unknown Source)
> 	at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source)
> 	at org.apache.derby.jdbc.AutoloadedDriver.connect(Unknown Source)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:311)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:268)
> 	at CheckTables.main(CheckTables.java:8)
> Caused by: ERROR XSDBB: Unknown page format at page Page(16,Container(0, 1073)), page dump follows: Hex dump:
> 00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 00000010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> <snip lots of 000's>
> I ran it with 10.3 and it completed all 50 iterations, so whether JVM or Derby issue it seems new since 10.3. (I haven't tried with 10.4).  Oddly I have run tests many times before on this machine using in the 10.5.1.1 release and the same jvm and have never seen this failure, so am looking into whether maybe something changed on the machine or environment.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (DERBY-4239) corruption on z/OS with storerecovery oc_rec? tests. ERROR XSLA7: Cannot redo operation null in the log.

Posted by "Knut Anders Hatlen (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/DERBY-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12712555#action_12712555 ] 

Knut Anders Hatlen commented on DERBY-4239:
-------------------------------------------

The cache manager that uses a ConcurrentHashMap is only loaded on JVM>=1.5, and this may affect the timing of the checkpoints.

I'm wondering if this could be related to the new background cleaner which is used in a different way than the old one. Does the problem reproduce if you make ConcurrentCache.getBackgroundCleaner() always return null?

> corruption on z/OS with storerecovery oc_rec? tests.  ERROR XSLA7: Cannot redo operation null in the log.
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-4239
>                 URL: https://issues.apache.org/jira/browse/DERBY-4239
>             Project: Derby
>          Issue Type: Bug
>          Components: Store
>    Affects Versions: 10.5.1.1
>         Environment: z/OS z10 processor. 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr4-20090219_01(SR4))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160-20090215_29883 (JIT enabled, AOT enabled)
> J9VM - 20090215_029883_bHdSMr
> JIT  - r9_20090213_2028
> GC   - 20090213_AA)
> JCL  - 20090218_01
> also 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr2ifix-20081021_01(SR2+IZ32776+IZ33456))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160ifx-20081010_24288 (JIT enabled, AOT enabled)
> J9VM - 20081009_024288_bHdSMr
> JIT  - r9_20080721_1330ifx2
> GC   - 20080724_AA)
> JCL  - 20080808_02
>            Reporter: Kathey Marsden
>            Priority: Critical
>         Attachments: badlogsizes.txt, derby.log, derby.log, goodlogsizes.txt, identifyBadContainer.ksh, reproDerby4239.zip, wombat_keeplog_notcorrupt.zip, wombat_with_keeplog.zip
>
>
> I saw corruption on z/OS with the storerecovery tests and 10.5.1.1.  The failure comes in oc_rec3 trying to connect to the database, but the actual problem seems to have occurred with the prior test oc_rec2.  The problem is somewhat intermittent, happening approximately 1/4 times.  I extracted the case from the harness and will attach the reproduction and run the script repro.ksh.  The script will loop up to 50 times until it gets the failure which looks like.
> ERROR XSLA7: Cannot redo operation null in the log.
> 	at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.FileLogger.redo(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.LogToFile.recover(Unknown Source)
> 	at org.apache.derby.impl.store.raw.RawStore.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.store.access.RAMAccessManager.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.bootStore(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.bootService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startProviderService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.findProviderAndStartService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.<init>(Unknown Source)
> 	at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Unknown Source)
> 	at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source)
> 	at org.apache.derby.jdbc.AutoloadedDriver.connect(Unknown Source)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:311)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:268)
> 	at CheckTables.main(CheckTables.java:8)
> Caused by: ERROR XSDBB: Unknown page format at page Page(16,Container(0, 1073)), page dump follows: Hex dump:
> 00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 00000010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> <snip lots of 000's>
> I ran it with 10.3 and it completed all 50 iterations, so whether JVM or Derby issue it seems new since 10.3. (I haven't tried with 10.4).  Oddly I have run tests many times before on this machine using in the 10.5.1.1 release and the same jvm and have never seen this failure, so am looking into whether maybe something changed on the machine or environment.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (DERBY-4239) corruption on z/OS with storerecovery oc_rec? tests. ERROR XSLA7: Cannot redo operation null in the log.

Posted by "Kathey Marsden (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/DERBY-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12712102#action_12712102 ] 

Kathey Marsden commented on DERBY-4239:
---------------------------------------

Looking at some corrupt databases I saw issues with the TEST1 table, TEST1_IDX_INDCOL3, TEST1_IDX_INDCOL1,  and TEST1_IDX_KEYCOL, so it seems to be all over the place except that it is the TEST1 table  and its indexes.


> corruption on z/OS with storerecovery oc_rec? tests.  ERROR XSLA7: Cannot redo operation null in the log.
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-4239
>                 URL: https://issues.apache.org/jira/browse/DERBY-4239
>             Project: Derby
>          Issue Type: Bug
>          Components: Store
>    Affects Versions: 10.5.1.1
>         Environment: z/OS z10 processor. 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr4-20090219_01(SR4))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160-20090215_29883 (JIT enabled, AOT enabled)
> J9VM - 20090215_029883_bHdSMr
> JIT  - r9_20090213_2028
> GC   - 20090213_AA)
> JCL  - 20090218_01
> also 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr2ifix-20081021_01(SR2+IZ32776+IZ33456))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160ifx-20081010_24288 (JIT enabled, AOT enabled)
> J9VM - 20081009_024288_bHdSMr
> JIT  - r9_20080721_1330ifx2
> GC   - 20080724_AA)
> JCL  - 20080808_02
>            Reporter: Kathey Marsden
>            Priority: Critical
>         Attachments: badlogsizes.txt, derby.log, derby.log, goodlogsizes.txt, reproDerby4239.zip, wombat_keeplog_notcorrupt.zip, wombat_with_keeplog.zip
>
>
> I saw corruption on z/OS with the storerecovery tests and 10.5.1.1.  The failure comes in oc_rec3 trying to connect to the database, but the actual problem seems to have occurred with the prior test oc_rec2.  The problem is somewhat intermittent, happening approximately 1/4 times.  I extracted the case from the harness and will attach the reproduction and run the script repro.ksh.  The script will loop up to 50 times until it gets the failure which looks like.
> ERROR XSLA7: Cannot redo operation null in the log.
> 	at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.FileLogger.redo(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.LogToFile.recover(Unknown Source)
> 	at org.apache.derby.impl.store.raw.RawStore.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.store.access.RAMAccessManager.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.bootStore(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.bootService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startProviderService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.findProviderAndStartService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.<init>(Unknown Source)
> 	at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Unknown Source)
> 	at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source)
> 	at org.apache.derby.jdbc.AutoloadedDriver.connect(Unknown Source)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:311)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:268)
> 	at CheckTables.main(CheckTables.java:8)
> Caused by: ERROR XSDBB: Unknown page format at page Page(16,Container(0, 1073)), page dump follows: Hex dump:
> 00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 00000010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> <snip lots of 000's>
> I ran it with 10.3 and it completed all 50 iterations, so whether JVM or Derby issue it seems new since 10.3. (I haven't tried with 10.4).  Oddly I have run tests many times before on this machine using in the 10.5.1.1 release and the same jvm and have never seen this failure, so am looking into whether maybe something changed on the machine or environment.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (DERBY-4239) corruption on z/OS with storerecovery oc_rec? tests. ERROR XSLA7: Cannot redo operation null in the log.

Posted by "Knut Anders Hatlen (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/DERBY-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12714796#action_12714796 ] 

Knut Anders Hatlen commented on DERBY-4239:
-------------------------------------------

Do you think this could be the same as the problem reported in DERBY-3393? I've run the storerecovery suite once without the patch and twice with the patch. Without the patch, DERBY-3393 was reproduced, but none of the runs with the patch reproduced it.

> corruption on z/OS with storerecovery oc_rec? tests.  ERROR XSLA7: Cannot redo operation null in the log.
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-4239
>                 URL: https://issues.apache.org/jira/browse/DERBY-4239
>             Project: Derby
>          Issue Type: Bug
>          Components: Store
>    Affects Versions: 10.1.3.3, 10.2.2.1, 10.3.2.1, 10.4.2.0, 10.5.1.1, 10.6.0.0
>         Environment: z/OS z10 processor. 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr4-20090219_01(SR4))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160-20090215_29883 (JIT enabled, AOT enabled)
> J9VM - 20090215_029883_bHdSMr
> JIT  - r9_20090213_2028
> GC   - 20090213_AA)
> JCL  - 20090218_01
> also 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr2ifix-20081021_01(SR2+IZ32776+IZ33456))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160ifx-20081010_24288 (JIT enabled, AOT enabled)
> J9VM - 20081009_024288_bHdSMr
> JIT  - r9_20080721_1330ifx2
> GC   - 20080724_AA)
> JCL  - 20080808_02
>            Reporter: Kathey Marsden
>            Assignee: Mike Matrigali
>            Priority: Critical
>         Attachments: badlogsizes.txt, derby-4239_1.diff, DERBY-4239_2.diff, DERBY-4239_3.diff, derby.log, derby.log, derby_dumponly.zip, goodlogsizes.txt, identifyBadContainer.ksh, reproBackgroundCheckpoint.zip, reproDerby4239.zip, wombat_keeplog_notcorrupt.zip, wombat_with_keeplog.zip
>
>
> I saw corruption on z/OS with the storerecovery tests and 10.5.1.1.  The failure comes in oc_rec3 trying to connect to the database, but the actual problem seems to have occurred with the prior test oc_rec2.  The problem is somewhat intermittent, happening approximately 1/4 times.  I extracted the case from the harness and will attach the reproduction and run the script repro.ksh.  The script will loop up to 50 times until it gets the failure which looks like.
> ERROR XSLA7: Cannot redo operation null in the log.
> 	at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.FileLogger.redo(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.LogToFile.recover(Unknown Source)
> 	at org.apache.derby.impl.store.raw.RawStore.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.store.access.RAMAccessManager.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.bootStore(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.bootService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startProviderService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.findProviderAndStartService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.<init>(Unknown Source)
> 	at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Unknown Source)
> 	at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source)
> 	at org.apache.derby.jdbc.AutoloadedDriver.connect(Unknown Source)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:311)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:268)
> 	at CheckTables.main(CheckTables.java:8)
> Caused by: ERROR XSDBB: Unknown page format at page Page(16,Container(0, 1073)), page dump follows: Hex dump:
> 00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 00000010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> <snip lots of 000's>
> I ran it with 10.3 and it completed all 50 iterations, so whether JVM or Derby issue it seems new since 10.3. (I haven't tried with 10.4).  Oddly I have run tests many times before on this machine using in the 10.5.1.1 release and the same jvm and have never seen this failure, so am looking into whether maybe something changed on the machine or environment.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (DERBY-4239) corruption on z/OS with storerecovery oc_rec? tests. ERROR XSLA7: Cannot redo operation null in the log.

Posted by "Mike Matrigali (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/DERBY-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mike Matrigali updated DERBY-4239:
----------------------------------

    Attachment: derby.log

Kathey asked that I log suggestions about what to look at in corrupt db's, so logging them here.  I don't have any conclusions yet what is happening, just
posting info as I see it.

The information I am posting is from looking at the wombat_with_keeplog.zip 
posted to this issue.  This is a good one to look at as kathey has managed to 
reproduce and had it set up so that all transaction logs are kept.   This means that
it is possible to read through these records and get an exact history of all "writing"
operations.   There a couple of ways to do this.  With a sane server you can set 
derby.storage.keepTransactionLog=true (i am not sure if this works with insane 
server or not).  A supported way of almost doing this is to take a online backup, 
which I believe will then stop logs from being deleted - but note that online backup
may change the behavior of some operations which are documented in the backup
docs.


In this case we are debugging a reproducible boot error, so just starting up 
reproduces the error.   First I set the following which will only work in a SANE
server.:
derby.debug.true=LogTrace
This will dump a short description of each log record that is processed.  For 
recovery this will include all the records that are read at boot time.  Note this 
will not be all the records in the log, it will just dump out the ones looked at during
normal reboot.

The first thing I look for is the first occurence of the error and if it is nested the lowest nested error.  In this case I get the following.  From this I get that recovery
reboot is trying to redo a delete on page 16 of container 1073.  The page version
is 775 so there have been 775 updates to this page before this.  The instant
is the basically the log record address (6,447354) - the first part 6 means it is
in log6.dat, second part is the byte offset into log6.dat:
Page(16,Container(0, 1073)) pageVersion 775 :  Delete : Slot=1 recordId=7 delete=true instant =
 (6,447354) logEnd = (6,447413) logIn at 25 available 18^M

>From the stack the operation is reading the page in from disk and getting a page
of all zero's when it expects to get a formated page at pageVersion 774.

snip from log:
DEBUG LogTrace OUTPUT: scanned 64300 : Page Operation: Page(17,Container(0, 1041)) pageVersion 776 :  Delete : Slot=1 recordId=7 delete=true instant =
 (6,447236) logEnd = (6,447295) logIn at 25 available 18^M
DEBUG LogTrace OUTPUT: scanned 64300 : Page Operation: Page(16,Container(0, 1057)) pageVersion 775 :  Delete : Slot=1 recordId=7 delete=true instant =
 (6,447295) logEnd = (6,447354) logIn at 25 available 18^M
DEBUG LogTrace OUTPUT: scanned 64300 : Page Operation: Page(16,Container(0, 1073)) pageVersion 775 :  Delete : Slot=1 recordId=7 delete=true instant =
 (6,447354) logEnd = (6,447413) logIn at 25 available 18^M

------------  BEGIN SHUTDOWN ERROR STACK -------------
^M
ERROR XSLA7: Cannot redo operation Page Operation: Page(16,Container(0, 1073)) pageVersion 775 :  Delete : Slot=1 recordId=7 delete=true in the log.^M
    at org.apache.derby.iapi.error.StandardException.newException(StandardException.java:296)^M
    at org.apache.derby.impl.store.raw.log.FileLogger.redo(FileLogger.java:1525)^M
    at org.apache.derby.impl.store.raw.log.LogToFile.recover(LogToFile.java:924)^M
    at org.apache.derby.impl.store.raw.RawStore.boot(RawStore.java:339)^M
    at org.apache.derby.impl.services.monitor.BaseMonitor.boot(BaseMonitor.java:2021)^M
    at org.apache.derby.impl.services.monitor.TopService.bootModule(TopService.java:291)^M
    at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(BaseMonitor.java:573)^M
    at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Monitor.java:427)^M
    at org.apache.derby.impl.store.access.RAMAccessManager.boot(RAMAccessManager.java:1019)^M
    at org.apache.derby.impl.services.monitor.BaseMonitor.boot(BaseMonitor.java:2021)^M
    at org.apache.derby.impl.services.monitor.TopService.bootModule(TopService.java:291)^M
    at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(BaseMonitor.java:573)^M
    at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Monitor.java:427)^M
    at org.apache.derby.impl.db.BasicDatabase.bootStore(BasicDatabase.java:780)^M
    at org.apache.derby.impl.db.BasicDatabase.boot(BasicDatabase.java:196)^M
    at org.apache.derby.impl.services.monitor.BaseMonitor.boot(BaseMonitor.java:2021)^M
    at org.apache.derby.impl.services.monitor.TopService.bootModule(TopService.java:291)^M
    at org.apache.derby.impl.services.monitor.BaseMonitor.bootService(BaseMonitor.java:1858)^M
    at org.apache.derby.impl.services.monitor.BaseMonitor.startProviderService(BaseMonitor.java:1724)^M
    at org.apache.derby.impl.services.monitor.BaseMonitor.findProviderAndStartService(BaseMonitor.java:1602)^M
    at org.apache.derby.impl.services.monitor.BaseMonitor.startPersistentService(BaseMonitor.java:1021)^M
    at org.apache.derby.iapi.services.monitor.Monitor.startPersistentService(Monitor.java:550)^M
    at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(EmbedConnection.java:2581)^M
    at org.apache.derby.impl.jdbc.EmbedConnection.<init>(EmbedConnection.java:374)^M
    at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Driver40.java:68)^M
    at org.apache.derby.jdbc.InternalDriver.connect(InternalDriver.java:238)^M
    at org.apache.derby.jdbc.AutoloadedDriver.connect(AutoloadedDriver.java:119)^M
    at java.sql.DriverManager.getConnection(DriverManager.java:316)^M
    at java.sql.DriverManager.getConnection(DriverManager.java:297)^M
    at org.apache.derby.impl.tools.ij.util.startJBMS(util.java:462)^M
    at org.apache.derby.impl.tools.ij.util.startJBMS(util.java:542)^M
    at org.apache.derby.impl.tools.ij.ConnectionEnv.init(ConnectionEnv.java:64)^M
    at org.apache.derby.impl.tools.ij.utilMain.initFromEnvironment(utilMain.java:164)^M
    at org.apache.derby.impl.tools.ij.Main.<init>(Main.java:225)^M
    at org.apache.derby.impl.tools.ij.Main.getMain(Main.java:189)^M
    at org.apache.derby.impl.tools.ij.Main.mainCore(Main.java:174)^M
    at org.apache.derby.impl.tools.ij.Main.main(Main.java:73)^M
    at org.apache.derby.tools.ij.main(ij.java:59)^M
Caused by: ERROR XSDBB: Unknown page format at page Page(16,Container(0, 1073)), page dump follows: Hex dump:
00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000020: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000030: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000040: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000050: 0000 0000 0000 0000 0000 0000 0000 0000  ................
... complete page dump of a page of all ZERO's


> corruption on z/OS with storerecovery oc_rec? tests.  ERROR XSLA7: Cannot redo operation null in the log.
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-4239
>                 URL: https://issues.apache.org/jira/browse/DERBY-4239
>             Project: Derby
>          Issue Type: Bug
>          Components: Store
>    Affects Versions: 10.5.1.1
>         Environment: z/OS z10 processor. 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr4-20090219_01(SR4))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160-20090215_29883 (JIT enabled, AOT enabled)
> J9VM - 20090215_029883_bHdSMr
> JIT  - r9_20090213_2028
> GC   - 20090213_AA)
> JCL  - 20090218_01
> also 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr2ifix-20081021_01(SR2+IZ32776+IZ33456))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160ifx-20081010_24288 (JIT enabled, AOT enabled)
> J9VM - 20081009_024288_bHdSMr
> JIT  - r9_20080721_1330ifx2
> GC   - 20080724_AA)
> JCL  - 20080808_02
>            Reporter: Kathey Marsden
>            Priority: Critical
>         Attachments: derby.log, derby.log, reproDerby4239.zip, wombat_with_keeplog.zip
>
>
> I saw corruption on z/OS with the storerecovery tests and 10.5.1.1.  The failure comes in oc_rec3 trying to connect to the database, but the actual problem seems to have occurred with the prior test oc_rec2.  The problem is somewhat intermittent, happening approximately 1/4 times.  I extracted the case from the harness and will attach the reproduction and run the script repro.ksh.  The script will loop up to 50 times until it gets the failure which looks like.
> ERROR XSLA7: Cannot redo operation null in the log.
> 	at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.FileLogger.redo(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.LogToFile.recover(Unknown Source)
> 	at org.apache.derby.impl.store.raw.RawStore.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.store.access.RAMAccessManager.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.bootStore(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.bootService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startProviderService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.findProviderAndStartService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.<init>(Unknown Source)
> 	at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Unknown Source)
> 	at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source)
> 	at org.apache.derby.jdbc.AutoloadedDriver.connect(Unknown Source)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:311)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:268)
> 	at CheckTables.main(CheckTables.java:8)
> Caused by: ERROR XSDBB: Unknown page format at page Page(16,Container(0, 1073)), page dump follows: Hex dump:
> 00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 00000010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> <snip lots of 000's>
> I ran it with 10.3 and it completed all 50 iterations, so whether JVM or Derby issue it seems new since 10.3. (I haven't tried with 10.4).  Oddly I have run tests many times before on this machine using in the 10.5.1.1 release and the same jvm and have never seen this failure, so am looking into whether maybe something changed on the machine or environment.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (DERBY-4239) corruption on z/OS with storerecovery oc_rec? tests. ERROR XSLA7: Cannot redo operation null in the log.

Posted by "Kathey Marsden (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/DERBY-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kathey Marsden updated DERBY-4239:
----------------------------------

    Attachment: badlogsizes.txt
                goodlogsizes.txt
                wombat_keeplog_notcorrupt.zip

attached is a database from when we are able to reconnect and run CheckTables successfully (wombat_keeplog_notcorrupt.zip)  and also two text files showing the log file sizes on 4 good (able to reconnect) with checktables and 4 bad (corrupt) databases, created  with the attached reproduction and derby.storage.keepTransactionLog=true
along with badlogsizes.txt and goodlogsizes.txt showing the log file sizes.

I notice the good ones all have 7 log files and the bad ones all have 6, but even between good databases the log sizes vary somewhat.   Why the difference?

Also I looked at first corrupted database that I posted  and found the corruption in the  index TEST1_IDX_INDCOL2.

To map the index to the container number  1073, I doctored up the database so I could connect to it and then ran.
SELECT  C.CONGLOMERATENUMBER, C.CONGLOMERATENAME  FROM SYS.SYSCONGLOMERATES C WHERE CONGLOMERATENUMBER=1073;

I will check other corrupt databases and see if it is corruption of the same index and  see if the reproduction works without the index.






> corruption on z/OS with storerecovery oc_rec? tests.  ERROR XSLA7: Cannot redo operation null in the log.
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-4239
>                 URL: https://issues.apache.org/jira/browse/DERBY-4239
>             Project: Derby
>          Issue Type: Bug
>          Components: Store
>    Affects Versions: 10.5.1.1
>         Environment: z/OS z10 processor. 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr4-20090219_01(SR4))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160-20090215_29883 (JIT enabled, AOT enabled)
> J9VM - 20090215_029883_bHdSMr
> JIT  - r9_20090213_2028
> GC   - 20090213_AA)
> JCL  - 20090218_01
> also 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr2ifix-20081021_01(SR2+IZ32776+IZ33456))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160ifx-20081010_24288 (JIT enabled, AOT enabled)
> J9VM - 20081009_024288_bHdSMr
> JIT  - r9_20080721_1330ifx2
> GC   - 20080724_AA)
> JCL  - 20080808_02
>            Reporter: Kathey Marsden
>            Priority: Critical
>         Attachments: badlogsizes.txt, derby.log, derby.log, goodlogsizes.txt, reproDerby4239.zip, wombat_keeplog_notcorrupt.zip, wombat_with_keeplog.zip
>
>
> I saw corruption on z/OS with the storerecovery tests and 10.5.1.1.  The failure comes in oc_rec3 trying to connect to the database, but the actual problem seems to have occurred with the prior test oc_rec2.  The problem is somewhat intermittent, happening approximately 1/4 times.  I extracted the case from the harness and will attach the reproduction and run the script repro.ksh.  The script will loop up to 50 times until it gets the failure which looks like.
> ERROR XSLA7: Cannot redo operation null in the log.
> 	at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.FileLogger.redo(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.LogToFile.recover(Unknown Source)
> 	at org.apache.derby.impl.store.raw.RawStore.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.store.access.RAMAccessManager.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.bootStore(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.bootService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startProviderService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.findProviderAndStartService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.<init>(Unknown Source)
> 	at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Unknown Source)
> 	at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source)
> 	at org.apache.derby.jdbc.AutoloadedDriver.connect(Unknown Source)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:311)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:268)
> 	at CheckTables.main(CheckTables.java:8)
> Caused by: ERROR XSDBB: Unknown page format at page Page(16,Container(0, 1073)), page dump follows: Hex dump:
> 00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 00000010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> <snip lots of 000's>
> I ran it with 10.3 and it completed all 50 iterations, so whether JVM or Derby issue it seems new since 10.3. (I haven't tried with 10.4).  Oddly I have run tests many times before on this machine using in the 10.5.1.1 release and the same jvm and have never seen this failure, so am looking into whether maybe something changed on the machine or environment.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (DERBY-4239) corruption on z/OS with storerecovery oc_rec? tests. ERROR XSLA7: Cannot redo operation null in the log.

Posted by "Kathey Marsden (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/DERBY-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12714868#action_12714868 ] 

Kathey Marsden commented on DERBY-4239:
---------------------------------------

I was doing some more runs on Windows with the patch because I accidentally deleted the database/log with the EOFException and I got a hang.   I don't really understand what is getting hung up though.
This is with IBM 1.6 and ReproCorruptionBackgroundCheckpoint.

2XMFULLTHDDUMP Full thread dump J9 VM (J2RE 6.0 IBM J9 2.4 Windows XP x86-32 build jvmwi3260sr5-20090516_3558820090516_035588_lHdSMr, native threads):
3XMTHREADINFO      "JIT Compilation Thread" TID:0x41CB1900, j9thread_t:0x00292630, state:CW, prio=10
3XMTHREADINFO1            (native thread ID:0x1028, native priority:0xB, native policy:UNKNOWN)
3XMTHREADINFO      "Gc Slave Thread" TID:0x4209C500, j9thread_t:0x00292FC0, state:CW, prio=5
3XMTHREADINFO1            (native thread ID:0x1638, native priority:0x5, native policy:UNKNOWN)
3XMTHREADINFO      "derby.antiGC" TID:0x41ED2100, j9thread_t:0x4276259C, state:CW, prio=1
3XMTHREADINFO1            (native thread ID:0x970, native priority:0x1, native policy:UNKNOWN)
4XESTACKTRACE          at java/lang/Object.wait(Native Method)
4XESTACKTRACE          at java/lang/Object.wait(Object.java:167)
4XESTACKTRACE          at org/apache/derby/impl/services/monitor/AntiGC.run(Bytecode PC:15)
4XESTACKTRACE          at java/lang/Thread.run(Thread.java:735)
3XMTHREADINFO      "Thread-2" TID:0x42934500, j9thread_t:0x42762A64, state:CW, prio=5
3XMTHREADINFO1            (native thread ID:0x13E8, native priority:0x5, native policy:UNKNOWN)
4XESTACKTRACE          at java/lang/Object.wait(Native Method)
4XESTACKTRACE          at java/lang/Object.wait(Object.java:167)
4XESTACKTRACE          at java/util/Timer$TimerImpl.run(Timer.java:221)
3XMTHREADINFO      "derby.rawStoreDaemon" TID:0x42933300, j9thread_t:0x42762F2C, state:CW, prio=5
3XMTHREADINFO1            (native thread ID:0xF30, native priority:0x5, native policy:UNKNOWN)
4XESTACKTRACE          at java/lang/Object.wait(Native Method)
4XESTACKTRACE          at java/lang/Object.wait(Object.java:196(Compiled Code))
4XESTACKTRACE          at org/apache/derby/impl/services/daemon/BasicDaemon.rest(Bytecode PC:3(Compiled Code))
4XESTACKTRACE          at org/apache/derby/impl/services/daemon/BasicDaemon.run(Bytecode PC:22)
4XESTACKTRACE          at java/lang/Thread.run(Thread.java:735)
3XMTHREADINFO      "Thread-6" TID:0x42933900, j9thread_t:0x42763190, state:CW, prio=5
3XMTHREADINFO1            (native thread ID:0x143C, native priority:0x5, native policy:UNKNOWN)
4XESTACKTRACE          at sun/nio/ch/FileChannelImpl.force0(Native Method)
4XESTACKTRACE          at sun/nio/ch/FileChannelImpl.force(FileChannelImpl.java:364(Compiled Code))
4XESTACKTRACE          at org/apache/derby/impl/io/DirRandomAccessFile4.sync(Bytecode PC:77(Compiled Code))
4XESTACKTRACE          at org/apache/derby/impl/store/raw/log/LogToFile.syncFile(Bytecode PC:77(Compiled Code))
4XESTACKTRACE          at org/apache/derby/impl/store/raw/log/LogToFile.writeControlFile(Bytecode PC:394(Compiled Code))
4XESTACKTRACE          at org/apache/derby/impl/store/raw/log/LogToFile.checkpointWithTran(Bytecode PC:341(Compiled Code))
4XESTACKTRACE          at org/apache/derby/impl/store/raw/log/LogToFile.checkpoint(Bytecode PC:16(Compiled Code))
4XESTACKTRACE          at org/apache/derby/impl/store/raw/RawStore.checkpoint(Bytecode PC:16(Compiled Code))
4XESTACKTRACE          at org/apache/derby/impl/store/access/RAMAccessManager.checkpoint(Bytecode PC:16(Compiled Code))
4XESTACKTRACE          at org/apache/derby/impl/db/BasicDatabase.checkpoint(Bytecode PC:16(Compiled Code))
4XESTACKTRACE          at org/apache/derby/catalog/SystemProcedures.SYSCS_CHECKPOINT_DATABASE(Bytecode PC:16(Compiled Code))
4XESTACKTRACE          at org/apache/derby/exe/acd381409ax0121x944ex95fex00000008ed900.g0(Bytecode PC:16(Compiled Code))
4XESTACKTRACE          at sun/reflect/GeneratedMethodAccessor3.invoke(Bytecode PC:16(Compiled Code))
4XESTACKTRACE          at sun/reflect/DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java(Compiled Code))
4XESTACKTRACE          at java/lang/reflect/Method.invoke(Method.java:578(Compiled Code))
4XESTACKTRACE          at org/apache/derby/impl/services/reflect/ReflectMethod.invoke(Bytecode PC:6(Compiled Code))
4XESTACKTRACE          at org/apache/derby/impl/sql/execute/CallStatementResultSet.open(Bytecode PC:6(Compiled Code))
4XESTACKTRACE          at org/apache/derby/impl/sql/GenericPreparedStatement.executeStmt(Bytecode PC:6(Compiled Code))
4XESTACKTRACE          at org/apache/derby/impl/sql/GenericPreparedStatement.execute(Bytecode PC:4(Compiled Code))
4XESTACKTRACE          at org/apache/derby/impl/jdbc/EmbedStatement.executeStatement(Bytecode PC:4(Compiled Code))
4XESTACKTRACE          at org/apache/derby/impl/jdbc/EmbedStatement.execute(Bytecode PC:152(Compiled Code))
4XESTACKTRACE          at org/apache/derby/impl/jdbc/EmbedStatement.execute(Bytecode PC:7(Compiled Code))
4XESTACKTRACE          at ReproCorruptionBackgroundCheckpoint$1.run(ReproCorruptionBackgroundCheckpoint.java:97(Compiled Code))
3XMTHREADINFO      "DestroyJavaVM helper thread" TID:0x41CB1300, j9thread_t:0x002923CC, state:CW, prio=5
3XMTHREADINFO1            (native thread ID:0x808, native priority:0x5, native policy:UNKNOWN)
NULL           ------------------------------------------------------------------------

> corruption on z/OS with storerecovery oc_rec? tests.  ERROR XSLA7: Cannot redo operation null in the log.
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-4239
>                 URL: https://issues.apache.org/jira/browse/DERBY-4239
>             Project: Derby
>          Issue Type: Bug
>          Components: Store
>    Affects Versions: 10.1.3.3, 10.2.2.1, 10.3.2.1, 10.4.2.0, 10.5.1.1, 10.6.0.0
>         Environment: z/OS z10 processor. 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr4-20090219_01(SR4))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160-20090215_29883 (JIT enabled, AOT enabled)
> J9VM - 20090215_029883_bHdSMr
> JIT  - r9_20090213_2028
> GC   - 20090213_AA)
> JCL  - 20090218_01
> also 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr2ifix-20081021_01(SR2+IZ32776+IZ33456))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160ifx-20081010_24288 (JIT enabled, AOT enabled)
> J9VM - 20081009_024288_bHdSMr
> JIT  - r9_20080721_1330ifx2
> GC   - 20080724_AA)
> JCL  - 20080808_02
>            Reporter: Kathey Marsden
>            Assignee: Mike Matrigali
>            Priority: Critical
>         Attachments: badlogsizes.txt, derby-4239_1.diff, DERBY-4239_2.diff, DERBY-4239_3.diff, derby.log, derby.log, derby_dumponly.zip, goodlogsizes.txt, identifyBadContainer.ksh, reproBackgroundCheckpoint.zip, reproDerby4239.zip, wombat_keeplog_notcorrupt.zip, wombat_with_keeplog.zip
>
>
> I saw corruption on z/OS with the storerecovery tests and 10.5.1.1.  The failure comes in oc_rec3 trying to connect to the database, but the actual problem seems to have occurred with the prior test oc_rec2.  The problem is somewhat intermittent, happening approximately 1/4 times.  I extracted the case from the harness and will attach the reproduction and run the script repro.ksh.  The script will loop up to 50 times until it gets the failure which looks like.
> ERROR XSLA7: Cannot redo operation null in the log.
> 	at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.FileLogger.redo(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.LogToFile.recover(Unknown Source)
> 	at org.apache.derby.impl.store.raw.RawStore.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.store.access.RAMAccessManager.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.bootStore(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.bootService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startProviderService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.findProviderAndStartService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.<init>(Unknown Source)
> 	at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Unknown Source)
> 	at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source)
> 	at org.apache.derby.jdbc.AutoloadedDriver.connect(Unknown Source)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:311)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:268)
> 	at CheckTables.main(CheckTables.java:8)
> Caused by: ERROR XSDBB: Unknown page format at page Page(16,Container(0, 1073)), page dump follows: Hex dump:
> 00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 00000010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> <snip lots of 000's>
> I ran it with 10.3 and it completed all 50 iterations, so whether JVM or Derby issue it seems new since 10.3. (I haven't tried with 10.4).  Oddly I have run tests many times before on this machine using in the 10.5.1.1 release and the same jvm and have never seen this failure, so am looking into whether maybe something changed on the machine or environment.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (DERBY-4239) corruption on z/OS with storerecovery oc_rec? tests. ERROR XSLA7: Cannot redo operation null in the log.

Posted by "Mike Matrigali (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/DERBY-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mike Matrigali updated DERBY-4239:
----------------------------------


It looks like the problem is when compress asks for a checkpoint while another checkpoint is in progress.  The current  checkpoint code will either wait for current checkpoint to finish or just return if one is already in progress.  compress needs a path that will wait for current one 
to finish and restart another one, and wait for that one to finish.

I'll concentrate on a patch for this.   I am not sure if the last stack that kathey posted is this same bug.  

> corruption on z/OS with storerecovery oc_rec? tests.  ERROR XSLA7: Cannot redo operation null in the log.
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-4239
>                 URL: https://issues.apache.org/jira/browse/DERBY-4239
>             Project: Derby
>          Issue Type: Bug
>          Components: Store
>    Affects Versions: 10.5.1.1
>         Environment: z/OS z10 processor. 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr4-20090219_01(SR4))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160-20090215_29883 (JIT enabled, AOT enabled)
> J9VM - 20090215_029883_bHdSMr
> JIT  - r9_20090213_2028
> GC   - 20090213_AA)
> JCL  - 20090218_01
> also 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr2ifix-20081021_01(SR2+IZ32776+IZ33456))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160ifx-20081010_24288 (JIT enabled, AOT enabled)
> J9VM - 20081009_024288_bHdSMr
> JIT  - r9_20080721_1330ifx2
> GC   - 20080724_AA)
> JCL  - 20080808_02
>            Reporter: Kathey Marsden
>            Assignee: Mike Matrigali
>            Priority: Critical
>         Attachments: badlogsizes.txt, derby.log, derby.log, derby_dumponly.zip, goodlogsizes.txt, identifyBadContainer.ksh, reproBackgroundCheckpoint.zip, reproDerby4239.zip, wombat_keeplog_notcorrupt.zip, wombat_with_keeplog.zip
>
>
> I saw corruption on z/OS with the storerecovery tests and 10.5.1.1.  The failure comes in oc_rec3 trying to connect to the database, but the actual problem seems to have occurred with the prior test oc_rec2.  The problem is somewhat intermittent, happening approximately 1/4 times.  I extracted the case from the harness and will attach the reproduction and run the script repro.ksh.  The script will loop up to 50 times until it gets the failure which looks like.
> ERROR XSLA7: Cannot redo operation null in the log.
> 	at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.FileLogger.redo(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.LogToFile.recover(Unknown Source)
> 	at org.apache.derby.impl.store.raw.RawStore.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.store.access.RAMAccessManager.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.bootStore(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.bootService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startProviderService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.findProviderAndStartService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.<init>(Unknown Source)
> 	at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Unknown Source)
> 	at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source)
> 	at org.apache.derby.jdbc.AutoloadedDriver.connect(Unknown Source)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:311)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:268)
> 	at CheckTables.main(CheckTables.java:8)
> Caused by: ERROR XSDBB: Unknown page format at page Page(16,Container(0, 1073)), page dump follows: Hex dump:
> 00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 00000010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> <snip lots of 000's>
> I ran it with 10.3 and it completed all 50 iterations, so whether JVM or Derby issue it seems new since 10.3. (I haven't tried with 10.4).  Oddly I have run tests many times before on this machine using in the 10.5.1.1 release and the same jvm and have never seen this failure, so am looking into whether maybe something changed on the machine or environment.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (DERBY-4239) corruption on z/OS with storerecovery oc_rec? tests. ERROR XSLA7: Cannot redo operation null in the log.

Posted by "Kathey Marsden (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/DERBY-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kathey Marsden updated DERBY-4239:
----------------------------------

    Attachment: identifyBadContainer.ksh

I used this little script to identify the bad containers.  It should only be run on a copy of the corrupted db in case anything goes wrong.

You have to get the container number out of derby.log and then run. 

indentifyBadContainer.ksh <database> <containernumber>

It could be smarter and grep or awk the containernumber out of the log for you and  determine whether this is a table or index and just output the name instead of the query output, but it sufficed for my purposes.


> corruption on z/OS with storerecovery oc_rec? tests.  ERROR XSLA7: Cannot redo operation null in the log.
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-4239
>                 URL: https://issues.apache.org/jira/browse/DERBY-4239
>             Project: Derby
>          Issue Type: Bug
>          Components: Store
>    Affects Versions: 10.5.1.1
>         Environment: z/OS z10 processor. 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr4-20090219_01(SR4))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160-20090215_29883 (JIT enabled, AOT enabled)
> J9VM - 20090215_029883_bHdSMr
> JIT  - r9_20090213_2028
> GC   - 20090213_AA)
> JCL  - 20090218_01
> also 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr2ifix-20081021_01(SR2+IZ32776+IZ33456))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160ifx-20081010_24288 (JIT enabled, AOT enabled)
> J9VM - 20081009_024288_bHdSMr
> JIT  - r9_20080721_1330ifx2
> GC   - 20080724_AA)
> JCL  - 20080808_02
>            Reporter: Kathey Marsden
>            Priority: Critical
>         Attachments: badlogsizes.txt, derby.log, derby.log, goodlogsizes.txt, identifyBadContainer.ksh, reproDerby4239.zip, wombat_keeplog_notcorrupt.zip, wombat_with_keeplog.zip
>
>
> I saw corruption on z/OS with the storerecovery tests and 10.5.1.1.  The failure comes in oc_rec3 trying to connect to the database, but the actual problem seems to have occurred with the prior test oc_rec2.  The problem is somewhat intermittent, happening approximately 1/4 times.  I extracted the case from the harness and will attach the reproduction and run the script repro.ksh.  The script will loop up to 50 times until it gets the failure which looks like.
> ERROR XSLA7: Cannot redo operation null in the log.
> 	at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.FileLogger.redo(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.LogToFile.recover(Unknown Source)
> 	at org.apache.derby.impl.store.raw.RawStore.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.store.access.RAMAccessManager.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.bootStore(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.bootService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startProviderService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.findProviderAndStartService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.<init>(Unknown Source)
> 	at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Unknown Source)
> 	at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source)
> 	at org.apache.derby.jdbc.AutoloadedDriver.connect(Unknown Source)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:311)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:268)
> 	at CheckTables.main(CheckTables.java:8)
> Caused by: ERROR XSDBB: Unknown page format at page Page(16,Container(0, 1073)), page dump follows: Hex dump:
> 00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 00000010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> <snip lots of 000's>
> I ran it with 10.3 and it completed all 50 iterations, so whether JVM or Derby issue it seems new since 10.3. (I haven't tried with 10.4).  Oddly I have run tests many times before on this machine using in the 10.5.1.1 release and the same jvm and have never seen this failure, so am looking into whether maybe something changed on the machine or environment.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.