You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-dev@db.apache.org by "Olav Sandstaa (JIRA)" <ji...@apache.org> on 2007/01/18 22:52:30 UTC

[jira] Commented: (DERBY-2254) Assert during log file switch: log file position exceeded max log file size

    [ https://issues.apache.org/jira/browse/DERBY-2254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12465881 ] 

Olav Sandstaa commented on DERBY-2254:
--------------------------------------

The crash happened when running simple transactions (3 updates, 1
insert, 1 select) against a database with 10 GB user data (total data
volume 17 GB). Derby was configured with a 18 GB database buffer. The
crash occured about 2 hours after re-starting an existing
database. The load consisted of 64 clients running transactions
back-to-back. Since the database was "just" started most of the
transactions had to read data from the disk, thus there were about 60
concurrent read requests for getting data into the database buffer. At
the same time the single back-ground thread was having a hard time to
keep up writing out dirty data to the disk. At this time there were
plenty of free room in the page cache so no data blocks were "evicted"
out to disk by the user threads.

Since the background thread were not able to flush data pages fast
enough it was also unable to reclaim log files. So the amount of log
kept growing. 

One thing I do not understand is that most of the log files have their
normal size while some are very close to the max log file size. Here
is a list of the files in the log directory after the crash:


-rw-r-----   1 olav       48 Jan 17 22:09 log.ctrl
-rw-r-----   1 olav       48 Jan 17 22:09 logmirror.ctrl
-rw-r-----   1 olav  1060411 Jan 17 22:09 log1181.dat
-rw-r-----   1 olav  1067038 Jan 17 22:09 log1182.dat
-rw-r-----   1 olav  1049488 Jan 17 22:09 log1183.dat
-rw-r-----   1 olav  1050176 Jan 17 22:09 log1184.dat
-rw-r-----   1 olav  1050758 Jan 17 22:10 log1185.dat
-rw-r-----   1 olav  1050583 Jan 17 22:10 log1186.dat
-rw-r-----   1 olav  1050178 Jan 17 22:10 log1187.dat
-rw-r-----   1 olav  1054465 Jan 17 22:10 log1188.dat
-rw-r-----   1 olav  1051730 Jan 17 22:11 log1189.dat
-rw-r-----   1 olav  1052601 Jan 17 22:11 log1190.dat
-rw-r-----   1 olav  268435436 Jan 17 22:55 log1191.dat
-rw-r-----   1 olav  1048576 Jan 17 22:55 log1192.dat
-rw-r-----   1 olav  1048576 Jan 17 22:55 log1193.dat
-rw-r-----   1 olav  1048576 Jan 17 22:55 log1194.dat
-rw-r-----   1 olav  1048576 Jan 17 22:55 log1195.dat
-rw-r-----   1 olav  1048576 Jan 17 22:55 log1196.dat
-rw-r-----   1 olav  1048576 Jan 17 22:55 log1197.dat
-rw-r-----   1 olav  268435434 Jan 17 23:23 log1198.dat
-rw-r-----   1 olav  1048576 Jan 17 23:23 log1199.dat
-rw-r-----   1 olav  1048576 Jan 17 23:23 log1200.dat
-rw-r-----   1 olav  1048576 Jan 17 23:23 log1201.dat
-rw-r-----   1 olav  1048576 Jan 17 23:23 log1202.dat
-rw-r-----   1 olav  268435429 Jan 17 23:46 log1203.dat
-rw-r-----   1 olav  1048576 Jan 17 23:46 log1204.dat
-rw-r-----   1 olav  1048576 Jan 17 23:46 log1205.dat
-rw-r-----   1 olav  1048576 Jan 17 23:46 log1206.dat
-rw-r-----   1 olav  1048576 Jan 17 23:46 log1207.dat
-rw-r-----   1 olav  1048576 Jan 17 23:46 log1208.dat
-rw-r-----   1 olav  1048576 Jan 17 23:46 log1209.dat
-rw-r-----   1 olav  1048576 Jan 17 23:46 log1210.dat
-rw-r-----   1 olav  268435456 Jan 18 00:09 log1211.dat
-rw-r-----   1 olav        0 Jan 18 00:09 log1212.dat


In org.apache.derby.impl.store.raw.log.LogCounter the following constant is defined

	// reserve top 4 bits in log file size for future use
	public static final long MAX_LOGFILE_SIZE	    = (long)0x0FFFFFFFL; // 268435455

and the assert is defined as:

        SanityManager.ASSERT(filepos < MAX_LOGFILE_SIZE,
                             "log file position exceeded max log file size");

Comparing the constant MAX_LOGFILE_SIZE (268435455) with the length of
the second last file above (268435456) it actually seems like this log
file has become a byte too long.


> Assert during log file switch: log file position exceeded max log file size
> ---------------------------------------------------------------------------
>
>                 Key: DERBY-2254
>                 URL: https://issues.apache.org/jira/browse/DERBY-2254
>             Project: Derby
>          Issue Type: Bug
>          Components: Store
>    Affects Versions: 10.3.0.0
>         Environment: Solaris 10, Java SE 6 build 104 
>            Reporter: Olav Sandstaa
>
> When running simple tpc-b like transactions against a embedded Derby based on a SANE build of trunk the following assertion occurs for the background thread and all user threads:
>    org.apache.derby.shared.common.sanity.AssertFailure: ASSERT FAILED log file position exceeded max log file size
> This seems to occur during a switch to a new log file.
> derby.log contains the following call stack for the background thread:
> Exception trace: 
> org.apache.derby.shared.common.sanity.AssertFailure: ASSERT FAILED log file position exceeded max log file size
> 	at org.apache.derby.shared.common.sanity.SanityManager.ASSERT(SanityManager.java:120)
> 	at org.apache.derby.impl.store.raw.log.LogCounter.makeLogInstantAsLong(LogCounter.java:120)
> 	at org.apache.derby.impl.store.raw.log.LogToFile.switchLogFile(LogToFile.java:1900)
> 	at org.apache.derby.impl.store.raw.log.LogToFile.appendLogRecord(LogToFile.java:3530)
> 	at org.apache.derby.impl.store.raw.log.FileLogger.logAndDo(FileLogger.java:345)
> 	at org.apache.derby.impl.store.raw.xact.Xact.logAndDo(Xact.java:1185)
> 	at org.apache.derby.impl.store.raw.log.LogToFile.checkpointWithTran(LogToFile.java:1540)
> 	at org.apache.derby.impl.store.raw.log.LogToFile.checkpoint(LogToFile.java:1357)
> 	at org.apache.derby.impl.store.raw.RawStore.checkpoint(RawStore.java:439)
> 	at org.apache.derby.impl.store.raw.log.LogToFile.performWork(LogToFile.java:3416)
> 	at org.apache.derby.impl.services.daemon.BasicDaemon.serviceClient(BasicDaemon.java:331)
> 	at org.apache.derby.impl.services.daemon.BasicDaemon.work(BasicDaemon.java:668)
> 	at org.apache.derby.impl.services.daemon.BasicDaemon.run(BasicDaemon.java:394)
> 	at java.lang.Thread.run(Thread.java:619)
> 2007-01-17 23:09:48.638 GMT Thread[derby.rawStoreDaemon,5,derby.daemons] Cleanup action starting
> org.apache.derby.shared.common.sanity.AssertFailure: ASSERT FAILED log file position exceeded max log file size
> 	at org.apache.derby.shared.common.sanity.SanityManager.ASSERT(SanityManager.java:120)
> 	at org.apache.derby.impl.store.raw.log.LogCounter.makeLogInstantAsLong(LogCounter.java:120)
> 	at org.apache.derby.impl.store.raw.log.LogToFile.switchLogFile(LogToFile.java:1900)
> 	at org.apache.derby.impl.store.raw.log.LogToFile.appendLogRecord(LogToFile.java:3530)
> 	at org.apache.derby.impl.store.raw.log.FileLogger.logAndDo(FileLogger.java:345)
> 	at org.apache.derby.impl.store.raw.xact.Xact.logAndDo(Xact.java:1185)
> 	at org.apache.derby.impl.store.raw.log.LogToFile.checkpointWithTran(LogToFile.java:1540)
> 	at org.apache.derby.impl.store.raw.log.LogToFile.checkpoint(LogToFile.java:1357)
> 	at org.apache.derby.impl.store.raw.RawStore.checkpoint(RawStore.java:439)
> 	at org.apache.derby.impl.store.raw.log.LogToFile.performWork(LogToFile.java:3416)
> 	at org.apache.derby.impl.services.daemon.BasicDaemon.serviceClient(BasicDaemon.java:331)
> 	at org.apache.derby.impl.services.daemon.BasicDaemon.work(BasicDaemon.java:668)
> 	at org.apache.derby.impl.services.daemon.BasicDaemon.run(BasicDaemon.java:394)
> 	at java.lang.Thread.run(Thread.java:619)
> Cleanup action completed
> For my user threads the call stack is similar:
> Database Class Loader started - derby.database.classpath=''
> 2007-01-17 23:09:36.401 GMT Thread[Thread-51,5,main] (XID = 12632406), (SESSIONID = 51), (DATABASE = /export/home/tmp/derby-db), (DRDAID = null), Cleanup action starting
> 2007-01-17 23:09:36.401 GMT Thread[Thread-51,5,main] (XID = 12632406), (SESSIONID = 51), (DATABASE = /export/home/tmp/derby-db), (DRDAID = null), Failed Statement is: UPDATE accounts SET abal = abal + ? WHERE aid = ? AND bid = ?
> org.apache.derby.shared.common.sanity.AssertFailure: ASSERT FAILED log file position exceeded max log file size
> 	at org.apache.derby.shared.common.sanity.SanityManager.ASSERT(SanityManager.java:120)
> 	at org.apache.derby.impl.store.raw.log.LogCounter.makeLogInstantAsLong(LogCounter.java:120)
> 	at org.apache.derby.impl.store.raw.log.LogToFile.switchLogFile(LogToFile.java:1900)
> 	at org.apache.derby.impl.store.raw.log.LogToFile.appendLogRecord(LogToFile.java:3530)
> 	at org.apache.derby.impl.store.raw.log.FileLogger.logAndDo(FileLogger.java:345)
> 	at org.apache.derby.impl.store.raw.xact.Xact.logAndDo(Xact.java:1185)
> 	at org.apache.derby.impl.store.raw.data.LoggableActions.doAction(LoggableActions.java:221)
> 	at org.apache.derby.impl.store.raw.data.LoggableActions.actionUpdate(LoggableActions.java:85)
> 	at org.apache.derby.impl.store.raw.data.StoredPage.doUpdateAtSlot(StoredPage.java:8521)
> 	at org.apache.derby.impl.store.raw.data.BasePage.updateAtSlot(BasePage.java:1108)
> 	at org.apache.derby.impl.store.access.conglomerate.GenericConglomerateController.replace(GenericConglomerateController.java:479)
> 	at org.apache.derby.impl.sql.execute.RowChangerImpl.updateRow(RowChangerImpl.java:523)
> 	at org.apache.derby.impl.sql.execute.UpdateResultSet.collectAffectedRows(UpdateResultSet.java:566)
> 	at org.apache.derby.impl.sql.execute.UpdateResultSet.open(UpdateResultSet.java:260)
> 	at org.apache.derby.impl.sql.GenericPreparedStatement.execute(GenericPreparedStatement.java:358)
> 	at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(EmbedStatement.java:1182)
> 	at org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeStatement(EmbedPreparedStatement.java:1652)
> 	at org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeUpdate(EmbedPreparedStatement.java:299)
> 	at com.sun.derby.perf.clients.tpcb.DBConnection.performTransaction(DBConnection.java:595)
> 	at com.sun.derby.perf.clients.tpcb.Client.run(Client.java:218)
> After this it seems like no user threads are able to connect to the database (not surpricing).
> This happend using a SANE build and I am unsure by just looking at the assertion in the code whether this situation would have been fatal also if I had been using an INSANE build.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira