You are viewing a plain text version of this content. The canonical link for it is here.

Posted to derby-dev@db.apache.org by "Øystein Grøvlen (JIRA)" <de...@db.apache.org> on 2005/05/24 15:57:10 UTC

[jira] Commented: (DERBY-298) rollforward will not work correctly if the system happens to crash immediately after rollforward backup.

     [ http://issues.apache.org/jira/browse/DERBY-298?page=comments#action_66168 ]
     
Øystein Grøvlen commented on DERBY-298:
---------------------------------------

Looking at the code, I became a bit confused about the definition of an empty log file.   Scan.getNextRecordForward contains debug output when it detects an empty log file.  It will then return without setting knownGoodLogEnd.  Hence, new log records will be written to the end of the previous file.  As Suresh says this is probably to be able to handle crashes during log switch.

However, this is not what happens when I run the recovery part of the example in this report.  Since, currentLogFileLength is a large number, it detects "zapped log end on log file", goes on to the next file, which does not exist, and returns.  (Who sets the length of a log file?  Is this maximum size until a log switch is performed?)  The effect is the same, but this can not be used to detect an empty log file and apply the solution proposed by Suresh.  Instead, one would have to do some hairy file handling at a later stage.

An alternative way to fix this would be to just create a dummy log record in the new log file as part of the backup command.  This would make the redo scan end in the new log file.  However, this will not work for those who do backup with OS-commands (i.e., copy the files directly).

I would also think it should be possible to do the log switch in such a way that it is possible to detect during recovery whether the log switch had completed or not.  If this was the case, one could just set knownGoodLogEnd of the redo scan to the start of the empty file if the log switch was completed.  Does anyone know if this is possible?




> rollforward will not work correctly  if the system happens to crash immediately after rollforward backup.
> ---------------------------------------------------------------------------------------------------------
>
>          Key: DERBY-298
>          URL: http://issues.apache.org/jira/browse/DERBY-298
>      Project: Derby
>         Type: Bug
>   Components: Store
>     Versions: 10.0.2.1
>     Reporter: Suresh Thalamati
>     Assignee: Øystein Grøvlen
>     Priority: Minor

>
> If the system crashes after a rollforward backup; last log file 
> is empty(say log2.dat). On next crash-recovery system ignores the  empty log 
> file and starts writing to the previous log(say log1.dat),  
> even thought there was successfule log file switch  before the crash.
> The reason I belive it is done this way to avoid special 
> handling of crashes  during the log switch process. 
> Problem is  on rollfroward restore from a backup log1.dat will get overwritten 
> from the copy in the backup, so any transaction that got added to log1.dat
> after the backup was taken will be lost. 
>  
> One possible solution that comes to my mind to solve this problem is 
>  1) check if an  empty a log file exist after a redo crash-recovery , if 
>      the log archive mode is enabled.
>  2) If it exists , delete and do log file switch again 
>  
> Repro:
> connect 'jdbc:derby:wombat;create=true';
> create table t1(a int ) ;
> insert into t1 values(1) ;
> insert into t1 values(2) ;
> call SYSCS_UTIL.SYSCS_BACKUP_DATABASE_AND_ENABLE_LOG_ARCHIVE_MODE(
>     'extinout/mybackup', 0);
> --crash (NO LOG RECORDS WENT IN AFTER THE BACKUP).
> connect 'jdbc:derby:wombat';
> insert into t1 select a*2 from t1 ;
> insert into t1 select a*2 from t1 ;
> insert into t1 select a*2 from t1 ;
> insert into t1 select a*2 from t1 ;
> insert into t1 select a*2 from t1 ;
> insert into t1 select a*2 from t1 ;
> insert into t1 select a*2 from t1 ;
> select count(*) from t1 ;
> --exit from jvm and restore from backup
> connect
> 'jdbc:derby:wombat;rollForwardRecoveryFrom=extinout/mybackup/wombat';
> select count(*) from t1 ;  -- THIS WILL GIVE INCORRECT VALUES

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

Re: [jira] Commented: (DERBY-298) rollforward will not work correctly if the system happens to crash immediately after rollforward backup.

Posted by Suresh Thalamati <su...@gmail.com>.

Øystein Grøvlen wrote:

>>>>>>"ST" == Suresh Thalamati <su...@gmail.com> writes:
>>>>>>            
>>>>>>
>
>    ST> derby has two types  of log files , one that works  in RWS mode with a
>    ST> preallocated log file ,
>
>What is RWS mode?
>  
>
RWS mode is  writing transction log using  the write sync mechanism 
supported by java.io.RandomaccessFile  from  jdk1.4.2  onwards, when a 
file is opened with "rws"  , all  writes to the file are synced to the 
disk on the write call itself.

>    ST> and  one which  uses  file sync  with  out preallocation.  In case  of
>    ST> preallocation  , zeros  are  written to  the  log file  to the  length
>    ST> specified  by the logSwitchInterval  (Default is  1 MB)  , it  is also
>    ST> configurable by the user.
>
>What determines which mode is used? 
>  
>
Write sync is the default  mode when derby is run on jvm after 
jdk1.4.2.  users can also  specify the engine to use
file sync mode ,  for example to avoid Derby-1 problem on  Mac-OS. 


>    ST> This  problem  can  also  be  fixed  by  writing  dummy  log  record.  
>    ST> Filelogger.redo() code  has to be fixed to  understand this, currently
>    ST> it looks at the logEnd only when a good log records is read.
>
>I was thinking that the "dummy" log record should be a "good" log
>record so that the current implementation of redo() need not change.
>    ST> Another possible  solution I  was thinking to  identify whether  a log
>    ST> switch is good  or not is by writing  a INT (4 bytes of  zeros ) after
>    ST> log file  initialization. As 512 bytes  writes suppose to  be atomic. 
>    ST> if  (log file  length >  = LOG_FILE_HEADER_SIZE(24)  +4) then  the log
>    ST> switch before the crash can be fixed as good one and fix the scan code
>    ST> to use the empty log file instead of writing to the previous log file.
>
>I am not sure I understood this.  Do you suggest writing the INT just
>after the header? 
>
Yes.  If  the header is written to the disk completely,  then,  it is 
safe to say  log switch is  completed successfully.

Basically:
 1) Write the header  and do a file sync as it done currently.
 2) Wrie a INT(4 bytes of zeros)  and do a file sync.

On scan if  the log file size is  >=  LOG_FILE_HEADER_SIZE(24) +4 ,  
then the log file switch was successful  before the crash.
It can be used as the next log file.


> Would this work for preallocated log files?  
>
I think so,  preallocation is done only for performance reasons. For some reason  if  the preallocation did not complete, recovery should work fine.  just that log writes will be slower until the next switch. 


>Would
>not the length always be 1 MB (default) in that case?
>
>  
>
Not sure I understand this question completely. Length will not be 1MB 
(default) If the crash occurs after log file initialization , but before 
preallocation. It will  be just  LOG_FILE_HEADER_SIZE(24) +4 .


Thanks
-suresh

Re: [jira] Commented: (DERBY-298) rollforward will not work correctly if the system happens to crash immediately after rollforward backup.

Posted by Øystein Grøvlen <Oy...@Sun.COM>.

>>>>> "ST" == Suresh Thalamati <su...@gmail.com> writes:

    ST> derby has two types  of log files , one that works  in RWS mode with a
    ST> preallocated log file ,

What is RWS mode?

    ST> and  one which  uses  file sync  with  out preallocation.  In case  of
    ST> preallocation  , zeros  are  written to  the  log file  to the  length
    ST> specified  by the logSwitchInterval  (Default is  1 MB)  , it  is also
    ST> configurable by the user.

What determines which mode is used? 

    ST> This  problem  can  also  be  fixed  by  writing  dummy  log  record.  
    ST> Filelogger.redo() code  has to be fixed to  understand this, currently
    ST> it looks at the logEnd only when a good log records is read.

I was thinking that the "dummy" log record should be a "good" log
record so that the current implementation of redo() need not change.

    ST> Another possible  solution I  was thinking to  identify whether  a log
    ST> switch is good  or not is by writing  a INT (4 bytes of  zeros ) after
    ST> log file  initialization. As 512 bytes  writes suppose to  be atomic. 
    ST> if  (log file  length >  = LOG_FILE_HEADER_SIZE(24)  +4) then  the log
    ST> switch before the crash can be fixed as good one and fix the scan code
    ST> to use the empty log file instead of writing to the previous log file.

I am not sure I understood this.  Do you suggest writing the INT just
after the header?  Would this work for preallocated log files?  Would
not the length always be 1 MB (default) in that case?

-- 
Øystein

Re: [jira] Commented: (DERBY-298) rollforward will not work correctly if the system happens to crash immediately after rollforward backup.

Posted by Suresh Thalamati <su...@gmail.com>.

Øystein Grøvlen (JIRA) wrote:

>     [ http://issues.apache.org/jira/browse/DERBY-298?page=comments#action_66168 ]
>     
>Øystein Grøvlen commented on DERBY-298:
>---------------------------------------
>
>Looking at the code, I became a bit confused about the definition of an empty log file.   Scan.getNextRecordForward contains debug output when it detects an empty log file.  It will then return without setting knownGoodLogEnd.  Hence, new log records will be written to the end of the previous file.  As Suresh says this is probably to be able to handle crashes during log switch.
>
>However, this is not what happens when I run the recovery part of the example in this report.  Since, currentLogFileLength is a large number, it detects "zapped log end on log file", goes on to the next file, which does not exist, and returns.  (Who sets the length of a log file?  Is this maximum size until a log switch is performed?)  The effect is the same, but this can not be used to detect an empty log file and apply the solution proposed by Suresh.  Instead, one would have to do some hairy file handling at a later stage.
>
>  
>
derby has two types of log files , one that works in RWS mode with a 
preallocated log file ,
and one which uses file sync with out preallocation. In case of 
preallocation , zeros are written to the log file to the length 
specified by the logSwitchInterval (Default is 1 MB) , it  is  also 
configurable by the user.

Empty log file can not be identified based on the length alone. . Only 
way to declare a log file is empty/fresh is when  No log records found 
in the file during recover scan. As u noticed because it is a 
preallocated file , when scan finds zeros on first after zeros,  it 
declares that there are no more log records.  Any fix for this problem 
has to handle both preallocated and non preallocated case .

Actually, I  don't like idea of creating a  new log file on boot even 
for special conditions,
because spending more time than required  does not make out users happy 
! If  you can  find a fix that does not require a new log creation, it 
will be great.

>An alternative way to fix this would be to just create a dummy log record in the new log file as part of the backup command.  This would make the redo scan end in the new log file.  However, this will not work for those who do backup with OS-commands (i.e., copy the files directly).
>  
>
 Backup with OS command should not be a problem ,  because there is no 
support to perform roll forward recovery  with these type of backups. If 
it  is a just a plain  backup ,  it does not matter how logs are 
written  after the backup, because they are never used to do a restore.

This problem can also be fixed by writing dummy log record.  
Filelogger.redo() code has to be fixed to understand this, currently it 
looks at the logEnd only when a good log records is read.

>I would also think it should be possible to do the log switch in such a way that it is possible to detect during recovery whether the log switch had completed or not.  If this was the case, one could just set knownGoodLogEnd of the redo scan to the start of the empty file if the log switch was completed.  Does anyone know if this is possible?
>
>  
>
Yes. It should be, with some changes to do redo/scan code.

Another possible solution I was thinking to identify whether a  log 
switch is good or not is   by writing  a INT (4 bytes of zeros ) after 
log file initialization.   As 512 bytes writes suppose to be atomic.  if 
(log file length > = LOG_FILE_HEADER_SIZE(24) +4)  then  the log switch 
before the  crash can be fixed as good  one and fix the scan code  to 
use the empty  log file  instead of writing to the previous log file.

Thanks
-suresht