You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@activemq.apache.org by "Alan Protasio (JIRA)" <ji...@apache.org> on 2018/10/20 04:42:00 UTC
[jira] [Comment Edited] (AMQ-7080) Keep track of free pages - Update db.free file during checkpoints

    [ https://issues.apache.org/jira/browse/AMQ-7080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16657510#comment-16657510 ] 

Alan Protasio edited comment on AMQ-7080 at 10/20/18 4:41 AM:
--------------------------------------------------------------

[~gtully] [~jgenender]  [~cshannon]

Guys.. I could figure out a way to do it without any performance hit...

So.. i'm doing the other way around... I'm writing in the end of the db.free file the nextTxid and the sequenceSet Hash.

The nextTxid shows if the db.data and db.free are in sync and the hash show if it was fully written (check for partial writes)

This change is also backward compatible as if there is not this metadata in the end of db.free file i'm just ignoring it (setting hashcheckpoint and nextTxid to -1)

So, At the checkpoint I only serialize the freeList (with the metadata) into a ByteArrayOutputStream and do the actual write async (see storeFreeListAsync) - storeFreeListAsync make sure that the the bytes represent the sequence set in the Checkpoint time but, do the actual write async (not blocking the checkpoint).... In the recovery path, we have ways of knowing if the db.free is up to date and was fully written and if it is in sync with db.data.

One thing that I found strange though is.. when a full recovery is performed the number of free pages can be different from the original one... In a case of a clean shutdown (current implementation) the free pages will be not the same as if we scan the whole index in a unclean shutdown.. 

For instance: in the test "testFreePageRecoveryUncleanShutdown" if we compare pf2.getFreePageCount() and pf.getFreePageCount() the number will not be the same.

So, this change has the same behaviour of a clean shutdown.. 


was (Author: alanprot):
[~gtully] [~jgenender]  [~cshannon]

Guys.. I could figure out a way to do it without any performance hit...

So.. i'm doing the other way around... I'm writing in the end of the db.free file the nextTxid and the sequenceSet Hash.

The nextTxid shows if the db.data and db.free are in sync and the hash show if  there was a partial write (I create a test for both cases).

This change is also backward compatible as if there is not this metadata in the end of db.free file i'm just ignoring set the hash and nextTxid to -1)

So, No in the checkpoint I only serialize the freeList (with the metadata) into a ByteArrayOutputStream and writing the file async (see storeFreeListAsync)

storeFreeListAsync make sure that the the bytes represent the sequence set in the Checkpoint time but, the write can be done later.... In the recovery path, we have ways of knowing if the db.free is up to date and was fully written last time.

 

One thing that I found strange though is.. when a full recovery is performed the number of free pages can be different from the original one...

For instance: in the test "testFreePageRecoveryUncleanShutdown" if we compare pf2.getFreePageCount() and pf.getFreePageCount() the number will not be the same.

 

> Keep track of free pages - Update db.free file during checkpoints
> -----------------------------------------------------------------
>
>                 Key: AMQ-7080
>                 URL: https://issues.apache.org/jira/browse/AMQ-7080
>             Project: ActiveMQ
>          Issue Type: Improvement
>          Components: KahaDB
>    Affects Versions: 5.15.6
>            Reporter: Alan Protasio
>            Assignee: Jean-Baptiste Onofré
>            Priority: Major
>             Fix For: 5.16.0, 5.15.7
>
>
> In a event of an unclean shutdown, Activemq loses the information about the free pages in the index. In order to recover this information, ActiveMQ read the whole index during shutdown searching for free pages and then save the db.free file. This operation can take a long time, making the failover slower. (during the shutdown, activemq will still hold the lock).
> From http://activemq.apache.org/shared-file-system-master-slave.html
> {quote}"If you have a SAN or shared file system it can be used to provide high availability such that if a broker is killed, another broker can take over immediately."
> {quote}
> Is important to note if the shutdown takes more than ACTIVEMQ_KILL_MAXSECONDS seconds, any following shutdown will be unclean. This broker will stay in this state unless the index is deleted (this state means that every failover will take more then ACTIVEMQ_KILL_MAXSECONDS, so, if you increase this time to 5 minutes, you fail over can take more than 5 minutes).
>  
> In order to prevent ActiveMQ reading the whole index file to search for free pages, we can keep track of those on every Checkpoint. In order to do that we need to be sure that db.data and db.free are in sync. To achieve that we can have a attribute in the db.free page that is referenced by the db.data.
> So during the checkpoint we have:
> 1 - Save db.free and give a freePageUniqueId
> 2 - Save this freePageUniqueId in the db.data (metadata)
> In a crash, we can see if the db.data has the same freePageUniqueId as the db.free. If this is the case we can safely use the free page information contained in the db.free
> Now, the only way to read the whole index file again is IF the crash happens btw step 1 and 2 (what is very unlikely).
> The drawback of this implementation is that we will have to save db.free during the checkpoint, what can possibly increase the checkpoint time.
> Is also important to note that we CAN (and should) have stale data in db.free as it is referencing stale db.data:
> Imagine the timeline:
> T0 -> P1, P2 and P3 are free.
> T1 -> Checkpoint
> T2 -> P1 got occupied.
> T3 -> Crash
> In the current scenario after the  Pagefile#load the P1 will be free and then the replay will mark P1 as occupied or will occupied another page (now that the recovery of free pages is done on shutdown)
> This change only make sure that db.data and db.free are in sync and showing the reality in T1 (checkpoint), If they are in sync we can trust the db.free.
> This is a really fast draft of what i'm suggesting... If you guys agree, i can create the proper patch after:
> [https://github.com/alanprot/activemq/commit/18036ef7214ef0eaa25c8650f40644dd8b4632a5] 
> This is related to https://issues.apache.org/jira/browse/AMQ-6590



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)