You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@trafodion.apache.org by "Oliver Bucaojit (JIRA)" <ji...@apache.org> on 2015/09/14 23:20:45 UTC

[jira] [Updated] (TRAFODION-34) Support region re-balancing with transactions active

     [ https://issues.apache.org/jira/browse/TRAFODION-34?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Oliver Bucaojit updated TRAFODION-34:
-------------------------------------
    Component/s: dtm

Updated the design with more implementation details.

I have code changes that implement both the balance and split work.  I have completed the following testing:
Balance –             
Loading a table through sqlci and ‘disabling’ ‘enabling’ the table
-Client-side retries until the region is re-enabled (configurable wait limit, currently 15 sec)
Having 2 RegionServers on workstation and running the balancer while table is being loaded
-Making sure the region was moved to a different RS while loading
               
Split --                   
For the split test, I create a table and set the MAX_FILESIZE, MEMSTORE_FLUSHSIZE to a small size, 4342177, to split with a small load
a)	Load the table until it splits into 2 regions, check the number of rows in table match rows inserted
b)	Load the table with 3 separate sessions until splits into 2 regions
c)	Repeat the loading and monitor the region numbers, successfully tested 11 region splits

Plan on reviewing code with team and working on merging code with current branch.  Will also continue functional testing and testing with regressions. 


> Support region re-balancing with transactions active
> ----------------------------------------------------
>
>                 Key: TRAFODION-34
>                 URL: https://issues.apache.org/jira/browse/TRAFODION-34
>             Project: Apache Trafodion
>          Issue Type: New Feature
>          Components: dtm
>            Reporter: Oliver Bucaojit
>            Assignee: Oliver Bucaojit
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
> Transactional state is not persisted over a region rebalance or split.  This JIRA will cover the rebalance aspect separate from the split handling.  The implementation of the two features would be similar, split handling being more complex.
> The transactional state on the server-side is held in-memory for a region in an endpoint coprocessor.  When a region is rebalanced, this state is lost when the region comes back online.  Some of the information needed to continue the transaction is the list of transaction states by ID, committed transactions by ID, pending transactions, etc.  
> One idea that I have been testing out is persisting this transactional information by serializing it as a Google protobuf and then writing this out to disk on region preClose().  On postOpen() of the region, this transactional state will be read and the information replayed to rebuild the lists and states.
> Another detail that I would like to add is to have the operation delay while transactions are pending and when there are transactional scanners being used.  This would allow pending transactions to complete before the region is taken down.  For the scanner, I think this would make things less complicated as opposed to having to rebuild the scanner state or know which was the last row that the user received from the scanner.  
> In an earlier version, I tested having a simple delay loop that checked for a period of all active transactions to be complete before continuing.  This was causing problems for long-running tests as there would be no quiet periods and the region would not continue with the operation and eventually run out of memory. So having it only delay on pending transactions vs active transactions will allow the delay loop to find a time to continue the split/balance operation.
> More design details are in the included blueprint link.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)