You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@phoenix.apache.org by "James Taylor (JIRA)" <ji...@apache.org> on 2016/09/26 17:54:20 UTC

[jira] [Issue Comment Deleted] (PHOENIX-3326) Restoring SYSTEM.CATALOG from snapshot causes clients to run into UpgradeInProgressException

     [ https://issues.apache.org/jira/browse/PHOENIX-3326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

James Taylor updated PHOENIX-3326:
----------------------------------
    Comment: was deleted

(was: [~samarthjain] - for 4.9.0, can we create a couple of new MetaDataEndPoint methods like obtainMutex() and releaseMutex() instead of creating a new HBase table? That way we have a bit more freedom on how we implement this.

Couple of ideas for implementation:

#. Take a lock on the SYSTEM.CATALOG table. The lock would be released when the upgrade is complete. It looks like these locks automatically expire (by default in 10 mins) and use zk underneath. We could also have our own timer task that releases the lock if need be. Other clients who attempt to get the lock would fail. Need to confirm that the snapshot could be done with the lock in place.

#. Use a RowLock with a timer task that expires it. This won't survive a region close and open, but maybe that's a corner case we're ok with? The obtainMutex() would just get a row lock:
{code}
RowLock rowLock = region.getRowLock(key, true);
{code}
And we can have a timer task release the lock (in case the client dies), and releaseMutex() would also release the row lock. 


 )

> Restoring SYSTEM.CATALOG from snapshot causes clients to run into UpgradeInProgressException
> --------------------------------------------------------------------------------------------
>
>                 Key: PHOENIX-3326
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-3326
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.8.1
>            Reporter: Samarth Jain
>            Assignee: Samarth Jain
>             Fix For: 4.9.0, 4.8.2
>
>         Attachments: PHOENIX-3326_4.8-HBase-0.98.patch, PHOENIX-3326_4.8-HBase-0.98_v2.patch, PHOENIX-3326_wip.patch
>
>
> We create a snapshot of the SYSTEM.CATALOG table only after the client is able to successfully acquire a distributed mutex of sorts. This means the snapshot also ends up containing the row that serves as the mutex. Now when restoring the table from snapshot, this rows is still present which causes clients to throw UpgradeInProgress exception. 
> I can think of a couple of ways to fix this:
> 1) Do the checkAndPut for the UPGRADE_MUTEX after creating the snapshot. I am not too sure though how about HBase handles concurrent snapshot requests. Do clients get an exception? Also we potentially could end up creating more snapshots than we really need to. 
> 2) Do the checkAndPut for the UPGRADE_MUTEX in a different table (possibly SYSTEM.SEQUENCE). This way the restored snapshot won't have the row. We would need to delete the row from SYSTEM.SEQUENCE after the upgrade is done (successfully or unsuccessfully).
> [~jamestaylor] - WDYT? 
> FYI, [~lhofhansl] - this is probably a blocker for 4.8.1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)