You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hawq.apache.org by liming01 <gi...@git.apache.org> on 2015/12/15 12:10:45 UTC

[GitHub] incubator-hawq pull request: HAWQ-255: change CHECKPOINT_START_LOC...

GitHub user liming01 opened a pull request:

    https://github.com/apache/incubator-hawq/pull/191

    HAWQ-255: change CHECKPOINT_START_LOCK to a fine grained scope so that checkpoint not blocked.

    In some scenarios the request order of this lock and CheckpointStartLock may cause deadlock, MirroredLock and LockRelationForResynchronize()  are useless in hawq now, and it may cause deadlock in AbortTransaction()/CommitTransaction()/smgrDoDeleteActions(), so we remove them.
    
    MirroredLock was removed at previous checkin. LockRelationForResynchronize() need to be removed here.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/liming01/incubator-hawq mli/not_block_checkpoint

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-hawq/pull/191.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #191
    
----
commit 8c19074e3f917ce6c9cbe5451d7196f449ec752c
Author: Ming LI <ml...@pivotal.io>
Date:   2015-12-15T11:04:10Z

    HAWQ-255: change CHECKPOINT_START_LOCK to a fine grained scope so that checkpoint not blocked.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request: HAWQ-255: change CHECKPOINT_START_LOC...

Posted by linwen <gi...@git.apache.org>.
Github user linwen commented on the pull request:

    https://github.com/apache/incubator-hawq/pull/191#issuecomment-166184594
  
    looks good


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request: HAWQ-255: change CHECKPOINT_START_LOC...

Posted by wangzw <gi...@git.apache.org>.
Github user wangzw commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/191#discussion_r47878881
  
    --- Diff: src/backend/storage/smgr/smgr.c ---
    @@ -1327,28 +1327,6 @@ smgrDoDeleteActions(
     						palloc0((*listCount) * sizeof(PersistentFileSysObjStateChangeResult));
     
     	/*
    -	 * There are two situations where we get here. CommitTransaction()/AbortTransaction() or via
    -	 * AbortSubTransaction(). In the first case, we have already obtained the MirroredLock and
    -	 * CheckPointStartLock. In the second case, we have not obtained the locks, so we attempt
    -	 * to get them to make sure proper lock order is maintained.
    -	 *
    -	 * Normally, if a relation lock is needed, it is obtained before the MirroredLock and CheckPointStartLock,
    -	 * but we have not yet obtained an EXCLUSIVE LockRelationForResynchronize. This lock will be obtained in
    -	 * PersistentFileSysObj_EndXactDrop(). This is an exception to the normal lock ordering, which is done
    -	 * to reduce the time that the lock is held, thus allowing a larger window of time for filerep
    -	 * resynchronization to obtain the lock.
    -	 */
    -
    -	/*
    -	 * The logic will eventually obtain a CheckpointStartLock in PersistentRelation_Dropped(),
    -	 * but functions called from this function my obtain Exclusive locks before the
    -	 * CheckpointStartLock is obtained. This could cause a potential deadlock in the future.
    -	 * We need to take a CheckpointStartLock here to maintain proper lock ordering
    -	 * (i.e. MirrorLock -> CheckpointStartLock ).
    -	 */
    -	CHECKPOINT_START_LOCK;
    -
    --- End diff --
    
    Nice work #178 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request: HAWQ-255: change CHECKPOINT_START_LOC...

Posted by liming01 <gi...@git.apache.org>.
Github user liming01 closed the pull request at:

    https://github.com/apache/incubator-hawq/pull/191


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request: HAWQ-255: change CHECKPOINT_START_LOC...

Posted by liming01 <gi...@git.apache.org>.
Github user liming01 commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/191#discussion_r47879838
  
    --- Diff: src/backend/access/transam/xact.c ---
    @@ -2317,14 +2317,14 @@ CommitTransaction(void)
     	willHaveObjectsFromSmgr =
     			PersistentEndXactRec_WillHaveObjectsFromSmgr(EndXactRecKind_Commit);
     
    -	if (willHaveObjectsFromSmgr)
    -	{
    -		/*
    -		 * We need to ensure the recording of the [distributed-]commit record and the
    -		 * persistent post-commit work will be done either before or after a checkpoint.
    -		 */
    -		CHECKPOINT_START_LOCK;
    -	}
    +	/* In previous version, we ensured the recording of the [distributed-]commit record and the
    +	 * persistent post-commit work will be done either before or after a checkpoint.
    +	 *
    +	 * However the persistent table status will be synchronized with AOSeg_XXXX
    +	 * table and hdfs file in PersistentRecovery_Scan() at recovery PASS2.
    +	 * We don't need to worry about inconsistent states between them. So no
    +	 * CHECKPOINT_START_LOCK any more.
    +	 */
    --- End diff --
    
    Now hawq doesn't report error when (4) occurs: I set break in gdb at xact.c:2427, and run below statements, when hang at the "COMMIT", then I run "hadoop dfsadmin -safemode enter" to set hdfs to save mode, then continue in gdb, the commit successfully finished with only warning. Similar problem for ABORT of transaction which includes 'CREATE TABLE'. 
    
    postgres=# begin transaction ISOLATION LEVEL SERIALIZABLE;BEGIN
    postgres=# drop table tableinfs2;                                                                                                                             DROP TABLE
    postgres=# commit;
    WARNING:  could not remove relation directory 24974/16387/24975: Input/output error
    CONTEXT:  Dropping file-system object -- Relation Directory: '24974/16387/24975'
    COMMIT
    postgres=# select * from tableinfs2;
    ERROR:  relation "tableinfs2" does not exist
    
    As for your question above, the recovery process will redo (4) again and similarly it report warning info if we failed to drop file and or fail to modify persistent table. Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request: HAWQ-255: change CHECKPOINT_START_LOC...

Posted by liming01 <gi...@git.apache.org>.
Github user liming01 commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/191#discussion_r47878574
  
    --- Diff: src/backend/storage/smgr/smgr.c ---
    @@ -1327,28 +1327,6 @@ smgrDoDeleteActions(
     						palloc0((*listCount) * sizeof(PersistentFileSysObjStateChangeResult));
     
     	/*
    -	 * There are two situations where we get here. CommitTransaction()/AbortTransaction() or via
    -	 * AbortSubTransaction(). In the first case, we have already obtained the MirroredLock and
    -	 * CheckPointStartLock. In the second case, we have not obtained the locks, so we attempt
    -	 * to get them to make sure proper lock order is maintained.
    -	 *
    -	 * Normally, if a relation lock is needed, it is obtained before the MirroredLock and CheckPointStartLock,
    -	 * but we have not yet obtained an EXCLUSIVE LockRelationForResynchronize. This lock will be obtained in
    -	 * PersistentFileSysObj_EndXactDrop(). This is an exception to the normal lock ordering, which is done
    -	 * to reduce the time that the lock is held, thus allowing a larger window of time for filerep
    -	 * resynchronization to obtain the lock.
    -	 */
    -
    -	/*
    -	 * The logic will eventually obtain a CheckpointStartLock in PersistentRelation_Dropped(),
    -	 * but functions called from this function my obtain Exclusive locks before the
    -	 * CheckpointStartLock is obtained. This could cause a potential deadlock in the future.
    -	 * We need to take a CheckpointStartLock here to maintain proper lock ordering
    -	 * (i.e. MirrorLock -> CheckpointStartLock ).
    -	 */
    -	CHECKPOINT_START_LOCK;
    -
    --- End diff --
    
    I have already removed MirrorLock which can interfere with CHECKPOINT_START_LOCK. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request: HAWQ-255: change CHECKPOINT_START_LOC...

Posted by liming01 <gi...@git.apache.org>.
Github user liming01 commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/191#discussion_r47879953
  
    --- Diff: src/backend/access/transam/xact.c ---
    @@ -2317,14 +2317,14 @@ CommitTransaction(void)
     	willHaveObjectsFromSmgr =
     			PersistentEndXactRec_WillHaveObjectsFromSmgr(EndXactRecKind_Commit);
     
    -	if (willHaveObjectsFromSmgr)
    -	{
    -		/*
    -		 * We need to ensure the recording of the [distributed-]commit record and the
    -		 * persistent post-commit work will be done either before or after a checkpoint.
    -		 */
    -		CHECKPOINT_START_LOCK;
    -	}
    --- End diff --
    
    We will add a test with FaultInjection here to cover this scenario, I don't think it would be a problem.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request: HAWQ-255: change CHECKPOINT_START_LOC...

Posted by wangzw <gi...@git.apache.org>.
Github user wangzw commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/191#discussion_r47877656
  
    --- Diff: src/backend/access/transam/xact.c ---
    @@ -2317,14 +2317,14 @@ CommitTransaction(void)
     	willHaveObjectsFromSmgr =
     			PersistentEndXactRec_WillHaveObjectsFromSmgr(EndXactRecKind_Commit);
     
    -	if (willHaveObjectsFromSmgr)
    -	{
    -		/*
    -		 * We need to ensure the recording of the [distributed-]commit record and the
    -		 * persistent post-commit work will be done either before or after a checkpoint.
    -		 */
    -		CHECKPOINT_START_LOCK;
    -	}
    +	/* In previous version, we ensured the recording of the [distributed-]commit record and the
    +	 * persistent post-commit work will be done either before or after a checkpoint.
    +	 *
    +	 * However the persistent table status will be synchronized with AOSeg_XXXX
    +	 * table and hdfs file in PersistentRecovery_Scan() at recovery PASS2.
    +	 * We don't need to worry about inconsistent states between them. So no
    +	 * CHECKPOINT_START_LOCK any more.
    +	 */
    --- End diff --
    
    Not true. Consider this case.
    
    1) flush commit record.
    2) start a checkpoint.
    3) checkpoint complete successfully.
    4) failed to drop file and or fail to modify persistent table.
    
    Since checkpoint was started (2) and finished (3) successfully after flush commit record (1), the commit record will be truncated during the checkpoint but failure (4) actually happened. I do not think recovery process can handle this case well.
     


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request: HAWQ-255: change CHECKPOINT_START_LOC...

Posted by wangzw <gi...@git.apache.org>.
Github user wangzw commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/191#discussion_r47880114
  
    --- Diff: src/backend/access/transam/xact.c ---
    @@ -2317,14 +2317,14 @@ CommitTransaction(void)
     	willHaveObjectsFromSmgr =
     			PersistentEndXactRec_WillHaveObjectsFromSmgr(EndXactRecKind_Commit);
     
    -	if (willHaveObjectsFromSmgr)
    -	{
    -		/*
    -		 * We need to ensure the recording of the [distributed-]commit record and the
    -		 * persistent post-commit work will be done either before or after a checkpoint.
    -		 */
    -		CHECKPOINT_START_LOCK;
    -	}
    +	/* In previous version, we ensured the recording of the [distributed-]commit record and the
    +	 * persistent post-commit work will be done either before or after a checkpoint.
    +	 *
    +	 * However the persistent table status will be synchronized with AOSeg_XXXX
    +	 * table and hdfs file in PersistentRecovery_Scan() at recovery PASS2.
    +	 * We don't need to worry about inconsistent states between them. So no
    +	 * CHECKPOINT_START_LOCK any more.
    +	 */
    --- End diff --
    
    If (4) failed, HAWQ should definitely NOT report error since the transaction has already committed (1).
    
    No redo will happen during recovery process since commit log has been truncated after checkpoint.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request: HAWQ-255: change CHECKPOINT_START_LOC...

Posted by liming01 <gi...@git.apache.org>.
Github user liming01 commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/191#discussion_r47990910
  
    --- Diff: src/backend/access/transam/xact.c ---
    @@ -2317,14 +2317,14 @@ CommitTransaction(void)
     	willHaveObjectsFromSmgr =
     			PersistentEndXactRec_WillHaveObjectsFromSmgr(EndXactRecKind_Commit);
     
    -	if (willHaveObjectsFromSmgr)
    -	{
    -		/*
    -		 * We need to ensure the recording of the [distributed-]commit record and the
    -		 * persistent post-commit work will be done either before or after a checkpoint.
    -		 */
    -		CHECKPOINT_START_LOCK;
    -	}
    +	/* In previous version, we ensured the recording of the [distributed-]commit record and the
    +	 * persistent post-commit work will be done either before or after a checkpoint.
    +	 *
    +	 * However the persistent table status will be synchronized with AOSeg_XXXX
    +	 * table and hdfs file in PersistentRecovery_Scan() at recovery PASS2.
    +	 * We don't need to worry about inconsistent states between them. So no
    +	 * CHECKPOINT_START_LOCK any more.
    +	 */
    --- End diff --
    
    The status of current transaction will not be truncated at (4), the truncate related code is:
    CreateCheckPoint() --> CheckPointGuts() --> CheckPointMultiXact() -> TruncateMultiXact() -> SimpleLruTruncate()
    In this function, cutoffPage is based on the min value of OldestMemberMXactId and OldestVisibleMXactId for all backends' proc. 
    
    And OldestMemberMXactId and OldestVisibleMXactId for current backend's proc are reset in
    AtEOXact_MultiXact(). In AbortTransaction() and CommitTransaction(), this function is called after AtEOXact_smgr(). 
    
    So in recovery process, we can know that the current transaction has already committed, and all persistent table and hdfs files should be redo. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request: HAWQ-255: change CHECKPOINT_START_LOC...

Posted by wangzw <gi...@git.apache.org>.
Github user wangzw commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/191#discussion_r47877784
  
    --- Diff: src/backend/storage/smgr/smgr.c ---
    @@ -1327,28 +1327,6 @@ smgrDoDeleteActions(
     						palloc0((*listCount) * sizeof(PersistentFileSysObjStateChangeResult));
     
     	/*
    -	 * There are two situations where we get here. CommitTransaction()/AbortTransaction() or via
    -	 * AbortSubTransaction(). In the first case, we have already obtained the MirroredLock and
    -	 * CheckPointStartLock. In the second case, we have not obtained the locks, so we attempt
    -	 * to get them to make sure proper lock order is maintained.
    -	 *
    -	 * Normally, if a relation lock is needed, it is obtained before the MirroredLock and CheckPointStartLock,
    -	 * but we have not yet obtained an EXCLUSIVE LockRelationForResynchronize. This lock will be obtained in
    -	 * PersistentFileSysObj_EndXactDrop(). This is an exception to the normal lock ordering, which is done
    -	 * to reduce the time that the lock is held, thus allowing a larger window of time for filerep
    -	 * resynchronization to obtain the lock.
    -	 */
    -
    -	/*
    -	 * The logic will eventually obtain a CheckpointStartLock in PersistentRelation_Dropped(),
    -	 * but functions called from this function my obtain Exclusive locks before the
    -	 * CheckpointStartLock is obtained. This could cause a potential deadlock in the future.
    -	 * We need to take a CheckpointStartLock here to maintain proper lock ordering
    -	 * (i.e. MirrorLock -> CheckpointStartLock ).
    -	 */
    -	CHECKPOINT_START_LOCK;
    -
    --- End diff --
    
    Although we do not have a mirror any more, but we still have MirrorLock (should be removed). If we remove this lock, can we still keep the lock ordering and prevent from dead locking?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request: HAWQ-255: change CHECKPOINT_START_LOC...

Posted by wangzw <gi...@git.apache.org>.
Github user wangzw commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/191#discussion_r47877332
  
    --- Diff: src/backend/access/transam/xact.c ---
    @@ -2317,14 +2317,14 @@ CommitTransaction(void)
     	willHaveObjectsFromSmgr =
     			PersistentEndXactRec_WillHaveObjectsFromSmgr(EndXactRecKind_Commit);
     
    -	if (willHaveObjectsFromSmgr)
    -	{
    -		/*
    -		 * We need to ensure the recording of the [distributed-]commit record and the
    -		 * persistent post-commit work will be done either before or after a checkpoint.
    -		 */
    -		CHECKPOINT_START_LOCK;
    -	}
    --- End diff --
    
    I have some concern about removing this lock. As the comment indicated, what will happen in such case?
    
    1) flush commit record.
    2) start a checkpoint.
    3) persistent post-commit work (change persistent table)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---