You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by GitBox <gi...@apache.org> on 2020/06/05 11:09:53 UTC

[GitHub] [hadoop-ozone] runzhiwang opened a new pull request #1028: HDDS-3735. Improve the performance of SCM with 0.78% by decrease the times of lock and unlock

runzhiwang opened a new pull request #1028:
URL: https://github.com/apache/hadoop-ozone/pull/1028


   
   ## What changes were proposed in this pull request?
   
   I start a ozone cluster with 1000 datanodes, and run two weeks with heavy workload, and perf scm.
   lock and unlock in NodeStateMap.getNodeInfo cost 0.78% cpu because of so many datanodes.
   
   we can getNodeInfo by batch to decrease the times of lock and unlock 
   
   ![](https://issues.apache.org/jira/secure/attachment/13004914/13004914_screenshot-1.png)
   
   
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-3735
   
   ## How was this patch tested?
   
   Existed UT and IT


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] elek commented on pull request #1028: HDDS-3735. Improve SCM performance with 3.7% by remove unnecessary lock and unlock

Posted by GitBox <gi...@apache.org>.
elek commented on pull request #1028:
URL: https://github.com/apache/hadoop-ozone/pull/1028#issuecomment-651659061


   It's not just about the `updateState` method, but (for example) the `removeContainer` method.
   
   Without write lock the container might be removed from `ownerMap` and not from the `typeMap`, which can cause inconsistency.
   
   (I think, but fix me if I am wrong)
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] runzhiwang edited a comment on pull request #1028: HDDS-3735. Improve SCM performance with 3.7% by remove unnecessary lock and unlock

Posted by GitBox <gi...@apache.org>.
runzhiwang edited a comment on pull request #1028:
URL: https://github.com/apache/hadoop-ozone/pull/1028#issuecomment-647830265


   @elek Thanks for review. You are right, if the write code like following, maybe after  `currentInfo.setState(newState)` and before `currentInfo.setOwner(owner)`, we read the currentInfo, so we read a half write ContainerInfo.  Actually, current code did not update two filed of ContainerInfo at the same time, so I did not consider this case.
   ```
   lock.writeLock().lock();
   final ContainerInfo currentInfo = containerMap.get(containerID);
   currentInfo.setState(newState);
   currentInfo.setOwner(owner);
   lock.writeLock().unlock();
   ```
   
   I think we can avoid this by the following code. Though write become more complicated because it needs to new an object, but read is pretty more than write, so it's worth. What do you think ?
   ```
   lock.writeLock().lock();
   final ContainerInfo currentInfo = containerMap.get(containerID);
   ContainerInfo newInfo = new ContainerInfo(currentInfo);
   newInfo.setState(newState);
   newInfo.setOwner(owner);
   containerMap.put(containerID, newInfo);
   lock.writeLock().unlock();
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] runzhiwang commented on pull request #1028: HDDS-3735. Improve SCM performance with 3.7% by remove unnecessary lock and unlock

Posted by GitBox <gi...@apache.org>.
runzhiwang commented on pull request #1028:
URL: https://github.com/apache/hadoop-ozone/pull/1028#issuecomment-651703240


   > Without write lock the container might be removed from ownerMap and not from the typeMap, which can cause inconsistency.
   
   I think this case can not be avoided by only add read lock. Unless we read containerMap, ownerMap, typeMap in the single method with a read lock, Otherwise for example 1. we read  containerMap, 2. we remove container, 3. we read ownerMap. It will also cause inconsistency, because container exists in step1 but not exists in step3.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] runzhiwang commented on pull request #1028: HDDS-3735. Improve SCM performance with 3.7% by remove unnecessary lock and unlock

Posted by GitBox <gi...@apache.org>.
runzhiwang commented on pull request #1028:
URL: https://github.com/apache/hadoop-ozone/pull/1028#issuecomment-647830265


   @elek Thanks for review. You are right, if the write code like following, maybe after  `currentInfo.setState(newState)` and before `currentInfo.setOwner(owner)`, we read the currentInfo, so we read a half write ContainerInfo. 
   ```
   lock.writeLock().lock();
   final ContainerInfo currentInfo = containerMap.get(containerID);
   currentInfo.setState(newState);
   currentInfo.setOwner(owner);
   lock.writeLock().unlock();
   ```
   
   I think we can avoid this by the following code. Though write become more complicated because it needs to new an object, but read is pretty more than write, so it's worth. What do you think ?
   ```
   lock.writeLock().lock();
   final ContainerInfo currentInfo = containerMap.get(containerID);
   ContainerInfo newInfo = new ContainerInfo(currentInfo);
   newInfo.setState(newState);
   newInfo.setOwner(owner);
   containerMap.put(containerID, tmpInfo);
   lock.writeLock().unlock();
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] elek commented on pull request #1028: HDDS-3735. Improve SCM performance with 3.7% by remove unnecessary lock and unlock

Posted by GitBox <gi...@apache.org>.
elek commented on pull request #1028:
URL: https://github.com/apache/hadoop-ozone/pull/1028#issuecomment-659297555


   > @elek I think this case can not be avoided by only add read lock
   
   Agree. It seems to be necessary. But it means that we can't remove the lock in the ContainerStateMap in this patch.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] runzhiwang closed pull request #1028: HDDS-3735. Improve SCM performance with 3.7% by remove unnecessary lock and unlock

Posted by GitBox <gi...@apache.org>.
runzhiwang closed pull request #1028:
URL: https://github.com/apache/hadoop-ozone/pull/1028


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] runzhiwang edited a comment on pull request #1028: HDDS-3735. Improve SCM performance with 3.7% by remove unnecessary lock and unlock

Posted by GitBox <gi...@apache.org>.
runzhiwang edited a comment on pull request #1028:
URL: https://github.com/apache/hadoop-ozone/pull/1028#issuecomment-651703240


   > Without write lock the container might be removed from ownerMap and not from the typeMap, which can cause inconsistency.
   
   @elek I think this case can not be avoided by only add read lock. Unless we read containerMap, ownerMap, typeMap in the single method with a read lock, Otherwise for example 1. we read  containerMap, 2. we remove container, 3. we read ownerMap. It will also cause inconsistency, because container exists in step1 but not exists in step3.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] runzhiwang commented on pull request #1028: HDDS-3735. Improve SCM performance with 3.7% by remove unnecessary lock and unlock

Posted by GitBox <gi...@apache.org>.
runzhiwang commented on pull request #1028:
URL: https://github.com/apache/hadoop-ozone/pull/1028#issuecomment-641678745


   @elek Could you help review this patch ? Thank you very much.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] runzhiwang edited a comment on pull request #1028: HDDS-3735. Improve SCM performance with 3.7% by remove unnecessary lock and unlock

Posted by GitBox <gi...@apache.org>.
runzhiwang edited a comment on pull request #1028:
URL: https://github.com/apache/hadoop-ozone/pull/1028#issuecomment-647830265


   @elek Thanks for review. You are right, if the write code like following, maybe after  `currentInfo.setState(newState)` and before `currentInfo.setOwner(owner)`, we read the currentInfo, so we read a half write ContainerInfo. 
   ```
   lock.writeLock().lock();
   final ContainerInfo currentInfo = containerMap.get(containerID);
   currentInfo.setState(newState);
   currentInfo.setOwner(owner);
   lock.writeLock().unlock();
   ```
   
   I think we can avoid this by the following code. Though write become more complicated because it needs to new an object, but read is pretty more than write, so it's worth. What do you think ?
   ```
   lock.writeLock().lock();
   final ContainerInfo currentInfo = containerMap.get(containerID);
   ContainerInfo newInfo = new ContainerInfo(currentInfo);
   newInfo.setState(newState);
   newInfo.setOwner(owner);
   containerMap.put(containerID, newInfo);
   lock.writeLock().unlock();
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] elek commented on pull request #1028: HDDS-3735. Improve SCM performance with 3.7% by remove unnecessary lock and unlock

Posted by GitBox <gi...@apache.org>.
elek commented on pull request #1028:
URL: https://github.com/apache/hadoop-ozone/pull/1028#issuecomment-647466786


   Actually I am not sure about the other part.
   
   Using read / write locks in the `ContainerStateMap` provides a consistent view. Read calls can see any containers only if all steps from the write method are done (all the indexes are updated).
   
   The patch seems to introduce a consistency problem. If you don't have a read lock the write may not be finished when you read. The  `containerMap` itself is ok, but if you do any operation with any other indexes from `ContainerStateMap` you might see a different view.
   
   Did I miss something? @runzhiwang 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] runzhiwang edited a comment on pull request #1028: HDDS-3735. Improve SCM performance with 3.7% by remove unnecessary lock and unlock

Posted by GitBox <gi...@apache.org>.
runzhiwang edited a comment on pull request #1028:
URL: https://github.com/apache/hadoop-ozone/pull/1028#issuecomment-647830265


   @elek Thanks for review. You are right, if the write code like following, maybe after  `currentInfo.setState(newState)` and before `currentInfo.setOwner(owner)`, we read the currentInfo, so we read a half write ContainerInfo.  Actually, current code did not update two filed of ContainerInfo at the same time, so I did not consider this case.
   ```
   lock.writeLock().lock();
   final ContainerInfo currentInfo = containerMap.get(containerID);
   currentInfo.setState(newState);
   currentInfo.setOwner(owner);
   lock.writeLock().unlock();
   ```
   
   I think we can avoid this by the following code. Though write become more complicated because it needs to new an object, but read is pretty more than write, so it's worth. What do you think ?
   ```
   lock.writeLock().lock();
   final ContainerInfo currentInfo = containerMap.get(containerID);
   ContainerInfo newInfo = new ContainerInfo(currentInfo);
   newInfo.setState(newState);
   newInfo.setOwner(owner);
   containerMap.put(containerID, newInfo);
   lock.writeLock().unlock();
   ```
   
   
   And one more discussion, `getContainerInfo`  return the object itself, so we can change the field of ContainerInfo out of  ContainerStateMap, without the protection of write lock, it's also unsafe.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org