You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by GitBox <gi...@apache.org> on 2020/06/04 16:40:12 UTC

[GitHub] [hadoop-ozone] esahekmat opened a new pull request #1023: HDDS-3718: Improve OmKeyLocationInfoGroup internal data structure

esahekmat opened a new pull request #1023:
URL: https://github.com/apache/hadoop-ozone/pull/1023


   ## What changes were proposed in this pull request?
   OmKeyLocationInfoGroup changed to store a Map between versions and list of OmKeyLocationInfos
   instead of storing all locationInfos in a list, by this change, the performance of getting the list of the latest version improved.
   
   ## What is the link to the Apache JIRA
   https://issues.apache.org/jira/browse/HDDS-3718
   
   ## How was this patch tested?
   ozone common module tests


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] esahekmat commented on a change in pull request #1023: HDDS-3718: Improve OmKeyLocationInfoGroup internal data structure

Posted by GitBox <gi...@apache.org>.
esahekmat commented on a change in pull request #1023:
URL: https://github.com/apache/hadoop-ozone/pull/1023#discussion_r449776213



##########
File path: hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/om/helpers/OmKeyLocationInfoGroup.java
##########
@@ -30,12 +30,23 @@
  */
 public class OmKeyLocationInfoGroup {
   private final long version;
-  private final List<OmKeyLocationInfo> locationList;
+  private final Map<Long, List<OmKeyLocationInfo>> locationVersionList;

Review comment:
       Sure, I will change it




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] esahekmat edited a comment on pull request #1023: HDDS-3718: Improve OmKeyLocationInfoGroup internal data structure

Posted by GitBox <gi...@apache.org>.
esahekmat edited a comment on pull request #1023:
URL: https://github.com/apache/hadoop-ozone/pull/1023#issuecomment-643073937


   I agree the change can be postponed until the version support implementation, however storing different version in a heterogeneous List, and filter it for specific version led to poor design, for example, searching for specific version in the list, some use-cases of it had been done in OmKeyLocationInfoGroup itself and some of them had been done in other class (see getBlocksLatestVersionOnly in OmKeyLocationInfoGroup and addPreallocateBlocks in BlockOutputStreamEntryPool)
   This structure also affects some other efforts to improve the performance (see https://github.com/apache/hadoop-ozone/pull/1034)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] esahekmat commented on pull request #1023: HDDS-3718: Improve OmKeyLocationInfoGroup internal data structure

Posted by GitBox <gi...@apache.org>.
esahekmat commented on pull request #1023:
URL: https://github.com/apache/hadoop-ozone/pull/1023#issuecomment-639125413


   I have tried to preserve the getProtobuf and getFromProtobuf methods and I have kept the KeyLocationList structure intact, does it enough to prevent incompatibility?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] arp7 commented on pull request #1023: HDDS-3718: Improve OmKeyLocationInfoGroup internal data structure

Posted by GitBox <gi...@apache.org>.
arp7 commented on pull request #1023:
URL: https://github.com/apache/hadoop-ozone/pull/1023#issuecomment-639114681


   Thanks for the contribution @esahekmat. Do you know if this structure is passed on the wire or persisted to disk i.e. will the change be incompatible?
   
   cc @bharatviswa504 @melek


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] xiaoyuyao commented on pull request #1023: HDDS-3718: Improve OmKeyLocationInfoGroup internal data structure

Posted by GitBox <gi...@apache.org>.
xiaoyuyao commented on pull request #1023:
URL: https://github.com/apache/hadoop-ozone/pull/1023#issuecomment-643004950


   I agree the structure pass over the wire does not change with this PR. However, given the version support is not fully added, we should have only limited(one) version here. Change the list to map will not bring in much performance improvement. Should we postpone  the change when adding version support? 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] xiaoyuyao merged pull request #1023: HDDS-3718: Improve OmKeyLocationInfoGroup internal data structure

Posted by GitBox <gi...@apache.org>.
xiaoyuyao merged pull request #1023:
URL: https://github.com/apache/hadoop-ozone/pull/1023


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] esahekmat commented on a change in pull request #1023: HDDS-3718: Improve OmKeyLocationInfoGroup internal data structure

Posted by GitBox <gi...@apache.org>.
esahekmat commented on a change in pull request #1023:
URL: https://github.com/apache/hadoop-ozone/pull/1023#discussion_r449776133



##########
File path: hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/om/helpers/OmKeyLocationInfoGroup.java
##########
@@ -30,12 +30,23 @@
  */
 public class OmKeyLocationInfoGroup {
   private final long version;
-  private final List<OmKeyLocationInfo> locationList;
+  private final Map<Long, List<OmKeyLocationInfo>> locationVersionList;
 
   public OmKeyLocationInfoGroup(long version,
                                 List<OmKeyLocationInfo> locations) {
     this.version = version;
-    this.locationList = locations;
+    this.locationVersionList = locations.stream()
+        .collect(Collectors.groupingBy(OmKeyLocationInfo::getCreateVersion));
+    //prevent NPE
+    this.locationVersionList.putIfAbsent(version, new ArrayList<>());
+  }
+
+  public OmKeyLocationInfoGroup(long version,
+                                Map<Long, List<OmKeyLocationInfo>> locations) {
+    this.version = version;
+    this.locationVersionList = locations;

Review comment:
       It will cause a redundant instantiation of the map and provides no benefit because the map reassigned by the constructor parameter.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] esahekmat commented on pull request #1023: HDDS-3718: Improve OmKeyLocationInfoGroup internal data structure

Posted by GitBox <gi...@apache.org>.
esahekmat commented on pull request #1023:
URL: https://github.com/apache/hadoop-ozone/pull/1023#issuecomment-656571927


   @xiaoyuyao Thanks for reviewing it again, I have answered your concerns, would you please take another look to the it


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] xiaoyuyao commented on pull request #1023: HDDS-3718: Improve OmKeyLocationInfoGroup internal data structure

Posted by GitBox <gi...@apache.org>.
xiaoyuyao commented on pull request #1023:
URL: https://github.com/apache/hadoop-ozone/pull/1023#issuecomment-662523336


   +1, Thanks @esahekmat for the contribution. I will merge the PR shortly. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] xiaoyuyao commented on a change in pull request #1023: HDDS-3718: Improve OmKeyLocationInfoGroup internal data structure

Posted by GitBox <gi...@apache.org>.
xiaoyuyao commented on a change in pull request #1023:
URL: https://github.com/apache/hadoop-ozone/pull/1023#discussion_r448011359



##########
File path: hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/om/helpers/OmKeyLocationInfoGroup.java
##########
@@ -30,12 +30,23 @@
  */
 public class OmKeyLocationInfoGroup {
   private final long version;
-  private final List<OmKeyLocationInfo> locationList;
+  private final Map<Long, List<OmKeyLocationInfo>> locationVersionList;

Review comment:
       NIT: should we rename to locationVersionMap to avoid confusion?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] xiaoyuyao commented on a change in pull request #1023: HDDS-3718: Improve OmKeyLocationInfoGroup internal data structure

Posted by GitBox <gi...@apache.org>.
xiaoyuyao commented on a change in pull request #1023:
URL: https://github.com/apache/hadoop-ozone/pull/1023#discussion_r448010316



##########
File path: hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/om/helpers/OmKeyLocationInfoGroup.java
##########
@@ -30,12 +30,23 @@
  */
 public class OmKeyLocationInfoGroup {
   private final long version;
-  private final List<OmKeyLocationInfo> locationList;
+  private final Map<Long, List<OmKeyLocationInfo>> locationVersionList;
 
   public OmKeyLocationInfoGroup(long version,
                                 List<OmKeyLocationInfo> locations) {
     this.version = version;
-    this.locationList = locations;
+    this.locationVersionList = locations.stream()
+        .collect(Collectors.groupingBy(OmKeyLocationInfo::getCreateVersion));
+    //prevent NPE
+    this.locationVersionList.putIfAbsent(version, new ArrayList<>());
+  }
+
+  public OmKeyLocationInfoGroup(long version,
+                                Map<Long, List<OmKeyLocationInfo>> locations) {
+    this.version = version;
+    this.locationVersionList = locations;

Review comment:
       should we change the order of 47/49 so that there is always an ArrayList for the version and the locations is added to that arraylist.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] esahekmat commented on pull request #1023: HDDS-3718: Improve OmKeyLocationInfoGroup internal data structure

Posted by GitBox <gi...@apache.org>.
esahekmat commented on pull request #1023:
URL: https://github.com/apache/hadoop-ozone/pull/1023#issuecomment-643073937


   I agree the change can be postponed until the version support implementation, however storing different version in a heterogeneous List, and filter it for specific version led to poor design, for example, searching for specific version in the list, some use-cases of it had been done in OmKeyLocationInfoGroup itself and some of them had been done in other class (see getBlocksLatestVersionOnly in OmKeyLocationInfoGroup and addPreallocateBlocks in BlockOutputStreamEntryPool)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] codecov-commenter commented on pull request #1023: HDDS-3718: Improve OmKeyLocationInfoGroup internal data structure

Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on pull request #1023:
URL: https://github.com/apache/hadoop-ozone/pull/1023#issuecomment-643114995


   # [Codecov](https://codecov.io/gh/apache/hadoop-ozone/pull/1023?src=pr&el=h1) Report
   > Merging [#1023](https://codecov.io/gh/apache/hadoop-ozone/pull/1023?src=pr&el=desc) into [master](https://codecov.io/gh/apache/hadoop-ozone/commit/cee76a891b21ea50884ad8ba14461b31f60494ea&el=desc) will **decrease** coverage by `0.01%`.
   > The diff coverage is `65.15%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hadoop-ozone/pull/1023/graphs/tree.svg?width=650&height=150&src=pr&token=5YeeptJMby)](https://codecov.io/gh/apache/hadoop-ozone/pull/1023?src=pr&el=tree)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #1023      +/-   ##
   ============================================
   - Coverage     69.46%   69.44%   -0.02%     
   + Complexity     9121     9110      -11     
   ============================================
     Files           961      961              
     Lines         48151    48132      -19     
     Branches       4679     4675       -4     
   ============================================
   - Hits          33448    33426      -22     
   + Misses        12489    12487       -2     
   - Partials       2214     2219       +5     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/hadoop-ozone/pull/1023?src=pr&el=tree) | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | [...main/java/org/apache/hadoop/ozone/OzoneConsts.java](https://codecov.io/gh/apache/hadoop-ozone/pull/1023/diff?src=pr&el=tree#diff-aGFkb29wLWhkZHMvY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9oYWRvb3Avb3pvbmUvT3pvbmVDb25zdHMuamF2YQ==) | `84.21% <ø> (ø)` | `1.00 <0.00> (ø)` | |
   | [...apache/hadoop/hdds/utils/db/RocksDBCheckpoint.java](https://codecov.io/gh/apache/hadoop-ozone/pull/1023/diff?src=pr&el=tree#diff-aGFkb29wLWhkZHMvZnJhbWV3b3JrL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9oYWRvb3AvaGRkcy91dGlscy9kYi9Sb2Nrc0RCQ2hlY2twb2ludC5qYXZh) | `90.90% <ø> (+0.90%)` | `5.00 <0.00> (-3.00)` | :arrow_up: |
   | [.../java/org/apache/hadoop/ozone/om/OMConfigKeys.java](https://codecov.io/gh/apache/hadoop-ozone/pull/1023/diff?src=pr&el=tree#diff-aGFkb29wLW96b25lL2NvbW1vbi9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaGFkb29wL296b25lL29tL09NQ29uZmlnS2V5cy5qYXZh) | `100.00% <ø> (ø)` | `1.00 <0.00> (ø)` | |
   | [.../apache/hadoop/ozone/om/OMDBCheckpointServlet.java](https://codecov.io/gh/apache/hadoop-ozone/pull/1023/diff?src=pr&el=tree#diff-aGFkb29wLW96b25lL296b25lLW1hbmFnZXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2hhZG9vcC9vem9uZS9vbS9PTURCQ2hlY2twb2ludFNlcnZsZXQuamF2YQ==) | `66.26% <ø> (-4.27%)` | `8.00 <0.00> (-2.00)` | |
   | [...a/org/apache/hadoop/ozone/om/ha/OMNodeDetails.java](https://codecov.io/gh/apache/hadoop-ozone/pull/1023/diff?src=pr&el=tree#diff-aGFkb29wLW96b25lL296b25lLW1hbmFnZXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2hhZG9vcC9vem9uZS9vbS9oYS9PTU5vZGVEZXRhaWxzLmphdmE=) | `86.66% <ø> (ø)` | `12.00 <0.00> (ø)` | |
   | [...p/ozone/om/ratis/utils/OzoneManagerRatisUtils.java](https://codecov.io/gh/apache/hadoop-ozone/pull/1023/diff?src=pr&el=tree#diff-aGFkb29wLW96b25lL296b25lLW1hbmFnZXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2hhZG9vcC9vem9uZS9vbS9yYXRpcy91dGlscy9Pem9uZU1hbmFnZXJSYXRpc1V0aWxzLmphdmE=) | `67.44% <0.00%> (-19.13%)` | `39.00 <0.00> (ø)` | |
   | [.../java/org/apache/hadoop/ozone/om/OzoneManager.java](https://codecov.io/gh/apache/hadoop-ozone/pull/1023/diff?src=pr&el=tree#diff-aGFkb29wLW96b25lL296b25lLW1hbmFnZXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2hhZG9vcC9vem9uZS9vbS9Pem9uZU1hbmFnZXIuamF2YQ==) | `64.22% <12.50%> (-0.37%)` | `185.00 <1.00> (-1.00)` | |
   | [...adoop/ozone/om/helpers/OmKeyLocationInfoGroup.java](https://codecov.io/gh/apache/hadoop-ozone/pull/1023/diff?src=pr&el=tree#diff-aGFkb29wLW96b25lL2NvbW1vbi9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaGFkb29wL296b25lL29tL2hlbHBlcnMvT21LZXlMb2NhdGlvbkluZm9Hcm91cC5qYXZh) | `84.90% <88.23%> (+9.29%)` | `14.00 <10.00> (+2.00)` | |
   | [...adoop/ozone/om/ratis/OzoneManagerStateMachine.java](https://codecov.io/gh/apache/hadoop-ozone/pull/1023/diff?src=pr&el=tree#diff-aGFkb29wLW96b25lL296b25lLW1hbmFnZXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2hhZG9vcC9vem9uZS9vbS9yYXRpcy9Pem9uZU1hbmFnZXJTdGF0ZU1hY2hpbmUuamF2YQ==) | `58.24% <90.47%> (+2.50%)` | `27.00 <4.00> (+1.00)` | |
   | [...op/ozone/client/io/BlockOutputStreamEntryPool.java](https://codecov.io/gh/apache/hadoop-ozone/pull/1023/diff?src=pr&el=tree#diff-aGFkb29wLW96b25lL2NsaWVudC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaGFkb29wL296b25lL2NsaWVudC9pby9CbG9ja091dHB1dFN0cmVhbUVudHJ5UG9vbC5qYXZh) | `71.61% <100.00%> (-0.19%)` | `33.00 <2.00> (-1.00)` | |
   | ... and [36 more](https://codecov.io/gh/apache/hadoop-ozone/pull/1023/diff?src=pr&el=tree-more) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/hadoop-ozone/pull/1023?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/hadoop-ozone/pull/1023?src=pr&el=footer). Last update [78255dd...686dcfb](https://codecov.io/gh/apache/hadoop-ozone/pull/1023?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org