You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by GitBox <gi...@apache.org> on 2021/05/18 15:24:57 UTC

[GitHub] [ozone] symious opened a new pull request #2261: HDDS-5243. Return latest key location for clients

symious opened a new pull request #2261:
URL: https://github.com/apache/ozone/pull/2261


   ## What changes were proposed in this pull request?
   
   Currently, OM queries all history versions of locations for a key and replied to clients, which may exceed the IPC response size and cost OM many resources to process history key locations.
   This patch added a config that can be specified to let OM return only the latest key location to clients.
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-5243
   
   ## How was this patch tested?
   
   Unit test
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] bharatviswa504 edited a comment on pull request #2261: HDDS-5243. Return latest key location for clients

Posted by GitBox <gi...@apache.org>.
bharatviswa504 edited a comment on pull request #2261:
URL: https://github.com/apache/ozone/pull/2261#issuecomment-844065641


   > Some stats about this improvement, consider a key is recommitted `n` times with the same content.
   > 
   > versions	size & process times before improvement	size & process times after improvement
   > 1	1	1
   > 5	15	5
   > 10	55	10
   > 100	5050	100
   > 1000	500500	1000
   > The content of the keyLocationList should be as follows:
   > list[0]: version0
   > list[1]: version0, version1
   > list[2]: version0, version1, version2
   > list[3]: version0, version1, version2, version3
   > ...
   > 
   > For each version, all history versions are now stored in an OmKeyLocationInfoGroup.
   
   I think we need to revisit this it was on my bucket list never got to it. Why each version maintains cumulative of all.
   And also the 2nd column mentioned size & process time but only I see one value, what is the processing time. Just curious how much performance improvement we have, certainly size transmitted through network is reduced.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] bharatviswa504 commented on pull request #2261: HDDS-5243. Return latest key location for clients

Posted by GitBox <gi...@apache.org>.
bharatviswa504 commented on pull request #2261:
URL: https://github.com/apache/ozone/pull/2261#issuecomment-850352929


   > +1 I agree. It's an incompatible change on wire level but should be fine as (AFAIK) it's an unusued feature.
   > 
   > We can mention this change in the release notes.
   
   Good idea, I will add this to Release Note Section in the Jira.
   
   I shall wait till Monday eod, if no more comments with this change from others will proceed with merge.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] bharatviswa504 commented on a change in pull request #2261: HDDS-5243. Return latest key location for clients

Posted by GitBox <gi...@apache.org>.
bharatviswa504 commented on a change in pull request #2261:
URL: https://github.com/apache/ozone/pull/2261#discussion_r639915497



##########
File path: hadoop-hdds/common/src/main/java/org/apache/hadoop/ozone/OzoneConfigKeys.java
##########
@@ -449,6 +449,11 @@
   public static final long OZONE_CLIENT_KEY_PROVIDER_CACHE_EXPIRY_DEFAULT =
       TimeUnit.DAYS.toMillis(10); // 10 days
 
+  public static final String OZONE_CLIENT_KEY_FULL_LOCATION_VERSION =
+      "ozone.client.key.full.location.version";
+  public static final boolean OZONE_CLIENT_KEY_FULL_LOCATION_VERSION_DEFAULT =

Review comment:
       As it is confusing when false, return full location, when true slim location.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] symious commented on pull request #2261: HDDS-5243. Return latest key location for clients

Posted by GitBox <gi...@apache.org>.
symious commented on pull request #2261:
URL: https://github.com/apache/ozone/pull/2261#issuecomment-844237231


   @bharatviswa504 Added the server-side check for `listStatus`, `lookupKey`, `getFileStatus`, and `lookupFile`. For ListKeys, the config is set to false by default. Please help to review.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] symious commented on pull request #2261: HDDS-5243. Return latest key location for clients

Posted by GitBox <gi...@apache.org>.
symious commented on pull request #2261:
URL: https://github.com/apache/ozone/pull/2261#issuecomment-843694444


   @elek @bshashikant Could you help to review this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] symious commented on pull request #2261: HDDS-5243. Return latest key location for clients

Posted by GitBox <gi...@apache.org>.
symious commented on pull request #2261:
URL: https://github.com/apache/ozone/pull/2261#issuecomment-856642933


   @bharatviswa504 @xiaoyuyao Thanks for the review.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] bharatviswa504 merged pull request #2261: HDDS-5243. Return latest key location for clients

Posted by GitBox <gi...@apache.org>.
bharatviswa504 merged pull request #2261:
URL: https://github.com/apache/ozone/pull/2261


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] bharatviswa504 commented on a change in pull request #2261: HDDS-5243. Return latest key location for clients

Posted by GitBox <gi...@apache.org>.
bharatviswa504 commented on a change in pull request #2261:
URL: https://github.com/apache/ozone/pull/2261#discussion_r639908056



##########
File path: hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java
##########
@@ -2370,4 +2387,22 @@ void sortDatanodes(String clientMachine, OmKeyInfo... keyInfos) {
     }
     return nodeSet;
   }
+
+  private void slimLocationVersion(OmKeyInfo... keyInfos) {
+    if (keyInfos != null) {
+      for (OmKeyInfo keyInfo : keyInfos) {
+        OmKeyLocationInfoGroup key = keyInfo.getLatestVersionLocations();
+        if (key == null) {
+          LOG.warn("No location version for key {}", keyInfo);
+          continue;
+        }
+        int keyLocationVersionLength = keyInfo.getKeyLocationVersions().size();
+        if (keyLocationVersionLength <= 1) {
+          continue;
+        }
+        keyInfo.setKeyLocationVersions(keyInfo.getKeyLocationVersions()

Review comment:
       >I think we can raise another ticket to check the design?
   
   Makes sense to me. Let us raise a new Jira and discuss this.
   
   >For subList, since setKeyLocationVersions accepts a List, seems we can't return the last value directly.
   Got it.
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] bharatviswa504 commented on a change in pull request #2261: HDDS-5243. Return latest key location for clients

Posted by GitBox <gi...@apache.org>.
bharatviswa504 commented on a change in pull request #2261:
URL: https://github.com/apache/ozone/pull/2261#discussion_r635205133



##########
File path: hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java
##########
@@ -679,6 +679,10 @@ public OmKeyInfo lookupKey(OmKeyArgs args, String clientAddress)
       throw new OMException("Key not found", KEY_NOT_FOUND);
     }
 
+    if (!args.getFullLocationVersion()) {

Review comment:
       Yes, but we need similar check-in lookup file, getFileStatus, listStatus etc., to use the flag and return based on it.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] bharatviswa504 commented on pull request #2261: HDDS-5243. Return latest key location for clients

Posted by GitBox <gi...@apache.org>.
bharatviswa504 commented on pull request #2261:
URL: https://github.com/apache/ozone/pull/2261#issuecomment-849381535


   > @bharatviswa504 I was thinking we should let ozone use the latest version by default. Since full version seems not good in all aspects: CPU, GC, network. I think when version feature is released, we can decide to set this config to true?
   
   It makes sense to me. But the behavior we are changing in client, as any way in RpcClient we use the latest version always we should be good.
   
   cc @bshashikant @arp7 @hanishakoneru for their comments.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] xiaoyuyao commented on a change in pull request #2261: HDDS-5243. Return latest key location for clients

Posted by GitBox <gi...@apache.org>.
xiaoyuyao commented on a change in pull request #2261:
URL: https://github.com/apache/ozone/pull/2261#discussion_r643371210



##########
File path: hadoop-hdds/common/src/main/resources/ozone-default.xml
##########
@@ -2855,4 +2854,11 @@
     </description>
   </property>
 
+  <property>
+    <name>ozone.client.key.latest.location.version</name>

Review comment:
       NIT: should we rename it to ozone.client.key.latest.version.location?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] bharatviswa504 commented on pull request #2261: HDDS-5243. Return latest key location for clients

Posted by GitBox <gi...@apache.org>.
bharatviswa504 commented on pull request #2261:
URL: https://github.com/apache/ozone/pull/2261#issuecomment-849374817


   @symious We don't need to change the config key.
   If we would have changed the default to true, we would be good. So that we don't change the default behavior of clients after the upgrade.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] bharatviswa504 commented on pull request #2261: HDDS-5243. Return latest key location for clients

Posted by GitBox <gi...@apache.org>.
bharatviswa504 commented on pull request #2261:
URL: https://github.com/apache/ozone/pull/2261#issuecomment-855820296


   @symious I am +1, waiting for @xiaoyuyao comment before I proceed, as he has some comments on config naming.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] bharatviswa504 edited a comment on pull request #2261: HDDS-5243. Return latest key location for clients

Posted by GitBox <gi...@apache.org>.
bharatviswa504 edited a comment on pull request #2261:
URL: https://github.com/apache/ozone/pull/2261#issuecomment-855820296


   @symious I am +1, waiting for @xiaoyuyao comments/approval before I proceed with merge, as he has some comments on config naming.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] symious commented on pull request #2261: HDDS-5243. Return latest key location for clients

Posted by GitBox <gi...@apache.org>.
symious commented on pull request #2261:
URL: https://github.com/apache/ozone/pull/2261#issuecomment-843308499


   Some stats about this improvement, consider a key is recommitted `n` times with the same content.
   versions|size & process times before improvement|size & process times after improvement
   --- | --- | ---
   1|1|1
   5|15|5
   10|55|10
   100|5050|100
   1000|500500|1000
   
   The content of the keyLocationList should be as follows:
   list[0]: version0
   list[1]: version0, version1
   list[2]: version0, version1, version2
   list[3]: version0, version1, version2, version3
   ...
   
   For each version, all history versions are now stored in an OmKeyLocationInfoGroup.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] bharatviswa504 commented on a change in pull request #2261: HDDS-5243. Return latest key location for clients

Posted by GitBox <gi...@apache.org>.
bharatviswa504 commented on a change in pull request #2261:
URL: https://github.com/apache/ozone/pull/2261#discussion_r635205133



##########
File path: hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java
##########
@@ -679,6 +679,10 @@ public OmKeyInfo lookupKey(OmKeyArgs args, String clientAddress)
       throw new OMException("Key not found", KEY_NOT_FOUND);
     }
 
+    if (!args.getFullLocationVersion()) {

Review comment:
       Yes, but we need similar check-in lookup file, getFileStatus, listStatus etc.,




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] symious commented on pull request #2261: HDDS-5243. Return latest key location for clients

Posted by GitBox <gi...@apache.org>.
symious commented on pull request #2261:
URL: https://github.com/apache/ozone/pull/2261#issuecomment-855752471


   @bharatviswa504 @xiaoyuyao Any update on this PR?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] bharatviswa504 commented on a change in pull request #2261: HDDS-5243. Return latest key location for clients

Posted by GitBox <gi...@apache.org>.
bharatviswa504 commented on a change in pull request #2261:
URL: https://github.com/apache/ozone/pull/2261#discussion_r635196241



##########
File path: hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/rpc/RpcClient.java
##########
@@ -1148,6 +1155,7 @@ public OzoneInputStream readFile(String volumeName, String bucketName,
         .setBucketName(bucketName)
         .setKeyName(keyName)
         .setSortDatanodesInPipeline(topologyAwareReadEnabled)
+        .setFullLocationVersion(getFullLocationVersion)

Review comment:
       Do you think we need to use this in listStatus also?

##########
File path: hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java
##########
@@ -679,6 +679,10 @@ public OmKeyInfo lookupKey(OmKeyArgs args, String clientAddress)
       throw new OMException("Key not found", KEY_NOT_FOUND);
     }
 
+    if (!args.getFullLocationVersion()) {

Review comment:
       This flag is set in lookupFile, getFileStatus, readKey/readFile in RpcClient but on Server, I don't see any use of that.
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] symious commented on a change in pull request #2261: HDDS-5243. Return latest key location for clients

Posted by GitBox <gi...@apache.org>.
symious commented on a change in pull request #2261:
URL: https://github.com/apache/ozone/pull/2261#discussion_r635206876



##########
File path: hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java
##########
@@ -679,6 +679,10 @@ public OmKeyInfo lookupKey(OmKeyArgs args, String clientAddress)
       throw new OMException("Key not found", KEY_NOT_FOUND);
     }
 
+    if (!args.getFullLocationVersion()) {

Review comment:
       Got it.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] symious commented on pull request #2261: HDDS-5243. Return latest key location for clients

Posted by GitBox <gi...@apache.org>.
symious commented on pull request #2261:
URL: https://github.com/apache/ozone/pull/2261#issuecomment-844079077


   @bshashikant @bharatviswa504 Thanks for the review, for processing improvement, I think it's `KeyManagerImpl$refresh(OmKeyInfo key)` and `KeyManagerImpl$addBlockToken4Read(OmKeyInfo value)`, since these two functions will process all locationVersions.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] symious commented on a change in pull request #2261: HDDS-5243. Return latest key location for clients

Posted by GitBox <gi...@apache.org>.
symious commented on a change in pull request #2261:
URL: https://github.com/apache/ozone/pull/2261#discussion_r635200745



##########
File path: hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java
##########
@@ -679,6 +679,10 @@ public OmKeyInfo lookupKey(OmKeyArgs args, String clientAddress)
       throw new OMException("Key not found", KEY_NOT_FOUND);
     }
 
+    if (!args.getFullLocationVersion()) {

Review comment:
       Since OM will always get the full location from rocksdb, the flag here is for OM to decide whether to send the full version back to client.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] bharatviswa504 commented on pull request #2261: HDDS-5243. Return latest key location for clients

Posted by GitBox <gi...@apache.org>.
bharatviswa504 commented on pull request #2261:
URL: https://github.com/apache/ozone/pull/2261#issuecomment-856640341


   Thank You @symious for the contribution and @xiaoyuyao for the review.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] symious commented on pull request #2261: HDDS-5243. Return latest key location for clients

Posted by GitBox <gi...@apache.org>.
symious commented on pull request #2261:
URL: https://github.com/apache/ozone/pull/2261#issuecomment-849379162


   @bharatviswa504 I was thinking we should let ozone use the latest version by default. Since full version seems not good in all aspects: CPU, GC, network. I think when version feature is released, we can decide to set this config to true?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] bharatviswa504 commented on pull request #2261: HDDS-5243. Return latest key location for clients

Posted by GitBox <gi...@apache.org>.
bharatviswa504 commented on pull request #2261:
URL: https://github.com/apache/ozone/pull/2261#issuecomment-847776458


   Can you rebase, I have few minor comments/questions


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] xiaoyuyao commented on pull request #2261: HDDS-5243. Return latest key location for clients

Posted by GitBox <gi...@apache.org>.
xiaoyuyao commented on pull request #2261:
URL: https://github.com/apache/ozone/pull/2261#issuecomment-856245375


   LGTM, +1. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] elek commented on pull request #2261: HDDS-5243. Return latest key location for clients

Posted by GitBox <gi...@apache.org>.
elek commented on pull request #2261:
URL: https://github.com/apache/ozone/pull/2261#issuecomment-849463091


   +1 I agree. It's an incompatible change on wire level but should be fine as (AFAIK) it's an unusued feature.
   
   We can mention this change in the release notes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] symious commented on a change in pull request #2261: HDDS-5243. Return latest key location for clients

Posted by GitBox <gi...@apache.org>.
symious commented on a change in pull request #2261:
URL: https://github.com/apache/ozone/pull/2261#discussion_r643607230



##########
File path: hadoop-hdds/common/src/main/resources/ozone-default.xml
##########
@@ -2855,4 +2854,11 @@
     </description>
   </property>
 
+  <property>
+    <name>ozone.client.key.latest.location.version</name>

Review comment:
       @xiaoyuyao Thanks for the review. Updated the config key.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] bharatviswa504 edited a comment on pull request #2261: HDDS-5243. Return latest key location for clients

Posted by GitBox <gi...@apache.org>.
bharatviswa504 edited a comment on pull request #2261:
URL: https://github.com/apache/ozone/pull/2261#issuecomment-849381535


   > @bharatviswa504 I was thinking we should let ozone use the latest version by default. Since full version seems not good in all aspects: CPU, GC, network. I think when version feature is released, we can decide to set this config to true?
   
   It makes sense to me. But the behavior we are changing in client, as any way in RpcClient we use the latest version always we should be good. So technically we are not changing the behavior, as anyway we always use only the latest version even though the server has passed all versions.
   
   cc @bshashikant @arp7 @hanishakoneru for their comments.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] xiaoyuyao commented on a change in pull request #2261: HDDS-5243. Return latest key location for clients

Posted by GitBox <gi...@apache.org>.
xiaoyuyao commented on a change in pull request #2261:
URL: https://github.com/apache/ozone/pull/2261#discussion_r643371370



##########
File path: hadoop-hdds/common/src/main/resources/ozone-default.xml
##########
@@ -2855,4 +2854,11 @@
     </description>
   </property>
 
+  <property>
+    <name>ozone.client.key.latest.location.version</name>
+    <tag>OZONE, CLIENT</tag>
+    <value>true</value>
+    <description>Ozone client get the latest location version.

Review comment:
       NIT: get->gets




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] bharatviswa504 commented on a change in pull request #2261: HDDS-5243. Return latest key location for clients

Posted by GitBox <gi...@apache.org>.
bharatviswa504 commented on a change in pull request #2261:
URL: https://github.com/apache/ozone/pull/2261#discussion_r635205133



##########
File path: hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java
##########
@@ -679,6 +679,10 @@ public OmKeyInfo lookupKey(OmKeyArgs args, String clientAddress)
       throw new OMException("Key not found", KEY_NOT_FOUND);
     }
 
+    if (!args.getFullLocationVersion()) {

Review comment:
       Yes, but we need similar check-in lookup file, getFileStatus etc.,




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] symious commented on a change in pull request #2261: HDDS-5243. Return latest key location for clients

Posted by GitBox <gi...@apache.org>.
symious commented on a change in pull request #2261:
URL: https://github.com/apache/ozone/pull/2261#discussion_r638699948



##########
File path: hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java
##########
@@ -2370,4 +2387,22 @@ void sortDatanodes(String clientMachine, OmKeyInfo... keyInfos) {
     }
     return nodeSet;
   }
+
+  private void slimLocationVersion(OmKeyInfo... keyInfos) {
+    if (keyInfos != null) {
+      for (OmKeyInfo keyInfo : keyInfos) {
+        OmKeyLocationInfoGroup key = keyInfo.getLatestVersionLocations();
+        if (key == null) {
+          LOG.warn("No location version for key {}", keyInfo);
+          continue;
+        }
+        int keyLocationVersionLength = keyInfo.getKeyLocationVersions().size();
+        if (keyLocationVersionLength <= 1) {
+          continue;
+        }
+        keyInfo.setKeyLocationVersions(keyInfo.getKeyLocationVersions()

Review comment:
       Thanks for the review. 
   I think it's caused by the definition of `keyLocationVersions: List<OmKeyLocationInfoGroup>`, the `slimLocationVersion` here is just trying to get the last value of `keyLocationVersions`, which is an `OmKeyLocationInfoGroup`. As for the problem that all versions are stored in `OmKeyLocationInfoGroup.locationVersionMap`, I think we can raise another ticket to check the design?
   
   For subList, since `setKeyLocationVersions` accepts a List, seems we can't return the last value directly.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] bharatviswa504 commented on a change in pull request #2261: HDDS-5243. Return latest key location for clients

Posted by GitBox <gi...@apache.org>.
bharatviswa504 commented on a change in pull request #2261:
URL: https://github.com/apache/ozone/pull/2261#discussion_r638680061



##########
File path: hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java
##########
@@ -2370,4 +2387,22 @@ void sortDatanodes(String clientMachine, OmKeyInfo... keyInfos) {
     }
     return nodeSet;
   }
+
+  private void slimLocationVersion(OmKeyInfo... keyInfos) {
+    if (keyInfos != null) {
+      for (OmKeyInfo keyInfo : keyInfos) {
+        OmKeyLocationInfoGroup key = keyInfo.getLatestVersionLocations();
+        if (key == null) {
+          LOG.warn("No location version for key {}", keyInfo);
+          continue;
+        }
+        int keyLocationVersionLength = keyInfo.getKeyLocationVersions().size();
+        if (keyLocationVersionLength <= 1) {
+          continue;
+        }
+        keyInfo.setKeyLocationVersions(keyInfo.getKeyLocationVersions()

Review comment:
       Not understood this code.
   
   Here We should return only latest blocks for the last version.
   
   Here, we are returning latestVersion which has blocks of version 0....n. Here should we return only the latest version blocks?
   

##########
File path: hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/om/helpers/OmKeyInfo.java
##########
@@ -83,8 +82,6 @@
     // complete and prove correctly functioning
     long currentVersion = -1;
     for (OmKeyLocationInfoGroup version : versions) {

Review comment:
       After removing this, I don't think we need code lines from L83-L86

##########
File path: hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java
##########
@@ -2370,4 +2387,22 @@ void sortDatanodes(String clientMachine, OmKeyInfo... keyInfos) {
     }
     return nodeSet;
   }
+
+  private void slimLocationVersion(OmKeyInfo... keyInfos) {
+    if (keyInfos != null) {
+      for (OmKeyInfo keyInfo : keyInfos) {
+        OmKeyLocationInfoGroup key = keyInfo.getLatestVersionLocations();
+        if (key == null) {
+          LOG.warn("No location version for key {}", keyInfo);
+          continue;
+        }
+        int keyLocationVersionLength = keyInfo.getKeyLocationVersions().size();
+        if (keyLocationVersionLength <= 1) {
+          continue;
+        }
+        keyInfo.setKeyLocationVersions(keyInfo.getKeyLocationVersions()

Review comment:
       And also if we want to return last Version of OmKeyLocationinfoGroup, can we return as below? instead of subList
   
   `keyInfo.setKeyLocationVersions(keyInfo.getKeyLocationVersions().get(keyLocationVersionLength-1)`
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] bharatviswa504 edited a comment on pull request #2261: HDDS-5243. Return latest key location for clients

Posted by GitBox <gi...@apache.org>.
bharatviswa504 edited a comment on pull request #2261:
URL: https://github.com/apache/ozone/pull/2261#issuecomment-849374817


   @symious We don't need to change the config key.
   If we would have changed the default to true, we would be good and in code if true return full location, else latest location. So that we don't change the default behavior of clients after the upgrade.
   
   But as anyway you have changed I think we should default to false here, to keep the old behavior.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] bharatviswa504 commented on pull request #2261: HDDS-5243. Return latest key location for clients

Posted by GitBox <gi...@apache.org>.
bharatviswa504 commented on pull request #2261:
URL: https://github.com/apache/ozone/pull/2261#issuecomment-844065641


   > Some stats about this improvement, consider a key is recommitted `n` times with the same content.
   > 
   > versions	size & process times before improvement	size & process times after improvement
   > 1	1	1
   > 5	15	5
   > 10	55	10
   > 100	5050	100
   > 1000	500500	1000
   > The content of the keyLocationList should be as follows:
   > list[0]: version0
   > list[1]: version0, version1
   > list[2]: version0, version1, version2
   > list[3]: version0, version1, version2, version3
   > ...
   > 
   > For each version, all history versions are now stored in an OmKeyLocationInfoGroup.
   
   I think we need to revisit this it was on my bucket list never got to it. Why each version maintains cumulative of all.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] bshashikant commented on pull request #2261: HDDS-5243. Return latest key location for clients

Posted by GitBox <gi...@apache.org>.
bshashikant commented on pull request #2261:
URL: https://github.com/apache/ozone/pull/2261#issuecomment-843766387


   @bharatviswa504 , can you please look into this one? I am ok with the change of adding a config to enable/disable fulll versions keyInfo reporting from OM.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] symious commented on pull request #2261: HDDS-5243. Return latest key location for clients

Posted by GitBox <gi...@apache.org>.
symious commented on pull request #2261:
URL: https://github.com/apache/ozone/pull/2261#issuecomment-855834395


   @bharatviswa504 Thanks for the review.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] bharatviswa504 edited a comment on pull request #2261: HDDS-5243. Return latest key location for clients

Posted by GitBox <gi...@apache.org>.
bharatviswa504 edited a comment on pull request #2261:
URL: https://github.com/apache/ozone/pull/2261#issuecomment-855820296


   @symious I am +1, waiting for @xiaoyuyao comment before I proceed with merge, as he has some comments on config naming.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] symious commented on pull request #2261: HDDS-5243. Return latest key location for clients

Posted by GitBox <gi...@apache.org>.
symious commented on pull request #2261:
URL: https://github.com/apache/ozone/pull/2261#issuecomment-844082264


   @bharatviswa504 For the cultimative version, I think it's the design in `OmKeyLocationInfoGroup`, since the new value is built on the last value in the map. 
   For me, I think it's not good, I didn't include the change in this patch because I think I may need more time to think about the original idea of the design of `OmKeyLocationInfoGroup`, in case I missed something.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] bharatviswa504 commented on a change in pull request #2261: HDDS-5243. Return latest key location for clients

Posted by GitBox <gi...@apache.org>.
bharatviswa504 commented on a change in pull request #2261:
URL: https://github.com/apache/ozone/pull/2261#discussion_r639914650



##########
File path: hadoop-hdds/common/src/main/java/org/apache/hadoop/ozone/OzoneConfigKeys.java
##########
@@ -449,6 +449,11 @@
   public static final long OZONE_CLIENT_KEY_PROVIDER_CACHE_EXPIRY_DEFAULT =
       TimeUnit.DAYS.toMillis(10); // 10 days
 
+  public static final String OZONE_CLIENT_KEY_FULL_LOCATION_VERSION =
+      "ozone.client.key.full.location.version";
+  public static final boolean OZONE_CLIENT_KEY_FULL_LOCATION_VERSION_DEFAULT =

Review comment:
       Do we need to make this default true, as by default we return the full location list(Even before this fix, as the key name says full location version) and false if we return only the latest version?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] symious commented on a change in pull request #2261: HDDS-5243. Return latest key location for clients

Posted by GitBox <gi...@apache.org>.
symious commented on a change in pull request #2261:
URL: https://github.com/apache/ozone/pull/2261#discussion_r643607119



##########
File path: hadoop-hdds/common/src/main/resources/ozone-default.xml
##########
@@ -2855,4 +2854,11 @@
     </description>
   </property>
 
+  <property>
+    <name>ozone.client.key.latest.location.version</name>
+    <tag>OZONE, CLIENT</tag>
+    <value>true</value>
+    <description>Ozone client get the latest location version.

Review comment:
       Updated.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org