You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@ozone.apache.org by GitBox <gi...@apache.org> on 2021/06/23 05:06:09 UTC

[GitHub] [ozone] bharatviswa504 edited a comment on pull request #2338: HDDS-5341. Container report processing is single threaded

bharatviswa504 edited a comment on pull request #2338:
URL: https://github.com/apache/ozone/pull/2338#issuecomment-866524893


   > I've got a couple of questions on this topic:
   > 
   > 1. Does it make sense to thread pool the Incremental Reports too? I know they are much smaller, but they are also much more frequent. I have not seen any evidence they are backing up or not, but its worth considering / checking if we can.
   
   Previously with ICR's, we used to send full container report, that is fixed by HDDS-5111. We shall be testing with this PR and HDDS-5111 with huge container reports from each DN. If we observe issue with ICR, we can add thread pool  to ICR also.
   
   > 2. I believe the DNs send a FCR every 60 - 90 seconds. Is that frequency really needed? Unknown bugs aside, is there any reason to send a FCR after startup and first registration? ON HDFS datanodes only send full block reports every 6 hours by default. If ICRs carry all the required information for SCM, perhaps we should increase the FCR interval to an hour or more?
   
   During startup/registration we need to send a full container report as the ContainerSafeMode rule is dependent on that to validate its rule.  And also we fire container report event, where we process container reports and build container replica set.
   
   But I completely agree with you we can change the full container report interval to a larger value. And I don't think we need to have a large value like HDFS, as compared to HDFS our container report size should be very less.
   
   From our scale testing
   With 9 DN's with each data node filled with 500 PB data, we have seen around 350K containers in the cluster. So, there are a total of 1 million replicas will be reported from all DN's.(When compared with HDFS our container report size is far less in size)
   
   > 3. Have we been able to capture any profiles (eg flame charts) of processing a large FCR to see if there is anything to be optimized in that flow?
   
   We have not debugged at this level, in future testing, we shall look into this.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org