You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by GitBox <gi...@apache.org> on 2020/12/08 07:46:03 UTC

[GitHub] [hbase] ramkrish86 commented on pull request #2664: HBASE-24850 - CellComparator perf improvement

ramkrish86 commented on pull request #2664:
URL: https://github.com/apache/hbase/pull/2664#issuecomment-740445961


   In the latest commit apart from having the ContiguousCellComparator, We also found that the bulk load performance was slower inspite of overall improving the comparator performance by above 15%. 
   The reason was that PutsortReducer - get a given row with all the cells for that row and that gets written to the hfile. So effectively it is one row that is geting added to the map. Now even when cases where there are 300 cells in a row, the optimization that we expect out of ContiguousCellComparator changes does not kick in. That is due to the various branches we still have in the code and the number of cells for the optimization to kick in is still lesser. 
   For those cases if we can bring up the KVComparator again (currently it is deprecated - see the PutsortReducer changes in the patch) and use that KVComparator specifically for these bulk load type of cases then we are performing 15% faster than 1.3 branch.  This is in line with what we are trying to do in https://issues.apache.org/jira/browse/HBASE-24754.
   I can open up a discussion thread with all the details in the dev@ for others to chime in.
   @anoopsjohn , @saintstack - FYI.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org