You are viewing a plain text version of this content. The canonical link for it is here.

Posted to notifications@accumulo.apache.org by "keith-turner (via GitHub)" <gi...@apache.org> on 2023/03/05 14:03:59 UTC

[GitHub] [accumulo] keith-turner commented on a diff in pull request #3221: Do not calculate split point in Tablet.needsSplit()

keith-turner commented on code in PR #3221:
URL: https://github.com/apache/accumulo/pull/3221#discussion_r1125672532


##########
server/tserver/src/main/java/org/apache/accumulo/tserver/tablet/Tablet.java:
##########
@@ -1814,7 +1824,7 @@ List<FileRef> findChopFiles(KeyExtent extent, Map<FileRef,Pair<Key,Key>> firstAn
   public synchronized boolean needsSplit() {
     if (isClosing() || isClosed())
       return false;
-    return findSplitRow(getDatafileManager().getFiles()) != null;
+    return isSplitPossible();

Review Comment:
   > Especially for bulk loads, this can cause bulk load failures that would have otherwise been successes due to timeouts. 
   
   Yikes, that is really bad. For the bulk import case, immediately before it calls needsSplit it adds a new file to the tablet, so for that will cause it to always redo the split computation.  Definitely want to avoid the split computations interfering with the bulk import RPC.  Pulling the computation out of any sync block helps with that because it means the bulk import thread will not get stuck on another thread doing the computation. However still need to do something to remove the computation from the bulk import thread itself. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org