You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@accumulo.apache.org by Mike Miller <mm...@apache.org> on 2022/05/19 18:55:16 UTC

1.10 and Merging Minor Compactions

Hello Accumulo folks,

Sending this to the dev list to see if anyone else had any thoughts. I
created this PR as a fix for a bad situation with merging minor compactions
in version 1.10. https://github.com/apache/accumulo/pull/2708

Here is the situation where a Tablet couldn't flush... The tserver hosting
the hot spot Tablet was hitting its max WAL limit (TABLE_MINC_LOGS_MAX) so
it was forcing a flush on the Tablet. The client (TabletServerBatchWriter)
would try to flush its data by calling applyUpdates() to the current commit
session but the user is seeing the HoldTimeoutException on the client side.
The flush will timeout on the tserver, presumably due to hitting max number
of write threads and/or connection pools filling up. The WALs keep growing
due to the Tablet not flushing. Major compactions will complete but the hot
spot Tablet will get stuck trying to flush.

The simple quick fix is to restart the tablet server hosting the hot spot
Tablet. But this won't prevent the situation from happening again.

FYI these troublesome flushes (M files) have been removed in version 2.1.