You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by GitBox <gi...@apache.org> on 2022/05/16 13:47:54 UTC

[GitHub] [accumulo] milleruntime opened a new pull request, #2708: Disable Merging MINC by default

milleruntime opened a new pull request, #2708:
URL: https://github.com/apache/accumulo/pull/2708

   * Merging minor compactions can lead to bad situations where a Tablet
   can't flush. They have been removed in version 2.1.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [accumulo] milleruntime commented on pull request #2708: Disable Merging MINC by default

Posted by GitBox <gi...@apache.org>.
milleruntime commented on PR #2708:
URL: https://github.com/apache/accumulo/pull/2708#issuecomment-1127852508

   It seems like a work around for this scenario would be to just restart the tserver. If a tserver can recover the Tablet, it should force a flush. According to code in the [tserver](https://github.com/apache/accumulo/blob/ed97eac76da47f711f8f901810cc9d4d8fb27214/server/tserver/src/main/java/org/apache/accumulo/tserver/TabletServer.java#L2429), after a Tablet is loaded and if it still has entries in memory, this means that it recovered from the WAL. Then it will be possible to bypass the mergeFile. From the `prepareForMinC` method in Tablet:
   
   https://github.com/apache/accumulo/blob/9c957ab55f28f10dde1321eb613a9ec3e2fa5322/server/tserver/src/main/java/org/apache/accumulo/tserver/tablet/Tablet.java#L977-L980


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [accumulo] milleruntime commented on a diff in pull request #2708: Disable Merging MINC by default

Posted by GitBox <gi...@apache.org>.
milleruntime commented on code in PR #2708:
URL: https://github.com/apache/accumulo/pull/2708#discussion_r873778785


##########
core/src/main/java/org/apache/accumulo/core/conf/Property.java:
##########
@@ -683,10 +683,10 @@ public enum Property {
       "After a tablet has been idle (no mutations) for this time period it may have its "
           + "in-memory map flushed to disk in a minor compaction. There is no guarantee an idle "
           + "tablet will be compacted."),
-  TABLE_MINC_MAX_MERGE_FILE_SIZE("table.compaction.minor.merge.file.size.max", "0",
+  TABLE_MINC_MAX_MERGE_FILE_SIZE("table.compaction.minor.merge.file.size.max", "1",
       PropertyType.MEMORY,
-      "The max file size used for a merging minor compaction. The default value"
-          + " of 0 disables a max file size."),
+      "The max file size used for a merging minor compaction. The value of 0 is no max "
+          + "file size. The default value is 1 byte, which should disable merging minor compactions."),

Review Comment:
   The only thing I could find was @keith-turner 's explanation in the release notes for 2.1:
   "Merging minor compactions could cause O(N^2) compaction work. "
   https://accumulo.apache.org/release/accumulo-2.1.0/



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [accumulo] EdColeman commented on a diff in pull request #2708: Disable Merging MINC by default

Posted by GitBox <gi...@apache.org>.
EdColeman commented on code in PR #2708:
URL: https://github.com/apache/accumulo/pull/2708#discussion_r873771307


##########
core/src/main/java/org/apache/accumulo/core/conf/Property.java:
##########
@@ -683,10 +683,10 @@ public enum Property {
       "After a tablet has been idle (no mutations) for this time period it may have its "
           + "in-memory map flushed to disk in a minor compaction. There is no guarantee an idle "
           + "tablet will be compacted."),
-  TABLE_MINC_MAX_MERGE_FILE_SIZE("table.compaction.minor.merge.file.size.max", "0",
+  TABLE_MINC_MAX_MERGE_FILE_SIZE("table.compaction.minor.merge.file.size.max", "1",
       PropertyType.MEMORY,
-      "The max file size used for a merging minor compaction. The default value"
-          + " of 0 disables a max file size."),
+      "The max file size used for a merging minor compaction. The value of 0 is no max "
+          + "file size. The default value is 1 byte, which should disable merging minor compactions."),

Review Comment:
   Could / should this also say why merging compactions are disabled? This whole thing is confusing to me - not sure if it can be addressed here, or in the documentation (and maybe it is?) - but it seems that we have gone from always wanting to perform a merging compaction to never, but not explaining why.  And when disabled, the max files setting has increased importance because that's what is going to drive compactions (at least, I think that how it expected to work)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [accumulo] milleruntime commented on pull request #2708: Disable Merging MINC by default

Posted by GitBox <gi...@apache.org>.
milleruntime commented on PR #2708:
URL: https://github.com/apache/accumulo/pull/2708#issuecomment-1128107656

   I was able to setup a 1.10 snapshot cluster with this change using Uno on my dev box. I ran CI for a bit and then verified the ingest. I saw lots of MinC and MajC compactions. I didn't see any "M" files. The property was set to 1:
   <pre>
   root@uno> config -t ci -f merge
   -----------+-----------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------
   SCOPE      | NAME                                          | VALUE
   -----------+-----------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------
   default    | table.compaction.minor.merge.file.size.max .. | 1
   </pre>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [accumulo] milleruntime commented on pull request #2708: Disable Merging MINC by default

Posted by GitBox <gi...@apache.org>.
milleruntime commented on PR #2708:
URL: https://github.com/apache/accumulo/pull/2708#issuecomment-1127791599

   > A value of 0 specifies no max size, and a merging compaction will always be run on a flush.
   
   I don't think this is true. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [accumulo] milleruntime commented on pull request #2708: Disable Merging MINC by default

Posted by GitBox <gi...@apache.org>.
milleruntime commented on PR #2708:
URL: https://github.com/apache/accumulo/pull/2708#issuecomment-1127774822

   This is untested. I currently don't have a 1.10 cluster setup. 
   
   Another option would be to just update the javadoc saying that setting it to `1 byte` will disable the Merging MINC. Assuming it actually works...


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [accumulo] milleruntime commented on pull request #2708: Disable Merging MINC by default

Posted by GitBox <gi...@apache.org>.
milleruntime commented on PR #2708:
URL: https://github.com/apache/accumulo/pull/2708#issuecomment-1127758501

   Here is the situation where a Tablet couldn't flush... The tserver hosting the hot spot Tablet was hitting its max WAL limit (TABLE_MINC_LOGS_MAX) so it was forcing a flush on the Tablet.  The client `TabletServerBatchWriter` would try to flush its data by calling `applyUpdates()` to the current commit session but the user is seeing the `HoldTimeoutException` on the client side. The flush will timeout on the tserver, presumably due to hitting max number of write threads and/or connection pools filling up. The WALs keep growing due to the Tablet not flushing. Major compactions will complete but the hot spot Tablet will get stuck trying to flush.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [accumulo] milleruntime commented on pull request #2708: Disable Merging MINC by default

Posted by GitBox <gi...@apache.org>.
milleruntime commented on PR #2708:
URL: https://github.com/apache/accumulo/pull/2708#issuecomment-1128087033

   > I don't know if the statement about `0` being no max size is true or not, but I'm in favor of disabling these by default, with whatever value does that.
   
   Looking at the code in `DatafileManager`, the max is Long.MAX_VALUE:
   https://github.com/apache/accumulo/blob/0a9837f3f8395d89c5cd7bab7805c4aae28919be/server/tserver/src/main/java/org/apache/accumulo/tserver/tablet/DatafileManager.java#L311-L316


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [accumulo] EdColeman commented on pull request #2708: Disable Merging MINC by default

Posted by GitBox <gi...@apache.org>.
EdColeman commented on PR #2708:
URL: https://github.com/apache/accumulo/pull/2708#issuecomment-1127785223

   So:
   
   ```
   The max file size used for a merging minor compaction.  The default value is 1 byte which effectively disables merging minor compactions. A value of 0 specifies no max size, and a merging compaction will always be run on a flush.  Minor merging compactions can result in write thread contention and flush failures, instead MAX_NUM_FILES [-?] and minor compactions will control the number of files
   ```
   Not sure if the NUM_FILES helps here or not...
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [accumulo] milleruntime commented on a diff in pull request #2708: Disable Merging MINC by default

Posted by GitBox <gi...@apache.org>.
milleruntime commented on code in PR #2708:
URL: https://github.com/apache/accumulo/pull/2708#discussion_r873774975


##########
core/src/main/java/org/apache/accumulo/core/conf/Property.java:
##########
@@ -683,10 +683,10 @@ public enum Property {
       "After a tablet has been idle (no mutations) for this time period it may have its "
           + "in-memory map flushed to disk in a minor compaction. There is no guarantee an idle "
           + "tablet will be compacted."),
-  TABLE_MINC_MAX_MERGE_FILE_SIZE("table.compaction.minor.merge.file.size.max", "0",
+  TABLE_MINC_MAX_MERGE_FILE_SIZE("table.compaction.minor.merge.file.size.max", "1",
       PropertyType.MEMORY,
-      "The max file size used for a merging minor compaction. The default value"
-          + " of 0 disables a max file size."),
+      "The max file size used for a merging minor compaction. The value of 0 is no max "
+          + "file size. The default value is 1 byte, which should disable merging minor compactions."),

Review Comment:
   It could, but its not a simple explanation. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [accumulo] milleruntime merged pull request #2708: Disable Merging MINC by default

Posted by GitBox <gi...@apache.org>.
milleruntime merged PR #2708:
URL: https://github.com/apache/accumulo/pull/2708


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [accumulo] ctubbsii commented on pull request #2708: Disable Merging MINC by default

Posted by GitBox <gi...@apache.org>.
ctubbsii commented on PR #2708:
URL: https://github.com/apache/accumulo/pull/2708#issuecomment-1136236076

   @milleruntime I think you could merge this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [accumulo] milleruntime commented on pull request #2708: Disable Merging MINC by default

Posted by GitBox <gi...@apache.org>.
milleruntime commented on PR #2708:
URL: https://github.com/apache/accumulo/pull/2708#issuecomment-1136240364

   > @milleruntime I think you could merge this.
   
   OK I will put some more in the description of the commit, trying to explain the situation.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org