You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by "Christopher Tubbs (Jira)" <ji...@apache.org> on 2020/10/28 22:35:00 UTC
[jira] [Resolved] (ACCUMULO-3067) scan performance degrades after
compaction
[ https://issues.apache.org/jira/browse/ACCUMULO-3067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Christopher Tubbs resolved ACCUMULO-3067.
-----------------------------------------
Resolution: Abandoned
Closing this stale issue. If this is still a problem, please open a new issue or PR at https://github.com/apache/accumulo
> scan performance degrades after compaction
> ------------------------------------------
>
> Key: ACCUMULO-3067
> URL: https://issues.apache.org/jira/browse/ACCUMULO-3067
> Project: Accumulo
> Issue Type: Bug
> Components: tserver
> Environment: Macbook Pro 2.6 GHz Intel Core i7, 16GB RAM, SSD, OSX 10.9.4, single tablet server process, single client process
> Reporter: Adam Fuchs
> Priority: Major
> Attachments: Screen Shot 2014-08-19 at 4.19.37 PM.png, accumulo_query_perf_test.tar.gz, jit_log_during_compaction.txt
>
>
> I've been running some scan performance tests on 1.6.0, and I'm running into an interesting situation in which query performance starts at a certain level and then degrades by ~15% after an event. The test follows roughly the following scenario:
> # Single tabletserver instance
> # Load 100M small (~10byte) key/values into a tablet and let it finish major compacting
> # Disable the garbage collector (this makes the time to _the event_ longer)
> # Restart the tabletserver
> # Repeatedly scan from the beginning to the end of the table in a loop
> # Something happens on the tablet server, like one of {idle compaction of metadata table, forced flush of metadata table, forced compaction of metadata table, forced flush of trace table}
> # Observe that scan rates dropped by 15-20%
> # Observe that restarting the scan will not improve performance back to original level. Performance only gets better upon restarting the tablet server.
> I've been able to get this not to happen by removing iterators from the iterator tree. It doesn't seem to matter which iterators, but removing a certain number both improves performance (significantly) and eliminates the degradation problem. The default iterator tree includes:
> * SourceSwitchingIterator
> ** VersioningIterator
> *** SynchronizedIterator
> **** VisibilityFilter
> ***** ColumnQualifierFilter
> ****** ColumnFamilySkippingIterator
> ******* DeletingIterator
> ******** StatsIterator
> ********* MultiIterator
> ********** MemoryIterator
> ********** ProblemReportingIterator
> *********** HeapIterator
> ************ RFile.LocalityGroupReader
> We can eliminate the weird condition by narrowing the set of iterators to:
> * SourceSwitchingIterator
> ** VisibilityFilter
> *** ColumnFamilySkippingIterator
> **** DeletingIterator
> ***** StatsIterator
> ****** MultiIterator
> ******* MemoryIterator
> ******* ProblemReportingIterator
> ******** HeapIterator
> ********* RFile.LocalityGroupReader
> There are other combinations that also perform much better than the default. I haven't been able to isolate this problem to a single iterator, despite removing each iterator one at a time.
> Anybody know what might be happening here? Best theory so far: the JVM learns that iterators can be used in a different way after a compaction, and some JVM optimization like JIT compilation, branch prediction, or automatic inlining stops happening.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)