You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by GitBox <gi...@apache.org> on 2021/05/12 18:51:08 UTC

[GitHub] [accumulo] keith-turner opened a new pull request #2101: Avoid recreating ServiceEnvironment

keith-turner opened a new pull request #2101:
URL: https://github.com/apache/accumulo/pull/2101


   While reviewing code in ConfigurationImpl to assess #947, I noticed that ConigurationImpl was doing some caching however the way that ConigurationImpl is made availabe for use could continually create new instances invalidating the caching.  This change makes it so that when ServiceEnvironmentImpl is exposed to plugins it does not continually create new instances, which would cause new instances of ConfigurationImpl to be creatd.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on a change in pull request #2101: Avoid recreating ServiceEnvironment

Posted by GitBox <gi...@apache.org>.
keith-turner commented on a change in pull request #2101:
URL: https://github.com/apache/accumulo/pull/2101#discussion_r631422331



##########
File path: server/base/src/main/java/org/apache/accumulo/server/conf/TableConfiguration.java
##########
@@ -213,6 +216,11 @@ private static ScanDispatcher createScanDispatcher(AccumuloConfiguration conf,
         conf.getAllPropertiesWithPrefixStripped(Property.TABLE_SCAN_DISPATCHER_OPTS);
 
     newDispatcher.init(new ScanDispatcher.InitParameters() {
+
+      // scan dispatcher are in the critical path for scans, so only create ServiceEnv if needed.
+      private final Supplier<ServiceEnvironment> senvSupplier =
+          Suppliers.memoize(() -> new ServiceEnvironmentImpl(context));

Review comment:
       The default scan dispatcher does not use the service env during init().  However this should only be called when config changes, so thinking that memoize is not needed here.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] ctubbsii commented on a change in pull request #2101: Avoid recreating ServiceEnvironment

Posted by GitBox <gi...@apache.org>.
ctubbsii commented on a change in pull request #2101:
URL: https://github.com/apache/accumulo/pull/2101#discussion_r631362740



##########
File path: server/tserver/src/main/java/org/apache/accumulo/tserver/tablet/CompactableUtils.java
##########
@@ -303,9 +312,11 @@ public TableId getTableId() {
 
     Selection selection = selector.select(new CompactionSelector.SelectionParameters() {
 
+      private final ServiceEnvironment senv = new ServiceEnvironmentImpl(tablet.getContext());
+

Review comment:
       This is duplicated in the `selectFiles` method for InitParamaters and for SelectionParameters. It could be created once in the selectFiles method as a final local variable.

##########
File path: server/tserver/src/main/java/org/apache/accumulo/tserver/TabletServerResourceManager.java
##########
@@ -178,14 +180,17 @@ private ExecutorService createPriorityExecutor(ScanExecutorConfig sec,
         Comparator<ScanInfo> comparator =
             factory.createComparator(new ScanPrioritizer.CreateParameters() {
 
+              private final Supplier<ServiceEnvironment> senvSupplier =
+                  Suppliers.memoize(() -> new ServiceEnvironmentImpl(context));

Review comment:
       It's not obvious here why this instance needs to be lazily loaded when other instances are not.

##########
File path: server/tserver/src/main/java/org/apache/accumulo/tserver/tablet/CompactableUtils.java
##########
@@ -226,14 +227,17 @@ static AccumuloConfiguration createCompactionConfiguration(Tablet tablet,
         cfg.getClassName(), CompactionConfigurer.class);
 
     configurer.init(new CompactionConfigurer.InitParamaters() {
+
+      private final ServiceEnvironment senv = new ServiceEnvironmentImpl(tablet.getContext());
+

Review comment:
       This is created twice in the createCompactionConfiguration method, once for the anonymous InitParamaters (which I just noticed is misspelled; should be InitParameters) and once for the anonymous InputParameters below. This could be created once in the method as a final local variable, and referenced in each anonymous inner class instance's getEnvironment method.

##########
File path: server/base/src/main/java/org/apache/accumulo/server/conf/TableConfiguration.java
##########
@@ -213,6 +216,11 @@ private static ScanDispatcher createScanDispatcher(AccumuloConfiguration conf,
         conf.getAllPropertiesWithPrefixStripped(Property.TABLE_SCAN_DISPATCHER_OPTS);
 
     newDispatcher.init(new ScanDispatcher.InitParameters() {
+
+      // scan dispatcher are in the critical path for scans, so only create ServiceEnv if needed.
+      private final Supplier<ServiceEnvironment> senvSupplier =
+          Suppliers.memoize(() -> new ServiceEnvironmentImpl(context));

Review comment:
       Under which circumstances would this one not be used, and the lazy loading would help?

##########
File path: server/tserver/src/main/java/org/apache/accumulo/tserver/TabletServerResourceManager.java
##########
@@ -783,6 +788,11 @@ public void executeReadAhead(KeyExtent tablet, ScanDispatcher dispatcher, ScanSe
       scanExecutors.get("meta").execute(task);
     } else {
       DispatchParameters params = new DispatchParamsImpl() {
+
+        // in scan critical path so only create ServiceEnv if needed
+        private final Supplier<ServiceEnvironment> senvSupplier =
+            Suppliers.memoize(() -> new ServiceEnvironmentImpl(context));

Review comment:
       For my own understanding, under which circumstances would this not actually be used, and therefore not need to be created?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on a change in pull request #2101: Avoid recreating ServiceEnvironment

Posted by GitBox <gi...@apache.org>.
keith-turner commented on a change in pull request #2101:
URL: https://github.com/apache/accumulo/pull/2101#discussion_r631419542



##########
File path: server/tserver/src/main/java/org/apache/accumulo/tserver/TabletServerResourceManager.java
##########
@@ -178,14 +180,17 @@ private ExecutorService createPriorityExecutor(ScanExecutorConfig sec,
         Comparator<ScanInfo> comparator =
             factory.createComparator(new ScanPrioritizer.CreateParameters() {
 
+              private final Supplier<ServiceEnvironment> senvSupplier =
+                  Suppliers.memoize(() -> new ServiceEnvironmentImpl(context));

Review comment:
       I reviewed that further and found it does not need to be lazily loaded.  Its only created once when a scan thread pool is created, so there is no per scan performance concern.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on a change in pull request #2101: Avoid recreating ServiceEnvironment

Posted by GitBox <gi...@apache.org>.
keith-turner commented on a change in pull request #2101:
URL: https://github.com/apache/accumulo/pull/2101#discussion_r631953611



##########
File path: server/tserver/src/main/java/org/apache/accumulo/tserver/TabletServerResourceManager.java
##########
@@ -783,6 +788,11 @@ public void executeReadAhead(KeyExtent tablet, ScanDispatcher dispatcher, ScanSe
       scanExecutors.get("meta").execute(task);
     } else {
       DispatchParameters params = new DispatchParamsImpl() {
+
+        // in scan critical path so only create ServiceEnv if needed
+        private final Supplier<ServiceEnvironment> senvSupplier =
+            Suppliers.memoize(() -> new ServiceEnvironmentImpl(context));

Review comment:
       > Is that right?
   
   That is correct.
   
   > Just a quirk with the current design of the service environment containing snapshots and then caching them.
   
   If it does not exists, probably need better documentation of the lifecycle of an instantiated dispatcher.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on a change in pull request #2101: Avoid recreating ServiceEnvironment

Posted by GitBox <gi...@apache.org>.
keith-turner commented on a change in pull request #2101:
URL: https://github.com/apache/accumulo/pull/2101#discussion_r631421683



##########
File path: server/tserver/src/main/java/org/apache/accumulo/tserver/TabletServerResourceManager.java
##########
@@ -783,6 +788,11 @@ public void executeReadAhead(KeyExtent tablet, ScanDispatcher dispatcher, ScanSe
       scanExecutors.get("meta").execute(task);
     } else {
       DispatchParameters params = new DispatchParamsImpl() {
+
+        // in scan critical path so only create ServiceEnv if needed
+        private final Supplier<ServiceEnvironment> senvSupplier =
+            Suppliers.memoize(() -> new ServiceEnvironmentImpl(context));

Review comment:
       The `dispatch()` method on the default scan dispatcher is designed to do as little work as possible since its called for each scan batch. Its possible we could remove the memoize for init, because the code only creates new ScanDispatcher objects when table config changes.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] ctubbsii commented on a change in pull request #2101: Avoid recreating ServiceEnvironment

Posted by GitBox <gi...@apache.org>.
ctubbsii commented on a change in pull request #2101:
URL: https://github.com/apache/accumulo/pull/2101#discussion_r631468449



##########
File path: server/base/src/main/java/org/apache/accumulo/server/ServiceEnvironmentImpl.java
##########
@@ -32,10 +34,12 @@
 
   private final ServerContext srvCtx;
   private final Configuration conf;
+  private final Map<TableId,Configuration> tableConfigs;
 
   public ServiceEnvironmentImpl(ServerContext ctx) {
     this.srvCtx = ctx;
     this.conf = new ConfigurationImpl(srvCtx.getConfiguration());
+    this.tableConfigs = new ConcurrentHashMap<>();

Review comment:
       This change only saves one line :smiley_cat:, but why not:
   
   ```suggestion
     private final Map<TableId,Configuration> tableConfigs = new ConcurrentHashMap<>();
   
     public ServiceEnvironmentImpl(ServerContext ctx) {
       this.srvCtx = ctx;
       this.conf = new ConfigurationImpl(srvCtx.getConfiguration());
   ```

##########
File path: server/tserver/src/main/java/org/apache/accumulo/tserver/TabletServerResourceManager.java
##########
@@ -783,6 +788,11 @@ public void executeReadAhead(KeyExtent tablet, ScanDispatcher dispatcher, ScanSe
       scanExecutors.get("meta").execute(task);
     } else {
       DispatchParameters params = new DispatchParamsImpl() {
+
+        // in scan critical path so only create ServiceEnv if needed
+        private final Supplier<ServiceEnvironment> senvSupplier =
+            Suppliers.memoize(() -> new ServiceEnvironmentImpl(context));

Review comment:
       Okay, so I think I understand: it depends on the dispatcher implementation, and some may not need to reference the environment in order to dispatch. So, we just avoid instantiating it unless they actually use it, just to avoid a performance hit, because the dispatch method gets called a lot.  Is that right?
   
   Aside:
   I noticed that the service environment passed in init could contain different information than the one passed in dispatch. That could be confusing for plugin implementers who might expect them to agree. Just a quirk with the current design of the service environment containing snapshots and then caching them.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on a change in pull request #2101: Avoid recreating ServiceEnvironment

Posted by GitBox <gi...@apache.org>.
keith-turner commented on a change in pull request #2101:
URL: https://github.com/apache/accumulo/pull/2101#discussion_r631424367



##########
File path: server/base/src/main/java/org/apache/accumulo/server/conf/TableConfiguration.java
##########
@@ -213,6 +216,11 @@ private static ScanDispatcher createScanDispatcher(AccumuloConfiguration conf,
         conf.getAllPropertiesWithPrefixStripped(Property.TABLE_SCAN_DISPATCHER_OPTS);
 
     newDispatcher.init(new ScanDispatcher.InitParameters() {
+
+      // scan dispatcher are in the critical path for scans, so only create ServiceEnv if needed.
+      private final Supplier<ServiceEnvironment> senvSupplier =
+          Suppliers.memoize(() -> new ServiceEnvironmentImpl(context));

Review comment:
       I removed the call to memoize, don't think it adds any benefit here.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner merged pull request #2101: Avoid recreating ServiceEnvironment

Posted by GitBox <gi...@apache.org>.
keith-turner merged pull request #2101:
URL: https://github.com/apache/accumulo/pull/2101


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on a change in pull request #2101: Avoid recreating ServiceEnvironment

Posted by GitBox <gi...@apache.org>.
keith-turner commented on a change in pull request #2101:
URL: https://github.com/apache/accumulo/pull/2101#discussion_r631403252



##########
File path: server/tserver/src/main/java/org/apache/accumulo/tserver/tablet/CompactableUtils.java
##########
@@ -226,14 +227,17 @@ static AccumuloConfiguration createCompactionConfiguration(Tablet tablet,
         cfg.getClassName(), CompactionConfigurer.class);
 
     configurer.init(new CompactionConfigurer.InitParamaters() {
+
+      private final ServiceEnvironment senv = new ServiceEnvironmentImpl(tablet.getContext());
+

Review comment:
       > which I just noticed is misspelled; should be InitParameters
   
   nice catch, that is an unreleased new API

##########
File path: server/tserver/src/main/java/org/apache/accumulo/tserver/TabletServerResourceManager.java
##########
@@ -783,6 +788,11 @@ public void executeReadAhead(KeyExtent tablet, ScanDispatcher dispatcher, ScanSe
       scanExecutors.get("meta").execute(task);
     } else {
       DispatchParameters params = new DispatchParamsImpl() {
+
+        // in scan critical path so only create ServiceEnv if needed
+        private final Supplier<ServiceEnvironment> senvSupplier =
+            Suppliers.memoize(() -> new ServiceEnvironmentImpl(context));

Review comment:
       The [default scan dispatcher](https://github.com/apache/accumulo/blob/5b1c0b54e33ee108a45b2126bd194472703822f2/core/src/main/java/org/apache/accumulo/core/spi/scan/SimpleScanDispatcher.java#L141) does not use the service env for `init()` or `dispatch()`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on a change in pull request #2101: Avoid recreating ServiceEnvironment

Posted by GitBox <gi...@apache.org>.
keith-turner commented on a change in pull request #2101:
URL: https://github.com/apache/accumulo/pull/2101#discussion_r631419669



##########
File path: server/tserver/src/main/java/org/apache/accumulo/tserver/TabletServerResourceManager.java
##########
@@ -178,14 +180,17 @@ private ExecutorService createPriorityExecutor(ScanExecutorConfig sec,
         Comparator<ScanInfo> comparator =
             factory.createComparator(new ScanPrioritizer.CreateParameters() {
 
+              private final Supplier<ServiceEnvironment> senvSupplier =
+                  Suppliers.memoize(() -> new ServiceEnvironmentImpl(context));

Review comment:
       Oh removed the memoize() call.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org