You are viewing a plain text version of this content. The canonical link for it is here.

Posted to notifications@accumulo.apache.org by "dlmarion (via GitHub)" <gi...@apache.org> on 2023/03/27 21:31:15 UTC

[GitHub] [accumulo] dlmarion opened a new pull request, #3262: Started working on multiple managers

dlmarion opened a new pull request, #3262:
URL: https://github.com/apache/accumulo/pull/3262

I have modified the Manager code in the following ways. I wanted to get a some feedback on this path before I go further, as it will starting to touch other modules.

1. Modified each Manager to put its address in ZooKeeper (at `ZMANAGERS/<>`), just like the other services do
2. Modified the Manager such that the instance that grabs the lock at `ZMANAGER_LOCK` is the "primary" manager (see MultipleManagerLockIT, which passes)
3. The "primary" manager will be responsible for FaTE transactions and other things that still need to be managed by one thing only.
4. Modified the TabletGroupWatcher, in the user table case, to try and distribute management of tables tablets evenly, but at table boundaries (see TabletGroupWatcher, MultipleManagerUtil).

I started working on MultipleManagerIT, but ran into issues with LiveTServerSet and the TabletServer code where it expects the Manager making the call to have the lock at `ZMANAGER_LOCK`. The next step is to start dealing with that, effectively removing that constraint. I figured this would be a good place to pause and get feedback before I spend more time going in this direction.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

Re: [PR] Started working on multiple managers [accumulo]

Posted by "keith-turner (via GitHub)" <gi...@apache.org>.

keith-turner commented on code in PR #3262:
URL: https://github.com/apache/accumulo/pull/3262#discussion_r1412550666


##########
server/manager/src/main/java/org/apache/accumulo/manager/MultipleManagerUtil.java:
##########
@@ -0,0 +1,100 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   https://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.accumulo.manager;
+
+import java.util.ArrayList;
+import java.util.Comparator;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Set;
+import java.util.SortedSet;
+import java.util.TreeSet;
+import java.util.stream.Collectors;
+import java.util.stream.IntStream;
+
+import org.apache.accumulo.core.data.TableId;
+import org.apache.accumulo.core.util.Pair;
+import org.apache.accumulo.server.ServerContext;
+
+public class MultipleManagerUtil {
+
+  /**
+   * Each Manager will be responsible for a range(s) of metadata tablets, but we don't want to split
+   * up a table's metadata tablets between managers as it will throw off the tablet balancing. If
+   * there are three managers, then we want to split up the metadata tablets roughly into thirds and
+   * have each manager responsible for one third, for example.
+   *
+   * @param context server context
+   * @param tables set of table ids
+   * @param numManagers number of managers
+   * @return list of num manager size, each element containing a set of tables for the manager to
+   *         manage
+   */
+  public static List<Set<TableId>> getTablesForManagers(ServerContext context, Set<TableId> tables,
+      int numManagers) {
+
+    if (numManagers == 0) {
+      throw new IllegalStateException("No managers, one or more expected");
+    }
+
+    if (numManagers == 1) {
+      return List.of(tables);
+    }
+
+    SortedSet<Pair<TableId,Long>> tableTabletCounts = new TreeSet<>(new Comparator<>() {
+      @Override
+      public int compare(Pair<TableId,Long> table1, Pair<TableId,Long> table2) {
+        // sort descending by number of tablets
+        int result = table1.getSecond().compareTo(table2.getSecond());
+        if (result == 0) {
+          return table1.getFirst().compareTo(table2.getFirst());
+        }
+        return -1 * result;
+      }
+    });
+    tables.forEach(tid -> {

Review Comment:
   Fluo partitions work among workers using an approach of having [one process determines the partitions](https://github.com/apache/fluo/blob/main/modules/core/src/main/java/org/apache/fluo/core/worker/finder/hash/PartitionManager.java) and puts those in ZK.  Then all other workers use those partitions from ZK to know what to work on.  This approach allows all of the workers to eventually settle on the same partitions which is what is needed here.  Posting the Fluo code to show that its not a lot of code and encapsulates nicely.  We could have a TabletManagementPartitioner that is created and tested as a stand alone task in its own PR that does this.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

Re: [PR] Started working on multiple managers [accumulo]

Posted by "dlmarion (via GitHub)" <gi...@apache.org>.

dlmarion commented on code in PR #3262:
URL: https://github.com/apache/accumulo/pull/3262#discussion_r1402635111


##########
core/src/main/java/org/apache/accumulo/core/conf/Property.java:
##########
@@ -317,6 +317,8 @@ public enum Property {
       "Properties in this category affect the behavior of the manager server.", "2.1.0"),
   MANAGER_CLIENTPORT("manager.port.client", "9999", PropertyType.PORT,
       "The port used for handling client connections on the manager.", "1.3.5"),
+  MANAGER_PORTSEARCH("manager.port.search", "false", PropertyType.BOOLEAN,
+      "If the manager.port.client is in use, search higher ports until one is available.", "3.1.0"),

Review Comment:
   Need to change version 4.0.0



##########
core/src/main/java/org/apache/accumulo/core/conf/Property.java:
##########
@@ -371,6 +373,19 @@ public enum Property {
       "Allow tablets for the " + MetadataTable.NAME
           + " table to be suspended via table.suspend.duration.",
       "1.8.0"),
+  MANAGER_STARTUP_MANAGER_AVAIL_MIN_COUNT("manager.startup.manager.avail.min.count", "0",
+      PropertyType.COUNT,
+      "Minimum number of managers that need to be registered before a manager will start. A value "
+          + "greater than 0 is useful when multiple managers are supposed to be running on startup. "
+          + "When set to 0 or less, no blocking occurs. Default is 0 (disabled).",
+      "3.1.0"),
+  MANAGER_STARTUP_MANAGER_AVAIL_MAX_WAIT("manager.startup.manager.avail.max.wait", "0",
+      PropertyType.TIMEDURATION,
+      "Maximum time manager will wait for manager available threshold "
+          + "to be reached before continuing. When set to 0 or less, will block "
+          + "indefinitely. Default is 0 to block indefinitely. Only valid when manager available "
+          + "threshold is set greater than 1.",
+      "3.1.0"),

Review Comment:
   Need to change version 4.0.0



##########
server/tserver/src/main/java/org/apache/accumulo/tserver/TabletServer.java:
##########
@@ -420,11 +420,11 @@ private HostAndPort startServer(AccumuloConfiguration conf, String address, TPro
 
   private HostAndPort getManagerAddress() {
     try {
-      List<String> locations = getContext().getManagerLocations();
-      if (locations.isEmpty()) {
+      String location = getContext().getPrimaryManagerLocation();

Review Comment:
   Should this be any Manager?



##########
server/base/src/main/java/org/apache/accumulo/server/manager/state/MetaDataStateStore.java:
##########
@@ -56,6 +60,12 @@ public DataLevel getLevel() {
     return level;
   }
 
+  @Override
+  @Deprecated
+  public void overrideRanges(List<Range> ranges) {
+    this.ranges = ranges;
+  }
+

Review Comment:
   This change can be removed.



##########
core/src/main/java/org/apache/accumulo/core/conf/Property.java:
##########
@@ -371,6 +373,19 @@ public enum Property {
       "Allow tablets for the " + MetadataTable.NAME
           + " table to be suspended via table.suspend.duration.",
       "1.8.0"),
+  MANAGER_STARTUP_MANAGER_AVAIL_MIN_COUNT("manager.startup.manager.avail.min.count", "0",
+      PropertyType.COUNT,
+      "Minimum number of managers that need to be registered before a manager will start. A value "
+          + "greater than 0 is useful when multiple managers are supposed to be running on startup. "
+          + "When set to 0 or less, no blocking occurs. Default is 0 (disabled).",
+      "3.1.0"),

Review Comment:
   Need to change version 4.0.0



##########
core/src/main/java/org/apache/accumulo/core/fate/zookeeper/ZooUtil.java:
##########
@@ -72,7 +72,11 @@ public LockID(String root, String serializedLID) {
       if (lastSlash == 0) {
         path = root;
       } else {
-        path = root + "/" + sa[0].substring(0, lastSlash);
+        path = root;
+        if (!sa[0].startsWith("/")) {
+          path += "/";
+        }

Review Comment:
   I had to add this because for some reason `sa[0]` starts with a `/` for the MANAGERS locks.



##########
server/base/src/main/java/org/apache/accumulo/server/manager/state/LoggingTabletStateStore.java:
##########
@@ -60,6 +60,12 @@ public ClosableIterator<TabletManagement> iterator(List<Range> ranges,
     return wrapped.iterator(ranges, parameters);
   }
 
+  @Override
+  @Deprecated
+  public void overrideRanges(List<Range> ranges) {
+    wrapped.overrideRanges(ranges);
+  }
+

Review Comment:
   This change can be removed.



##########
server/base/src/main/java/org/apache/accumulo/server/manager/state/TabletStateStore.java:
##########
@@ -59,7 +59,20 @@ ClosableIterator<TabletManagement> iterator(List<Range> ranges,
    * Scan the information about all tablets covered by this store..
    */
   default ClosableIterator<TabletManagement> iterator(TabletManagementParameters parameters) {
-    return iterator(List.of(MetadataSchema.TabletsSection.getRange()), parameters);
+    List<Range> ranges = List.of(MetadataSchema.TabletsSection.getRange());
+    if (parameters.getRangeOverrides() != null) {
+      ranges = parameters.getRangeOverrides();
+    }
+    return iterator(ranges, parameters);
+  }
+
+  /**
+   * Override the range of tablets that the TabletStateStore should retrieve. By default it
+   * retrieves all tablets.
+   */
+  @Deprecated
+  default void overrideRanges(List<Range> ranges) {

Review Comment:
   This method can be removed



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

Re: [PR] Started working on multiple managers [accumulo]

Posted by "dlmarion (via GitHub)" <gi...@apache.org>.

dlmarion commented on PR #3262:
URL: https://github.com/apache/accumulo/pull/3262#issuecomment-1823427398

   Need to wire up the multiple managers in `accumulo-cluster`. I may have lost that in the rebase.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

Re: [PR] Started working on multiple managers [accumulo]

Posted by "keith-turner (via GitHub)" <gi...@apache.org>.

keith-turner commented on PR #3262:
URL: https://github.com/apache/accumulo/pull/3262#issuecomment-1828943330

   Was anything done re upgrade in this PR?  If not I think we need to open a follow in issue for that.
   
   This would be a follow on issue, thinking we an distribute the compaction coordinator by having it hash parition queue names. among manager processes.  TGW could make an RPC to add a job to a remote queue.  Compaction coordinators could hash the name to find the manager process to ask for work.  
   
   In this PR it seems like the TGW is adding compaction jobs to a local queue in the process.  What do compactors do to find jobs?
   
   We may need to make the EventCoordinator use the same partitioning as the TGW and send events to other manager processes via a new async RPC.  Need to analyze the EventCoordinator, may make sense to pull it in to the TGW conceptually.  Every manager uses it local TGW instance to signal events and internally the TGW code knows how to route that in the cluster to other TGW instances.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

Re: [PR] Started working on multiple managers [accumulo]

Posted by "dlmarion (via GitHub)" <gi...@apache.org>.

dlmarion commented on PR #3262:
URL: https://github.com/apache/accumulo/pull/3262#issuecomment-1828764079

   > Need to wire up the multiple managers in `accumulo-cluster`. I may have lost that in the rebase.
   
   Looked into this, `accumulo-cluster` already supports multiple managers - active/backup pair capability that already exists.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

Re: [PR] Started working on multiple managers [accumulo]

Posted by "dlmarion (via GitHub)" <gi...@apache.org>.

dlmarion commented on PR #3262:
URL: https://github.com/apache/accumulo/pull/3262#issuecomment-1830612379

   > How will user assign tables to manager resource groups?
   
   Regarding this, the current mechanism is described in the **Resource Groups** section of https://cwiki.apache.org/confluence/display/ACCUMULO/A+More+Cost+Efficient+Accumulo. I was thinking that it would make sense to promote the  `table.custom.assignment.group` property to a fixed property. Users would set this property on a table and would need to define a resource group with the same name so that the Manager could manage its tablets. Clients that connect to the Manager to perform table operations or initiate user Fate transactions would need to use the table property to determine which Manager to connect to (using the Resource Group name). The resource group name is already part of the ServiceLock object, so for example, we would modify `ClientContext.getManagerLocations` to iterate over the Managers in ZooKeeper and return the one with the correct resource group. Another option, if resource groups are a first class citizen, is to put the resource group name in the ZK path altho
 ugh that could lead to needing cleanup if a resource group was created and then removed. I haven't thought of everything, but this approach does solve some problems.
   
   > What manager would RPC operations that are not table related use?
   
   I think for instance-level operations and maybe even namespace level operations, any Manager could be used.
   
   > Thinking confluence documents would be a better place to explore this rather than here in issue.
   
   I can start a new Confluence page


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

Re: [PR] Started working on multiple managers [accumulo]

Posted by "keith-turner (via GitHub)" <gi...@apache.org>.

keith-turner commented on code in PR #3262:
URL: https://github.com/apache/accumulo/pull/3262#discussion_r1406979175


##########
server/manager/src/main/java/org/apache/accumulo/manager/TabletGroupWatcher.java:
##########
@@ -613,8 +641,15 @@ public void run() {
       final long waitTimeBetweenScans = manager.getConfiguration()
           .getTimeInMillis(Property.MANAGER_TABLET_GROUP_WATCHER_INTERVAL);
 
+      // Override the set of table ranges that this manager will manage
+      Optional<List<Range>> ranges = Optional.empty();
+      if (store.getLevel() == Ample.DataLevel.USER) {
+        ranges = calculateRangesForMultipleManagers(

Review Comment:
   Not sure about this, was getting a bit lost.  I think this may return Optional.empty() with the expectation that the last computation will be used when empty is returned, but I am not sure that is actually being done.



##########
server/manager/src/main/java/org/apache/accumulo/manager/MultipleManagerUtil.java:
##########
@@ -0,0 +1,100 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   https://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.accumulo.manager;
+
+import java.util.ArrayList;
+import java.util.Comparator;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Set;
+import java.util.SortedSet;
+import java.util.TreeSet;
+import java.util.stream.Collectors;
+import java.util.stream.IntStream;
+
+import org.apache.accumulo.core.data.TableId;
+import org.apache.accumulo.core.util.Pair;
+import org.apache.accumulo.server.ServerContext;
+
+public class MultipleManagerUtil {
+
+  /**
+   * Each Manager will be responsible for a range(s) of metadata tablets, but we don't want to split
+   * up a table's metadata tablets between managers as it will throw off the tablet balancing. If
+   * there are three managers, then we want to split up the metadata tablets roughly into thirds and
+   * have each manager responsible for one third, for example.
+   *
+   * @param context server context
+   * @param tables set of table ids
+   * @param numManagers number of managers
+   * @return list of num manager size, each element containing a set of tables for the manager to
+   *         manage
+   */
+  public static List<Set<TableId>> getTablesForManagers(ServerContext context, Set<TableId> tables,
+      int numManagers) {
+
+    if (numManagers == 0) {
+      throw new IllegalStateException("No managers, one or more expected");
+    }
+
+    if (numManagers == 1) {
+      return List.of(tables);
+    }
+
+    SortedSet<Pair<TableId,Long>> tableTabletCounts = new TreeSet<>(new Comparator<>() {
+      @Override
+      public int compare(Pair<TableId,Long> table1, Pair<TableId,Long> table2) {
+        // sort descending by number of tablets
+        int result = table1.getSecond().compareTo(table2.getSecond());
+        if (result == 0) {
+          return table1.getFirst().compareTo(table2.getFirst());
+        }
+        return -1 * result;
+      }
+    });
+    tables.forEach(tid -> {

Review Comment:
   When the set of managers and tables are steady for a bit, all manager processes need to arrive at the same decisions for partitioning tables into buckets.  With the algorithm in this method different manager processes may see different counts for the same tables at different times and end up partitioning tables into different buckets. This could lead to overlap in the partitions or in the worst case a table that no manager processes.  We could start with a deterministic hash partitioning of tables and open a follow in issue to improve.  One possible way to improve would be to have a single manager process run this algorithm and publish the partitioning information, with all other manager just using it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

Re: [PR] Started working on multiple managers [accumulo]

Posted by "dlmarion (via GitHub)" <gi...@apache.org>.

dlmarion commented on PR #3262:
URL: https://github.com/apache/accumulo/pull/3262#issuecomment-1832692665

   Feedback requested on https://cwiki.apache.org/confluence/display/ACCUMULO/Using+Resource+Groups+as+an+implementation+of+Multiple+Managers. Are there other options to consider, what other information should I add?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

Re: [PR] Started working on multiple managers [accumulo]

Posted by "dlmarion (via GitHub)" <gi...@apache.org>.

dlmarion commented on PR #3262:
URL: https://github.com/apache/accumulo/pull/3262#issuecomment-1828950532

   This PR isn't close to being merge-able. I wanted to get feedback on the approach before spending a larger amount of time making it ready.  W/r/t the approach, I'm talking about the locking, the concept of a primary manager, how work is distributed across that managers, how the Monitor is supposed to work, etc.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

Re: [PR] Started working on multiple managers [accumulo]

Posted by "dlmarion (via GitHub)" <gi...@apache.org>.

dlmarion commented on PR #3262:
URL: https://github.com/apache/accumulo/pull/3262#issuecomment-1829805334

   > When the set of managers and tables are steady for a bit, all manager processes need to arrive at the same decisions for partitioning tables into buckets. With the algorithm in this method different manager processes may see different counts for the same tables at different times and end up partitioning tables into different buckets. This could lead to overlap in the partitions or in the worst case a table that no manager processes. We could start with a deterministic hash partitioning of tables and open a follow in issue to improve. One possible way to improve would be to have a single manager process run this algorithm and publish the partitioning information, with all other manager just using it.
   
   > This would be a follow on issue, thinking we could distribute the compaction coordinator by having it hash parition queue names. among manager processes. TGW could make an RPC to add a job to a remote queue. Compaction coordinators could hash the name to find the manager process to ask for work.
   
   > We may need to make the EventCoordinator use the same partitioning as the TGW and send events to other manager processes via a new async RPC. Need to analyze the EventCoordinator, may make sense to pull it in to the TGW conceptually. Every manager uses it local TGW instance to signal events and internally the TGW code knows how to route that in the cluster to other TGW instances.
   
   I'm now concerned that this is going to be overly complex - lot's of moving parts with the potential for multiple managers to claim ownership of the same object, or using some external process (ZK) to coordinate which Manager is responsible for a specific object. The Multiple Manager implementation in this PR is based off [this](https://cwiki.apache.org/confluence/display/ACCUMULO/Elasticity+Design+Notes+-+March+2023) design, which has multiple managers try to manage everything. 
   
   I think there may be a simpler way as we have already introduced a natural partitioning mechanism - resource groups. I went back and looked in the wiki and you (@keith-turner ) had a very similar idea at the bottom of [this](https://cwiki.apache.org/confluence/display/ACCUMULO/Implementing+multiple+managers+via+independant+distributed+services) page. So, instead of having a single set of Managers try to manage everything, you would have a single Manager manage tablets, compactions, and Fate for all of the tables that map to a specific resource group. We could continue to have the active/backup Manager feature that we have today, but per resource group. This also solves the Monitor problem. If we look at this using the `cluster.yaml` file it would go from what we have today:
   
   ```
   manager:
     - localhost
   
   monitor:
     - localhost
   
   gc:
     - localhost
   
   tserver:
     default:
       - localhost
     group1:
       - localhost
   
   compactor:
     accumulo_meta:
       - localhost
     user_small:
       - localhost
     user_large:
       - localhost
   
   sserver:
     default:
       - localhost
     group1:
       - localhost    
   ```
   
   to something like:
   
   ```
   default:
     manager:
       - localhost
     monitor:
       - localhost
     gc:
       - localhost
     tserver:
       - localhost
     compactor:
       accumulo_meta:
         - localhost
       user_small:
         - localhost
       user_large:
         - localhost
     sserver:
       default:
         - localhost
         
   group1:
     manager:
       - localhost
     monitor:
       - localhost
     gc:
       - localhost
     tserver:
       - localhost
     compactor:
       accumulo_meta:
         - localhost
       user_small:
         - localhost
       user_large:
         - localhost
     sserver:
       default:
         - localhost
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

Re: [PR] Started working on multiple managers [accumulo]

Posted by "keith-turner (via GitHub)" <gi...@apache.org>.

keith-turner commented on PR #3262:
URL: https://github.com/apache/accumulo/pull/3262#issuecomment-1830540427

> So, instead of having a single set of Managers try to manage everything, you would have a single Manager manage tablets, compactions, and Fate for all of the tables that map to a specific resource group.

Thats an interesting concept that I think would be good to explore further. I have alot of questions about specifics. Like how will fate data be partitioned in storage? How will user assign tables to manager resource groups? What manager would RPC operations that are not table related use? Thinking confluence documents would be a better place to explore this rather than here in issue.

This proposed design precludes scale out for metadata processing for a single table. The experimentation I was doing in #3964 allows FATE to scale across many processes. I was thinking if a single table creates 10K fate operations all of a sudden, then if FATE is running on many manager processes they all could all start working on them. I would like to explore scaling out more for the managers different fucntional components, I can work on exploring that further and post what I find in confluence documents. Would like to determine what all of the hurdles are to scaling out and what the possible solutions are before deciding not pursue it.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

Re: [PR] Started working on multiple managers [accumulo]

Posted by "keith-turner (via GitHub)" <gi...@apache.org>.

keith-turner commented on PR #3262:
URL: https://github.com/apache/accumulo/pull/3262#issuecomment-1836734617

> Feedback requested on https://cwiki.apache.org/confluence/display/ACCUMULO/Using+Resource+Groups+as+an+implementation+of+Multiple+Managers. Are there other options to consider, what other information should I add?

I updated the above document with another possible solution. Thinking that this PR and #3964 are already heading in the direction of that solution. I still have a lot of uncertainty and I was thinking about how to reduce that. Thought of the following.

1. We can start running scale test now that abuse the current code. By doing this we may learn new things that help us make more informed decisions. I opened #4006 for this and created other items for scale test as TODOs on the elasticity board.
2. We can reorganize the manager code to make the functional services in the manager more explicit. I opened #4005 for this. I am going to take a shot are reorganizing just one thing int he manager as described in that issue to see what it looks like.
3. Would be good to chat sometime as mentioned in slack

Warning this is not a fully formed thought. #3964 took a bottom up approach to scaling the manager and this PR is taking a top down approach. Was wondering about taking some of the things in the PR and creating something more focused on just distributing tablet management like #3964 is just for distributing FATE. However, tablet managment is not as cleanly self contained in the code as FATE is, so its harder to do that. That is one reason I opened #4005. It would be nice to have an IT test that creates multiple tablet management objects each with different partitions and verifies that. #3694 has test like this for FATE.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org