You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by GitBox <gi...@apache.org> on 2019/10/02 13:48:51 UTC

[GitHub] [accumulo] jzgithub1 removed a comment on issue #1225: Use fewer ZooKeeper watchers

jzgithub1 removed a comment on issue #1225: Use fewer ZooKeeper watchers
URL: https://github.com/apache/accumulo/issues/1225#issuecomment-531360046
 
 
   @ctubbsii.  I investigated how I would implement:" _One possible solution is instead of having a watcher for every configuration item, we have only a single configuration version field in ZooKeeper for all configuration items, and a single watcher (per process) to track that version field and reload all configurations whenever it is changed. This would require us to ensure that we increment this field whenever a configuration is changed._"
   
   I saw a lot of complexity inside of the ZooCache and ZooReader objects that makes that approach seem difficult to implement in way that we could be sure would work the intended way all of the time.   I looked at using some of the other members of the Zookeeper Stat object like version and cversion to help us keep track of version state but this could lead to some logical errors down the road.
   
   Ultimately, I looked at the Apache Curator project.   I ran the the TreeCacheExample  which runs continuously against a ZooKeeper  instance.  I started up and instance of Flun Uno and then started 'cingest ingest'  in Accumulo-Testing.  In that TreeCacheExample, a listener function is added to the CuratorFramework client object and another listener function is added to the TreeCache object.   These listeners captured all of the actions on all Zookeeper paths as they were occurring during the ingest.  
   
   The TreeCache object in Curator really seems to do what we want to in terms of reducing watchers and getting reliable access to data, and not grow maps so large that it causes problems.   
   
   I would like to put a TreeCache object inside of ZooCache and remove the maps and locks that may be causing issues.  The TreeCache object internally does retries and handles all of the concurrency issues and data storage issues that ZooCache does.  Possibly it does it better.  What do you think about this approach?
   
   Here is the TreeCacheExample code:
   ```java
   
   /**
    * Licensed to the Apache Software Foundation (ASF) under one
    * or more contributor license agreements.  See the NOTICE file
    * distributed with this work for additional information
    * regarding copyright ownership.  The ASF licenses this file
    * to you under the Apache License, Version 2.0 (the
    * "License"); you may not use this file except in compliance
    * with the License.  You may obtain a copy of the License at
    *
    *   http://www.apache.org/licenses/LICENSE-2.0
    *
    * Unless required by applicable law or agreed to in writing,
    * software distributed under the License is distributed on an
    * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
    * KIND, either express or implied.  See the License for the
    * specific language governing permissions and limitations
    * under the License.
    */
   package cache;
   
   import framework.CreateClientExamples;
   import org.apache.curator.framework.CuratorFramework;
   import org.apache.curator.framework.recipes.cache.TreeCache;
   import java.io.BufferedReader;
   import java.io.InputStreamReader;
   
   public class TreeCacheExample
   {
       public static void main(String[] args) throws Exception
       {
           CuratorFramework client = CreateClientExamples.createSimple("127.0.0.1:2181");
           client.getUnhandledErrorListenable().addListener((message, e) -> {
               System.err.println("error=" + message);
               e.printStackTrace();
           });
           client.getConnectionStateListenable().addListener((c, newState) -> {
               System.out.println("state=" + newState);
           });
           client.start();
   
           TreeCache cache = TreeCache.newBuilder(client, "/").setCacheData(false).build();
           cache.getListenable().addListener((c, event) -> {
               if ( event.getData() != null )
               {
                   System.out.println("type=" + event.getType() + " path=" + event.getData().getPath());
               }
               else
               {
                   System.out.println("type=" + event.getType());
               }
           });
           cache.start();
   
           BufferedReader in = new BufferedReader(new InputStreamReader(System.in));
           in.readLine();
       }
   }
   ```
   
   Here is some sample output while running ingest in Uno:
   
   state=CONNECTED
   type=NODE_ADDED path=/
   type=NODE_ADDED path=/accumulo
   type=NODE_ADDED path=/tracers
   type=NODE_ADDED path=/zookeeper
   type=NODE_ADDED path=/accumulo/01414446-ec20-4766-a586-9ae08ed91c0c
   type=NODE_ADDED path=/accumulo/instances
   type=NODE_ADDED path=/tracers/trace-0000000000
   type=NODE_ADDED path=/zookeeper/quota
   type=NODE_ADDED path=/accumulo/01414446-ec20-4766-a586-9ae08ed91c0c/bulk_failed_copyq
   type=NODE_ADDED path=/accumulo/01414446-ec20-4766-a586-9ae08ed91c0c/config
   type=NODE_ADDED path=/accumulo/01414446-ec20-4766-a586-9ae08ed91c0c/dead
   type=NODE_ADDED path=/accumulo/01414446-ec20-4766-a586-9ae08ed91c0c/fate
   type=NODE_ADDED path=/accumulo/01414446-ec20-4766-a586-9ae08ed91c0c/gc
   type=NODE_ADDED path=/accumulo/01414446-ec20-4766-a586-9ae08ed91c0c/hdfs_reservations
   type=NODE_ADDED path=/accumulo/01414446-ec20-4766-a586-9ae08ed91c0c/masters
   type=NODE_ADDED path=/accumulo/01414446-ec20-4766-a586-9ae08ed91c0c/monitor
   type=NODE_ADDED path=/accumulo/01414446-ec20-4766-a586-9ae08ed91c0c/namespaces
   type=NODE_ADDED path=/accumulo/01414446-ec20-4766-a586-9ae08ed91c0c/next_file
   type=NODE_ADDED path=/accumulo/01414446-ec20-4766-a586-9ae08ed91c0c/problems
   type=NODE_ADDED path=/accumulo/01414446-ec20-4766-a586-9ae08ed91c0c/recovery
   type=NODE_ADDED path=/accumulo/01414446-ec20-4766-a586-9ae08ed91c0c/replication
   type=NODE_ADDED path=/accumulo/01414446-ec20-4766-a586-9ae08ed91c0c/root_tablet
   type=NODE_ADDED path=/accumulo/01414446-ec20-4766-a586-9ae08ed91c0c/table_locks
   type=NODE_ADDED path=/accumulo/01414446-ec20-4766-a586-9ae08ed91c0c/tables
   type=NODE_ADDED path=/accumulo/01414446-ec20-4766-a586-9ae08ed91c0c/tservers
   type=NODE_ADDED path=/accumulo/01414446-ec20-4766-a586-9ae08ed91c0c/users
   type=NODE_ADDED path=/accumulo/01414446-ec20-4766-a586-9ae08ed91c0c/wals
   type=NODE_ADDED path=/accumulo/instances/uno
   type=NODE_ADDED path=/accumulo/01414446-ec20-4766-a586-9ae08ed91c0c/bulk_failed_copyq/locks
   type=NODE_ADDED path=/accumulo/01414446-ec20-4766-a586-9ae08ed91c0c/dead/tservers
   type=NODE_ADDED path=/accumulo/01414446-ec20-4766-a586-9ae08ed91c0c/gc/lock
   type=NODE_ADDED path=/accumulo/01414446-ec20-4766-a586-9ae08ed91c0c/masters/goal_state
   type=NODE_ADDED path=/accumulo/01414446-ec20-4766-a586-9ae08ed91c0c/masters/lock
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services