You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by "keith-turner (via GitHub)" <gi...@apache.org> on 2023/03/21 17:54:40 UTC

[GitHub] [accumulo] keith-turner commented on pull request #3220: Add ondemand table state

keith-turner commented on PR #3220:
URL: https://github.com/apache/accumulo/pull/3220#issuecomment-1478347120

   I was looking at dlmarion/accumulo#37 and I like the on demand column.  Looking at it I was thinking for an ondemand table that the tablet location cache will need to cache tablets with and without locations.  This is so that eventual scans can work against tablets without a location like the changes in #3143 for offline tables.
   
   If the tablet location cache will eventually hold tablets w/o a location, then we could possibly leverage that.  Below is some pseudo code I wrote to help me explore this concept and think through it at a high level.  
   
   ```java
    
    interface ClientLoadRequestProcessor {
          /**
           * Makes a call to one or more managers to load tablets for an ondemand table, knows how to partition 
           * tablets to managers.  May ingore extents it was recently asked to load.
           */
          void  loadTablets(Set<KeyExtent> extents); 
    }
    
    
    interface TabletLocator {
         /**
          * Maps ranges to tablets, partitioning the tablets into hosted and unhosted sets.  Unhosted means the 
          * tablet does not have a location.
          */
         List<Range> locateTablets(List<Range> ranges, Map<...> hostedTablets, Set<KeyExtent> unhostedTablets);
         
         void invalidateExtents(Set<KeyExtent> extents);
    }
    
    
    
   /* Maybe this code would  be in the tablet locator impl or batch scanner impl, not sure of best way to organize 
    * code ATM. I am slightly leaning twoards putting it in TabletLocator like you did in 37, but pulling the code to 
    * make load request to managers out of tablet locator.  
    * /
    class XYZ {
    
       // The following vrs should hang off the client context, just putting it here to make things shorter
        ClientLoadRequestProcessor tabletLoader;
        TabletLocator locator;
    
        // Maps all given ranges to tablets with locations.  This is would be used by batch scanners doing 
        // immediate scans.
        private Map<...> lookupLocations(List<Range> ranges) {
          while(true){
            var hosted = new Map();
            var unhosted = new Set<KeyExtent>();
            
            locator.locatorTablets(ranges, hosted, unhosted);
        
            if(unhosted.isEmpty()) {
               // all ranges were mapped to tablets with a location
               return hosted;
            }
            
            var tableState = getTableState();
            
           if(tableState == ONDEMAND) {
   
            tabletLoader.loadTablets(unhosted);
   
           } else if(tableState != ONLINE) {
             // if table is offline or deleted then tablets will never come online.
             //TODO invalidate cache if table deleted
              throw new Exception();
           }
   
            // TODO sleep with backoff
   
            // this will force the cache to reread these from the metadata table on the next request 
            locator.invalidateExtents(unhosted); 
          }
        }
    }
   
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org