You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by GitBox <gi...@apache.org> on 2021/06/08 22:54:18 UTC

[GitHub] [accumulo] keith-turner opened a new pull request #2152: Adds Ample support for tablets and uses this in compaction finalizer

keith-turner opened a new pull request #2152:
URL: https://github.com/apache/accumulo/pull/2152


   fixes #1974


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] ctubbsii commented on a change in pull request #2152: Adds Ample support for tablets and uses this in compaction finalizer

Posted by GitBox <gi...@apache.org>.
ctubbsii commented on a change in pull request #2152:
URL: https://github.com/apache/accumulo/pull/2152#discussion_r648637616



##########
File path: core/src/main/java/org/apache/accumulo/core/metadata/schema/TabletsMetadata.java
##########
@@ -87,14 +99,23 @@
     private TableId tableId;
     private ReadConsistency readConsistency = ReadConsistency.IMMEDIATE;
     private AccumuloClient _client;
+    private Collection<KeyExtent> extents;
 
     Builder(AccumuloClient client) {
       this._client = client;
     }
 
     @Override
     public TabletsMetadata build() {
-      Preconditions.checkState((level == null) != (table == null),
+      if (extents != null) {
+        // its expected that things are not set for scanning a single range, so check that. It also
+        // expected that checkConsistency is not set as this check can not be done for
+        // non-contiguous tablets.

Review comment:
       I didn't understand this check before, but now that you've written it out, I think a more concise way to put this might be something like:
   
   ```suggestion
           // setting multiple extents with forTablets(extents) is mutually exclusive with these single-tablet options
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner merged pull request #2152: Adds Ample support for tablets and uses this in compaction finalizer

Posted by GitBox <gi...@apache.org>.
keith-turner merged pull request #2152:
URL: https://github.com/apache/accumulo/pull/2152


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on a change in pull request #2152: Adds Ample support for tablets and uses this in compaction finalizer

Posted by GitBox <gi...@apache.org>.
keith-turner commented on a change in pull request #2152:
URL: https://github.com/apache/accumulo/pull/2152#discussion_r648461139



##########
File path: core/src/main/java/org/apache/accumulo/core/metadata/schema/TabletsMetadata.java
##########
@@ -144,8 +164,24 @@ private TabletsMetadata buildNonRoot(AccumuloClient client) {
       }
     }
 
+    private void configureColumns(ScannerBase scanner) {
+      for (Text fam : families) {
+        scanner.fetchColumnFamily(fam);
+      }
+
+      for (ColumnFQ col : qualifiers) {
+        col.fetch(scanner);
+      }
+
+      if (families.isEmpty() && qualifiers.isEmpty()) {
+        fetchedCols = EnumSet.allOf(ColumnType.class);
+      }
+    }
+
     @Override
     public Options checkConsistency() {
+      Preconditions.checkState(extents == null,

Review comment:
       
   f017626




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on a change in pull request #2152: Adds Ample support for tablets and uses this in compaction finalizer

Posted by GitBox <gi...@apache.org>.
keith-turner commented on a change in pull request #2152:
URL: https://github.com/apache/accumulo/pull/2152#discussion_r648457449



##########
File path: core/src/main/java/org/apache/accumulo/core/metadata/schema/TabletMetadata.java
##########
@@ -429,6 +434,27 @@ private void setLocationOnce(String val, String qual, LocationType lt) {
     location = new Location(val, qual, lt);
   }
 
+  static Iterable<TabletMetadata> convert(BatchScanner input, EnumSet<ColumnType> fetchedColumns,
+      boolean buildKeyValueMap) {
+
+    IteratorSetting iterSetting = new IteratorSetting(100, WholeRowIterator.class);
+    input.addScanIterator(iterSetting);

Review comment:
       > It seems like a bad idea to put these inside this method, as they mutate BatchScanner, which could be re-used by the caller. 
   
   Yeah that was awful.  I was following the pattern of another existing method, that had the same bad pattern.  I inlined both methods in 887a91b




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] ctubbsii commented on a change in pull request #2152: Adds Ample support for tablets and uses this in compaction finalizer

Posted by GitBox <gi...@apache.org>.
ctubbsii commented on a change in pull request #2152:
URL: https://github.com/apache/accumulo/pull/2152#discussion_r647929478



##########
File path: core/src/main/java/org/apache/accumulo/core/metadata/schema/TabletMetadata.java
##########
@@ -429,6 +434,27 @@ private void setLocationOnce(String val, String qual, LocationType lt) {
     location = new Location(val, qual, lt);
   }
 
+  static Iterable<TabletMetadata> convert(BatchScanner input, EnumSet<ColumnType> fetchedColumns,
+      boolean buildKeyValueMap) {
+
+    IteratorSetting iterSetting = new IteratorSetting(100, WholeRowIterator.class);
+    input.addScanIterator(iterSetting);
+
+    Supplier<Iterator<TabletMetadata>> iterFactory = () -> {
+      return Iterators.transform(input.iterator(), entry -> {
+        try {
+          return convertRow(
+              WholeRowIterator.decodeRow(entry.getKey(), entry.getValue()).entrySet().iterator(),
+              fetchedColumns, buildKeyValueMap);
+        } catch (IOException e) {
+          throw new RuntimeException(e);
+        }
+      });
+    };
+
+    return () -> iterFactory.get();

Review comment:
       You don't need the Supplier as an intermediate. You can create an Iterable directly. You also don't need the variable, because you can just return it right away.
   
   Also, there is an UncheckedIOException to wrap IOException as a RTE.
   
   ```suggestion
       return () -> Iterators.transform(input.iterator(), entry -> {
         try {
           return convertRow(
               WholeRowIterator.decodeRow(entry.getKey(), entry.getValue()).entrySet().iterator(),
               fetchedColumns, buildKeyValueMap);
         } catch (IOException e) {
           throw new UncheckedIOException(e);
         }
       });
   ```

##########
File path: core/src/main/java/org/apache/accumulo/core/metadata/schema/TabletsMetadata.java
##########
@@ -144,8 +164,24 @@ private TabletsMetadata buildNonRoot(AccumuloClient client) {
       }
     }
 
+    private void configureColumns(ScannerBase scanner) {
+      for (Text fam : families) {
+        scanner.fetchColumnFamily(fam);
+      }
+
+      for (ColumnFQ col : qualifiers) {
+        col.fetch(scanner);
+      }
+

Review comment:
       ```suggestion
         families.forEach(scanner::fetchColumnFamily);
         qualifiers.forEach(col -> col.fetch(scanner));
   ```

##########
File path: core/src/main/java/org/apache/accumulo/core/metadata/schema/TabletsMetadata.java
##########
@@ -306,6 +355,13 @@ public Options readConsistency(ReadConsistency readConsistency) {
      */
     Options forTablet(KeyExtent extent);
 
+    /**
+     * Get the tablet metadata for the given extents. This will find tablets based on end row, so
+     * its possible the prev rows could differ for the tablets returned. If this matters then it

Review comment:
       "its" -> "it's" ("it is")
   
   ```suggestion
        * it's possible the prev rows could differ for the tablets returned. If this matters, then it
   ```

##########
File path: core/src/main/java/org/apache/accumulo/core/metadata/schema/TabletsMetadata.java
##########
@@ -104,6 +115,25 @@ public TabletsMetadata build() {
       }
     }
 
+    private TabletsMetadata buildExtents(AccumuloClient client) {
+
+      try {
+        BatchScanner scanner = client.createBatchScanner(level.metaTable(), Authorizations.EMPTY);
+
+        List<Range> ranges =
+            extents.stream().map(e -> e.toMetaRange()).collect(Collectors.toList());

Review comment:
       3 possible changes here. The first is `KeyExtent::toMetaRange` for the map function. The other two changes are attempts to make this line shorter for readability. Could use `var` here, or static import `toList()`. I did both below, but it's up to you what you're comfortable with. I think it's worth trying to get it to not wrap, though.
   
   ```suggestion
           var ranges = extents.stream().map(KeyExtent::toMetaRange).collect(toList());
   ```

##########
File path: core/src/main/java/org/apache/accumulo/core/metadata/schema/TabletsMetadata.java
##########
@@ -233,6 +269,19 @@ public Options forTablet(KeyExtent extent) {
       return this;
     }
 
+    @Override
+    public Options forTablets(Collection<KeyExtent> extents) {
+      if (!extents.stream().map(e -> DataLevel.of(e.tableId()))
+          .allMatch(dl -> dl == DataLevel.USER)) {

Review comment:
       Instead of `! allMatch`, you can just use `anyMatch`:
   
   ```suggestion
         if (extents.stream().map(e -> DataLevel.of(e.tableId()))
             .anyMatch(dl -> dl == DataLevel.USER)) {
   ```
   
   You can also split up the two steps in the map function to use method references. Sometimes this is easier to understand:
   
   ```suggestion
         if (extents.stream().map(KeyExtent::tableId).map(DataLevel::of)
             .anyMatch(dl -> dl == DataLevel.USER)) {
   ```

##########
File path: core/src/main/java/org/apache/accumulo/core/metadata/schema/TabletsMetadata.java
##########
@@ -87,13 +91,20 @@
     private TableId tableId;
     private ReadConsistency readConsistency = ReadConsistency.IMMEDIATE;
     private AccumuloClient _client;
+    private Collection<KeyExtent> extents;
 
     Builder(AccumuloClient client) {
       this._client = client;
     }
 
     @Override
     public TabletsMetadata build() {
+      if (extents != null) {
+        Preconditions.checkState(
+            range == null && table == null && level == DataLevel.USER && !checkConsistency);
+        return buildExtents(_client);
+      }

Review comment:
       This is doing a lot of state checking, but it's not clear what it's checking for. A comment could go a long way here.

##########
File path: core/src/main/java/org/apache/accumulo/core/metadata/schema/TabletsMetadata.java
##########
@@ -144,8 +164,24 @@ private TabletsMetadata buildNonRoot(AccumuloClient client) {
       }
     }
 
+    private void configureColumns(ScannerBase scanner) {
+      for (Text fam : families) {
+        scanner.fetchColumnFamily(fam);
+      }
+
+      for (ColumnFQ col : qualifiers) {
+        col.fetch(scanner);
+      }
+
+      if (families.isEmpty() && qualifiers.isEmpty()) {
+        fetchedCols = EnumSet.allOf(ColumnType.class);
+      }
+    }
+
     @Override
     public Options checkConsistency() {
+      Preconditions.checkState(extents == null,

Review comment:
       I like to statically import Preconditions.checkState (or Preconditions.checkArgument). It can help with readability.

##########
File path: server/compaction-coordinator/src/main/java/org/apache/accumulo/coordinator/CompactionFinalizer.java
##########
@@ -148,16 +150,22 @@ private void processPending() {
 
         List<ExternalCompactionId> statusesToDelete = new ArrayList<>();
 
+        Map<KeyExtent,TabletMetadata> tabletsMetadata = new HashMap<>();
+
+        List<KeyExtent> extents =
+            batch.stream().map(ecfs -> ecfs.getExtent()).collect(Collectors.toList());
+
+        try (TabletsMetadata tablets = context.getAmple().readTablets().forTablets(extents)
+            .fetch(ColumnType.LOCATION, ColumnType.PREV_ROW, ColumnType.ECOMP).build()) {
+          tablets.forEach(tm -> tabletsMetadata.put(tm.getExtent(), tm));
+        }

Review comment:
       Can stream directly to a map using a collector (I did a static import of `Function.identity()` and `Collectors.toMap()` to shorten the line for readability):
   
   ```suggestion
           Map<KeyExtent,TabletMetadata> tabletsMetadata;
           var extents = batch.stream().map(ExternalCompactionFinalState::getExtent).collect(toList());
           try (TabletsMetadata tablets = context.getAmple().readTablets().forTablets(extents)
               .fetch(ColumnType.LOCATION, ColumnType.PREV_ROW, ColumnType.ECOMP).build()) {
             tabletsMetadata = tablets.stream().collect(toMap(TabletMetadata::getExtent, identity()));
           }
   ```

##########
File path: core/src/main/java/org/apache/accumulo/core/metadata/schema/TabletMetadata.java
##########
@@ -429,6 +434,27 @@ private void setLocationOnce(String val, String qual, LocationType lt) {
     location = new Location(val, qual, lt);
   }
 
+  static Iterable<TabletMetadata> convert(BatchScanner input, EnumSet<ColumnType> fetchedColumns,
+      boolean buildKeyValueMap) {
+
+    IteratorSetting iterSetting = new IteratorSetting(100, WholeRowIterator.class);
+    input.addScanIterator(iterSetting);

Review comment:
       It seems like a bad idea to put these inside this method, as they mutate BatchScanner, which could be re-used by the caller. This requires a significant amount of coordination between the caller and this method's implementation, such that it probably doesn't make sense to have this method. Since there's only one caller, it could be inline'd... or at least made local and private for the one caller.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on a change in pull request #2152: Adds Ample support for tablets and uses this in compaction finalizer

Posted by GitBox <gi...@apache.org>.
keith-turner commented on a change in pull request #2152:
URL: https://github.com/apache/accumulo/pull/2152#discussion_r648439120



##########
File path: core/src/main/java/org/apache/accumulo/core/metadata/schema/TabletsMetadata.java
##########
@@ -233,6 +269,19 @@ public Options forTablet(KeyExtent extent) {
       return this;
     }
 
+    @Override
+    public Options forTablets(Collection<KeyExtent> extents) {
+      if (!extents.stream().map(e -> DataLevel.of(e.tableId()))
+          .allMatch(dl -> dl == DataLevel.USER)) {

Review comment:
       These are not equivalent. Wrote the little test below.
   
   ```java
   	var dlset = List.of(DataLevel.USER, DataLevel.USER, DataLevel.USER);
   	System.out.println(!dlset.stream().allMatch(dl -> dl == DataLevel.USER));
   	System.out.println(dlset.stream().anyMatch(dl -> dl == DataLevel.USER));
   ```  
   
   prints
   
   ```
   false
   true
   ```
   
   The following pattern is equivalent and uses anyMatch.  I think it makes the code longer so not sure its worthwhile.  Also not too worried about efficiency because the expected case is all USER, so normally it will iterate through all with anyMatch or allMatch.
   
   ```java
   	EnumSet<DataLevel> unsupportedLevels = EnumSet.complementOf(EnumSet.of(DataLevel.USER));
   	
   	var dlset = List.of(DataLevel.USER, DataLevel.USER, DataLevel.USER);
   	System.out.println(!dlset.stream().allMatch(dl -> dl == DataLevel.USER));
   	System.out.println(dlset.stream().anyMatch(dl -> unsupportedLevels.contains(dl)));
   	
   	dlset = List.of(DataLevel.USER, DataLevel.ROOT, DataLevel.USER);
   	System.out.println(!dlset.stream().allMatch(dl -> dl == DataLevel.USER));
   	System.out.println(dlset.stream().anyMatch(dl -> unsupportedLevels.contains(dl)));
   ```
   
   This prints 
   
   ```
   false
   false
   true
   true
   ```
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on a change in pull request #2152: Adds Ample support for tablets and uses this in compaction finalizer

Posted by GitBox <gi...@apache.org>.
keith-turner commented on a change in pull request #2152:
URL: https://github.com/apache/accumulo/pull/2152#discussion_r648457449



##########
File path: core/src/main/java/org/apache/accumulo/core/metadata/schema/TabletMetadata.java
##########
@@ -429,6 +434,27 @@ private void setLocationOnce(String val, String qual, LocationType lt) {
     location = new Location(val, qual, lt);
   }
 
+  static Iterable<TabletMetadata> convert(BatchScanner input, EnumSet<ColumnType> fetchedColumns,
+      boolean buildKeyValueMap) {
+
+    IteratorSetting iterSetting = new IteratorSetting(100, WholeRowIterator.class);
+    input.addScanIterator(iterSetting);

Review comment:
       > It seems like a bad idea to put these inside this method, as they mutate BatchScanner, which could be re-used by the caller. 
   
   Yeah that was awful.  I was following the pattern of another existing method, that had the same bad pattern.  I inlined both methods in 887a91b.  I think it made the code much better.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on a change in pull request #2152: Adds Ample support for tablets and uses this in compaction finalizer

Posted by GitBox <gi...@apache.org>.
keith-turner commented on a change in pull request #2152:
URL: https://github.com/apache/accumulo/pull/2152#discussion_r648443273



##########
File path: core/src/main/java/org/apache/accumulo/core/metadata/schema/TabletsMetadata.java
##########
@@ -233,6 +269,19 @@ public Options forTablet(KeyExtent extent) {
       return this;
     }
 
+    @Override
+    public Options forTablets(Collection<KeyExtent> extents) {
+      if (!extents.stream().map(e -> DataLevel.of(e.tableId()))
+          .allMatch(dl -> dl == DataLevel.USER)) {

Review comment:
       I thought of something shorter that uses anyMatch pushing that soon.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on a change in pull request #2152: Adds Ample support for tablets and uses this in compaction finalizer

Posted by GitBox <gi...@apache.org>.
keith-turner commented on a change in pull request #2152:
URL: https://github.com/apache/accumulo/pull/2152#discussion_r648450096



##########
File path: core/src/main/java/org/apache/accumulo/core/metadata/schema/TabletsMetadata.java
##########
@@ -87,13 +91,20 @@
     private TableId tableId;
     private ReadConsistency readConsistency = ReadConsistency.IMMEDIATE;
     private AccumuloClient _client;
+    private Collection<KeyExtent> extents;
 
     Builder(AccumuloClient client) {
       this._client = client;
     }
 
     @Override
     public TabletsMetadata build() {
+      if (extents != null) {
+        Preconditions.checkState(
+            range == null && table == null && level == DataLevel.USER && !checkConsistency);
+        return buildExtents(_client);
+      }

Review comment:
       
   adde comment in 52b75b1




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on a change in pull request #2152: Adds Ample support for tablets and uses this in compaction finalizer

Posted by GitBox <gi...@apache.org>.
keith-turner commented on a change in pull request #2152:
URL: https://github.com/apache/accumulo/pull/2152#discussion_r648446469



##########
File path: core/src/main/java/org/apache/accumulo/core/metadata/schema/TabletsMetadata.java
##########
@@ -233,6 +269,19 @@ public Options forTablet(KeyExtent extent) {
       return this;
     }
 
+    @Override
+    public Options forTablets(Collection<KeyExtent> extents) {
+      if (!extents.stream().map(e -> DataLevel.of(e.tableId()))
+          .allMatch(dl -> dl == DataLevel.USER)) {

Review comment:
       Changed in 858ec8f




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on a change in pull request #2152: Adds Ample support for tablets and uses this in compaction finalizer

Posted by GitBox <gi...@apache.org>.
keith-turner commented on a change in pull request #2152:
URL: https://github.com/apache/accumulo/pull/2152#discussion_r648439120



##########
File path: core/src/main/java/org/apache/accumulo/core/metadata/schema/TabletsMetadata.java
##########
@@ -233,6 +269,19 @@ public Options forTablet(KeyExtent extent) {
       return this;
     }
 
+    @Override
+    public Options forTablets(Collection<KeyExtent> extents) {
+      if (!extents.stream().map(e -> DataLevel.of(e.tableId()))
+          .allMatch(dl -> dl == DataLevel.USER)) {

Review comment:
       These are not equivalent. Wrote the little test below.
   
   ```java
   	var dlset = List.of(DataLevel.USER, DataLevel.USER, DataLevel.USER);
   	System.out.println(!dlset.stream().allMatch(dl -> dl == DataLevel.USER));
   	System.out.println(dlset.stream().anyMatch(dl -> dl == DataLevel.USER));
   ```  
   
   prints
   
   ```
   false
   true
   ```
   
   The following pattern is equivalent and uses anyMatch.  I think it makes the code longer so not sure its worthwhile.  Also not too worried about efficiency because the expected case is all USER, so normally it will iterator through all with anyMatch or allMatch.
   
   ```java
   	EnumSet<DataLevel> unsupportedLevels = EnumSet.complementOf(EnumSet.of(DataLevel.USER));
   	
   	var dlset = List.of(DataLevel.USER, DataLevel.USER, DataLevel.USER);
   	System.out.println(!dlset.stream().allMatch(dl -> dl == DataLevel.USER));
   	System.out.println(dlset.stream().anyMatch(dl -> unsupportedLevels.contains(dl)));
   	
   	dlset = List.of(DataLevel.USER, DataLevel.ROOT, DataLevel.USER);
   	System.out.println(!dlset.stream().allMatch(dl -> dl == DataLevel.USER));
   	System.out.println(dlset.stream().anyMatch(dl -> unsupportedLevels.contains(dl)));
   ```
   
   This prints 
   
   ```
   false
   false
   true
   true
   ```
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org