You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by GitBox <gi...@apache.org> on 2020/05/19 18:55:00 UTC

[GitHub] [accumulo] milleruntime opened a new pull request #1614: Create max tablets property in new bulk import

milleruntime opened a new pull request #1614:
URL: https://github.com/apache/accumulo/pull/1614


   * Creates the master.bulk.max.tablets property to allow restricting the
   number of tablets in a single bulk import


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime commented on a change in pull request #1614: Create max tablets property in new bulk import

Posted by GitBox <gi...@apache.org>.
milleruntime commented on a change in pull request #1614:
URL: https://github.com/apache/accumulo/pull/1614#discussion_r428062817



##########
File path: core/src/main/java/org/apache/accumulo/core/clientImpl/bulk/BulkImport.java
##########
@@ -149,6 +150,11 @@ public void load()
       if (mappings.isEmpty())
         throw new IllegalArgumentException("Attempted to import zero files from " + srcPath);
 
+      long tabletMaxSize = conf.getCount(Property.MASTER_BULK_MAX_TABLETS);
+      if (tabletMaxSize > 0 && mappings.keySet().size() > tabletMaxSize)

Review comment:
       Yes that is correct.  I am not sure which is better but based on the original description in #1559, I took it as we want to prevent files from mapping to too many tablets.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on pull request #1614: Create max tablets property in new bulk import

Posted by GitBox <gi...@apache.org>.
keith-turner commented on pull request #1614:
URL: https://github.com/apache/accumulo/pull/1614#issuecomment-637796852


   > Ah I see. If that is the case, I should probably add the check for max tablets on the server side as well. I could see how 1 tablet could easily get expanded beyond the limit.
   
   If its possible, then I think the best place to do it on the servers side would be `PrepBulkImport.checkForMerge()` because this is before any files have been loaded to any tablets.  I am not sure if its possible though, because at that point I think the data is grouped by range like `Map<Range, List<File>>`.  To easily do the check we need `Map<File, List<Range>>`. 
   
   Another option if that does not work is when Range.TABLE is seen AND maxTablets > 0, then we actually try to resolve the range to tablets.   However this kinda defeats the purpose of Range.TABLE as it is for the case when user has the most knowledge and Accumulo can do the least work in the fastest manner.
   
   Not sure what is best, just trying to think through the options.  Need to look at the code some more too.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on a change in pull request #1614: Create max tablets property in new bulk import

Posted by GitBox <gi...@apache.org>.
keith-turner commented on a change in pull request #1614:
URL: https://github.com/apache/accumulo/pull/1614#discussion_r433463825



##########
File path: test/src/main/java/org/apache/accumulo/test/functional/BulkNewIT.java
##########
@@ -183,6 +186,29 @@ public void testSingleTabletSingleFileOffline() throws Exception {
     }
   }
 
+  @Test
+  public void testMaxTablets() throws Exception {
+    try (AccumuloClient client = Accumulo.newClient().from(getClientProps()).build()) {
+      tableName = "testMaxTablets_table1";
+      NewTableConfiguration newTableConf = new NewTableConfiguration();
+      // set logical time type so we can set time on bulk import
+      Map<String,String> props = new HashMap<>();
+      props.put(Property.TABLE_BULK_MAX_TABLETS.getKey(), "1");
+      newTableConf.setProperties(props);
+      client.tableOperations().create(tableName, newTableConf);
+
+      // test max tablets hit while inspecting bulk files
+      var thrown = assertThrows(RuntimeException.class, () -> testBulkFile(false, false));
+      var c = thrown.getCause();
+      assertTrue("Wrong exception: " + c, c instanceof ExecutionException);
+      assertTrue("Wrong exception: " + c.getCause(),
+          c.getCause() instanceof IllegalArgumentException);
+

Review comment:
       > but think it should also be mentioned on the loadFileTo methods, with the startRow and endRow parameters.
   
   For the loadFileTo method, its javadoc could say that how the startRow and endRow are interpreted depends on the RangeType (with a javadoc link to RangeType).




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime commented on pull request #1614: Create max tablets property in new bulk import

Posted by GitBox <gi...@apache.org>.
milleruntime commented on pull request #1614:
URL: https://github.com/apache/accumulo/pull/1614#issuecomment-637638048


   @keith-turner I added new data for the max tablet test but now I am getting different tablet counts when using the LoadPlan.  I can't figure out why.  The changes i just pushed up in 3ff589a don't throw an IllegalArgumentException when using the LoadPlan.  Here are the counts I get:
   ```
   2020-06-02T11:35:34,990 [bulk.BulkImport] DEBUG: The file f3.rf mapped to 1 tablets.
   2020-06-02T11:35:34,991 [bulk.BulkImport] DEBUG: The file f4.rf mapped to 2 tablets.
   2020-06-02T11:35:34,991 [bulk.BulkImport] DEBUG: The file f2.rf mapped to 1 tablets.
   2020-06-02T11:35:34,991 [bulk.BulkImport] DEBUG: The file f1.rf mapped to 1 tablets.
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime commented on pull request #1614: Create max tablets property in new bulk import

Posted by GitBox <gi...@apache.org>.
milleruntime commented on pull request #1614:
URL: https://github.com/apache/accumulo/pull/1614#issuecomment-631509987


   > Would there be any use for this being able to be overridden on a table basis? 
   
   That is a good question.  I hadn't thought about that level of granularity but it would make sense as a next step.  And the bulk import API does operate on a per table basis so that makes sense too.  I will look into it as I am not sure how that would work with the implementation in Master.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on a change in pull request #1614: Create max tablets property in new bulk import

Posted by GitBox <gi...@apache.org>.
keith-turner commented on a change in pull request #1614:
URL: https://github.com/apache/accumulo/pull/1614#discussion_r428938795



##########
File path: test/src/main/java/org/apache/accumulo/test/functional/BulkNewIT.java
##########
@@ -186,18 +186,26 @@ public void testSingleTabletSingleFileOffline() throws Exception {
 
   @Test
   public void testMaxTablets() throws Exception {
-    String maxTablets = "0";
+    // test max tablets hit while inspecting bulk files
     try (AccumuloClient client = Accumulo.newClient().from(getClientProps()).build()) {
-      maxTablets = client.instanceOperations().getSystemConfiguration()
-          .get(Property.MASTER_BULK_MAX_TABLETS.getKey());
-      client.instanceOperations().setProperty(Property.MASTER_BULK_MAX_TABLETS.getKey(), "1");
+      tableName = "testMaxTablets_table1";
+      NewTableConfiguration newTableConf = new NewTableConfiguration();
+      // set logical time type so we can set time on bulk import
+      Map<String,String> props = new HashMap<>();
+      props.put(Property.TABLE_BULK_MAX_TABLETS.getKey(), "1");
+      newTableConf.setProperties(props);
+      client.tableOperations().create(tableName, newTableConf);
       testBulkFile(false, false);
-      fail("Expected IllegalArgumentException for " + Property.MASTER_BULK_MAX_TABLETS);
-    } catch (IllegalArgumentException e) {} finally {
-      try (AccumuloClient client = Accumulo.newClient().from(getClientProps()).build()) {
-        client.instanceOperations().setProperty(Property.MASTER_BULK_MAX_TABLETS.getKey(),
-            maxTablets);
-      }
+      fail("Expected IllegalArgumentException for " + Property.TABLE_BULK_MAX_TABLETS);
+    } catch (IllegalArgumentException e) {
+      // expected
+    }
+    // test max tablets hit using load plan
+    try {
+      testBulkFile(false, true);
+      fail("Expected IllegalArgumentException for " + Property.TABLE_BULK_MAX_TABLETS);
+    } catch (IllegalArgumentException e) {
+      // expected
     }

Review comment:
       I recently learned about this new feature in Junit.  It also returns the expected exception if you want to check the message.
   
   ```suggestion
      Assert.assertThrows(IllegalArgumentException.class, () -> testBulkFile(false, true));
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on a change in pull request #1614: Create max tablets property in new bulk import

Posted by GitBox <gi...@apache.org>.
keith-turner commented on a change in pull request #1614:
URL: https://github.com/apache/accumulo/pull/1614#discussion_r428935251



##########
File path: core/src/main/java/org/apache/accumulo/core/clientImpl/bulk/BulkImport.java
##########
@@ -576,7 +587,12 @@ private Text toText(byte[] row) {
         Thread.currentThread().interrupt();
         throw new RuntimeException(e);
       } catch (ExecutionException e) {
-        throw new RuntimeException(e);
+        // clean up exception for user
+        Throwable t = e.getCause();
+        if (t instanceof IllegalArgumentException) {
+          throw (IllegalArgumentException) t;
+        } else
+          throw new RuntimeException(t);

Review comment:
       I am not sure if the ExecutionException provides valuable information, but definitely want to wrap the IllegalArgumentException.  I would err on the side of caution with omiting the excution exception in that even if does not currently provide anything useful, it may in future versions of Java. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime commented on a change in pull request #1614: Create max tablets property in new bulk import

Posted by GitBox <gi...@apache.org>.
milleruntime commented on a change in pull request #1614:
URL: https://github.com/apache/accumulo/pull/1614#discussion_r428850037



##########
File path: core/src/main/java/org/apache/accumulo/core/clientImpl/bulk/BulkImport.java
##########
@@ -133,6 +134,14 @@ public void load()
 
     SortedMap<KeyExtent,Bulk.Files> mappings;
     TableOperationsImpl tableOps = new TableOperationsImpl(context);
+
+    int maxTablets = 0;
+    for (var prop : tableOps.getProperties(tableName)) {
+      if (prop.getKey().equals(Property.TABLE_BULK_MAX_TABLETS.getKey())) {
+        maxTablets = Integer.parseInt(prop.getValue());
+        break;
+      }
+    }

Review comment:
       I couldn't find a cleaner way to get a single table property from the client side...




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime commented on a change in pull request #1614: Create max tablets property in new bulk import

Posted by GitBox <gi...@apache.org>.
milleruntime commented on a change in pull request #1614:
URL: https://github.com/apache/accumulo/pull/1614#discussion_r429215865



##########
File path: core/src/main/java/org/apache/accumulo/core/clientImpl/bulk/BulkImport.java
##########
@@ -576,7 +587,12 @@ private Text toText(byte[] row) {
         Thread.currentThread().interrupt();
         throw new RuntimeException(e);
       } catch (ExecutionException e) {
-        throw new RuntimeException(e);
+        // clean up exception for user
+        Throwable t = e.getCause();
+        if (t instanceof IllegalArgumentException) {
+          throw (IllegalArgumentException) t;
+        } else
+          throw new RuntimeException(t);

Review comment:
       Good point.  Better to error on the side of too much information.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime commented on pull request #1614: Create max tablets property in new bulk import

Posted by GitBox <gi...@apache.org>.
milleruntime commented on pull request #1614:
URL: https://github.com/apache/accumulo/pull/1614#issuecomment-637757509


   > I am thinking it will on the server side. So the client passes that single range and then it gets expanded into one or more tablets on the servers side.
   
   Ah I see.  If that is the case, I should probably add the check for max tablets on the server side as well.  I could see how 1 tablet could easily get expanded beyond the limit.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime commented on pull request #1614: Create max tablets property in new bulk import

Posted by GitBox <gi...@apache.org>.
milleruntime commented on pull request #1614:
URL: https://github.com/apache/accumulo/pull/1614#issuecomment-637719740


   > @milleruntime this is the code that I think is just serving as a pass through on the client side
   > 
   > https://github.com/apache/accumulo/blob/3ff589ac91ac668b433e78fdb3f6a735d5da7091/core/src/main/java/org/apache/accumulo/core/clientImpl/bulk/BulkImport.java#L461
   
   OK so it looks like a file can only ever have 1 tablet when using ```RangeType.TABLE```.  Is this as intended?  Is this how Accumulo creates files when splitting?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] EdColeman commented on a change in pull request #1614: Create max tablets property in new bulk import

Posted by GitBox <gi...@apache.org>.
EdColeman commented on a change in pull request #1614:
URL: https://github.com/apache/accumulo/pull/1614#discussion_r428035541



##########
File path: test/src/main/java/org/apache/accumulo/test/functional/BulkNewIT.java
##########
@@ -183,6 +184,23 @@ public void testSingleTabletSingleFileOffline() throws Exception {
     }
   }
 
+  @Test
+  public void testMaxTablets() throws Exception {
+    String maxTablets = "0";
+    try (AccumuloClient client = Accumulo.newClient().from(getClientProps()).build()) {
+      maxTablets = client.instanceOperations().getSystemConfiguration()
+          .get(Property.MASTER_BULK_MAX_TABLETS.getKey());
+      client.instanceOperations().setProperty(Property.MASTER_BULK_MAX_TABLETS.getKey(), "1");
+      testBulkFile(false, false);
+      fail("Expected IllegalArgumentException for " + Property.MASTER_BULK_MAX_TABLETS);
+    } catch (IllegalArgumentException e) {} finally {
+      try (AccumuloClient client = Accumulo.newClient().from(getClientProps()).build()) {

Review comment:
       Maybe just a style comment - but I don't think this needs to be in a finally block.  If exception is not thrown, it will fail - otherwise catch (and ignore the exception) and then just continue.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on a change in pull request #1614: Create max tablets property in new bulk import

Posted by GitBox <gi...@apache.org>.
keith-turner commented on a change in pull request #1614:
URL: https://github.com/apache/accumulo/pull/1614#discussion_r428065237



##########
File path: test/src/main/java/org/apache/accumulo/test/functional/BulkNewIT.java
##########
@@ -183,6 +184,23 @@ public void testSingleTabletSingleFileOffline() throws Exception {
     }
   }
 
+  @Test
+  public void testMaxTablets() throws Exception {
+    String maxTablets = "0";
+    try (AccumuloClient client = Accumulo.newClient().from(getClientProps()).build()) {
+      maxTablets = client.instanceOperations().getSystemConfiguration()
+          .get(Property.MASTER_BULK_MAX_TABLETS.getKey());
+      client.instanceOperations().setProperty(Property.MASTER_BULK_MAX_TABLETS.getKey(), "1");

Review comment:
       I am not completely sure, but these two lines may not work as expected sometimes.  What I think may be able to happen is that the property is set on tserver A by the first line and then the second line reads props from tserver B.  If server B does not clear zoocache, then it could return a stale copy w/o the prop set on the first line.  I am not sure if the client side code to get props forces a prop cache clear.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on a change in pull request #1614: Create max tablets property in new bulk import

Posted by GitBox <gi...@apache.org>.
keith-turner commented on a change in pull request #1614:
URL: https://github.com/apache/accumulo/pull/1614#discussion_r428065874



##########
File path: test/src/main/java/org/apache/accumulo/test/functional/BulkNewIT.java
##########
@@ -183,6 +184,23 @@ public void testSingleTabletSingleFileOffline() throws Exception {
     }
   }
 
+  @Test
+  public void testMaxTablets() throws Exception {
+    String maxTablets = "0";
+    try (AccumuloClient client = Accumulo.newClient().from(getClientProps()).build()) {
+      maxTablets = client.instanceOperations().getSystemConfiguration()
+          .get(Property.MASTER_BULK_MAX_TABLETS.getKey());
+      client.instanceOperations().setProperty(Property.MASTER_BULK_MAX_TABLETS.getKey(), "1");

Review comment:
       If it were a table prop, then table could be created with the prop and would not need to clear in finally.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on pull request #1614: Create max tablets property in new bulk import

Posted by GitBox <gi...@apache.org>.
keith-turner commented on pull request #1614:
URL: https://github.com/apache/accumulo/pull/1614#issuecomment-631512130


   I like the suggestion for a table prop, that aligns more closely with the user experience rather than how its implemented.
   
   The check on the client side is nice for fail fast, but we may want a check on the server side to ensure correctness.  However the server side check is really hard to test.
   
   Are there any plans to make the old bulk import code respect this property?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime commented on a change in pull request #1614: Create max tablets property in new bulk import

Posted by GitBox <gi...@apache.org>.
milleruntime commented on a change in pull request #1614:
URL: https://github.com/apache/accumulo/pull/1614#discussion_r434774157



##########
File path: test/src/main/java/org/apache/accumulo/test/functional/BulkNewIT.java
##########
@@ -183,6 +186,29 @@ public void testSingleTabletSingleFileOffline() throws Exception {
     }
   }
 
+  @Test
+  public void testMaxTablets() throws Exception {
+    try (AccumuloClient client = Accumulo.newClient().from(getClientProps()).build()) {
+      tableName = "testMaxTablets_table1";
+      NewTableConfiguration newTableConf = new NewTableConfiguration();
+      // set logical time type so we can set time on bulk import
+      Map<String,String> props = new HashMap<>();
+      props.put(Property.TABLE_BULK_MAX_TABLETS.getKey(), "1");
+      newTableConf.setProperties(props);
+      client.tableOperations().create(tableName, newTableConf);
+
+      // test max tablets hit while inspecting bulk files
+      var thrown = assertThrows(RuntimeException.class, () -> testBulkFile(false, false));
+      var c = thrown.getCause();
+      assertTrue("Wrong exception: " + c, c instanceof ExecutionException);
+      assertTrue("Wrong exception: " + c.getCause(),
+          c.getCause() instanceof IllegalArgumentException);
+

Review comment:
       Opened up #1619 so we don't forget




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime commented on a change in pull request #1614: Create max tablets property in new bulk import

Posted by GitBox <gi...@apache.org>.
milleruntime commented on a change in pull request #1614:
URL: https://github.com/apache/accumulo/pull/1614#discussion_r429215962



##########
File path: test/src/main/java/org/apache/accumulo/test/functional/BulkNewIT.java
##########
@@ -186,18 +186,26 @@ public void testSingleTabletSingleFileOffline() throws Exception {
 
   @Test
   public void testMaxTablets() throws Exception {
-    String maxTablets = "0";
+    // test max tablets hit while inspecting bulk files
     try (AccumuloClient client = Accumulo.newClient().from(getClientProps()).build()) {
-      maxTablets = client.instanceOperations().getSystemConfiguration()
-          .get(Property.MASTER_BULK_MAX_TABLETS.getKey());
-      client.instanceOperations().setProperty(Property.MASTER_BULK_MAX_TABLETS.getKey(), "1");
+      tableName = "testMaxTablets_table1";
+      NewTableConfiguration newTableConf = new NewTableConfiguration();
+      // set logical time type so we can set time on bulk import
+      Map<String,String> props = new HashMap<>();
+      props.put(Property.TABLE_BULK_MAX_TABLETS.getKey(), "1");
+      newTableConf.setProperties(props);
+      client.tableOperations().create(tableName, newTableConf);
       testBulkFile(false, false);
-      fail("Expected IllegalArgumentException for " + Property.MASTER_BULK_MAX_TABLETS);
-    } catch (IllegalArgumentException e) {} finally {
-      try (AccumuloClient client = Accumulo.newClient().from(getClientProps()).build()) {
-        client.instanceOperations().setProperty(Property.MASTER_BULK_MAX_TABLETS.getKey(),
-            maxTablets);
-      }
+      fail("Expected IllegalArgumentException for " + Property.TABLE_BULK_MAX_TABLETS);
+    } catch (IllegalArgumentException e) {
+      // expected
+    }
+    // test max tablets hit using load plan
+    try {
+      testBulkFile(false, true);
+      fail("Expected IllegalArgumentException for " + Property.TABLE_BULK_MAX_TABLETS);
+    } catch (IllegalArgumentException e) {
+      // expected
     }

Review comment:
       Cool thanks.  I am not sure what happened but Github isn't letting me apply your suggestion.  I will just change it manually.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime commented on a change in pull request #1614: Create max tablets property in new bulk import

Posted by GitBox <gi...@apache.org>.
milleruntime commented on a change in pull request #1614:
URL: https://github.com/apache/accumulo/pull/1614#discussion_r433364473



##########
File path: test/src/main/java/org/apache/accumulo/test/functional/BulkNewIT.java
##########
@@ -183,6 +186,29 @@ public void testSingleTabletSingleFileOffline() throws Exception {
     }
   }
 
+  @Test
+  public void testMaxTablets() throws Exception {
+    try (AccumuloClient client = Accumulo.newClient().from(getClientProps()).build()) {
+      tableName = "testMaxTablets_table1";
+      NewTableConfiguration newTableConf = new NewTableConfiguration();
+      // set logical time type so we can set time on bulk import
+      Map<String,String> props = new HashMap<>();
+      props.put(Property.TABLE_BULK_MAX_TABLETS.getKey(), "1");
+      newTableConf.setProperties(props);
+      client.tableOperations().create(tableName, newTableConf);
+
+      // test max tablets hit while inspecting bulk files
+      var thrown = assertThrows(RuntimeException.class, () -> testBulkFile(false, false));
+      var c = thrown.getCause();
+      assertTrue("Wrong exception: " + c, c instanceof ExecutionException);
+      assertTrue("Wrong exception: " + c.getCause(),
+          c.getCause() instanceof IllegalArgumentException);
+

Review comment:
       I have been looking at BulkNewIT to see how we could test this scenario and I think there is an issue with the test data.  This line doesn't make sense to me and seems wrong: https://github.com/milleruntime/accumulo/blob/f71a2d1ba175dbb74a8f9e26151b778c270eaa2c/test/src/main/java/org/apache/accumulo/test/functional/BulkNewIT.java#L315
   
   This also made me wonder about exclusivity when creating the load plan.  We don't mention whether the start and end rows are inclusive/exclusive on the load API [here](https://github.com/apache/accumulo/blob/3fd5cad92f9b63ac19e4466f3f2d5237b905262c/core/src/main/java/org/apache/accumulo/core/data/LoadPlan.java#L187).  I was wondering this myself and eventually found it mentioned in the ```RangeType``` enum but think it should also be mentioned on the ```loadFileTo``` methods, with the ```startRow``` and ```endRow``` parameters.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime commented on pull request #1614: Create max tablets property in new bulk import

Posted by GitBox <gi...@apache.org>.
milleruntime commented on pull request #1614:
URL: https://github.com/apache/accumulo/pull/1614#issuecomment-639469949


   @keith-turner want to take a last look at this?  I think it is ready to merge.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on pull request #1614: Create max tablets property in new bulk import

Posted by GitBox <gi...@apache.org>.
keith-turner commented on pull request #1614:
URL: https://github.com/apache/accumulo/pull/1614#issuecomment-637704826


   > @keith-turner I added new data for the max tablet test but now I am getting different tablet counts when using the LoadPlan. I can't figure out why. The changes i just pushed up in 3ff589a don't throw an IllegalArgumentException when using the LoadPlan
   
   I took a look at the code quickly. I think what is happening is that for the RangeType.TABLE it does nothing for this on the client side and maps the range to tablets on the server side.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime commented on pull request #1614: Create max tablets property in new bulk import

Posted by GitBox <gi...@apache.org>.
milleruntime commented on pull request #1614:
URL: https://github.com/apache/accumulo/pull/1614#issuecomment-631536014


   > Are there any plans to make the old bulk import code respect this property?
   
   I made some comments on #1560 about this as I think it would be worth it but should be discussed since it would incur adding a new feature to a deprecated API.  The implementation in 2.1 was a priority first though.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner edited a comment on pull request #1614: Create max tablets property in new bulk import

Posted by GitBox <gi...@apache.org>.
keith-turner edited a comment on pull request #1614:
URL: https://github.com/apache/accumulo/pull/1614#issuecomment-631512130


   I like the suggestion for a table prop, that aligns more closely with the user experience rather than how its implemented.
   
   The check on the client side is nice for fail fast, but we may want a check on the server side to ensure correctness.  However the server side check is really hard to test, if there is a client side check.
   
   Are there any plans to make the old bulk import code respect this property?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime commented on pull request #1614: Create max tablets property in new bulk import

Posted by GitBox <gi...@apache.org>.
milleruntime commented on pull request #1614:
URL: https://github.com/apache/accumulo/pull/1614#issuecomment-639066819


   > @milleruntime the following branch has a change I was experimenting with for adding a server side check... its the top commit
   > 
   > https://github.com/keith-turner/accumulo/tree/accumulo-1614
   
   Thanks.  It took a bit more work than I expected but I think I got the server side (master FATE) check working.   I also added a note to the property description. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on a change in pull request #1614: Create max tablets property in new bulk import

Posted by GitBox <gi...@apache.org>.
keith-turner commented on a change in pull request #1614:
URL: https://github.com/apache/accumulo/pull/1614#discussion_r428933667



##########
File path: core/src/main/java/org/apache/accumulo/core/clientImpl/bulk/BulkImport.java
##########
@@ -576,7 +587,12 @@ private Text toText(byte[] row) {
         Thread.currentThread().interrupt();
         throw new RuntimeException(e);
       } catch (ExecutionException e) {
-        throw new RuntimeException(e);
+        // clean up exception for user
+        Throwable t = e.getCause();
+        if (t instanceof IllegalArgumentException) {
+          throw (IllegalArgumentException) t;
+        } else
+          throw new RuntimeException(t);

Review comment:
       This could lead to a loss of information for someone debugging.  Like they see a stack trace from a background thread in Accumulo, but they don't know anything about the calling code in the foreground thread.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on a change in pull request #1614: Create max tablets property in new bulk import

Posted by GitBox <gi...@apache.org>.
keith-turner commented on a change in pull request #1614:
URL: https://github.com/apache/accumulo/pull/1614#discussion_r431955808



##########
File path: core/src/main/java/org/apache/accumulo/core/clientImpl/bulk/BulkImport.java
##########
@@ -576,7 +587,12 @@ private Text toText(byte[] row) {
         Thread.currentThread().interrupt();
         throw new RuntimeException(e);
       } catch (ExecutionException e) {
-        throw new RuntimeException(e);
+        // clean up exception for user
+        Throwable t = e.getCause();
+        if (t instanceof IllegalArgumentException) {
+          throw (IllegalArgumentException) t;
+        } else
+          throw new RuntimeException(t);
       }

Review comment:
       Sorry for the delay @milleruntime.  This fell off my radar.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] EdColeman commented on pull request #1614: Create max tablets property in new bulk import

Posted by GitBox <gi...@apache.org>.
EdColeman commented on pull request #1614:
URL: https://github.com/apache/accumulo/pull/1614#issuecomment-631495941


   Would there be any use for this being able to be overridden on a table basis?  My thought would be that "normally" there would be a system limit, but because of some external factor there was a need to allow this to be on a case by case basis.  It could go either way - say, I know this is not optimal, but for this table I want to allow more files, or the other way would be that the system allows X, but for this table I want it to be less because at X bad things happen and I'd rather fail.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime commented on pull request #1614: Create max tablets property in new bulk import

Posted by GitBox <gi...@apache.org>.
milleruntime commented on pull request #1614:
URL: https://github.com/apache/accumulo/pull/1614#issuecomment-632282696


   @EdColeman @keith-turner I made some improvements based on your feedback.  I had to move the check inside the computeMapping methods so it would fail fast on a single file.  This also lead me to clean up the exception as it comes out of a future so the user will just get a single IllegalArgumentException.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime commented on a change in pull request #1614: Create max tablets property in new bulk import

Posted by GitBox <gi...@apache.org>.
milleruntime commented on a change in pull request #1614:
URL: https://github.com/apache/accumulo/pull/1614#discussion_r428066247



##########
File path: core/src/main/java/org/apache/accumulo/core/clientImpl/bulk/BulkImport.java
##########
@@ -149,6 +150,11 @@ public void load()
       if (mappings.isEmpty())
         throw new IllegalArgumentException("Attempted to import zero files from " + srcPath);
 
+      long tabletMaxSize = conf.getCount(Property.MASTER_BULK_MAX_TABLETS);
+      if (tabletMaxSize > 0 && mappings.keySet().size() > tabletMaxSize)

Review comment:
       I guess the danger with this check is stopping ingest due to false positives?  Like if I have 100 tablets and 10 files going to all tablets could be normal. I would rather stop 1 file going to all 100 then 10 going to all 100. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner merged pull request #1614: Create max tablets property in new bulk import

Posted by GitBox <gi...@apache.org>.
keith-turner merged pull request #1614:
URL: https://github.com/apache/accumulo/pull/1614


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on pull request #1614: Create max tablets property in new bulk import

Posted by GitBox <gi...@apache.org>.
keith-turner commented on pull request #1614:
URL: https://github.com/apache/accumulo/pull/1614#issuecomment-637745303


   > I thought this would map to 3 tablets 0666-0334, 0999-0667, 1333-1000...
   
   I am thinking it will on the server side.  So the client passes that single range and then it gets expanded into one or more tablets on the servers side.  
   
   https://github.com/apache/accumulo/blob/3ff589ac91ac668b433e78fdb3f6a735d5da7091/server/master/src/main/java/org/apache/accumulo/master/tableOps/bulkVer2/LoadFiles.java#L339


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime commented on a change in pull request #1614: Create max tablets property in new bulk import

Posted by GitBox <gi...@apache.org>.
milleruntime commented on a change in pull request #1614:
URL: https://github.com/apache/accumulo/pull/1614#discussion_r431159101



##########
File path: core/src/main/java/org/apache/accumulo/core/clientImpl/bulk/BulkImport.java
##########
@@ -576,7 +587,12 @@ private Text toText(byte[] row) {
         Thread.currentThread().interrupt();
         throw new RuntimeException(e);
       } catch (ExecutionException e) {
-        throw new RuntimeException(e);
+        // clean up exception for user
+        Throwable t = e.getCause();
+        if (t instanceof IllegalArgumentException) {
+          throw (IllegalArgumentException) t;
+        } else
+          throw new RuntimeException(t);
       }

Review comment:
       See my other comment.  You OK with me keeping the exception handling as-is?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime commented on pull request #1614: Create max tablets property in new bulk import

Posted by GitBox <gi...@apache.org>.
milleruntime commented on pull request #1614:
URL: https://github.com/apache/accumulo/pull/1614#issuecomment-631482122


   This PR addresses #1559 for 2.1.  Related to #1560 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime commented on pull request #1614: Create max tablets property in new bulk import

Posted by GitBox <gi...@apache.org>.
milleruntime commented on pull request #1614:
URL: https://github.com/apache/accumulo/pull/1614#issuecomment-637732071


   > It can correspond to one or more tablets, but must map to existing table split points.
   
   Than how come this line in my last commit only maps to 1 tablet?
   ```     
   .loadFileTo("f2.rf", RangeType.TABLE, row(333), row(1333))
   ```
   Since I used the same splits as the other tests, ```"0333 0666 0999 1333 1666 null"``` I thought this would map to 3 tablets 0666-0334, 0999-0667, 1333-1000...
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on a change in pull request #1614: Create max tablets property in new bulk import

Posted by GitBox <gi...@apache.org>.
keith-turner commented on a change in pull request #1614:
URL: https://github.com/apache/accumulo/pull/1614#discussion_r428932975



##########
File path: core/src/main/java/org/apache/accumulo/core/clientImpl/bulk/BulkImport.java
##########
@@ -576,7 +587,12 @@ private Text toText(byte[] row) {
         Thread.currentThread().interrupt();
         throw new RuntimeException(e);
       } catch (ExecutionException e) {
-        throw new RuntimeException(e);
+        // clean up exception for user
+        Throwable t = e.getCause();
+        if (t instanceof IllegalArgumentException) {
+          throw (IllegalArgumentException) t;
+        } else
+          throw new RuntimeException(t);
       }

Review comment:
       I would wrap `e` so that stack traces are not lost and anyone getting the exceptions can trace the full code path from their code to the background thread.
   
   ```suggestion
           Throwable t = e.getCause();
           if (t instanceof IllegalArgumentException) {
             throw new IllegalArgumentException(e);
           } else
             throw new RuntimeException(e);
         }
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime commented on a change in pull request #1614: Create max tablets property in new bulk import

Posted by GitBox <gi...@apache.org>.
milleruntime commented on a change in pull request #1614:
URL: https://github.com/apache/accumulo/pull/1614#discussion_r429355678



##########
File path: test/src/main/java/org/apache/accumulo/test/functional/BulkNewIT.java
##########
@@ -186,18 +186,26 @@ public void testSingleTabletSingleFileOffline() throws Exception {
 
   @Test
   public void testMaxTablets() throws Exception {
-    String maxTablets = "0";
+    // test max tablets hit while inspecting bulk files
     try (AccumuloClient client = Accumulo.newClient().from(getClientProps()).build()) {
-      maxTablets = client.instanceOperations().getSystemConfiguration()
-          .get(Property.MASTER_BULK_MAX_TABLETS.getKey());
-      client.instanceOperations().setProperty(Property.MASTER_BULK_MAX_TABLETS.getKey(), "1");
+      tableName = "testMaxTablets_table1";
+      NewTableConfiguration newTableConf = new NewTableConfiguration();
+      // set logical time type so we can set time on bulk import
+      Map<String,String> props = new HashMap<>();
+      props.put(Property.TABLE_BULK_MAX_TABLETS.getKey(), "1");
+      newTableConf.setProperties(props);
+      client.tableOperations().create(tableName, newTableConf);
       testBulkFile(false, false);
-      fail("Expected IllegalArgumentException for " + Property.MASTER_BULK_MAX_TABLETS);
-    } catch (IllegalArgumentException e) {} finally {
-      try (AccumuloClient client = Accumulo.newClient().from(getClientProps()).build()) {
-        client.instanceOperations().setProperty(Property.MASTER_BULK_MAX_TABLETS.getKey(),
-            maxTablets);
-      }
+      fail("Expected IllegalArgumentException for " + Property.TABLE_BULK_MAX_TABLETS);
+    } catch (IllegalArgumentException e) {
+      // expected
+    }
+    // test max tablets hit using load plan
+    try {
+      testBulkFile(false, true);
+      fail("Expected IllegalArgumentException for " + Property.TABLE_BULK_MAX_TABLETS);
+    } catch (IllegalArgumentException e) {
+      // expected
     }

Review comment:
       I settled with keeping the exception handling the way it is now.  I was able to clean up the IT using assertThrows and just checked the causes manually.  I think this is as clean as it is going to get until we move to Junit5.  I did find this library which I thought was interesting: https://assertj.github.io/doc/




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on a change in pull request #1614: Create max tablets property in new bulk import

Posted by GitBox <gi...@apache.org>.
keith-turner commented on a change in pull request #1614:
URL: https://github.com/apache/accumulo/pull/1614#discussion_r428104619



##########
File path: core/src/main/java/org/apache/accumulo/core/clientImpl/bulk/BulkImport.java
##########
@@ -149,6 +150,11 @@ public void load()
       if (mappings.isEmpty())
         throw new IllegalArgumentException("Attempted to import zero files from " + srcPath);
 
+      long tabletMaxSize = conf.getCount(Property.MASTER_BULK_MAX_TABLETS);
+      if (tabletMaxSize > 0 && mappings.keySet().size() > tabletMaxSize)

Review comment:
       I just looked at #1559 and saw the following sentence.  I would be in favor of preventing a single file from going to too many tablets.  If someone has a large number of tablets and each file is going to a few tablets, they are probably ok with that like you said.
   
   > It would be nice if we could establish a threshold in the bulk import process to abort when encountering a rfile that maps to more than a specified number of extents.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on pull request #1614: Create max tablets property in new bulk import

Posted by GitBox <gi...@apache.org>.
keith-turner commented on pull request #1614:
URL: https://github.com/apache/accumulo/pull/1614#issuecomment-637722433


   > OK so it looks like a file can only ever have 1 tablet when using RangeType.TABLE. Is this as intended? Is this how Accumulo creates files when splitting?
   
   It can correspond to one or more tablets, but must map to existing table split points.
   
   https://github.com/apache/accumulo/blob/3fd5cad92f9b63ac19e4466f3f2d5237b905262c/core/src/main/java/org/apache/accumulo/core/data/LoadPlan.java#L71


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on a change in pull request #1614: Create max tablets property in new bulk import

Posted by GitBox <gi...@apache.org>.
keith-turner commented on a change in pull request #1614:
URL: https://github.com/apache/accumulo/pull/1614#discussion_r428058802



##########
File path: core/src/main/java/org/apache/accumulo/core/clientImpl/bulk/BulkImport.java
##########
@@ -149,6 +150,11 @@ public void load()
       if (mappings.isEmpty())
         throw new IllegalArgumentException("Attempted to import zero files from " + srcPath);
 
+      long tabletMaxSize = conf.getCount(Property.MASTER_BULK_MAX_TABLETS);
+      if (tabletMaxSize > 0 && mappings.keySet().size() > tabletMaxSize)

Review comment:
       This check is different than the one in #1560, which ensured a single file did not go to too many tablets.  This is checking that all files do not go to too many tablets.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on pull request #1614: Create max tablets property in new bulk import

Posted by GitBox <gi...@apache.org>.
keith-turner commented on pull request #1614:
URL: https://github.com/apache/accumulo/pull/1614#issuecomment-637707808


   @milleruntime this is the code that I think is just serving as a pass through on the client side
   
   https://github.com/apache/accumulo/blob/3ff589ac91ac668b433e78fdb3f6a735d5da7091/core/src/main/java/org/apache/accumulo/core/clientImpl/bulk/BulkImport.java#L461


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime commented on a change in pull request #1614: Create max tablets property in new bulk import

Posted by GitBox <gi...@apache.org>.
milleruntime commented on a change in pull request #1614:
URL: https://github.com/apache/accumulo/pull/1614#discussion_r429217233



##########
File path: core/src/main/java/org/apache/accumulo/core/clientImpl/bulk/BulkImport.java
##########
@@ -576,7 +587,12 @@ private Text toText(byte[] row) {
         Thread.currentThread().interrupt();
         throw new RuntimeException(e);
       } catch (ExecutionException e) {
-        throw new RuntimeException(e);
+        // clean up exception for user
+        Throwable t = e.getCause();
+        if (t instanceof IllegalArgumentException) {
+          throw (IllegalArgumentException) t;
+        } else
+          throw new RuntimeException(t);
       }

Review comment:
       I am not sure this is better than just the Runtime.  I think this will wrap an IllegalArgument around the whole stack, which already has an IllegalArgument at the bottom.  I will play around.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime commented on a change in pull request #1614: Create max tablets property in new bulk import

Posted by GitBox <gi...@apache.org>.
milleruntime commented on a change in pull request #1614:
URL: https://github.com/apache/accumulo/pull/1614#discussion_r428851706



##########
File path: core/src/main/java/org/apache/accumulo/core/clientImpl/bulk/BulkImport.java
##########
@@ -576,7 +587,12 @@ private Text toText(byte[] row) {
         Thread.currentThread().interrupt();
         throw new RuntimeException(e);
       } catch (ExecutionException e) {
-        throw new RuntimeException(e);
+        // clean up exception for user
+        Throwable t = e.getCause();
+        if (t instanceof IllegalArgumentException) {
+          throw (IllegalArgumentException) t;
+        } else
+          throw new RuntimeException(t);

Review comment:
       I thought it was cleaner for the user to just drop the ExecutionException coming out of the future since that is just implementation specific.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on a change in pull request #1614: Create max tablets property in new bulk import

Posted by GitBox <gi...@apache.org>.
keith-turner commented on a change in pull request #1614:
URL: https://github.com/apache/accumulo/pull/1614#discussion_r433461885



##########
File path: test/src/main/java/org/apache/accumulo/test/functional/BulkNewIT.java
##########
@@ -183,6 +186,29 @@ public void testSingleTabletSingleFileOffline() throws Exception {
     }
   }
 
+  @Test
+  public void testMaxTablets() throws Exception {
+    try (AccumuloClient client = Accumulo.newClient().from(getClientProps()).build()) {
+      tableName = "testMaxTablets_table1";
+      NewTableConfiguration newTableConf = new NewTableConfiguration();
+      // set logical time type so we can set time on bulk import
+      Map<String,String> props = new HashMap<>();
+      props.put(Property.TABLE_BULK_MAX_TABLETS.getKey(), "1");
+      newTableConf.setProperties(props);
+      client.tableOperations().create(tableName, newTableConf);
+
+      // test max tablets hit while inspecting bulk files
+      var thrown = assertThrows(RuntimeException.class, () -> testBulkFile(false, false));
+      var c = thrown.getCause();
+      assertTrue("Wrong exception: " + c, c instanceof ExecutionException);
+      assertTrue("Wrong exception: " + c.getCause(),
+          c.getCause() instanceof IllegalArgumentException);
+

Review comment:
       I looked at BulkNewIT line 315,  I think what is going is that the file with a hash of h3 overlaps two tablets.  It overlaps a tablet with an end row of 1333 and another tablet with an end row of 1666.  So that file is added as expected for those two tablets.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on pull request #1614: Create max tablets property in new bulk import

Posted by GitBox <gi...@apache.org>.
keith-turner commented on pull request #1614:
URL: https://github.com/apache/accumulo/pull/1614#issuecomment-639933614


   @milleruntime I went ahead and merged this because you said you thought it was ready to merge and I worked up a follow on test in #1623.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on pull request #1614: Create max tablets property in new bulk import

Posted by GitBox <gi...@apache.org>.
keith-turner commented on pull request #1614:
URL: https://github.com/apache/accumulo/pull/1614#issuecomment-637899981


   @milleruntime the following branch has a change I was experimenting with for adding a server side check... its the top commit
   
   https://github.com/keith-turner/accumulo/tree/accumulo-1614


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime commented on a change in pull request #1614: Create max tablets property in new bulk import

Posted by GitBox <gi...@apache.org>.
milleruntime commented on a change in pull request #1614:
URL: https://github.com/apache/accumulo/pull/1614#discussion_r428054872



##########
File path: test/src/main/java/org/apache/accumulo/test/functional/BulkNewIT.java
##########
@@ -183,6 +184,23 @@ public void testSingleTabletSingleFileOffline() throws Exception {
     }
   }
 
+  @Test
+  public void testMaxTablets() throws Exception {
+    String maxTablets = "0";
+    try (AccumuloClient client = Accumulo.newClient().from(getClientProps()).build()) {
+      maxTablets = client.instanceOperations().getSystemConfiguration()
+          .get(Property.MASTER_BULK_MAX_TABLETS.getKey());
+      client.instanceOperations().setProperty(Property.MASTER_BULK_MAX_TABLETS.getKey(), "1");
+      testBulkFile(false, false);
+      fail("Expected IllegalArgumentException for " + Property.MASTER_BULK_MAX_TABLETS);
+    } catch (IllegalArgumentException e) {} finally {
+      try (AccumuloClient client = Accumulo.newClient().from(getClientProps()).build()) {

Review comment:
       I was thinking of the situation where an exception other than IllegalArgumentException is thrown, the property won't be changed back.  Typically this doesn't matter, but I was under the impression we are still striving towards getting the ITs to work as a standalone cluster (even if its still currently fails).




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on a change in pull request #1614: Create max tablets property in new bulk import

Posted by GitBox <gi...@apache.org>.
keith-turner commented on a change in pull request #1614:
URL: https://github.com/apache/accumulo/pull/1614#discussion_r431953512



##########
File path: test/src/main/java/org/apache/accumulo/test/functional/BulkNewIT.java
##########
@@ -183,6 +186,29 @@ public void testSingleTabletSingleFileOffline() throws Exception {
     }
   }
 
+  @Test
+  public void testMaxTablets() throws Exception {
+    try (AccumuloClient client = Accumulo.newClient().from(getClientProps()).build()) {
+      tableName = "testMaxTablets_table1";
+      NewTableConfiguration newTableConf = new NewTableConfiguration();
+      // set logical time type so we can set time on bulk import
+      Map<String,String> props = new HashMap<>();
+      props.put(Property.TABLE_BULK_MAX_TABLETS.getKey(), "1");
+      newTableConf.setProperties(props);
+      client.tableOperations().create(tableName, newTableConf);
+
+      // test max tablets hit while inspecting bulk files
+      var thrown = assertThrows(RuntimeException.class, () -> testBulkFile(false, false));
+      var c = thrown.getCause();
+      assertTrue("Wrong exception: " + c, c instanceof ExecutionException);
+      assertTrue("Wrong exception: " + c.getCause(),
+          c.getCause() instanceof IllegalArgumentException);
+

Review comment:
       Not sure if this is possible, but it would be really nice to confirm that exception message contains the offending file name.  Whenever someone runs into this error message, knowing which file caused the problem will be extremely helpful to them.
   
   If the test does not do this, I would also recommend creating multiple files.  One that exceeds the limit and few that do not.  Want to ensure in this case the troublesome file is listed in the message.

##########
File path: test/src/main/java/org/apache/accumulo/test/functional/BulkNewIT.java
##########
@@ -183,6 +186,29 @@ public void testSingleTabletSingleFileOffline() throws Exception {
     }
   }
 
+  @Test
+  public void testMaxTablets() throws Exception {
+    try (AccumuloClient client = Accumulo.newClient().from(getClientProps()).build()) {
+      tableName = "testMaxTablets_table1";
+      NewTableConfiguration newTableConf = new NewTableConfiguration();
+      // set logical time type so we can set time on bulk import
+      Map<String,String> props = new HashMap<>();
+      props.put(Property.TABLE_BULK_MAX_TABLETS.getKey(), "1");

Review comment:
       ```suggestion
        var props = Map.of(Property.TABLE_BULK_MAX_TABLETS.getKey(), "1");
   ```

##########
File path: core/src/main/java/org/apache/accumulo/core/conf/Property.java
##########
@@ -715,6 +715,8 @@
           + " perform specialized parsing of the key. "),
   TABLE_BLOOM_HASHTYPE("table.bloom.hash.type", "murmur", PropertyType.STRING,
       "The bloom filter hash type"),
+  TABLE_BULK_MAX_TABLETS("table.bulk.max.tablets", "0", PropertyType.COUNT,
+      "The maximum number of tablets allowed for one bulk import file. Value of 0 is Unlimited"),

Review comment:
       Could mention that this property is only enforced when using the new bulk import API.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org