You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/01/20 12:48:40 UTC

[GitHub] [iceberg] kbendick opened a new pull request #3938: Core: Disallow Namespace with null level or level with NULL BYTE or o…

kbendick opened a new pull request #3938:
URL: https://github.com/apache/iceberg/pull/3938


   …ther empty characters
   
   In the REST catalog, we have decided to use a NULL_BYTE to delimit certain portions of a `Namespace`.
   
   To prepare for that, we should disallow any level which contains a NULL_BYTE character (technically either the deprecated `\0` or  the unicode character `\u0000`). It also doesn't make sense for a level to be null, so I've added a check for that as well.
   
   Added tests and I also tested the regular expression against a large number of patterns in a Scala REPL. I can make the regular expression simpler (just matching on `\0` or `\u0000`), but I don't think it makes sense for any namespace to have whitespace in it at all.
   
   Are there any systems where users could have a catalog table such as `hive_catalog.\\`wow who named me\\`.tbl`", where backticks are used to escape on the spaces? Even if so, it seems.... not that advisable.
   
   I'm happy to remove the stricter whitespace check in favor of just checking for null-byte (so that future work won't accidentally allow the null-byte character to pass into the namespace silently).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] kbendick commented on pull request #3938: API: Disallow Namespace with null byte character or null level in it

Posted by GitBox <gi...@apache.org>.
kbendick commented on pull request #3938:
URL: https://github.com/apache/iceberg/pull/3938#issuecomment-1032949032


   cc @rdblue if you could possibly merge this now that the tests are passing.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] kbendick commented on a change in pull request #3938: API: Disallow Namespace with null byte character or null level in it

Posted by GitBox <gi...@apache.org>.
kbendick commented on a change in pull request #3938:
URL: https://github.com/apache/iceberg/pull/3938#discussion_r806185400



##########
File path: api/src/main/java/org/apache/iceberg/catalog/Namespace.java
##########
@@ -29,6 +31,8 @@
 public class Namespace {
   private static final Namespace EMPTY_NAMESPACE = new Namespace(new String[] {});
   private static final Joiner DOT = Joiner.on('.');
+  private static final Predicate<String> CONTAINS_NULL_BYTE =
+      Pattern.compile("\0|\u0000", Pattern.UNICODE_CHARACTER_CLASS).asPredicate();

Review comment:
       One is unicode and one isn't. I do notice that some of our systems complain when using `\0` and not the full unicode `\u0000` which is the preferred one.
   
   To be safe, I just included both. Let me test and remove the ASCII one if it's not needed.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] kbendick commented on a change in pull request #3938: API: Disallow Namespace with null byte character or null level in it

Posted by GitBox <gi...@apache.org>.
kbendick commented on a change in pull request #3938:
URL: https://github.com/apache/iceberg/pull/3938#discussion_r801010664



##########
File path: api/src/test/java/org/apache/iceberg/catalog/TestNamespace.java
##########
@@ -44,4 +45,31 @@ public void testNamespace() {
       Assertions.assertThat(namespace.level(i)).isEqualTo(levels[i]);
     }
   }
+
+  @Test
+  public void testWithNullInLevel() {
+    AssertHelpers.assertThrows(
+        "An individual level of a namespace cannot be null",
+        IllegalArgumentException.class,

Review comment:
       Test is fixed 👍 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] kbendick commented on a change in pull request #3938: API: Disallow Namespace with null byte character or null level in it

Posted by GitBox <gi...@apache.org>.
kbendick commented on a change in pull request #3938:
URL: https://github.com/apache/iceberg/pull/3938#discussion_r800986771



##########
File path: api/src/test/java/org/apache/iceberg/catalog/TestNamespace.java
##########
@@ -44,4 +45,31 @@ public void testNamespace() {
       Assertions.assertThat(namespace.level(i)).isEqualTo(levels[i]);
     }
   }
+
+  @Test
+  public void testWithNullInLevel() {
+    AssertHelpers.assertThrows(
+        "An individual level of a namespace cannot be null",
+        IllegalArgumentException.class,

Review comment:
       Ahh good catch. Thank you! It was previously a different precondition check.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #3938: API: Disallow Namespace with null byte character or null level in it

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #3938:
URL: https://github.com/apache/iceberg/pull/3938#discussion_r805434503



##########
File path: api/src/main/java/org/apache/iceberg/catalog/Namespace.java
##########
@@ -29,6 +31,8 @@
 public class Namespace {
   private static final Namespace EMPTY_NAMESPACE = new Namespace(new String[] {});
   private static final Joiner DOT = Joiner.on('.');
+  private static final Predicate<String> CONTAINS_NULL_BYTE =
+      Pattern.compile("\0|\u0000", Pattern.UNICODE_CHARACTER_CLASS).asPredicate();

Review comment:
       What's the difference between `\0` and `\u0000`?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] kbendick commented on a change in pull request #3938: API: Disallow Namespace with null byte character or null level in it

Posted by GitBox <gi...@apache.org>.
kbendick commented on a change in pull request #3938:
URL: https://github.com/apache/iceberg/pull/3938#discussion_r806188812



##########
File path: api/src/main/java/org/apache/iceberg/catalog/Namespace.java
##########
@@ -29,6 +31,8 @@
 public class Namespace {
   private static final Namespace EMPTY_NAMESPACE = new Namespace(new String[] {});
   private static final Joiner DOT = Joiner.on('.');
+  private static final Predicate<String> CONTAINS_NULL_BYTE =
+      Pattern.compile("\0|\u0000", Pattern.UNICODE_CHARACTER_CLASS).asPredicate();

Review comment:
       Using just `\0000` is sufficient, so I removed the ASCII one.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue merged pull request #3938: API: Disallow Namespace with null byte character or null level in it

Posted by GitBox <gi...@apache.org>.
rdblue merged pull request #3938:
URL: https://github.com/apache/iceberg/pull/3938


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on pull request #3938: API: Disallow Namespace with null byte character or null level in it

Posted by GitBox <gi...@apache.org>.
rdblue commented on pull request #3938:
URL: https://github.com/apache/iceberg/pull/3938#issuecomment-1039665773


   Thanks, @kbendick!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] kbendick commented on pull request #3938: API: Disallow Namespace with null byte character or null level in it

Posted by GitBox <gi...@apache.org>.
kbendick commented on pull request #3938:
URL: https://github.com/apache/iceberg/pull/3938#issuecomment-1017968614


   Looks like my emptiness check on namespace fields broke some of the existing tests: I'll take a look at them and then either remove that check (all that's really needed is the null-byte check) or update the tests if they're possibly expecting to fail later in the process or something.
   
   ```
   org.apache.iceberg.aws.glue.TestIcebergToGlueConverter > testToDatabaseNameFailure FAILED
       java.lang.IllegalArgumentException: Cannot create a namespace with a null or empty level
           at org.apache.iceberg.relocated.com.google.common.base.Preconditions.checkArgument(Preconditions.java:142)
           at org.apache.iceberg.catalog.Namespace.of(Namespace.java:48)
           at org.apache.iceberg.aws.glue.TestIcebergToGlueConverter.testToDatabaseNameFailure(TestIcebergToGlueConverter.java:54)
   
   65 tests completed, 1 failed
   
   > Task :iceberg-aws:test FAILED
   > Task :iceberg-core:test
   
   org.apache.iceberg.jdbc.TestJdbcCatalog > testListNamespace FAILED
       java.lang.IllegalArgumentException: Cannot create a namespace with a null or empty level
           at org.apache.iceberg.relocated.com.google.common.base.Preconditions.checkArgument(Preconditions.java:142)
           at org.apache.iceberg.catalog.Namespace.of(Namespace.java:48)
           at org.apache.iceberg.jdbc.JdbcUtil.stringToNamespace(JdbcUtil.java:80)
           at org.apache.iceberg.jdbc.JdbcCatalog.lambda$listNamespaces$4(JdbcCatalog.java:270)
           at org.apache.iceberg.ClientPoolImpl.run(ClientPoolImpl.java:58)
           at org.apache.iceberg.ClientPoolImpl.run(ClientPoolImpl.java:51)
           at org.apache.iceberg.jdbc.JdbcCatalog.listNamespaces(JdbcCatalog.java:262)
           at org.apache.iceberg.catalog.SupportsNamespaces.listNamespaces(SupportsNamespaces.java:75)
           at org.apache.iceberg.jdbc.TestJdbcCatalog.testListNamespace(TestJdbcCatalog.java:478)
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] nastra commented on a change in pull request #3938: API: Disallow Namespace with null byte character or null level in it

Posted by GitBox <gi...@apache.org>.
nastra commented on a change in pull request #3938:
URL: https://github.com/apache/iceberg/pull/3938#discussion_r797517309



##########
File path: api/src/test/java/org/apache/iceberg/catalog/TestNamespace.java
##########
@@ -44,4 +45,31 @@ public void testNamespace() {
       Assertions.assertThat(namespace.level(i)).isEqualTo(levels[i]);
     }
   }
+
+  @Test
+  public void testWithNullInLevel() {
+    AssertHelpers.assertThrows(
+        "An individual level of a namespace cannot be null",
+        IllegalArgumentException.class,

Review comment:
       should be NPE




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #3938: [WIP] API: Disallow Namespace with null byte character or null level in it

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #3938:
URL: https://github.com/apache/iceberg/pull/3938#discussion_r790317530



##########
File path: api/src/main/java/org/apache/iceberg/catalog/Namespace.java
##########
@@ -29,6 +31,8 @@
 public class Namespace {
   private static final Namespace EMPTY_NAMESPACE = new Namespace(new String[] {});
   private static final Joiner DOT = Joiner.on('.');
+  private static final Predicate<String> CONTAINS_WHITESPACE_OR_NULL_BYTE =
+      Pattern.compile("\\p{Zs}|\0|\u0000", Pattern.UNICODE_CHARACTER_CLASS).asPredicate();

Review comment:
       Why disallow whitespace?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue merged pull request #3938: API: Disallow Namespace with null byte character or null level in it

Posted by GitBox <gi...@apache.org>.
rdblue merged pull request #3938:
URL: https://github.com/apache/iceberg/pull/3938


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] kbendick commented on a change in pull request #3938: API: Disallow Namespace with null byte character or null level in it

Posted by GitBox <gi...@apache.org>.
kbendick commented on a change in pull request #3938:
URL: https://github.com/apache/iceberg/pull/3938#discussion_r806185400



##########
File path: api/src/main/java/org/apache/iceberg/catalog/Namespace.java
##########
@@ -29,6 +31,8 @@
 public class Namespace {
   private static final Namespace EMPTY_NAMESPACE = new Namespace(new String[] {});
   private static final Joiner DOT = Joiner.on('.');
+  private static final Predicate<String> CONTAINS_NULL_BYTE =
+      Pattern.compile("\0|\u0000", Pattern.UNICODE_CHARACTER_CLASS).asPredicate();

Review comment:
       One is unicode and one isn't. I do notice that some of our systems complain when using `\0` and not the full unicode `\u0000` which is the preferred one.
   
   To be safe, I just included both. Let me test if the unicode one (`\u0000`) is sufficient to catch the other one.

##########
File path: api/src/main/java/org/apache/iceberg/catalog/Namespace.java
##########
@@ -29,6 +31,8 @@
 public class Namespace {
   private static final Namespace EMPTY_NAMESPACE = new Namespace(new String[] {});
   private static final Joiner DOT = Joiner.on('.');
+  private static final Predicate<String> CONTAINS_NULL_BYTE =
+      Pattern.compile("\0|\u0000", Pattern.UNICODE_CHARACTER_CLASS).asPredicate();

Review comment:
       Using just `\0000` is sufficient, so I removed the ASCII one.

##########
File path: api/src/main/java/org/apache/iceberg/catalog/Namespace.java
##########
@@ -29,6 +31,8 @@
 public class Namespace {
   private static final Namespace EMPTY_NAMESPACE = new Namespace(new String[] {});
   private static final Joiner DOT = Joiner.on('.');
+  private static final Predicate<String> CONTAINS_NULL_BYTE =
+      Pattern.compile("\0|\u0000", Pattern.UNICODE_CHARACTER_CLASS).asPredicate();

Review comment:
       One is unicode and one isn't. I do notice that some of our systems complain when using `\0` and not the full unicode `\u0000` which is the preferred one.
   
   To be safe, I just included both. Let me test and remove the ASCII one if it's not needed.

##########
File path: api/src/main/java/org/apache/iceberg/catalog/Namespace.java
##########
@@ -29,6 +31,8 @@
 public class Namespace {
   private static final Namespace EMPTY_NAMESPACE = new Namespace(new String[] {});
   private static final Joiner DOT = Joiner.on('.');
+  private static final Predicate<String> CONTAINS_NULL_BYTE =

Review comment:
       That's fair. Maybe just `CONTAINS_NULL_CHARACTER`? `CONTAINS_NULL` makes it sound like we're checking for `null` itself, which this regex does not do.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] kbendick edited a comment on pull request #3938: API: Disallow Namespace with null byte character or null level in it

Posted by GitBox <gi...@apache.org>.
kbendick edited a comment on pull request #3938:
URL: https://github.com/apache/iceberg/pull/3938#issuecomment-1017482383


   cc @rdblue re checking for null bytes when instantiating `Namespace`s.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] kbendick commented on pull request #3938: API: Disallow Namespace with null byte character or null level in it

Posted by GitBox <gi...@apache.org>.
kbendick commented on pull request #3938:
URL: https://github.com/apache/iceberg/pull/3938#issuecomment-1017482383


   cc @rdblue re checking for null bytes when instantiating `Namespaces`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] kbendick commented on a change in pull request #3938: API: Disallow Namespace with null byte character or null level in it

Posted by GitBox <gi...@apache.org>.
kbendick commented on a change in pull request #3938:
URL: https://github.com/apache/iceberg/pull/3938#discussion_r806233415



##########
File path: api/src/main/java/org/apache/iceberg/catalog/Namespace.java
##########
@@ -29,6 +31,8 @@
 public class Namespace {
   private static final Namespace EMPTY_NAMESPACE = new Namespace(new String[] {});
   private static final Joiner DOT = Joiner.on('.');
+  private static final Predicate<String> CONTAINS_NULL_BYTE =

Review comment:
       That's fair. Maybe just `CONTAINS_NULL_CHARACTER`? `CONTAINS_NULL` makes it sound like we're checking for `null` itself, which this regex does not do.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #3938: API: Disallow Namespace with null byte character or null level in it

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #3938:
URL: https://github.com/apache/iceberg/pull/3938#discussion_r806203865



##########
File path: api/src/main/java/org/apache/iceberg/catalog/Namespace.java
##########
@@ -29,6 +31,8 @@
 public class Namespace {
   private static final Namespace EMPTY_NAMESPACE = new Namespace(new String[] {});
   private static final Joiner DOT = Joiner.on('.');
+  private static final Predicate<String> CONTAINS_NULL_BYTE =

Review comment:
       `BYTE` probably isn't correct since this is checking for the null unicode codepoint. Maybe just `CONTAINS_NULL`?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #3938: API: Disallow Namespace with null byte character or null level in it

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #3938:
URL: https://github.com/apache/iceberg/pull/3938#discussion_r805434503



##########
File path: api/src/main/java/org/apache/iceberg/catalog/Namespace.java
##########
@@ -29,6 +31,8 @@
 public class Namespace {
   private static final Namespace EMPTY_NAMESPACE = new Namespace(new String[] {});
   private static final Joiner DOT = Joiner.on('.');
+  private static final Predicate<String> CONTAINS_NULL_BYTE =
+      Pattern.compile("\0|\u0000", Pattern.UNICODE_CHARACTER_CLASS).asPredicate();

Review comment:
       What's the difference between `\0` and `\u0000`?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] kbendick edited a comment on pull request #3938: API: Disallow Namespace with null byte character or null level in it

Posted by GitBox <gi...@apache.org>.
kbendick edited a comment on pull request #3938:
URL: https://github.com/apache/iceberg/pull/3938#issuecomment-1017968614


   Looks like my emptiness / null check on namespace levels broke some of the existing tests: I'll take a look at them and then either remove that check (all that's really needed is the null-byte check) or update the tests if they're possibly expecting to fail later in the process or something.
   
   ```
   org.apache.iceberg.aws.glue.TestIcebergToGlueConverter > testToDatabaseNameFailure FAILED
       java.lang.IllegalArgumentException: Cannot create a namespace with a null or empty level
           at org.apache.iceberg.relocated.com.google.common.base.Preconditions.checkArgument(Preconditions.java:142)
           at org.apache.iceberg.catalog.Namespace.of(Namespace.java:48)
           at org.apache.iceberg.aws.glue.TestIcebergToGlueConverter.testToDatabaseNameFailure(TestIcebergToGlueConverter.java:54)
   
   65 tests completed, 1 failed
   
   > Task :iceberg-aws:test FAILED
   > Task :iceberg-core:test
   
   org.apache.iceberg.jdbc.TestJdbcCatalog > testListNamespace FAILED
       java.lang.IllegalArgumentException: Cannot create a namespace with a null or empty level
           at org.apache.iceberg.relocated.com.google.common.base.Preconditions.checkArgument(Preconditions.java:142)
           at org.apache.iceberg.catalog.Namespace.of(Namespace.java:48)
           at org.apache.iceberg.jdbc.JdbcUtil.stringToNamespace(JdbcUtil.java:80)
           at org.apache.iceberg.jdbc.JdbcCatalog.lambda$listNamespaces$4(JdbcCatalog.java:270)
           at org.apache.iceberg.ClientPoolImpl.run(ClientPoolImpl.java:58)
           at org.apache.iceberg.ClientPoolImpl.run(ClientPoolImpl.java:51)
           at org.apache.iceberg.jdbc.JdbcCatalog.listNamespaces(JdbcCatalog.java:262)
           at org.apache.iceberg.catalog.SupportsNamespaces.listNamespaces(SupportsNamespaces.java:75)
           at org.apache.iceberg.jdbc.TestJdbcCatalog.testListNamespace(TestJdbcCatalog.java:478)
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] kbendick commented on a change in pull request #3938: [WIP] API: Disallow Namespace with null byte character or null level in it

Posted by GitBox <gi...@apache.org>.
kbendick commented on a change in pull request #3938:
URL: https://github.com/apache/iceberg/pull/3938#discussion_r794144935



##########
File path: api/src/main/java/org/apache/iceberg/catalog/Namespace.java
##########
@@ -29,6 +31,8 @@
 public class Namespace {
   private static final Namespace EMPTY_NAMESPACE = new Namespace(new String[] {});
   private static final Joiner DOT = Joiner.on('.');
+  private static final Predicate<String> CONTAINS_WHITESPACE_OR_NULL_BYTE =
+      Pattern.compile("\\p{Zs}|\0|\u0000", Pattern.UNICODE_CHARACTER_CLASS).asPredicate();

Review comment:
       I guess whitespace wouldn't make a difference if it was put in backticks? I can remove it and then just add the `\0` and `\u0000`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] kbendick removed a comment on pull request #3938: [WIP] API: Disallow Namespace with null byte character or null level in it

Posted by GitBox <gi...@apache.org>.
kbendick removed a comment on pull request #3938:
URL: https://github.com/apache/iceberg/pull/3938#issuecomment-1017968614


   Looks like my emptiness / null check on namespace levels broke some of the existing tests: I'll take a look at them and then either remove that check (all that's really needed is the null-byte check) or update the tests if they're possibly expecting to fail later in the process or something.
   
   ```
   org.apache.iceberg.aws.glue.TestIcebergToGlueConverter > testToDatabaseNameFailure FAILED
       java.lang.IllegalArgumentException: Cannot create a namespace with a null or empty level
           at org.apache.iceberg.relocated.com.google.common.base.Preconditions.checkArgument(Preconditions.java:142)
           at org.apache.iceberg.catalog.Namespace.of(Namespace.java:48)
           at org.apache.iceberg.aws.glue.TestIcebergToGlueConverter.testToDatabaseNameFailure(TestIcebergToGlueConverter.java:54)
   
   65 tests completed, 1 failed
   
   > Task :iceberg-aws:test FAILED
   > Task :iceberg-core:test
   
   org.apache.iceberg.jdbc.TestJdbcCatalog > testListNamespace FAILED
       java.lang.IllegalArgumentException: Cannot create a namespace with a null or empty level
           at org.apache.iceberg.relocated.com.google.common.base.Preconditions.checkArgument(Preconditions.java:142)
           at org.apache.iceberg.catalog.Namespace.of(Namespace.java:48)
           at org.apache.iceberg.jdbc.JdbcUtil.stringToNamespace(JdbcUtil.java:80)
           at org.apache.iceberg.jdbc.JdbcCatalog.lambda$listNamespaces$4(JdbcCatalog.java:270)
           at org.apache.iceberg.ClientPoolImpl.run(ClientPoolImpl.java:58)
           at org.apache.iceberg.ClientPoolImpl.run(ClientPoolImpl.java:51)
           at org.apache.iceberg.jdbc.JdbcCatalog.listNamespaces(JdbcCatalog.java:262)
           at org.apache.iceberg.catalog.SupportsNamespaces.listNamespaces(SupportsNamespaces.java:75)
           at org.apache.iceberg.jdbc.TestJdbcCatalog.testListNamespace(TestJdbcCatalog.java:478)
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] kbendick commented on a change in pull request #3938: API: Disallow Namespace with null byte character or null level in it

Posted by GitBox <gi...@apache.org>.
kbendick commented on a change in pull request #3938:
URL: https://github.com/apache/iceberg/pull/3938#discussion_r794925418



##########
File path: api/src/main/java/org/apache/iceberg/catalog/Namespace.java
##########
@@ -29,6 +31,8 @@
 public class Namespace {
   private static final Namespace EMPTY_NAMESPACE = new Namespace(new String[] {});
   private static final Joiner DOT = Joiner.on('.');
+  private static final Predicate<String> CONTAINS_WHITESPACE_OR_NULL_BYTE =
+      Pattern.compile("\\p{Zs}|\0|\u0000", Pattern.UNICODE_CHARACTER_CLASS).asPredicate();

Review comment:
       Removed the whitespace check




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] kbendick edited a comment on pull request #3938: API: Disallow Namespace with null byte character or null level in it

Posted by GitBox <gi...@apache.org>.
kbendick edited a comment on pull request #3938:
URL: https://github.com/apache/iceberg/pull/3938#issuecomment-1032949032


   cc @rdblue if you could possibly merge this now that the tests are passing (I just rebased off latest master but they should still be passing).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] kbendick commented on a change in pull request #3938: API: Disallow Namespace with null byte character or null level in it

Posted by GitBox <gi...@apache.org>.
kbendick commented on a change in pull request #3938:
URL: https://github.com/apache/iceberg/pull/3938#discussion_r806185400



##########
File path: api/src/main/java/org/apache/iceberg/catalog/Namespace.java
##########
@@ -29,6 +31,8 @@
 public class Namespace {
   private static final Namespace EMPTY_NAMESPACE = new Namespace(new String[] {});
   private static final Joiner DOT = Joiner.on('.');
+  private static final Predicate<String> CONTAINS_NULL_BYTE =
+      Pattern.compile("\0|\u0000", Pattern.UNICODE_CHARACTER_CLASS).asPredicate();

Review comment:
       One is unicode and one isn't. I do notice that some of our systems complain when using `\0` and not the full unicode `\u0000` which is the preferred one.
   
   To be safe, I just included both. Let me test if the unicode one (`\u0000`) is sufficient to catch the other one.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #3938: API: Disallow Namespace with null byte character or null level in it

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #3938:
URL: https://github.com/apache/iceberg/pull/3938#discussion_r806203865



##########
File path: api/src/main/java/org/apache/iceberg/catalog/Namespace.java
##########
@@ -29,6 +31,8 @@
 public class Namespace {
   private static final Namespace EMPTY_NAMESPACE = new Namespace(new String[] {});
   private static final Joiner DOT = Joiner.on('.');
+  private static final Predicate<String> CONTAINS_NULL_BYTE =

Review comment:
       `BYTE` probably isn't correct since this is checking for the null unicode codepoint. Maybe just `CONTAINS_NULL`?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on pull request #3938: API: Disallow Namespace with null byte character or null level in it

Posted by GitBox <gi...@apache.org>.
rdblue commented on pull request #3938:
URL: https://github.com/apache/iceberg/pull/3938#issuecomment-1039665773


   Thanks, @kbendick!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org