You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/12/01 15:39:37 UTC

[GitHub] [iceberg] nastra opened a new pull request, #6338: Core: Use lower lengths for iceberg_namespace_properties / iceberg_tables in JdbcCatalog

nastra opened a new pull request, #6338:
URL: https://github.com/apache/iceberg/pull/6338

   Users are running into issues when hooking up the `JdbcCatalog` with `MySql` or other Databases, which actually impose lower limits than [sqlite](https://www.sqlite.org/limits.html) (which we use for testing the `JdbcCatalog`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] danielcweeks merged pull request #6338: Core: Use lower lengths for iceberg_namespace_properties / iceberg_tables in JdbcCatalog

Posted by GitBox <gi...@apache.org>.
danielcweeks merged PR #6338:
URL: https://github.com/apache/iceberg/pull/6338


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] nastra commented on a diff in pull request #6338: Core: Use lower lengths for iceberg_namespace_properties / iceberg_tables in JdbcCatalog

Posted by GitBox <gi...@apache.org>.
nastra commented on code in PR #6338:
URL: https://github.com/apache/iceberg/pull/6338#discussion_r1037276526


##########
core/src/main/java/org/apache/iceberg/jdbc/JdbcUtil.java:
##########
@@ -72,9 +72,9 @@ final class JdbcUtil {
           + TABLE_NAME
           + " VARCHAR(255) NOT NULL,"
           + METADATA_LOCATION
-          + " VARCHAR(5500),"
+          + " VARCHAR(1000),"

Review Comment:
   arbitrarily chosen to be lower than 3072 bytes



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] nastra commented on a diff in pull request #6338: Core: Use lower lengths for iceberg_namespace_properties / iceberg_tables in JdbcCatalog

Posted by GitBox <gi...@apache.org>.
nastra commented on code in PR #6338:
URL: https://github.com/apache/iceberg/pull/6338#discussion_r1037273853


##########
core/src/main/java/org/apache/iceberg/jdbc/JdbcUtil.java:
##########
@@ -185,9 +185,9 @@ final class JdbcUtil {
           + NAMESPACE_NAME
           + " VARCHAR(255) NOT NULL,"
           + NAMESPACE_PROPERTY_KEY
-          + " VARCHAR(5500),"
+          + " VARCHAR(255),"

Review Comment:
   CATALOG_NAME + NAMESPACE_NAME + NAMESPACE_PROPERTY_KEY make up the primary key, and the primary key size is limited by 767 ( 255 * 3 = 765 < 767)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] nastra commented on pull request #6338: Core: Use lower lengths for iceberg_namespace_properties / iceberg_tables in JdbcCatalog

Posted by GitBox <gi...@apache.org>.
nastra commented on PR #6338:
URL: https://github.com/apache/iceberg/pull/6338#issuecomment-1333965778

   @openinx given that you reviewed https://github.com/apache/iceberg/pull/2778 back then, could you review this one as well please?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a diff in pull request #6338: Core: Use lower lengths for iceberg_namespace_properties / iceberg_tables in JdbcCatalog

Posted by GitBox <gi...@apache.org>.
rdblue commented on code in PR #6338:
URL: https://github.com/apache/iceberg/pull/6338#discussion_r1037388592


##########
core/src/main/java/org/apache/iceberg/jdbc/JdbcUtil.java:
##########
@@ -72,9 +72,9 @@ final class JdbcUtil {
           + TABLE_NAME
           + " VARCHAR(255) NOT NULL,"
           + METADATA_LOCATION
-          + " VARCHAR(5500),"
+          + " VARCHAR(1000),"

Review Comment:
   Why not make this as long as we can to avoid truncation? What happens if a value is truncated?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] nastra commented on a diff in pull request #6338: Core: Use lower lengths for iceberg_namespace_properties / iceberg_tables in JdbcCatalog

Posted by GitBox <gi...@apache.org>.
nastra commented on code in PR #6338:
URL: https://github.com/apache/iceberg/pull/6338#discussion_r1037875766


##########
core/src/main/java/org/apache/iceberg/jdbc/JdbcUtil.java:
##########
@@ -72,9 +72,9 @@ final class JdbcUtil {
           + TABLE_NAME
           + " VARCHAR(255) NOT NULL,"
           + METADATA_LOCATION
-          + " VARCHAR(5500),"
+          + " VARCHAR(1000),"

Review Comment:
   I've rerun some testing today to verify my findings from yesterday when making this PR. It turns out adjusting `NAMESPACE_PROPERTY_KEY` from 5500 to 255 was the only thing that was necessary. I was mislead by the stack trace
   ```
   at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:1903)
   at com.mysql.jdbc.PreparedStatement.execute(PreparedStatement.java:1242)
   at org.apache.iceberg.jdbc.JdbcCatalog.lambda$initializeCatalogTables$1(JdbcCatalog.java:152)
   at org.apache.iceberg.ClientPoolImpl.run(ClientPoolImpl.java:58)
   at org.apache.iceberg.ClientPoolImpl.run(ClientPoolImpl.java:51)
   at org.apache.iceberg.jdbc.JdbcCatalog.initializeCatalogTables(JdbcCatalog.java:135)
   at org.apache.iceberg.jdbc.JdbcCatalog.initialize(JdbcCatalog.java:103)
   ... 87 more
   org.apache.iceberg.jdbc.UncheckedSQLException: Cannot initialize JDBC catalog
   at org.apache.iceberg.jdbc.JdbcCatalog.initialize(JdbcCatalog.java:109)
   at org.apache.iceberg.CatalogUtil.loadCatalog(CatalogUtil.java:212)
   ```
   thinking that `JdbcCatalog.initializeCatalogTables(JdbcCatalog.java:135)`  is affecting the [CREATE_TABLE_STATEMENT](https://github.com/apache/iceberg/blob/383d9caec804293e27b23b50c2cd89168346a848/core/src/main/java/org/apache/iceberg/jdbc/JdbcCatalog.java#L135) while the issue was in all 3 issues the `CREATE_NAMESPACE_PROPERTIES_TABLE` stmt. 
   
   So should we restore those varchars from 1000 back to 5500? 
   
   `Although InnoDB supports row sizes larger than 65,535 bytes internally, MySQL itself imposes a row-size limit of 65,535 for the combined size of all columns` <-- we could also try and set them to a higher value by calculating (65535 - (255 * 3)) / 2 (metadata_location + previous_metadata_location). But if we ever add another column, we'd have to re-calculate that.
   
   To your second question around truncation. If a value is truncated, then there will be an exception: `Data truncation: Data too long for column 'metadata_location' at row 1`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org