You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2020/11/04 06:40:00 UTC

[GitHub] [spark] huaxingao commented on pull request #30154: [SPARK-32405][SQL] Apply table options while creating tables in JDBC Table Catalog

huaxingao commented on pull request #30154:
URL: https://github.com/apache/spark/pull/30154#issuecomment-721542287


   > when users specify some table properties, JDBC V2 should fail if the underlying database can't support the properties.
   
   I agree JDBC V2 should fail if the underlying database can't support the properties. However, I feel it's hard to come up with a complete list of the supported properties for each of the databases. It is easy to have a complete list of the supported properties for MySQL because in CREATE TABLE syntax, it explicitly lists the table_options: 
   ```
   table_option: {
       AUTO_INCREMENT [=] value
     | AVG_ROW_LENGTH [=] value
     | [DEFAULT] CHARACTER SET [=] charset_name
     | CHECKSUM [=] {0 | 1}
     | [DEFAULT] COLLATE [=] collation_name
     | COMMENT [=] 'string'
     | COMPRESSION [=] {'ZLIB' | 'LZ4' | 'NONE'}
     | CONNECTION [=] 'connect_string'
     | {DATA | INDEX} DIRECTORY [=] 'absolute path to directory'
     | DELAY_KEY_WRITE [=] {0 | 1}
     | ENCRYPTION [=] {'Y' | 'N'}
     | ENGINE [=] engine_name
     | ENGINE_ATTRIBUTE [=] 'string'
     | INSERT_METHOD [=] { NO | FIRST | LAST }
     | KEY_BLOCK_SIZE [=] value
     | MAX_ROWS [=] value
     | MIN_ROWS [=] value
     | PACK_KEYS [=] {0 | 1 | DEFAULT}
     | PASSWORD [=] 'string'
     | ROW_FORMAT [=] {DEFAULT | DYNAMIC | FIXED | COMPRESSED | REDUNDANT | COMPACT}
     | SECONDARY_ENGINE_ATTRIBUTE [=] 'string'
     | STATS_AUTO_RECALC [=] {DEFAULT | 0 | 1}
     | STATS_PERSISTENT [=] {DEFAULT | 0 | 1}
     | STATS_SAMPLE_PAGES [=] value
     | TABLESPACE tablespace_name [STORAGE {DISK | MEMORY}]
     | UNION [=] (tbl_name[,tbl_name]...)
   }
   ```
   but other databases don't have explicitly defined table_options. I am not sure which of the properties should be considered as valid table properties. For example, for postgresql, https://www.postgresql.org/docs/9.1/sql-createtable.html, should we treat `TABLESPACE tablespace` or `CONSTRAINT constraint_name` as valid table properties? I feel it might be better to send properties to underlying databases and let databases decide whether to fail the CREATE TALBE or not, than trying to have a complete list of the supported properties on Spark side.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org