You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2019/12/09 21:32:23 UTC

[GitHub] [spark] gengliangwang commented on a change in pull request #26750: [SPARK-28948][SQL] Support passing all Table metadata in TableProvider

gengliangwang commented on a change in pull request #26750: [SPARK-28948][SQL] Support passing all Table metadata in TableProvider
URL: https://github.com/apache/spark/pull/26750#discussion_r355691922
 
 

 ##########
 File path: sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/TableProvider.java
 ##########
 @@ -36,26 +38,47 @@
 public interface TableProvider {
 
   /**
-   * Return a {@link Table} instance to do read/write with user-specified options.
+   * Return a {@link Table} instance with specified table properties to do read/write.
+   * Implementations should infer the table schema and partitioning.
+   *
+   * @param properties The specified table properties. It's case preserving (contains exactly what
+   *                   users specified) and implementations are free to use it case sensitively or
+   *                   insensitively. It should be able to identify a table, e.g. file path, Kafka
+   *                   topic name, etc.
+   */
+  Table getTable(Map<String, String> properties);
+
+  /**
+   * Return a {@link Table} instance with specified table schema and properties to do read/write.
+   * Implementations should infer the table partitioning.
+   *
+   * @param schema The specified schema.
+   * @param properties The specified table properties. It's case preserving (contains exactly what
+   *                   users specified) and implementations are free to use it case sensitively or
+   *                   insensitively. It should be able to identify a table, e.g. file path, Kafka
+   *                   topic name, etc.
    *
-   * @param options the user-specified options that can identify a table, e.g. file path, Kafka
-   *                topic name, etc. It's an immutable case-insensitive string-to-string map.
+   * @throws IllegalArgumentException if the specified schema does not match the actual table
+   *                                  schema.
    */
-  Table getTable(CaseInsensitiveStringMap options);
+  Table getTable(StructType schema, Map<String, String> properties);
 
   /**
-   * Return a {@link Table} instance to do read/write with user-specified schema and options.
-   * <p>
-   * By default this method throws {@link UnsupportedOperationException}, implementations should
-   * override this method to handle user-specified schema.
-   * </p>
-   * @param options the user-specified options that can identify a table, e.g. file path, Kafka
-   *                topic name, etc. It's an immutable case-insensitive string-to-string map.
-   * @param schema the user-specified schema.
-   * @throws UnsupportedOperationException
+   * Return a {@link Table} instance with specified table schema, partitioning and properties to do
+   * read/write.
+   *
+   * @param schema The specified schema.
+   * @param partitioning The specified partitioning.
+   * @param properties The specified table properties. It's case preserving (contains exactly what
+   *                   users specified) and implementations are free to use it case sensitively or
+   *                   insensitively. It should be able to identify a table, e.g. file path, Kafka
+   *                   topic name, etc.
+   *
+   * @throws IllegalArgumentException if the specified schema/partitioning does not match the actual
+   *                                  table schema/partitioning.
    */
-  default Table getTable(CaseInsensitiveStringMap options, StructType schema) {
-    throw new UnsupportedOperationException(
-      this.getClass().getSimpleName() + " source does not support user-specified schema");
-  }
+  Table getTable(
 
 Review comment:
   I am a bit curious about the parameter order in these 3 methods:
   ```
   getTable(properties)
   getTable(schema, properties)
   getTable(schema, partitioning, properties)
   ```
   Is it on purpose? Why not:
   ```
   getTable(properties)
   getTable(properties, schema)
   getTable(properties, schema, partitioning)
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org