You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by GitBox <gi...@apache.org> on 2019/04/11 07:45:50 UTC

[GitHub] [flink] bowenli86 commented on issue #8007: [FLINK-11474][table] Add ReadableCatalog, ReadableWritableCatalog, and other …

bowenli86 commented on issue #8007: [FLINK-11474][table] Add ReadableCatalog, ReadableWritableCatalog, and other …
URL: https://github.com/apache/flink/pull/8007#issuecomment-482005765
 
 
   Hi all,
   
   I believe @sunjincheng121 and @hequn8128 brought up valuable suggestions to avoid API name confusions. Xuefu also made very good points in consideration of API design and impl, and that javadoc should be the true for understanding APIs.
   
   Previously I may be more affected by Hive's design given I've been working heavily on integrating Flink-Hive. @sunjincheng121 's concerns, if I understand correctly, may come from that these APIs will be used by not only SQL users but also Table API users, who may not have Hive backgrounds and thus easier to get confused. Thus I tried to step out of Hive context, and inspect these APIs from the perspective of their usage, as well as referencing MySQL, Postgres, Oracle, SQL Server, and Hive. Here are my thoughts:
   
   On the reading side, view is always treated as a logical table. In queries (SELECT in standard SQL DML), view is table - 'FROM' clause is always "FROM x" rather than "FROM `TABLE/VIEW` x". It's planner's responsibility to process views specially. Meta commands as well, if with no extra params - "DESCRIBE" doesn't distinguish them; Listing tables usually goes in two syntax, "SHOW TABLES" and "SELECT * FROM meta", they return both tables and views, listing only views would be different commands or with extra params like "SHOW VIEWS" and "SELECT * FROM meta WHERE type='view'"
   
   On the writing side, view is treated differently from table, given representations of view and table are a bit different (though they share some common fields). DDL, especially CREATE and ALTER,  are always requires specifying either `TABLE` or `VIEW` as "CREATE/ALTER `TABLE/VIEW` x". "DROP/RENAME" don't touch fields inside table and view, thus their impl behind the scene are usually the same, and therefore some databases choose to not require the `TABLE/VIEW` keyword, but I think it really depends on the developers. Since our devs feel strongly that it causes confusions, we can requires the keywords in our APIs and Flink SQL.
   
   I think we should avoid design in which a SQL statement is translated into multiple catalog API calls or requires unnecessary extra processing. With that in mind, and also given the above conclusions (please correct me if there's anything above is wrong), I propose the following solution:`ReadableCatalog` APIs should treat views as tables by default if no extra params specified, thus `getTable()` and `listTables()` operate on both table and view, and we will have individual APIs as `listPhysicalTables()`, `listViews()`, and potentially `listMaterializedViews()` in the future. `ReadableWritableCatalog` APIs should treat views and tables differently, thus have create/alter/drop/rename APIs separately for view and table. E.g. dropTable() and dropView(), even though the two will very likely share the same code. We will also add clear javadoc and Flink documentations for all catalog APIs in a separate PR. This way, we can eliminate confusions and still maintain a 1 on 1 mapping between SQL statements and catalog APIs.
   
   What do you think?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services