You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/05/18 22:36:56 UTC
[GitHub] [arrow] jduo commented on pull request #9368: [WIP] [POC] Flight SQL

jduo commented on pull request #9368:
URL: https://github.com/apache/arrow/pull/9368#issuecomment-843612756


   > Thank you for the PR. I think since there isn't a fully implementation example yet for the all the features it would make this easier to review if we cut down the FlightSql proto to what is implemented and focus on each use-case in a separate PR. I think it makes sense to handle read, write and metadata operations separately?
   > 
   > > Like other Flight APIs, flight-sql does not provide implementation details that dictate how a client and server communicates with each other, it simply provides the SQL semantics and apply them onto the Flight API.
   > 
   > IMO, I think we should be be trying to specify more about how client and server interact with each other in terms of error handling and retries, these are important aspects and without it I think some of the utility of standardization will be diminished.
   
   
   
   > > Thinking about this a bit more I think an alternative to a lot of the metadata querying is to define a minimal sql table schema and querying capabilities for it. I'll expand more in a little bit
   > 
   > I did a rough sketch on how the metadata layer could be modeled as pseudo-tables as comments on the proto. I think this addresses some uneasiner I had with the existing model:
   > 
   > 1. It uses one consistent data model for tabular data (the arrow model).
   > 2. It makes query semantics clearer instead of trying to model them in an ad-hoc manner through protobufs
   > 
   > This comes at slightly more complex for server implementations. And potential confusion on limitations for specific tables. This might be limited to some extend if we had a different GetMetadata action so implementations know to expect a query to specific pseudo-table. I expect we would provide library wrappers over JDBC to also ease implementation.
   
   To clarify, are you suggesting something similar to INFORMATION_SCHEMA?
   I am very much in favour of returning the data more like a standard query result. The existing protobuf definitions don't allow vendors to supply extra metadata with their catalog results. (ODBC and JDBC let you add extra columns to getTables() calls).
   
   I am hesitant about using pseudo-tables -- backend databases may not have catalog information in a form that can be transformed by the rest of their SQL engine. It opens up complexity from an implementor's perspective and I'm not sure how beneficial it is from a user perspective.
   
   I'm also OK with making the catalog information accessible with a limited query syntax rather than pseudo-tables (eg SHOW TABLES WHERE ...)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org