You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by Gavin Ray <ra...@gmail.com> on 2022/12/01 00:17:03 UTC

Re: DISCUSS: [FlightSQL] Catalog support

Just to chime in on this, one thing I'm curious about is whether there
will be support for user-defined catalog/schema hierarchy depth?

This comment that James made does seem reasonable to me
> scheme://<host>:<port>/path-1/path-2/.../path-n

Trino/Presto does a similar thing (jdbc:trino://localhost:8080/tpch/sf1)

At Hasura, what we do is have an alias "FullyQualifiedName" which is
just "Array<String>"
and the identifier to some element in a data source is always fully-qualified:

https://github.com/hasura/graphql-engine/tree/master/dc-agents#schema

["postgres_1", "db1", "schema2", "my_table", "col_a"]
["mongo", "db1",  "collection_a", "field_a"]
["csv_adapter", "myfile.csv", "col_x"]

On Wed, Nov 30, 2022 at 6:31 PM James Duong
<ja...@bitquilltech.com.invalid> wrote:
>
> Our current convention of sending connection properties as headers with
> every request has the benefit of making statefulness optional, but has the
> drawback of sending redundant, unused properties on requests after the
> first, which increases the payload size unnecessarily.
>
> I'd suggest we define session management features explicitly in Flight
> (while being optional). The suggestion is to make this part of Flight as an
> optional feature, rather than Flight SQL due to its applicability outside
> of just database access.
>
> Creating a session:
> - The Flight client supplies a New-Session header which has key-value pairs
> for initial session options. This header can be applied to any RPC call,
> but logically should be the first one the client makes.
> - The server should send a Set-Cookie header back containing some
> server-side representation of the session that the client can use in
> subsequent requests.
> - The path specified in the URI is sent as a "Catalog" session option.
>
> Modifying session options:
> - A separate RPC call that takes in a Stream<string, string> representing
> each session option that is being modified and returns a stream of statuses
> to indicate if the setting change was accepted.
> - This RPC call is only valid when the Cookie header is used.
> - It is up to the server to define if a failed session property change is
> fatal or if other properties can continue to be set.
>
> Closing a session:
> - A separate RPC call that tells the server to drop the session specified
> by the Cookie header.
>
> Notes:
> A Flight SQL client would check if session management RPCs are supported
> through a new GetSqlInfo property. A Flight client doesn't have a way to do
> this generically, but there could be an application-specific RPC or header
> that reports this metadata.
>
> The O/JDBC and ADBC drivers would need to be updated to programmatically
> check for session management RPCs. If unsupported, then use the old
> behavior of sending all properties as headers with each request. If
> supported, make use of the New-Session header and drop the session when
> closing the client-side connection.
>
> It's a bit asymmetric that creating a new session is done by applying a
> header, but closing a session is an RPC call. This was so that session
> creation doesn't introduce another round trip before the first real data
> request. If there's a way to batch RPC calls it might be better to make
> session creation an RPC call.
>
> On Tue, Nov 22, 2022 at 3:16 PM David Li <li...@apache.org> wrote:
>
> > It sounds reasonable - then there are three points:
> >
> > - A standard URI scheme for Flight SQL that can be used by multiple client
> > APIs (JDBC, ADBC, etc.)
> > - A standard scheme for session data (likely header/cookie-based)
> > - A mapping from URI parameters and fields to session data
> >
> >
> >
> > On Tue, Nov 22, 2022, at 17:45, James Duong wrote:
> > > Just following up on this and if there are any thoughts.
> > >
> > > The purpose would be to standardize how we specify access to some named
> > > logical grouping of data. This would make it easy to model catalog/schema
> > > semantics in Flight SQL.
> > >
> > > Having this be part of the connection URI makes it similar to specifying
> > a
> > > resource in an HTTP URL (ie an endpoint) which should make it easy for
> > end
> > > users to work with and modify.
> > >
> > > On Fri, Nov 18, 2022 at 3:17 PM James Duong <ja...@bitquilltech.com>
> > wrote:
> > >
> > >> As for surfacing catalogs itself, perhaps we allow the URI take in a
> > path
> > >> and treat that as a way of specifying a multi-level resource that which
> > the
> > >> FlightClient is connecting to:
> > >>
> > >> eg a connection URI of the form:
> > >> scheme://<host>:<port>/path-1/path-2/.../path-n
> > >>
> > >> The FlightClient could send this path as either a header or a session
> > >> property (with a neutral name like 'resource-path'). Flight SQL
> > Producers
> > >> could interpret this as a catalog or schema.
> > >> eg
> > >> grpc://<host>:<port>/catalog/schema
> > >>
> > >> On Fri, Nov 11, 2022 at 2:07 AM James Henderson <jm...@juxt.pro> wrote:
> > >>
> > >>> Sounds good to me.
> > >>>
> > >>> > Are you interested in writing up a (sketch of a) proposal?
> > >>>
> > >>> Yep, can do - I'm OoO over the next couple of weeks so might be a bit
> > >>> intermittent.
> > >>>
> > >>> On Thu, 10 Nov 2022 at 15:28, David Li <li...@apache.org> wrote:
> > >>>
> > >>> > Hey James H.,
> > >>> >
> > >>> > That would make sense to me. So it sounds like we'd want
> > >>> >
> > >>> > - Formal specification of using cookies/headers to mark a 'session'
> > (I
> > >>> > guess this will be a little inconsistent with transactions, though)
> > >>> > - Adding RPCs to query session values
> > >>> > - Adding RPCs to set session values
> > >>> > - Listing standard values and types
> > >>> >
> > >>> > Some things may require more consideration, e.g. transaction
> > isolation
> > >>> > might be better off as part of the transaction RPCs than an ambient
> > >>> > property. Are you interested in writing up a (sketch of a) proposal?
> > >>> >
> > >>> > -David
> > >>> >
> > >>> > On Thu, Nov 10, 2022, at 10:09, James Henderson wrote:
> > >>> > > Similarly, we're also currently considering how best to implement
> > >>> some of
> > >>> > > the SQL standard session variables in our Flight SQL server -
> > things
> > >>> like
> > >>> > > current transaction isolation level, access mode, time zone etc,
> > which
> > >>> > seem
> > >>> > > to have similar properties to the (traditional) connection's
> > current
> > >>> > > catalog. We'd (perhaps naively) looked at solutions involving the
> > >>> Flight
> > >>> > > client's `ClientCookieMiddleware`, but might these have
> > standardised
> > >>> > > support within Flight SQL itself eventually?
> > >>> > >
> > >>> > > Cheers,
> > >>> > >
> > >>> > > James
> > >>> > >
> > >>> > > On Wed, 9 Nov 2022 at 21:51, David Li <li...@apache.org> wrote:
> > >>> > >
> > >>> > >> I think having better support for this makes sense, but perhaps we
> > >>> can
> > >>> > >> find a way to make it not tied to the connection itself? For
> > >>> instance,
> > >>> > in
> > >>> > >> the same way transactions were implemented (as a handle). Or
> > rather,
> > >>> > >> instead of adding connection statefulness to Flight RPC, I'd
> > rather
> > >>> try
> > >>> > to
> > >>> > >> work within the gRPC/RPC paradigm.
> > >>> > >>
> > >>> > >> On Wed, Nov 9, 2022, at 16:47, James Duong wrote:
> > >>> > >> > Databases such as SQL Server and PostgreSQL have the concept of
> > >>> > catalogs
> > >>> > >> as
> > >>> > >> > containers of database schemas. Users can usually specify an
> > >>> initial
> > >>> > >> > catalog during the connection process, list catalogs, and
> > sometimes
> > >>> > >> change
> > >>> > >> > catalogs during the session.
> > >>> > >> >
> > >>> > >> > Currently catalog support in Flight SQL is somewhat limited. The
> > >>> > protocol
> > >>> > >> > provides a way to list catalogs as well as metadata in
> > SqlTypeInfo
> > >>> for
> > >>> > >> > reporting how catalogs are supported from a syntax perspective.
> > >>> > >> >
> > >>> > >> > ODBC and JDBC provide API calls for changing the catalog.
> > >>> > Additionally,
> > >>> > >> > Flight SQL doesn't really provide the concept of "initial"
> > >>> connection
> > >>> > >> > properties (such as a starting catalog) since Flight itself is
> > >>> > stateless
> > >>> > >> > from a connection perspective.
> > >>> > >> >
> > >>> > >> > To support catalogs properly, I'd imagine we need to make some
> > >>> > changes to
> > >>> > >> > the Flight SQL protocol:
> > >>> > >> > - Introduce the concept of connection-time properties (perhaps
> > an
> > >>> > >> optional
> > >>> > >> > RPC for Flight SQL applications that need this)
> > >>> > >> > - Related to the above, expand the connection URL and Java
> > builder
> > >>> to
> > >>> > >> allow
> > >>> > >> > arbitrary application-specific properties.
> > >>> > >> > - Add optional RPCs for changing the catalog and relevant error
> > >>> codes
> > >>> > if
> > >>> > >> > this is not permitted.
> > >>> > >> >
> > >>> > >> >
> > >>> > >> > --
> > >>> > >> >
> > >>> > >> > *James Duong*
> > >>> > >> > Lead Software Developer
> > >>> > >> > Bit Quill Technologies Inc.
> > >>> > >> > Direct: +1.604.562.6082 | jamesd@bitquilltech.com
> > >>> > >> > https://www.bitquilltech.com
> > >>> > >> >
> > >>> > >> > This email message is for the sole use of the intended
> > recipient(s)
> > >>> > and
> > >>> > >> may
> > >>> > >> > contain confidential and privileged information.  Any
> > unauthorized
> > >>> > >> review,
> > >>> > >> > use, disclosure, or distribution is prohibited.  If you are not
> > the
> > >>> > >> > intended recipient, please contact the sender by reply email and
> > >>> > destroy
> > >>> > >> > all copies of the original message.  Thank you.
> > >>> > >>
> > >>> > >
> > >>> > >
> > >>> > > --
> > >>> > > *James Henderson*
> > >>> > > XTDB Developer at *JUXT*
> > >>> > > Email jms@juxt.pro
> > >>> > > Website https://juxt.pro
> > >>> > >
> > >>> > > [image: photo]
> > >>> >
> > >>>
> > >>>
> > >>> --
> > >>> *James Henderson*
> > >>> XTDB Developer at *JUXT*
> > >>> Email jms@juxt.pro
> > >>> Website https://juxt.pro
> > >>>
> > >>> [image: photo]
> > >>>
> > >>
> > >>
> > >> --
> > >>
> > >> *James Duong*
> > >> Lead Software Developer
> > >> Bit Quill Technologies Inc.
> > >> Direct: +1.604.562.6082 | jamesd@bitquilltech.com
> > >> https://www.bitquilltech.com
> > >>
> > >> This email message is for the sole use of the intended recipient(s) and
> > >> may contain confidential and privileged information.  Any unauthorized
> > >> review, use, disclosure, or distribution is prohibited.  If you are not
> > the
> > >> intended recipient, please contact the sender by reply email and destroy
> > >> all copies of the original message.  Thank you.
> > >>
> > >
> > >
> > > --
> > >
> > > *James Duong*
> > > Lead Software Developer
> > > Bit Quill Technologies Inc.
> > > Direct: +1.604.562.6082 | jamesd@bitquilltech.com
> > > https://www.bitquilltech.com
> > >
> > > This email message is for the sole use of the intended recipient(s) and
> > may
> > > contain confidential and privileged information.  Any unauthorized
> > review,
> > > use, disclosure, or distribution is prohibited.  If you are not the
> > > intended recipient, please contact the sender by reply email and destroy
> > > all copies of the original message.  Thank you.
> >
>
>
> --
>
> *James Duong*
> Lead Software Developer
> Bit Quill Technologies Inc.
> Direct: +1.604.562.6082 | jamesd@bitquilltech.com
> https://www.bitquilltech.com
>
> This email message is for the sole use of the intended recipient(s) and may
> contain confidential and privileged information.  Any unauthorized review,
> use, disclosure, or distribution is prohibited.  If you are not the
> intended recipient, please contact the sender by reply email and destroy
> all copies of the original message.  Thank you.

Re: DISCUSS: [FlightSQL] Catalog support

Posted by David Li <li...@apache.org>.
The proposal for session support/explicit catalogs is ready for review [1]. Absent any objections I will start the vote this week, but comments would be appreciate as I'd like to avoid lots of revisions during the vote itself.

[1]: https://github.com/apache/arrow/pull/34817

On Wed, Nov 15, 2023, at 15:31, David Li wrote:
> Hi all,
>
> Paul and company have been working on this feature for Flight SQL via 
> support for explicit sessions in Flight SQL. Feedback would be much 
> appreciated on the PR, especially before tackling the second 
> implementation and eventual vote [1]. The current PR implements 
> client/server support for C++.
>
> [1]: https://github.com/apache/arrow/pull/34817
>
> On Wed, Feb 15, 2023, at 01:10, Sutou Kouhei wrote:
>> Hi James,
>>
>> Thanks for sharing your plan! I will wait for an update.
>>
>> -- 
>> kou
>>
>> In 
>>  
>> <CH...@CH2PR13MB3831.namprd13.prod.outlook.com>
>>   "Re: DISCUSS: [FlightSQL] Catalog support" on Tue, 14 Feb 2023 
>> 04:53:07 +0000,
>>   James Duong <Ja...@improving.com.INVALID> wrote:
>>
>>> Hi Sutou,
>>> 
>>> I saw your PostgreSQL project and thought it was quite interesting, especially given the number of PostgreSQL-compatible databases.
>>> 
>>> Paul Nienaber will be picking up implementation of the catalog feature going forward and can provide an update
>>> 
>>> Get Outlook for Android<https://aka.ms/AAb9ysg>
>>> 
>>> ________________________________
>>> From: Sutou Kouhei <ko...@clear-code.com>
>>> Sent: Thursday, February 9, 2023, 22:25
>>> To: dev@arrow.apache.org <de...@arrow.apache.org>
>>> Subject: Re: DISCUSS: [FlightSQL] Catalog support
>>> 
>>> Hi James
>>> 
>>> Is there any progress of this?
>>> 
>>> I'm developing a Flight SQL adapter for PostgreSQL:
>>> https://github.com/apache/arrow-flight-sql-postgresql
>>> 
>>> I want to implement session feature for it because opening
>>> a session in PostgreSQL is expensive. PostgreSQL uses one
>>> process per session. If we open and close a session for
>>> each Flight SQL call, we need to start one process for each
>>> Flight SQL call.
>>> 
>>> I noticed that the current Flight SQL specification doesn't
>>> provide the standard session support. So I'm interesting in
>>> this discussion.
>>> 
>>> Background: https://github.com/apache/arrow-flight-sql-postgresql/issues/13
>>> 
>>> 
>>> Thanks,
>>> --
>>> kou
>>> 
>>> In
>>>  <CH...@CH2PR13MB3831.namprd13.prod.outlook.com>
>>>   "Re: DISCUSS: [FlightSQL] Catalog support" on Mon, 12 Dec 2022 18:12:06 +0000,
>>>   James Duong <Ja...@improving.com.INVALID> wrote:
>>> 
>>>> Hi David,
>>>>
>>>> I've written up the URI parsing in C++ and started adding session management messages. I'm also planning on having the ClientCookieMiddlewareFactory be able to report if sessions are enabled on the server.
>>>>
>>>> I (or another developer) will send an update once those features are ready for demo.
>>>> ________________________________
>>>> From: David Li <li...@apache.org>
>>>> Sent: December 12, 2022 10:07 AM
>>>> To: dev@arrow.apache.org <de...@arrow.apache.org>
>>>> Subject: Re: DISCUSS: [FlightSQL] Catalog support
>>>>
>>>> Following up here, James are you interested in putting up a draft PR for the Flight SQL URI format and for session management?
>>>>
>>>> The Flight SQL URI format would then also cover Andrew's use case. And if someone wants to draw up a PR to the JDBC driver to enable arbitrary properties, I can review that too.
>>>>
>>>> On Sat, Dec 3, 2022, at 05:38, Andrew Lamb wrote:
>>>>>> Andrew, do we need to look into adding more metadata to indicate
>>>>> different query languages? (It's quite a shame that we named this Flight
>>>>> SQL at this point...)
>>>>>
>>>>> TDLR is I don't think trying to explicitly support languages other than SQL
>>>>> in FlightSQL is a good idea. Among other reasons, the JDBC / ODBC drivers,
>>>>> which mostly assume SQL, are one of the key features of FlightSQL, and they
>>>>> are likely not as useful for non SQL. I can see the argument to support for
>>>>> substrait plans, and it will be interesting to see what use cases benefit
>>>>> from that support.
>>>>>
>>>>> What would make our life easier would be some standard way to pass
>>>>> application specific key/value pairs from the JDBC driver to a flight SQL
>>>>> backend with each request (perhaps via gRPC headers). This would allow
>>>>> passing configuration parameters that were not envisioned in the spec, from
>>>>> end user (of the JDBC driver) all the way to our backend.
>>>>>
>>>>> Thanks again for driving this forward,
>>>>> Andrew
>>>>>
>>>>> On Thu, Dec 1, 2022 at 7:11 PM David Li <li...@apache.org> wrote:
>>>>>
>>>>>> Hey James, thanks for putting this up.
>>>>>>
>>>>>> Inline:
>>>>>>
>>>>>> > The suggestion is to make this part of Flight as an
>>>>>> > optional feature, rather than Flight SQL due to its applicability outside
>>>>>> > of just database access.
>>>>>>
>>>>>> Which uses do you see? I see statefulness as a general antipattern here,
>>>>>> so I'm wary of introducing it beyond where we need it.
>>>>>>
>>>>>> > - The Flight client supplies a New-Session header which has key-value
>>>>>> pairs
>>>>>> > for initial session options. This header can be applied to any RPC call,
>>>>>> > but logically should be the first one the client makes.
>>>>>>
>>>>>> Handshake already effectively serves as this RPC - maybe we could extend
>>>>>> it? (I also see Handshake as an antipattern because it's a stateful auth
>>>>>> mechanism.)
>>>>>>
>>>>>> Should the session timeout/be on a lease? (gRPC doesn't really give the
>>>>>> server a way to track the persistence of a particular client connection.)
>>>>>>
>>>>>> > It's a bit asymmetric that creating a new session is done by applying a
>>>>>> > header, but closing a session is an RPC call. This was so that session
>>>>>> > creation doesn't introduce another round trip before the first real data
>>>>>> > request. If there's a way to batch RPC calls it might be better to make
>>>>>> > session creation an RPC call.
>>>>>>
>>>>>> Is this a worrisome amount of overhead?
>>>>>>
>>>>>> Unfortunately gRPC doesn't batch RPC calls, but RPC calls on the same
>>>>>> client generally share the same TCP connection (modulo load balancing
>>>>>> behavior, but presumably that is not enabled if you want persistent
>>>>>> sessions).
>>>>>>
>>>>>> On the implementation side, I'd like to avoid baking this in too deeply if
>>>>>> at all possible. Ideally it'd be implemented entirely as middleware,
>>>>>> possibly making use of an interface so applications can override the
>>>>>> session storage (hashtable, Redis, etcd, etc.)
>>>>>>
>>>>>> > Just to chime in on this, one thing I'm curious about is whether there
>>>>>> > will be support for user-defined catalog/schema hierarchy depth?
>>>>>>
>>>>>> Gavin, for ADBC we discussed adding a delimiter to the catalog name to
>>>>>> handle this case - maybe we can handle this by adding a property for the
>>>>>> delimiter to SqlInfo?
>>>>>>
>>>>>> > https://github.com/influxdata/influxdb_iox/issues/6102
>>>>>>
>>>>>> Andrew, do we need to look into adding more metadata to indicate different
>>>>>> query languages? (It's quite a shame that we named this Flight SQL at this
>>>>>> point...)
>>>>>>
>>>>>> On Thu, Dec 1, 2022, at 08:49, Andrew Lamb wrote:
>>>>>> > Sorry for the late reply -- thank you James and David for this
>>>>>> discussion.
>>>>>> >
>>>>>> > I agree that adding Catalog support would be a valuable addition to
>>>>>> Flight
>>>>>> > SQL, and it recently came up as we begin to implement Flight SQL in
>>>>>> > InfluxDB IOx [1].
>>>>>> >
>>>>>> >> - A standard URI scheme for Flight SQL that can be used by multiple
>>>>>> > client APIs (JDBC, ADBC, etc.)
>>>>>> >
>>>>>> > I agree this would be very valuable, along with a standard way (ideally
>>>>>> > with HTTP headers) to send this information as part of the FlightSQL gRPC
>>>>>> > requests.
>>>>>> >
>>>>>> >> I'd suggest we define session management features explicitly in Flight
>>>>>> > (while being optional).
>>>>>> >
>>>>>> > I agree it is critical that server-side state is not required to
>>>>>> implement
>>>>>> > FlightSQL. Stateful connections would likely complicate deploying
>>>>>> FlightSQL
>>>>>> > in distributed systems. I suggest it should be possible to implement any
>>>>>> > session management features by sending the entire session state with the
>>>>>> > request, if desired.
>>>>>> >
>>>>>> > I don't have a strong opinion about the merits of including explicit
>>>>>> > session management features in FlightSQL. It seems to me that keeping the
>>>>>> > API surface of FlightSQL minimal and implementation flexibility maximal
>>>>>> > should be the default. However, if JDBC/ODBC driver compatibility would
>>>>>> be
>>>>>> > improved with explicit state management APIs, then adding them to
>>>>>> FlightSQL
>>>>>> > seems like a good idea to me.
>>>>>> >
>>>>>> > Thanks again -- it is amazing to hit some issue in design and then find
>>>>>> out
>>>>>> > the Arrow community is already hard at work on a solution.
>>>>>> >
>>>>>> > Andrew
>>>>>> >
>>>>>> > [1] https://github.com/influxdata/influxdb_iox/issues/6102
>>>>>> >
>>>>>> > On Wed, Nov 30, 2022 at 7:17 PM Gavin Ray <ra...@gmail.com> wrote:
>>>>>> >
>>>>>> >> Just to chime in on this, one thing I'm curious about is whether there
>>>>>> >> will be support for user-defined catalog/schema hierarchy depth?
>>>>>> >>
>>>>>> >> This comment that James made does seem reasonable to me
>>>>>> >> > scheme://<host>:<port>/path-1/path-2/.../path-n
>>>>>> >>
>>>>>> >> Trino/Presto does a similar thing (jdbc:trino://localhost:8080/tpch/sf1)
>>>>>> >>
>>>>>> >> At Hasura, what we do is have an alias "FullyQualifiedName" which is
>>>>>> >> just "Array<String>"
>>>>>> >> and the identifier to some element in a data source is always
>>>>>> >> fully-qualified:
>>>>>> >>
>>>>>> >> https://github.com/hasura/graphql-engine/tree/master/dc-agents#schema
>>>>>> >>
>>>>>> >> ["postgres_1", "db1", "schema2", "my_table", "col_a"]
>>>>>> >> ["mongo", "db1",  "collection_a", "field_a"]
>>>>>> >> ["csv_adapter", "myfile.csv", "col_x"]
>>>>>> >>
>>>>>> >> On Wed, Nov 30, 2022 at 6:31 PM James Duong
>>>>>> >> <ja...@bitquilltech.com.invalid> wrote:
>>>>>> >> >
>>>>>> >> > Our current convention of sending connection properties as headers
>>>>>> with
>>>>>> >> > every request has the benefit of making statefulness optional, but has
>>>>>> >> the
>>>>>> >> > drawback of sending redundant, unused properties on requests after the
>>>>>> >> > first, which increases the payload size unnecessarily.
>>>>>> >> >
>>>>>> >> > I'd suggest we define session management features explicitly in Flight
>>>>>> >> > (while being optional). The suggestion is to make this part of Flight
>>>>>> as
>>>>>> >> an
>>>>>> >> > optional feature, rather than Flight SQL due to its applicability
>>>>>> outside
>>>>>> >> > of just database access.
>>>>>> >> >
>>>>>> >> > Creating a session:
>>>>>> >> > - The Flight client supplies a New-Session header which has key-value
>>>>>> >> pairs
>>>>>> >> > for initial session options. This header can be applied to any RPC
>>>>>> call,
>>>>>> >> > but logically should be the first one the client makes.
>>>>>> >> > - The server should send a Set-Cookie header back containing some
>>>>>> >> > server-side representation of the session that the client can use in
>>>>>> >> > subsequent requests.
>>>>>> >> > - The path specified in the URI is sent as a "Catalog" session option.
>>>>>> >> >
>>>>>> >> > Modifying session options:
>>>>>> >> > - A separate RPC call that takes in a Stream<string, string>
>>>>>> representing
>>>>>> >> > each session option that is being modified and returns a stream of
>>>>>> >> statuses
>>>>>> >> > to indicate if the setting change was accepted.
>>>>>> >> > - This RPC call is only valid when the Cookie header is used.
>>>>>> >> > - It is up to the server to define if a failed session property
>>>>>> change is
>>>>>> >> > fatal or if other properties can continue to be set.
>>>>>> >> >
>>>>>> >> > Closing a session:
>>>>>> >> > - A separate RPC call that tells the server to drop the session
>>>>>> specified
>>>>>> >> > by the Cookie header.
>>>>>> >> >
>>>>>> >> > Notes:
>>>>>> >> > A Flight SQL client would check if session management RPCs are
>>>>>> supported
>>>>>> >> > through a new GetSqlInfo property. A Flight client doesn't have a way
>>>>>> to
>>>>>> >> do
>>>>>> >> > this generically, but there could be an application-specific RPC or
>>>>>> >> header
>>>>>> >> > that reports this metadata.
>>>>>> >> >
>>>>>> >> > The O/JDBC and ADBC drivers would need to be updated to
>>>>>> programmatically
>>>>>> >> > check for session management RPCs. If unsupported, then use the old
>>>>>> >> > behavior of sending all properties as headers with each request. If
>>>>>> >> > supported, make use of the New-Session header and drop the session
>>>>>> when
>>>>>> >> > closing the client-side connection.
>>>>>> >> >
>>>>>> >> > It's a bit asymmetric that creating a new session is done by applying
>>>>>> a
>>>>>> >> > header, but closing a session is an RPC call. This was so that session
>>>>>> >> > creation doesn't introduce another round trip before the first real
>>>>>> data
>>>>>> >> > request. If there's a way to batch RPC calls it might be better to
>>>>>> make
>>>>>> >> > session creation an RPC call.
>>>>>> >> >
>>>>>> >> > On Tue, Nov 22, 2022 at 3:16 PM David Li <li...@apache.org> wrote:
>>>>>> >> >
>>>>>> >> > > It sounds reasonable - then there are three points:
>>>>>> >> > >
>>>>>> >> > > - A standard URI scheme for Flight SQL that can be used by multiple
>>>>>> >> client
>>>>>> >> > > APIs (JDBC, ADBC, etc.)
>>>>>> >> > > - A standard scheme for session data (likely header/cookie-based)
>>>>>> >> > > - A mapping from URI parameters and fields to session data
>>>>>> >> > >
>>>>>> >> > >
>>>>>> >> > >
>>>>>> >> > > On Tue, Nov 22, 2022, at 17:45, James Duong wrote:
>>>>>> >> > > > Just following up on this and if there are any thoughts.
>>>>>> >> > > >
>>>>>> >> > > > The purpose would be to standardize how we specify access to some
>>>>>> >> named
>>>>>> >> > > > logical grouping of data. This would make it easy to model
>>>>>> >> catalog/schema
>>>>>> >> > > > semantics in Flight SQL.
>>>>>> >> > > >
>>>>>> >> > > > Having this be part of the connection URI makes it similar to
>>>>>> >> specifying
>>>>>> >> > > a
>>>>>> >> > > > resource in an HTTP URL (ie an endpoint) which should make it easy
>>>>>> >> for
>>>>>> >> > > end
>>>>>> >> > > > users to work with and modify.
>>>>>> >> > > >
>>>>>> >> > > > On Fri, Nov 18, 2022 at 3:17 PM James Duong <
>>>>>> jamesd@bitquilltech.com
>>>>>> >> >
>>>>>> >> > > wrote:
>>>>>> >> > > >
>>>>>> >> > > >> As for surfacing catalogs itself, perhaps we allow the URI take
>>>>>> in a
>>>>>> >> > > path
>>>>>> >> > > >> and treat that as a way of specifying a multi-level resource that
>>>>>> >> which
>>>>>> >> > > the
>>>>>> >> > > >> FlightClient is connecting to:
>>>>>> >> > > >>
>>>>>> >> > > >> eg a connection URI of the form:
>>>>>> >> > > >> scheme://<host>:<port>/path-1/path-2/.../path-n
>>>>>> >> > > >>
>>>>>> >> > > >> The FlightClient could send this path as either a header or a
>>>>>> >> session
>>>>>> >> > > >> property (with a neutral name like 'resource-path'). Flight SQL
>>>>>> >> > > Producers
>>>>>> >> > > >> could interpret this as a catalog or schema.
>>>>>> >> > > >> eg
>>>>>> >> > > >> grpc://<host>:<port>/catalog/schema
>>>>>> >> > > >>
>>>>>> >> > > >> On Fri, Nov 11, 2022 at 2:07 AM James Henderson <jm...@juxt.pro>
>>>>>> >> wrote:
>>>>>> >> > > >>
>>>>>> >> > > >>> Sounds good to me.
>>>>>> >> > > >>>
>>>>>> >> > > >>> > Are you interested in writing up a (sketch of a) proposal?
>>>>>> >> > > >>>
>>>>>> >> > > >>> Yep, can do - I'm OoO over the next couple of weeks so might be
>>>>>> a
>>>>>> >> bit
>>>>>> >> > > >>> intermittent.
>>>>>> >> > > >>>
>>>>>> >> > > >>> On Thu, 10 Nov 2022 at 15:28, David Li <li...@apache.org>
>>>>>> >> wrote:
>>>>>> >> > > >>>
>>>>>> >> > > >>> > Hey James H.,
>>>>>> >> > > >>> >
>>>>>> >> > > >>> > That would make sense to me. So it sounds like we'd want
>>>>>> >> > > >>> >
>>>>>> >> > > >>> > - Formal specification of using cookies/headers to mark a
>>>>>> >> 'session'
>>>>>> >> > > (I
>>>>>> >> > > >>> > guess this will be a little inconsistent with transactions,
>>>>>> >> though)
>>>>>> >> > > >>> > - Adding RPCs to query session values
>>>>>> >> > > >>> > - Adding RPCs to set session values
>>>>>> >> > > >>> > - Listing standard values and types
>>>>>> >> > > >>> >
>>>>>> >> > > >>> > Some things may require more consideration, e.g. transaction
>>>>>> >> > > isolation
>>>>>> >> > > >>> > might be better off as part of the transaction RPCs than an
>>>>>> >> ambient
>>>>>> >> > > >>> > property. Are you interested in writing up a (sketch of a)
>>>>>> >> proposal?
>>>>>> >> > > >>> >
>>>>>> >> > > >>> > -David
>>>>>> >> > > >>> >
>>>>>> >> > > >>> > On Thu, Nov 10, 2022, at 10:09, James Henderson wrote:
>>>>>> >> > > >>> > > Similarly, we're also currently considering how best to
>>>>>> >> implement
>>>>>> >> > > >>> some of
>>>>>> >> > > >>> > > the SQL standard session variables in our Flight SQL server
>>>>>> -
>>>>>> >> > > things
>>>>>> >> > > >>> like
>>>>>> >> > > >>> > > current transaction isolation level, access mode, time zone
>>>>>> >> etc,
>>>>>> >> > > which
>>>>>> >> > > >>> > seem
>>>>>> >> > > >>> > > to have similar properties to the (traditional) connection's
>>>>>> >> > > current
>>>>>> >> > > >>> > > catalog. We'd (perhaps naively) looked at solutions
>>>>>> involving
>>>>>> >> the
>>>>>> >> > > >>> Flight
>>>>>> >> > > >>> > > client's `ClientCookieMiddleware`, but might these have
>>>>>> >> > > standardised
>>>>>> >> > > >>> > > support within Flight SQL itself eventually?
>>>>>> >> > > >>> > >
>>>>>> >> > > >>> > > Cheers,
>>>>>> >> > > >>> > >
>>>>>> >> > > >>> > > James
>>>>>> >> > > >>> > >
>>>>>> >> > > >>> > > On Wed, 9 Nov 2022 at 21:51, David Li <li...@apache.org>
>>>>>> >> wrote:
>>>>>> >> > > >>> > >
>>>>>> >> > > >>> > >> I think having better support for this makes sense, but
>>>>>> >> perhaps we
>>>>>> >> > > >>> can
>>>>>> >> > > >>> > >> find a way to make it not tied to the connection itself?
>>>>>> For
>>>>>> >> > > >>> instance,
>>>>>> >> > > >>> > in
>>>>>> >> > > >>> > >> the same way transactions were implemented (as a handle).
>>>>>> Or
>>>>>> >> > > rather,
>>>>>> >> > > >>> > >> instead of adding connection statefulness to Flight RPC,
>>>>>> I'd
>>>>>> >> > > rather
>>>>>> >> > > >>> try
>>>>>> >> > > >>> > to
>>>>>> >> > > >>> > >> work within the gRPC/RPC paradigm.
>>>>>> >> > > >>> > >>
>>>>>> >> > > >>> > >> On Wed, Nov 9, 2022, at 16:47, James Duong wrote:
>>>>>> >> > > >>> > >> > Databases such as SQL Server and PostgreSQL have the
>>>>>> >> concept of
>>>>>> >> > > >>> > catalogs
>>>>>> >> > > >>> > >> as
>>>>>> >> > > >>> > >> > containers of database schemas. Users can usually
>>>>>> specify an
>>>>>> >> > > >>> initial
>>>>>> >> > > >>> > >> > catalog during the connection process, list catalogs, and
>>>>>> >> > > sometimes
>>>>>> >> > > >>> > >> change
>>>>>> >> > > >>> > >> > catalogs during the session.
>>>>>> >> > > >>> > >> >
>>>>>> >> > > >>> > >> > Currently catalog support in Flight SQL is somewhat
>>>>>> >> limited. The
>>>>>> >> > > >>> > protocol
>>>>>> >> > > >>> > >> > provides a way to list catalogs as well as metadata in
>>>>>> >> > > SqlTypeInfo
>>>>>> >> > > >>> for
>>>>>> >> > > >>> > >> > reporting how catalogs are supported from a syntax
>>>>>> >> perspective.
>>>>>> >> > > >>> > >> >
>>>>>> >> > > >>> > >> > ODBC and JDBC provide API calls for changing the catalog.
>>>>>> >> > > >>> > Additionally,
>>>>>> >> > > >>> > >> > Flight SQL doesn't really provide the concept of
>>>>>> "initial"
>>>>>> >> > > >>> connection
>>>>>> >> > > >>> > >> > properties (such as a starting catalog) since Flight
>>>>>> itself
>>>>>> >> is
>>>>>> >> > > >>> > stateless
>>>>>> >> > > >>> > >> > from a connection perspective.
>>>>>> >> > > >>> > >> >
>>>>>> >> > > >>> > >> > To support catalogs properly, I'd imagine we need to make
>>>>>> >> some
>>>>>> >> > > >>> > changes to
>>>>>> >> > > >>> > >> > the Flight SQL protocol:
>>>>>> >> > > >>> > >> > - Introduce the concept of connection-time properties
>>>>>> >> (perhaps
>>>>>> >> > > an
>>>>>> >> > > >>> > >> optional
>>>>>> >> > > >>> > >> > RPC for Flight SQL applications that need this)
>>>>>> >> > > >>> > >> > - Related to the above, expand the connection URL and
>>>>>> Java
>>>>>> >> > > builder
>>>>>> >> > > >>> to
>>>>>> >> > > >>> > >> allow
>>>>>> >> > > >>> > >> > arbitrary application-specific properties.
>>>>>> >> > > >>> > >> > - Add optional RPCs for changing the catalog and relevant
>>>>>> >> error
>>>>>> >> > > >>> codes
>>>>>> >> > > >>> > if
>>>>>> >> > > >>> > >> > this is not permitted.
>>>>>> >> > > >>> > >> >
>>>>>> >> > > >>> > >> >
>>>>>> >> > > >>> > >> > --
>>>>>> >> > > >>> > >> >
>>>>>> >> > > >>> > >> > *James Duong*
>>>>>> >> > > >>> > >> > Lead Software Developer
>>>>>> >> > > >>> > >> > Bit Quill Technologies Inc.
>>>>>> >> > > >>> > >> > Direct: +1.604.562.6082 | jamesd@bitquilltech.com
>>>>>> >> > > >>> > >> > https://www.bitquilltech.com
>>>>>> >> > > >>> > >> >
>>>>>> >> > > >>> > >> > This email message is for the sole use of the intended
>>>>>> >> > > recipient(s)
>>>>>> >> > > >>> > and
>>>>>> >> > > >>> > >> may
>>>>>> >> > > >>> > >> > contain confidential and privileged information.  Any
>>>>>> >> > > unauthorized
>>>>>> >> > > >>> > >> review,
>>>>>> >> > > >>> > >> > use, disclosure, or distribution is prohibited.  If you
>>>>>> are
>>>>>> >> not
>>>>>> >> > > the
>>>>>> >> > > >>> > >> > intended recipient, please contact the sender by reply
>>>>>> >> email and
>>>>>> >> > > >>> > destroy
>>>>>> >> > > >>> > >> > all copies of the original message.  Thank you.
>>>>>> >> > > >>> > >>
>>>>>> >> > > >>> > >
>>>>>> >> > > >>> > >
>>>>>> >> > > >>> > > --
>>>>>> >> > > >>> > > *James Henderson*
>>>>>> >> > > >>> > > XTDB Developer at *JUXT*
>>>>>> >> > > >>> > > Email jms@juxt.pro
>>>>>> >> > > >>> > > Website https://juxt.pro
>>>>>> >> > > >>> > >
>>>>>> >> > > >>> > > [image: photo]
>>>>>> >> > > >>> >
>>>>>> >> > > >>>
>>>>>> >> > > >>>
>>>>>> >> > > >>> --
>>>>>> >> > > >>> *James Henderson*
>>>>>> >> > > >>> XTDB Developer at *JUXT*
>>>>>> >> > > >>> Email jms@juxt.pro
>>>>>> >> > > >>> Website https://juxt.pro
>>>>>> >> > > >>>
>>>>>> >> > > >>> [image: photo]
>>>>>> >> > > >>>
>>>>>> >> > > >>
>>>>>> >> > > >>
>>>>>> >> > > >> --
>>>>>> >> > > >>
>>>>>> >> > > >> *James Duong*
>>>>>> >> > > >> Lead Software Developer
>>>>>> >> > > >> Bit Quill Technologies Inc.
>>>>>> >> > > >> Direct: +1.604.562.6082 | jamesd@bitquilltech.com
>>>>>> >> > > >> https://www.bitquilltech.com
>>>>>> >> > > >>
>>>>>> >> > > >> This email message is for the sole use of the intended
>>>>>> recipient(s)
>>>>>> >> and
>>>>>> >> > > >> may contain confidential and privileged information.  Any
>>>>>> >> unauthorized
>>>>>> >> > > >> review, use, disclosure, or distribution is prohibited.  If you
>>>>>> are
>>>>>> >> not
>>>>>> >> > > the
>>>>>> >> > > >> intended recipient, please contact the sender by reply email and
>>>>>> >> destroy
>>>>>> >> > > >> all copies of the original message.  Thank you.
>>>>>> >> > > >>
>>>>>> >> > > >
>>>>>> >> > > >
>>>>>> >> > > > --
>>>>>> >> > > >
>>>>>> >> > > > *James Duong*
>>>>>> >> > > > Lead Software Developer
>>>>>> >> > > > Bit Quill Technologies Inc.
>>>>>> >> > > > Direct: +1.604.562.6082 | jamesd@bitquilltech.com
>>>>>> >> > > > https://www.bitquilltech.com
>>>>>> >> > > >
>>>>>> >> > > > This email message is for the sole use of the intended
>>>>>> recipient(s)
>>>>>> >> and
>>>>>> >> > > may
>>>>>> >> > > > contain confidential and privileged information.  Any unauthorized
>>>>>> >> > > review,
>>>>>> >> > > > use, disclosure, or distribution is prohibited.  If you are not
>>>>>> the
>>>>>> >> > > > intended recipient, please contact the sender by reply email and
>>>>>> >> destroy
>>>>>> >> > > > all copies of the original message.  Thank you.
>>>>>> >> > >
>>>>>> >> >
>>>>>> >> >
>>>>>> >> > --
>>>>>> >> >
>>>>>> >> > *James Duong*
>>>>>> >> > Lead Software Developer
>>>>>> >> > Bit Quill Technologies Inc.
>>>>>> >> > Direct: +1.604.562.6082 | jamesd@bitquilltech.com
>>>>>> >> > https://www.bitquilltech.com
>>>>>> >> >
>>>>>> >> > This email message is for the sole use of the intended recipient(s)
>>>>>> and
>>>>>> >> may
>>>>>> >> > contain confidential and privileged information.  Any unauthorized
>>>>>> >> review,
>>>>>> >> > use, disclosure, or distribution is prohibited.  If you are not the
>>>>>> >> > intended recipient, please contact the sender by reply email and
>>>>>> destroy
>>>>>> >> > all copies of the original message.  Thank you.
>>>>>> >>
>>>>>>
>>>

Re: DISCUSS: [FlightSQL] Catalog support

Posted by David Li <li...@apache.org>.
Hi all,

Paul and company have been working on this feature for Flight SQL via support for explicit sessions in Flight SQL. Feedback would be much appreciated on the PR, especially before tackling the second implementation and eventual vote [1]. The current PR implements client/server support for C++.

[1]: https://github.com/apache/arrow/pull/34817

On Wed, Feb 15, 2023, at 01:10, Sutou Kouhei wrote:
> Hi James,
>
> Thanks for sharing your plan! I will wait for an update.
>
> -- 
> kou
>
> In 
>  
> <CH...@CH2PR13MB3831.namprd13.prod.outlook.com>
>   "Re: DISCUSS: [FlightSQL] Catalog support" on Tue, 14 Feb 2023 
> 04:53:07 +0000,
>   James Duong <Ja...@improving.com.INVALID> wrote:
>
>> Hi Sutou,
>> 
>> I saw your PostgreSQL project and thought it was quite interesting, especially given the number of PostgreSQL-compatible databases.
>> 
>> Paul Nienaber will be picking up implementation of the catalog feature going forward and can provide an update
>> 
>> Get Outlook for Android<https://aka.ms/AAb9ysg>
>> 
>> ________________________________
>> From: Sutou Kouhei <ko...@clear-code.com>
>> Sent: Thursday, February 9, 2023, 22:25
>> To: dev@arrow.apache.org <de...@arrow.apache.org>
>> Subject: Re: DISCUSS: [FlightSQL] Catalog support
>> 
>> Hi James
>> 
>> Is there any progress of this?
>> 
>> I'm developing a Flight SQL adapter for PostgreSQL:
>> https://github.com/apache/arrow-flight-sql-postgresql
>> 
>> I want to implement session feature for it because opening
>> a session in PostgreSQL is expensive. PostgreSQL uses one
>> process per session. If we open and close a session for
>> each Flight SQL call, we need to start one process for each
>> Flight SQL call.
>> 
>> I noticed that the current Flight SQL specification doesn't
>> provide the standard session support. So I'm interesting in
>> this discussion.
>> 
>> Background: https://github.com/apache/arrow-flight-sql-postgresql/issues/13
>> 
>> 
>> Thanks,
>> --
>> kou
>> 
>> In
>>  <CH...@CH2PR13MB3831.namprd13.prod.outlook.com>
>>   "Re: DISCUSS: [FlightSQL] Catalog support" on Mon, 12 Dec 2022 18:12:06 +0000,
>>   James Duong <Ja...@improving.com.INVALID> wrote:
>> 
>>> Hi David,
>>>
>>> I've written up the URI parsing in C++ and started adding session management messages. I'm also planning on having the ClientCookieMiddlewareFactory be able to report if sessions are enabled on the server.
>>>
>>> I (or another developer) will send an update once those features are ready for demo.
>>> ________________________________
>>> From: David Li <li...@apache.org>
>>> Sent: December 12, 2022 10:07 AM
>>> To: dev@arrow.apache.org <de...@arrow.apache.org>
>>> Subject: Re: DISCUSS: [FlightSQL] Catalog support
>>>
>>> Following up here, James are you interested in putting up a draft PR for the Flight SQL URI format and for session management?
>>>
>>> The Flight SQL URI format would then also cover Andrew's use case. And if someone wants to draw up a PR to the JDBC driver to enable arbitrary properties, I can review that too.
>>>
>>> On Sat, Dec 3, 2022, at 05:38, Andrew Lamb wrote:
>>>>> Andrew, do we need to look into adding more metadata to indicate
>>>> different query languages? (It's quite a shame that we named this Flight
>>>> SQL at this point...)
>>>>
>>>> TDLR is I don't think trying to explicitly support languages other than SQL
>>>> in FlightSQL is a good idea. Among other reasons, the JDBC / ODBC drivers,
>>>> which mostly assume SQL, are one of the key features of FlightSQL, and they
>>>> are likely not as useful for non SQL. I can see the argument to support for
>>>> substrait plans, and it will be interesting to see what use cases benefit
>>>> from that support.
>>>>
>>>> What would make our life easier would be some standard way to pass
>>>> application specific key/value pairs from the JDBC driver to a flight SQL
>>>> backend with each request (perhaps via gRPC headers). This would allow
>>>> passing configuration parameters that were not envisioned in the spec, from
>>>> end user (of the JDBC driver) all the way to our backend.
>>>>
>>>> Thanks again for driving this forward,
>>>> Andrew
>>>>
>>>> On Thu, Dec 1, 2022 at 7:11 PM David Li <li...@apache.org> wrote:
>>>>
>>>>> Hey James, thanks for putting this up.
>>>>>
>>>>> Inline:
>>>>>
>>>>> > The suggestion is to make this part of Flight as an
>>>>> > optional feature, rather than Flight SQL due to its applicability outside
>>>>> > of just database access.
>>>>>
>>>>> Which uses do you see? I see statefulness as a general antipattern here,
>>>>> so I'm wary of introducing it beyond where we need it.
>>>>>
>>>>> > - The Flight client supplies a New-Session header which has key-value
>>>>> pairs
>>>>> > for initial session options. This header can be applied to any RPC call,
>>>>> > but logically should be the first one the client makes.
>>>>>
>>>>> Handshake already effectively serves as this RPC - maybe we could extend
>>>>> it? (I also see Handshake as an antipattern because it's a stateful auth
>>>>> mechanism.)
>>>>>
>>>>> Should the session timeout/be on a lease? (gRPC doesn't really give the
>>>>> server a way to track the persistence of a particular client connection.)
>>>>>
>>>>> > It's a bit asymmetric that creating a new session is done by applying a
>>>>> > header, but closing a session is an RPC call. This was so that session
>>>>> > creation doesn't introduce another round trip before the first real data
>>>>> > request. If there's a way to batch RPC calls it might be better to make
>>>>> > session creation an RPC call.
>>>>>
>>>>> Is this a worrisome amount of overhead?
>>>>>
>>>>> Unfortunately gRPC doesn't batch RPC calls, but RPC calls on the same
>>>>> client generally share the same TCP connection (modulo load balancing
>>>>> behavior, but presumably that is not enabled if you want persistent
>>>>> sessions).
>>>>>
>>>>> On the implementation side, I'd like to avoid baking this in too deeply if
>>>>> at all possible. Ideally it'd be implemented entirely as middleware,
>>>>> possibly making use of an interface so applications can override the
>>>>> session storage (hashtable, Redis, etcd, etc.)
>>>>>
>>>>> > Just to chime in on this, one thing I'm curious about is whether there
>>>>> > will be support for user-defined catalog/schema hierarchy depth?
>>>>>
>>>>> Gavin, for ADBC we discussed adding a delimiter to the catalog name to
>>>>> handle this case - maybe we can handle this by adding a property for the
>>>>> delimiter to SqlInfo?
>>>>>
>>>>> > https://github.com/influxdata/influxdb_iox/issues/6102
>>>>>
>>>>> Andrew, do we need to look into adding more metadata to indicate different
>>>>> query languages? (It's quite a shame that we named this Flight SQL at this
>>>>> point...)
>>>>>
>>>>> On Thu, Dec 1, 2022, at 08:49, Andrew Lamb wrote:
>>>>> > Sorry for the late reply -- thank you James and David for this
>>>>> discussion.
>>>>> >
>>>>> > I agree that adding Catalog support would be a valuable addition to
>>>>> Flight
>>>>> > SQL, and it recently came up as we begin to implement Flight SQL in
>>>>> > InfluxDB IOx [1].
>>>>> >
>>>>> >> - A standard URI scheme for Flight SQL that can be used by multiple
>>>>> > client APIs (JDBC, ADBC, etc.)
>>>>> >
>>>>> > I agree this would be very valuable, along with a standard way (ideally
>>>>> > with HTTP headers) to send this information as part of the FlightSQL gRPC
>>>>> > requests.
>>>>> >
>>>>> >> I'd suggest we define session management features explicitly in Flight
>>>>> > (while being optional).
>>>>> >
>>>>> > I agree it is critical that server-side state is not required to
>>>>> implement
>>>>> > FlightSQL. Stateful connections would likely complicate deploying
>>>>> FlightSQL
>>>>> > in distributed systems. I suggest it should be possible to implement any
>>>>> > session management features by sending the entire session state with the
>>>>> > request, if desired.
>>>>> >
>>>>> > I don't have a strong opinion about the merits of including explicit
>>>>> > session management features in FlightSQL. It seems to me that keeping the
>>>>> > API surface of FlightSQL minimal and implementation flexibility maximal
>>>>> > should be the default. However, if JDBC/ODBC driver compatibility would
>>>>> be
>>>>> > improved with explicit state management APIs, then adding them to
>>>>> FlightSQL
>>>>> > seems like a good idea to me.
>>>>> >
>>>>> > Thanks again -- it is amazing to hit some issue in design and then find
>>>>> out
>>>>> > the Arrow community is already hard at work on a solution.
>>>>> >
>>>>> > Andrew
>>>>> >
>>>>> > [1] https://github.com/influxdata/influxdb_iox/issues/6102
>>>>> >
>>>>> > On Wed, Nov 30, 2022 at 7:17 PM Gavin Ray <ra...@gmail.com> wrote:
>>>>> >
>>>>> >> Just to chime in on this, one thing I'm curious about is whether there
>>>>> >> will be support for user-defined catalog/schema hierarchy depth?
>>>>> >>
>>>>> >> This comment that James made does seem reasonable to me
>>>>> >> > scheme://<host>:<port>/path-1/path-2/.../path-n
>>>>> >>
>>>>> >> Trino/Presto does a similar thing (jdbc:trino://localhost:8080/tpch/sf1)
>>>>> >>
>>>>> >> At Hasura, what we do is have an alias "FullyQualifiedName" which is
>>>>> >> just "Array<String>"
>>>>> >> and the identifier to some element in a data source is always
>>>>> >> fully-qualified:
>>>>> >>
>>>>> >> https://github.com/hasura/graphql-engine/tree/master/dc-agents#schema
>>>>> >>
>>>>> >> ["postgres_1", "db1", "schema2", "my_table", "col_a"]
>>>>> >> ["mongo", "db1",  "collection_a", "field_a"]
>>>>> >> ["csv_adapter", "myfile.csv", "col_x"]
>>>>> >>
>>>>> >> On Wed, Nov 30, 2022 at 6:31 PM James Duong
>>>>> >> <ja...@bitquilltech.com.invalid> wrote:
>>>>> >> >
>>>>> >> > Our current convention of sending connection properties as headers
>>>>> with
>>>>> >> > every request has the benefit of making statefulness optional, but has
>>>>> >> the
>>>>> >> > drawback of sending redundant, unused properties on requests after the
>>>>> >> > first, which increases the payload size unnecessarily.
>>>>> >> >
>>>>> >> > I'd suggest we define session management features explicitly in Flight
>>>>> >> > (while being optional). The suggestion is to make this part of Flight
>>>>> as
>>>>> >> an
>>>>> >> > optional feature, rather than Flight SQL due to its applicability
>>>>> outside
>>>>> >> > of just database access.
>>>>> >> >
>>>>> >> > Creating a session:
>>>>> >> > - The Flight client supplies a New-Session header which has key-value
>>>>> >> pairs
>>>>> >> > for initial session options. This header can be applied to any RPC
>>>>> call,
>>>>> >> > but logically should be the first one the client makes.
>>>>> >> > - The server should send a Set-Cookie header back containing some
>>>>> >> > server-side representation of the session that the client can use in
>>>>> >> > subsequent requests.
>>>>> >> > - The path specified in the URI is sent as a "Catalog" session option.
>>>>> >> >
>>>>> >> > Modifying session options:
>>>>> >> > - A separate RPC call that takes in a Stream<string, string>
>>>>> representing
>>>>> >> > each session option that is being modified and returns a stream of
>>>>> >> statuses
>>>>> >> > to indicate if the setting change was accepted.
>>>>> >> > - This RPC call is only valid when the Cookie header is used.
>>>>> >> > - It is up to the server to define if a failed session property
>>>>> change is
>>>>> >> > fatal or if other properties can continue to be set.
>>>>> >> >
>>>>> >> > Closing a session:
>>>>> >> > - A separate RPC call that tells the server to drop the session
>>>>> specified
>>>>> >> > by the Cookie header.
>>>>> >> >
>>>>> >> > Notes:
>>>>> >> > A Flight SQL client would check if session management RPCs are
>>>>> supported
>>>>> >> > through a new GetSqlInfo property. A Flight client doesn't have a way
>>>>> to
>>>>> >> do
>>>>> >> > this generically, but there could be an application-specific RPC or
>>>>> >> header
>>>>> >> > that reports this metadata.
>>>>> >> >
>>>>> >> > The O/JDBC and ADBC drivers would need to be updated to
>>>>> programmatically
>>>>> >> > check for session management RPCs. If unsupported, then use the old
>>>>> >> > behavior of sending all properties as headers with each request. If
>>>>> >> > supported, make use of the New-Session header and drop the session
>>>>> when
>>>>> >> > closing the client-side connection.
>>>>> >> >
>>>>> >> > It's a bit asymmetric that creating a new session is done by applying
>>>>> a
>>>>> >> > header, but closing a session is an RPC call. This was so that session
>>>>> >> > creation doesn't introduce another round trip before the first real
>>>>> data
>>>>> >> > request. If there's a way to batch RPC calls it might be better to
>>>>> make
>>>>> >> > session creation an RPC call.
>>>>> >> >
>>>>> >> > On Tue, Nov 22, 2022 at 3:16 PM David Li <li...@apache.org> wrote:
>>>>> >> >
>>>>> >> > > It sounds reasonable - then there are three points:
>>>>> >> > >
>>>>> >> > > - A standard URI scheme for Flight SQL that can be used by multiple
>>>>> >> client
>>>>> >> > > APIs (JDBC, ADBC, etc.)
>>>>> >> > > - A standard scheme for session data (likely header/cookie-based)
>>>>> >> > > - A mapping from URI parameters and fields to session data
>>>>> >> > >
>>>>> >> > >
>>>>> >> > >
>>>>> >> > > On Tue, Nov 22, 2022, at 17:45, James Duong wrote:
>>>>> >> > > > Just following up on this and if there are any thoughts.
>>>>> >> > > >
>>>>> >> > > > The purpose would be to standardize how we specify access to some
>>>>> >> named
>>>>> >> > > > logical grouping of data. This would make it easy to model
>>>>> >> catalog/schema
>>>>> >> > > > semantics in Flight SQL.
>>>>> >> > > >
>>>>> >> > > > Having this be part of the connection URI makes it similar to
>>>>> >> specifying
>>>>> >> > > a
>>>>> >> > > > resource in an HTTP URL (ie an endpoint) which should make it easy
>>>>> >> for
>>>>> >> > > end
>>>>> >> > > > users to work with and modify.
>>>>> >> > > >
>>>>> >> > > > On Fri, Nov 18, 2022 at 3:17 PM James Duong <
>>>>> jamesd@bitquilltech.com
>>>>> >> >
>>>>> >> > > wrote:
>>>>> >> > > >
>>>>> >> > > >> As for surfacing catalogs itself, perhaps we allow the URI take
>>>>> in a
>>>>> >> > > path
>>>>> >> > > >> and treat that as a way of specifying a multi-level resource that
>>>>> >> which
>>>>> >> > > the
>>>>> >> > > >> FlightClient is connecting to:
>>>>> >> > > >>
>>>>> >> > > >> eg a connection URI of the form:
>>>>> >> > > >> scheme://<host>:<port>/path-1/path-2/.../path-n
>>>>> >> > > >>
>>>>> >> > > >> The FlightClient could send this path as either a header or a
>>>>> >> session
>>>>> >> > > >> property (with a neutral name like 'resource-path'). Flight SQL
>>>>> >> > > Producers
>>>>> >> > > >> could interpret this as a catalog or schema.
>>>>> >> > > >> eg
>>>>> >> > > >> grpc://<host>:<port>/catalog/schema
>>>>> >> > > >>
>>>>> >> > > >> On Fri, Nov 11, 2022 at 2:07 AM James Henderson <jm...@juxt.pro>
>>>>> >> wrote:
>>>>> >> > > >>
>>>>> >> > > >>> Sounds good to me.
>>>>> >> > > >>>
>>>>> >> > > >>> > Are you interested in writing up a (sketch of a) proposal?
>>>>> >> > > >>>
>>>>> >> > > >>> Yep, can do - I'm OoO over the next couple of weeks so might be
>>>>> a
>>>>> >> bit
>>>>> >> > > >>> intermittent.
>>>>> >> > > >>>
>>>>> >> > > >>> On Thu, 10 Nov 2022 at 15:28, David Li <li...@apache.org>
>>>>> >> wrote:
>>>>> >> > > >>>
>>>>> >> > > >>> > Hey James H.,
>>>>> >> > > >>> >
>>>>> >> > > >>> > That would make sense to me. So it sounds like we'd want
>>>>> >> > > >>> >
>>>>> >> > > >>> > - Formal specification of using cookies/headers to mark a
>>>>> >> 'session'
>>>>> >> > > (I
>>>>> >> > > >>> > guess this will be a little inconsistent with transactions,
>>>>> >> though)
>>>>> >> > > >>> > - Adding RPCs to query session values
>>>>> >> > > >>> > - Adding RPCs to set session values
>>>>> >> > > >>> > - Listing standard values and types
>>>>> >> > > >>> >
>>>>> >> > > >>> > Some things may require more consideration, e.g. transaction
>>>>> >> > > isolation
>>>>> >> > > >>> > might be better off as part of the transaction RPCs than an
>>>>> >> ambient
>>>>> >> > > >>> > property. Are you interested in writing up a (sketch of a)
>>>>> >> proposal?
>>>>> >> > > >>> >
>>>>> >> > > >>> > -David
>>>>> >> > > >>> >
>>>>> >> > > >>> > On Thu, Nov 10, 2022, at 10:09, James Henderson wrote:
>>>>> >> > > >>> > > Similarly, we're also currently considering how best to
>>>>> >> implement
>>>>> >> > > >>> some of
>>>>> >> > > >>> > > the SQL standard session variables in our Flight SQL server
>>>>> -
>>>>> >> > > things
>>>>> >> > > >>> like
>>>>> >> > > >>> > > current transaction isolation level, access mode, time zone
>>>>> >> etc,
>>>>> >> > > which
>>>>> >> > > >>> > seem
>>>>> >> > > >>> > > to have similar properties to the (traditional) connection's
>>>>> >> > > current
>>>>> >> > > >>> > > catalog. We'd (perhaps naively) looked at solutions
>>>>> involving
>>>>> >> the
>>>>> >> > > >>> Flight
>>>>> >> > > >>> > > client's `ClientCookieMiddleware`, but might these have
>>>>> >> > > standardised
>>>>> >> > > >>> > > support within Flight SQL itself eventually?
>>>>> >> > > >>> > >
>>>>> >> > > >>> > > Cheers,
>>>>> >> > > >>> > >
>>>>> >> > > >>> > > James
>>>>> >> > > >>> > >
>>>>> >> > > >>> > > On Wed, 9 Nov 2022 at 21:51, David Li <li...@apache.org>
>>>>> >> wrote:
>>>>> >> > > >>> > >
>>>>> >> > > >>> > >> I think having better support for this makes sense, but
>>>>> >> perhaps we
>>>>> >> > > >>> can
>>>>> >> > > >>> > >> find a way to make it not tied to the connection itself?
>>>>> For
>>>>> >> > > >>> instance,
>>>>> >> > > >>> > in
>>>>> >> > > >>> > >> the same way transactions were implemented (as a handle).
>>>>> Or
>>>>> >> > > rather,
>>>>> >> > > >>> > >> instead of adding connection statefulness to Flight RPC,
>>>>> I'd
>>>>> >> > > rather
>>>>> >> > > >>> try
>>>>> >> > > >>> > to
>>>>> >> > > >>> > >> work within the gRPC/RPC paradigm.
>>>>> >> > > >>> > >>
>>>>> >> > > >>> > >> On Wed, Nov 9, 2022, at 16:47, James Duong wrote:
>>>>> >> > > >>> > >> > Databases such as SQL Server and PostgreSQL have the
>>>>> >> concept of
>>>>> >> > > >>> > catalogs
>>>>> >> > > >>> > >> as
>>>>> >> > > >>> > >> > containers of database schemas. Users can usually
>>>>> specify an
>>>>> >> > > >>> initial
>>>>> >> > > >>> > >> > catalog during the connection process, list catalogs, and
>>>>> >> > > sometimes
>>>>> >> > > >>> > >> change
>>>>> >> > > >>> > >> > catalogs during the session.
>>>>> >> > > >>> > >> >
>>>>> >> > > >>> > >> > Currently catalog support in Flight SQL is somewhat
>>>>> >> limited. The
>>>>> >> > > >>> > protocol
>>>>> >> > > >>> > >> > provides a way to list catalogs as well as metadata in
>>>>> >> > > SqlTypeInfo
>>>>> >> > > >>> for
>>>>> >> > > >>> > >> > reporting how catalogs are supported from a syntax
>>>>> >> perspective.
>>>>> >> > > >>> > >> >
>>>>> >> > > >>> > >> > ODBC and JDBC provide API calls for changing the catalog.
>>>>> >> > > >>> > Additionally,
>>>>> >> > > >>> > >> > Flight SQL doesn't really provide the concept of
>>>>> "initial"
>>>>> >> > > >>> connection
>>>>> >> > > >>> > >> > properties (such as a starting catalog) since Flight
>>>>> itself
>>>>> >> is
>>>>> >> > > >>> > stateless
>>>>> >> > > >>> > >> > from a connection perspective.
>>>>> >> > > >>> > >> >
>>>>> >> > > >>> > >> > To support catalogs properly, I'd imagine we need to make
>>>>> >> some
>>>>> >> > > >>> > changes to
>>>>> >> > > >>> > >> > the Flight SQL protocol:
>>>>> >> > > >>> > >> > - Introduce the concept of connection-time properties
>>>>> >> (perhaps
>>>>> >> > > an
>>>>> >> > > >>> > >> optional
>>>>> >> > > >>> > >> > RPC for Flight SQL applications that need this)
>>>>> >> > > >>> > >> > - Related to the above, expand the connection URL and
>>>>> Java
>>>>> >> > > builder
>>>>> >> > > >>> to
>>>>> >> > > >>> > >> allow
>>>>> >> > > >>> > >> > arbitrary application-specific properties.
>>>>> >> > > >>> > >> > - Add optional RPCs for changing the catalog and relevant
>>>>> >> error
>>>>> >> > > >>> codes
>>>>> >> > > >>> > if
>>>>> >> > > >>> > >> > this is not permitted.
>>>>> >> > > >>> > >> >
>>>>> >> > > >>> > >> >
>>>>> >> > > >>> > >> > --
>>>>> >> > > >>> > >> >
>>>>> >> > > >>> > >> > *James Duong*
>>>>> >> > > >>> > >> > Lead Software Developer
>>>>> >> > > >>> > >> > Bit Quill Technologies Inc.
>>>>> >> > > >>> > >> > Direct: +1.604.562.6082 | jamesd@bitquilltech.com
>>>>> >> > > >>> > >> > https://www.bitquilltech.com
>>>>> >> > > >>> > >> >
>>>>> >> > > >>> > >> > This email message is for the sole use of the intended
>>>>> >> > > recipient(s)
>>>>> >> > > >>> > and
>>>>> >> > > >>> > >> may
>>>>> >> > > >>> > >> > contain confidential and privileged information.  Any
>>>>> >> > > unauthorized
>>>>> >> > > >>> > >> review,
>>>>> >> > > >>> > >> > use, disclosure, or distribution is prohibited.  If you
>>>>> are
>>>>> >> not
>>>>> >> > > the
>>>>> >> > > >>> > >> > intended recipient, please contact the sender by reply
>>>>> >> email and
>>>>> >> > > >>> > destroy
>>>>> >> > > >>> > >> > all copies of the original message.  Thank you.
>>>>> >> > > >>> > >>
>>>>> >> > > >>> > >
>>>>> >> > > >>> > >
>>>>> >> > > >>> > > --
>>>>> >> > > >>> > > *James Henderson*
>>>>> >> > > >>> > > XTDB Developer at *JUXT*
>>>>> >> > > >>> > > Email jms@juxt.pro
>>>>> >> > > >>> > > Website https://juxt.pro
>>>>> >> > > >>> > >
>>>>> >> > > >>> > > [image: photo]
>>>>> >> > > >>> >
>>>>> >> > > >>>
>>>>> >> > > >>>
>>>>> >> > > >>> --
>>>>> >> > > >>> *James Henderson*
>>>>> >> > > >>> XTDB Developer at *JUXT*
>>>>> >> > > >>> Email jms@juxt.pro
>>>>> >> > > >>> Website https://juxt.pro
>>>>> >> > > >>>
>>>>> >> > > >>> [image: photo]
>>>>> >> > > >>>
>>>>> >> > > >>
>>>>> >> > > >>
>>>>> >> > > >> --
>>>>> >> > > >>
>>>>> >> > > >> *James Duong*
>>>>> >> > > >> Lead Software Developer
>>>>> >> > > >> Bit Quill Technologies Inc.
>>>>> >> > > >> Direct: +1.604.562.6082 | jamesd@bitquilltech.com
>>>>> >> > > >> https://www.bitquilltech.com
>>>>> >> > > >>
>>>>> >> > > >> This email message is for the sole use of the intended
>>>>> recipient(s)
>>>>> >> and
>>>>> >> > > >> may contain confidential and privileged information.  Any
>>>>> >> unauthorized
>>>>> >> > > >> review, use, disclosure, or distribution is prohibited.  If you
>>>>> are
>>>>> >> not
>>>>> >> > > the
>>>>> >> > > >> intended recipient, please contact the sender by reply email and
>>>>> >> destroy
>>>>> >> > > >> all copies of the original message.  Thank you.
>>>>> >> > > >>
>>>>> >> > > >
>>>>> >> > > >
>>>>> >> > > > --
>>>>> >> > > >
>>>>> >> > > > *James Duong*
>>>>> >> > > > Lead Software Developer
>>>>> >> > > > Bit Quill Technologies Inc.
>>>>> >> > > > Direct: +1.604.562.6082 | jamesd@bitquilltech.com
>>>>> >> > > > https://www.bitquilltech.com
>>>>> >> > > >
>>>>> >> > > > This email message is for the sole use of the intended
>>>>> recipient(s)
>>>>> >> and
>>>>> >> > > may
>>>>> >> > > > contain confidential and privileged information.  Any unauthorized
>>>>> >> > > review,
>>>>> >> > > > use, disclosure, or distribution is prohibited.  If you are not
>>>>> the
>>>>> >> > > > intended recipient, please contact the sender by reply email and
>>>>> >> destroy
>>>>> >> > > > all copies of the original message.  Thank you.
>>>>> >> > >
>>>>> >> >
>>>>> >> >
>>>>> >> > --
>>>>> >> >
>>>>> >> > *James Duong*
>>>>> >> > Lead Software Developer
>>>>> >> > Bit Quill Technologies Inc.
>>>>> >> > Direct: +1.604.562.6082 | jamesd@bitquilltech.com
>>>>> >> > https://www.bitquilltech.com
>>>>> >> >
>>>>> >> > This email message is for the sole use of the intended recipient(s)
>>>>> and
>>>>> >> may
>>>>> >> > contain confidential and privileged information.  Any unauthorized
>>>>> >> review,
>>>>> >> > use, disclosure, or distribution is prohibited.  If you are not the
>>>>> >> > intended recipient, please contact the sender by reply email and
>>>>> destroy
>>>>> >> > all copies of the original message.  Thank you.
>>>>> >>
>>>>>
>>

Re: DISCUSS: [FlightSQL] Catalog support

Posted by Sutou Kouhei <ko...@clear-code.com>.
Hi James,

Thanks for sharing your plan! I will wait for an update.

-- 
kou

In 
 <CH...@CH2PR13MB3831.namprd13.prod.outlook.com>
  "Re: DISCUSS: [FlightSQL] Catalog support" on Tue, 14 Feb 2023 04:53:07 +0000,
  James Duong <Ja...@improving.com.INVALID> wrote:

> Hi Sutou,
> 
> I saw your PostgreSQL project and thought it was quite interesting, especially given the number of PostgreSQL-compatible databases.
> 
> Paul Nienaber will be picking up implementation of the catalog feature going forward and can provide an update
> 
> Get Outlook for Android<https://aka.ms/AAb9ysg>
> 
> ________________________________
> From: Sutou Kouhei <ko...@clear-code.com>
> Sent: Thursday, February 9, 2023, 22:25
> To: dev@arrow.apache.org <de...@arrow.apache.org>
> Subject: Re: DISCUSS: [FlightSQL] Catalog support
> 
> Hi James
> 
> Is there any progress of this?
> 
> I'm developing a Flight SQL adapter for PostgreSQL:
> https://github.com/apache/arrow-flight-sql-postgresql
> 
> I want to implement session feature for it because opening
> a session in PostgreSQL is expensive. PostgreSQL uses one
> process per session. If we open and close a session for
> each Flight SQL call, we need to start one process for each
> Flight SQL call.
> 
> I noticed that the current Flight SQL specification doesn't
> provide the standard session support. So I'm interesting in
> this discussion.
> 
> Background: https://github.com/apache/arrow-flight-sql-postgresql/issues/13
> 
> 
> Thanks,
> --
> kou
> 
> In
>  <CH...@CH2PR13MB3831.namprd13.prod.outlook.com>
>   "Re: DISCUSS: [FlightSQL] Catalog support" on Mon, 12 Dec 2022 18:12:06 +0000,
>   James Duong <Ja...@improving.com.INVALID> wrote:
> 
>> Hi David,
>>
>> I've written up the URI parsing in C++ and started adding session management messages. I'm also planning on having the ClientCookieMiddlewareFactory be able to report if sessions are enabled on the server.
>>
>> I (or another developer) will send an update once those features are ready for demo.
>> ________________________________
>> From: David Li <li...@apache.org>
>> Sent: December 12, 2022 10:07 AM
>> To: dev@arrow.apache.org <de...@arrow.apache.org>
>> Subject: Re: DISCUSS: [FlightSQL] Catalog support
>>
>> Following up here, James are you interested in putting up a draft PR for the Flight SQL URI format and for session management?
>>
>> The Flight SQL URI format would then also cover Andrew's use case. And if someone wants to draw up a PR to the JDBC driver to enable arbitrary properties, I can review that too.
>>
>> On Sat, Dec 3, 2022, at 05:38, Andrew Lamb wrote:
>>>> Andrew, do we need to look into adding more metadata to indicate
>>> different query languages? (It's quite a shame that we named this Flight
>>> SQL at this point...)
>>>
>>> TDLR is I don't think trying to explicitly support languages other than SQL
>>> in FlightSQL is a good idea. Among other reasons, the JDBC / ODBC drivers,
>>> which mostly assume SQL, are one of the key features of FlightSQL, and they
>>> are likely not as useful for non SQL. I can see the argument to support for
>>> substrait plans, and it will be interesting to see what use cases benefit
>>> from that support.
>>>
>>> What would make our life easier would be some standard way to pass
>>> application specific key/value pairs from the JDBC driver to a flight SQL
>>> backend with each request (perhaps via gRPC headers). This would allow
>>> passing configuration parameters that were not envisioned in the spec, from
>>> end user (of the JDBC driver) all the way to our backend.
>>>
>>> Thanks again for driving this forward,
>>> Andrew
>>>
>>> On Thu, Dec 1, 2022 at 7:11 PM David Li <li...@apache.org> wrote:
>>>
>>>> Hey James, thanks for putting this up.
>>>>
>>>> Inline:
>>>>
>>>> > The suggestion is to make this part of Flight as an
>>>> > optional feature, rather than Flight SQL due to its applicability outside
>>>> > of just database access.
>>>>
>>>> Which uses do you see? I see statefulness as a general antipattern here,
>>>> so I'm wary of introducing it beyond where we need it.
>>>>
>>>> > - The Flight client supplies a New-Session header which has key-value
>>>> pairs
>>>> > for initial session options. This header can be applied to any RPC call,
>>>> > but logically should be the first one the client makes.
>>>>
>>>> Handshake already effectively serves as this RPC - maybe we could extend
>>>> it? (I also see Handshake as an antipattern because it's a stateful auth
>>>> mechanism.)
>>>>
>>>> Should the session timeout/be on a lease? (gRPC doesn't really give the
>>>> server a way to track the persistence of a particular client connection.)
>>>>
>>>> > It's a bit asymmetric that creating a new session is done by applying a
>>>> > header, but closing a session is an RPC call. This was so that session
>>>> > creation doesn't introduce another round trip before the first real data
>>>> > request. If there's a way to batch RPC calls it might be better to make
>>>> > session creation an RPC call.
>>>>
>>>> Is this a worrisome amount of overhead?
>>>>
>>>> Unfortunately gRPC doesn't batch RPC calls, but RPC calls on the same
>>>> client generally share the same TCP connection (modulo load balancing
>>>> behavior, but presumably that is not enabled if you want persistent
>>>> sessions).
>>>>
>>>> On the implementation side, I'd like to avoid baking this in too deeply if
>>>> at all possible. Ideally it'd be implemented entirely as middleware,
>>>> possibly making use of an interface so applications can override the
>>>> session storage (hashtable, Redis, etcd, etc.)
>>>>
>>>> > Just to chime in on this, one thing I'm curious about is whether there
>>>> > will be support for user-defined catalog/schema hierarchy depth?
>>>>
>>>> Gavin, for ADBC we discussed adding a delimiter to the catalog name to
>>>> handle this case - maybe we can handle this by adding a property for the
>>>> delimiter to SqlInfo?
>>>>
>>>> > https://github.com/influxdata/influxdb_iox/issues/6102
>>>>
>>>> Andrew, do we need to look into adding more metadata to indicate different
>>>> query languages? (It's quite a shame that we named this Flight SQL at this
>>>> point...)
>>>>
>>>> On Thu, Dec 1, 2022, at 08:49, Andrew Lamb wrote:
>>>> > Sorry for the late reply -- thank you James and David for this
>>>> discussion.
>>>> >
>>>> > I agree that adding Catalog support would be a valuable addition to
>>>> Flight
>>>> > SQL, and it recently came up as we begin to implement Flight SQL in
>>>> > InfluxDB IOx [1].
>>>> >
>>>> >> - A standard URI scheme for Flight SQL that can be used by multiple
>>>> > client APIs (JDBC, ADBC, etc.)
>>>> >
>>>> > I agree this would be very valuable, along with a standard way (ideally
>>>> > with HTTP headers) to send this information as part of the FlightSQL gRPC
>>>> > requests.
>>>> >
>>>> >> I'd suggest we define session management features explicitly in Flight
>>>> > (while being optional).
>>>> >
>>>> > I agree it is critical that server-side state is not required to
>>>> implement
>>>> > FlightSQL. Stateful connections would likely complicate deploying
>>>> FlightSQL
>>>> > in distributed systems. I suggest it should be possible to implement any
>>>> > session management features by sending the entire session state with the
>>>> > request, if desired.
>>>> >
>>>> > I don't have a strong opinion about the merits of including explicit
>>>> > session management features in FlightSQL. It seems to me that keeping the
>>>> > API surface of FlightSQL minimal and implementation flexibility maximal
>>>> > should be the default. However, if JDBC/ODBC driver compatibility would
>>>> be
>>>> > improved with explicit state management APIs, then adding them to
>>>> FlightSQL
>>>> > seems like a good idea to me.
>>>> >
>>>> > Thanks again -- it is amazing to hit some issue in design and then find
>>>> out
>>>> > the Arrow community is already hard at work on a solution.
>>>> >
>>>> > Andrew
>>>> >
>>>> > [1] https://github.com/influxdata/influxdb_iox/issues/6102
>>>> >
>>>> > On Wed, Nov 30, 2022 at 7:17 PM Gavin Ray <ra...@gmail.com> wrote:
>>>> >
>>>> >> Just to chime in on this, one thing I'm curious about is whether there
>>>> >> will be support for user-defined catalog/schema hierarchy depth?
>>>> >>
>>>> >> This comment that James made does seem reasonable to me
>>>> >> > scheme://<host>:<port>/path-1/path-2/.../path-n
>>>> >>
>>>> >> Trino/Presto does a similar thing (jdbc:trino://localhost:8080/tpch/sf1)
>>>> >>
>>>> >> At Hasura, what we do is have an alias "FullyQualifiedName" which is
>>>> >> just "Array<String>"
>>>> >> and the identifier to some element in a data source is always
>>>> >> fully-qualified:
>>>> >>
>>>> >> https://github.com/hasura/graphql-engine/tree/master/dc-agents#schema
>>>> >>
>>>> >> ["postgres_1", "db1", "schema2", "my_table", "col_a"]
>>>> >> ["mongo", "db1",  "collection_a", "field_a"]
>>>> >> ["csv_adapter", "myfile.csv", "col_x"]
>>>> >>
>>>> >> On Wed, Nov 30, 2022 at 6:31 PM James Duong
>>>> >> <ja...@bitquilltech.com.invalid> wrote:
>>>> >> >
>>>> >> > Our current convention of sending connection properties as headers
>>>> with
>>>> >> > every request has the benefit of making statefulness optional, but has
>>>> >> the
>>>> >> > drawback of sending redundant, unused properties on requests after the
>>>> >> > first, which increases the payload size unnecessarily.
>>>> >> >
>>>> >> > I'd suggest we define session management features explicitly in Flight
>>>> >> > (while being optional). The suggestion is to make this part of Flight
>>>> as
>>>> >> an
>>>> >> > optional feature, rather than Flight SQL due to its applicability
>>>> outside
>>>> >> > of just database access.
>>>> >> >
>>>> >> > Creating a session:
>>>> >> > - The Flight client supplies a New-Session header which has key-value
>>>> >> pairs
>>>> >> > for initial session options. This header can be applied to any RPC
>>>> call,
>>>> >> > but logically should be the first one the client makes.
>>>> >> > - The server should send a Set-Cookie header back containing some
>>>> >> > server-side representation of the session that the client can use in
>>>> >> > subsequent requests.
>>>> >> > - The path specified in the URI is sent as a "Catalog" session option.
>>>> >> >
>>>> >> > Modifying session options:
>>>> >> > - A separate RPC call that takes in a Stream<string, string>
>>>> representing
>>>> >> > each session option that is being modified and returns a stream of
>>>> >> statuses
>>>> >> > to indicate if the setting change was accepted.
>>>> >> > - This RPC call is only valid when the Cookie header is used.
>>>> >> > - It is up to the server to define if a failed session property
>>>> change is
>>>> >> > fatal or if other properties can continue to be set.
>>>> >> >
>>>> >> > Closing a session:
>>>> >> > - A separate RPC call that tells the server to drop the session
>>>> specified
>>>> >> > by the Cookie header.
>>>> >> >
>>>> >> > Notes:
>>>> >> > A Flight SQL client would check if session management RPCs are
>>>> supported
>>>> >> > through a new GetSqlInfo property. A Flight client doesn't have a way
>>>> to
>>>> >> do
>>>> >> > this generically, but there could be an application-specific RPC or
>>>> >> header
>>>> >> > that reports this metadata.
>>>> >> >
>>>> >> > The O/JDBC and ADBC drivers would need to be updated to
>>>> programmatically
>>>> >> > check for session management RPCs. If unsupported, then use the old
>>>> >> > behavior of sending all properties as headers with each request. If
>>>> >> > supported, make use of the New-Session header and drop the session
>>>> when
>>>> >> > closing the client-side connection.
>>>> >> >
>>>> >> > It's a bit asymmetric that creating a new session is done by applying
>>>> a
>>>> >> > header, but closing a session is an RPC call. This was so that session
>>>> >> > creation doesn't introduce another round trip before the first real
>>>> data
>>>> >> > request. If there's a way to batch RPC calls it might be better to
>>>> make
>>>> >> > session creation an RPC call.
>>>> >> >
>>>> >> > On Tue, Nov 22, 2022 at 3:16 PM David Li <li...@apache.org> wrote:
>>>> >> >
>>>> >> > > It sounds reasonable - then there are three points:
>>>> >> > >
>>>> >> > > - A standard URI scheme for Flight SQL that can be used by multiple
>>>> >> client
>>>> >> > > APIs (JDBC, ADBC, etc.)
>>>> >> > > - A standard scheme for session data (likely header/cookie-based)
>>>> >> > > - A mapping from URI parameters and fields to session data
>>>> >> > >
>>>> >> > >
>>>> >> > >
>>>> >> > > On Tue, Nov 22, 2022, at 17:45, James Duong wrote:
>>>> >> > > > Just following up on this and if there are any thoughts.
>>>> >> > > >
>>>> >> > > > The purpose would be to standardize how we specify access to some
>>>> >> named
>>>> >> > > > logical grouping of data. This would make it easy to model
>>>> >> catalog/schema
>>>> >> > > > semantics in Flight SQL.
>>>> >> > > >
>>>> >> > > > Having this be part of the connection URI makes it similar to
>>>> >> specifying
>>>> >> > > a
>>>> >> > > > resource in an HTTP URL (ie an endpoint) which should make it easy
>>>> >> for
>>>> >> > > end
>>>> >> > > > users to work with and modify.
>>>> >> > > >
>>>> >> > > > On Fri, Nov 18, 2022 at 3:17 PM James Duong <
>>>> jamesd@bitquilltech.com
>>>> >> >
>>>> >> > > wrote:
>>>> >> > > >
>>>> >> > > >> As for surfacing catalogs itself, perhaps we allow the URI take
>>>> in a
>>>> >> > > path
>>>> >> > > >> and treat that as a way of specifying a multi-level resource that
>>>> >> which
>>>> >> > > the
>>>> >> > > >> FlightClient is connecting to:
>>>> >> > > >>
>>>> >> > > >> eg a connection URI of the form:
>>>> >> > > >> scheme://<host>:<port>/path-1/path-2/.../path-n
>>>> >> > > >>
>>>> >> > > >> The FlightClient could send this path as either a header or a
>>>> >> session
>>>> >> > > >> property (with a neutral name like 'resource-path'). Flight SQL
>>>> >> > > Producers
>>>> >> > > >> could interpret this as a catalog or schema.
>>>> >> > > >> eg
>>>> >> > > >> grpc://<host>:<port>/catalog/schema
>>>> >> > > >>
>>>> >> > > >> On Fri, Nov 11, 2022 at 2:07 AM James Henderson <jm...@juxt.pro>
>>>> >> wrote:
>>>> >> > > >>
>>>> >> > > >>> Sounds good to me.
>>>> >> > > >>>
>>>> >> > > >>> > Are you interested in writing up a (sketch of a) proposal?
>>>> >> > > >>>
>>>> >> > > >>> Yep, can do - I'm OoO over the next couple of weeks so might be
>>>> a
>>>> >> bit
>>>> >> > > >>> intermittent.
>>>> >> > > >>>
>>>> >> > > >>> On Thu, 10 Nov 2022 at 15:28, David Li <li...@apache.org>
>>>> >> wrote:
>>>> >> > > >>>
>>>> >> > > >>> > Hey James H.,
>>>> >> > > >>> >
>>>> >> > > >>> > That would make sense to me. So it sounds like we'd want
>>>> >> > > >>> >
>>>> >> > > >>> > - Formal specification of using cookies/headers to mark a
>>>> >> 'session'
>>>> >> > > (I
>>>> >> > > >>> > guess this will be a little inconsistent with transactions,
>>>> >> though)
>>>> >> > > >>> > - Adding RPCs to query session values
>>>> >> > > >>> > - Adding RPCs to set session values
>>>> >> > > >>> > - Listing standard values and types
>>>> >> > > >>> >
>>>> >> > > >>> > Some things may require more consideration, e.g. transaction
>>>> >> > > isolation
>>>> >> > > >>> > might be better off as part of the transaction RPCs than an
>>>> >> ambient
>>>> >> > > >>> > property. Are you interested in writing up a (sketch of a)
>>>> >> proposal?
>>>> >> > > >>> >
>>>> >> > > >>> > -David
>>>> >> > > >>> >
>>>> >> > > >>> > On Thu, Nov 10, 2022, at 10:09, James Henderson wrote:
>>>> >> > > >>> > > Similarly, we're also currently considering how best to
>>>> >> implement
>>>> >> > > >>> some of
>>>> >> > > >>> > > the SQL standard session variables in our Flight SQL server
>>>> -
>>>> >> > > things
>>>> >> > > >>> like
>>>> >> > > >>> > > current transaction isolation level, access mode, time zone
>>>> >> etc,
>>>> >> > > which
>>>> >> > > >>> > seem
>>>> >> > > >>> > > to have similar properties to the (traditional) connection's
>>>> >> > > current
>>>> >> > > >>> > > catalog. We'd (perhaps naively) looked at solutions
>>>> involving
>>>> >> the
>>>> >> > > >>> Flight
>>>> >> > > >>> > > client's `ClientCookieMiddleware`, but might these have
>>>> >> > > standardised
>>>> >> > > >>> > > support within Flight SQL itself eventually?
>>>> >> > > >>> > >
>>>> >> > > >>> > > Cheers,
>>>> >> > > >>> > >
>>>> >> > > >>> > > James
>>>> >> > > >>> > >
>>>> >> > > >>> > > On Wed, 9 Nov 2022 at 21:51, David Li <li...@apache.org>
>>>> >> wrote:
>>>> >> > > >>> > >
>>>> >> > > >>> > >> I think having better support for this makes sense, but
>>>> >> perhaps we
>>>> >> > > >>> can
>>>> >> > > >>> > >> find a way to make it not tied to the connection itself?
>>>> For
>>>> >> > > >>> instance,
>>>> >> > > >>> > in
>>>> >> > > >>> > >> the same way transactions were implemented (as a handle).
>>>> Or
>>>> >> > > rather,
>>>> >> > > >>> > >> instead of adding connection statefulness to Flight RPC,
>>>> I'd
>>>> >> > > rather
>>>> >> > > >>> try
>>>> >> > > >>> > to
>>>> >> > > >>> > >> work within the gRPC/RPC paradigm.
>>>> >> > > >>> > >>
>>>> >> > > >>> > >> On Wed, Nov 9, 2022, at 16:47, James Duong wrote:
>>>> >> > > >>> > >> > Databases such as SQL Server and PostgreSQL have the
>>>> >> concept of
>>>> >> > > >>> > catalogs
>>>> >> > > >>> > >> as
>>>> >> > > >>> > >> > containers of database schemas. Users can usually
>>>> specify an
>>>> >> > > >>> initial
>>>> >> > > >>> > >> > catalog during the connection process, list catalogs, and
>>>> >> > > sometimes
>>>> >> > > >>> > >> change
>>>> >> > > >>> > >> > catalogs during the session.
>>>> >> > > >>> > >> >
>>>> >> > > >>> > >> > Currently catalog support in Flight SQL is somewhat
>>>> >> limited. The
>>>> >> > > >>> > protocol
>>>> >> > > >>> > >> > provides a way to list catalogs as well as metadata in
>>>> >> > > SqlTypeInfo
>>>> >> > > >>> for
>>>> >> > > >>> > >> > reporting how catalogs are supported from a syntax
>>>> >> perspective.
>>>> >> > > >>> > >> >
>>>> >> > > >>> > >> > ODBC and JDBC provide API calls for changing the catalog.
>>>> >> > > >>> > Additionally,
>>>> >> > > >>> > >> > Flight SQL doesn't really provide the concept of
>>>> "initial"
>>>> >> > > >>> connection
>>>> >> > > >>> > >> > properties (such as a starting catalog) since Flight
>>>> itself
>>>> >> is
>>>> >> > > >>> > stateless
>>>> >> > > >>> > >> > from a connection perspective.
>>>> >> > > >>> > >> >
>>>> >> > > >>> > >> > To support catalogs properly, I'd imagine we need to make
>>>> >> some
>>>> >> > > >>> > changes to
>>>> >> > > >>> > >> > the Flight SQL protocol:
>>>> >> > > >>> > >> > - Introduce the concept of connection-time properties
>>>> >> (perhaps
>>>> >> > > an
>>>> >> > > >>> > >> optional
>>>> >> > > >>> > >> > RPC for Flight SQL applications that need this)
>>>> >> > > >>> > >> > - Related to the above, expand the connection URL and
>>>> Java
>>>> >> > > builder
>>>> >> > > >>> to
>>>> >> > > >>> > >> allow
>>>> >> > > >>> > >> > arbitrary application-specific properties.
>>>> >> > > >>> > >> > - Add optional RPCs for changing the catalog and relevant
>>>> >> error
>>>> >> > > >>> codes
>>>> >> > > >>> > if
>>>> >> > > >>> > >> > this is not permitted.
>>>> >> > > >>> > >> >
>>>> >> > > >>> > >> >
>>>> >> > > >>> > >> > --
>>>> >> > > >>> > >> >
>>>> >> > > >>> > >> > *James Duong*
>>>> >> > > >>> > >> > Lead Software Developer
>>>> >> > > >>> > >> > Bit Quill Technologies Inc.
>>>> >> > > >>> > >> > Direct: +1.604.562.6082 | jamesd@bitquilltech.com
>>>> >> > > >>> > >> > https://www.bitquilltech.com
>>>> >> > > >>> > >> >
>>>> >> > > >>> > >> > This email message is for the sole use of the intended
>>>> >> > > recipient(s)
>>>> >> > > >>> > and
>>>> >> > > >>> > >> may
>>>> >> > > >>> > >> > contain confidential and privileged information.  Any
>>>> >> > > unauthorized
>>>> >> > > >>> > >> review,
>>>> >> > > >>> > >> > use, disclosure, or distribution is prohibited.  If you
>>>> are
>>>> >> not
>>>> >> > > the
>>>> >> > > >>> > >> > intended recipient, please contact the sender by reply
>>>> >> email and
>>>> >> > > >>> > destroy
>>>> >> > > >>> > >> > all copies of the original message.  Thank you.
>>>> >> > > >>> > >>
>>>> >> > > >>> > >
>>>> >> > > >>> > >
>>>> >> > > >>> > > --
>>>> >> > > >>> > > *James Henderson*
>>>> >> > > >>> > > XTDB Developer at *JUXT*
>>>> >> > > >>> > > Email jms@juxt.pro
>>>> >> > > >>> > > Website https://juxt.pro
>>>> >> > > >>> > >
>>>> >> > > >>> > > [image: photo]
>>>> >> > > >>> >
>>>> >> > > >>>
>>>> >> > > >>>
>>>> >> > > >>> --
>>>> >> > > >>> *James Henderson*
>>>> >> > > >>> XTDB Developer at *JUXT*
>>>> >> > > >>> Email jms@juxt.pro
>>>> >> > > >>> Website https://juxt.pro
>>>> >> > > >>>
>>>> >> > > >>> [image: photo]
>>>> >> > > >>>
>>>> >> > > >>
>>>> >> > > >>
>>>> >> > > >> --
>>>> >> > > >>
>>>> >> > > >> *James Duong*
>>>> >> > > >> Lead Software Developer
>>>> >> > > >> Bit Quill Technologies Inc.
>>>> >> > > >> Direct: +1.604.562.6082 | jamesd@bitquilltech.com
>>>> >> > > >> https://www.bitquilltech.com
>>>> >> > > >>
>>>> >> > > >> This email message is for the sole use of the intended
>>>> recipient(s)
>>>> >> and
>>>> >> > > >> may contain confidential and privileged information.  Any
>>>> >> unauthorized
>>>> >> > > >> review, use, disclosure, or distribution is prohibited.  If you
>>>> are
>>>> >> not
>>>> >> > > the
>>>> >> > > >> intended recipient, please contact the sender by reply email and
>>>> >> destroy
>>>> >> > > >> all copies of the original message.  Thank you.
>>>> >> > > >>
>>>> >> > > >
>>>> >> > > >
>>>> >> > > > --
>>>> >> > > >
>>>> >> > > > *James Duong*
>>>> >> > > > Lead Software Developer
>>>> >> > > > Bit Quill Technologies Inc.
>>>> >> > > > Direct: +1.604.562.6082 | jamesd@bitquilltech.com
>>>> >> > > > https://www.bitquilltech.com
>>>> >> > > >
>>>> >> > > > This email message is for the sole use of the intended
>>>> recipient(s)
>>>> >> and
>>>> >> > > may
>>>> >> > > > contain confidential and privileged information.  Any unauthorized
>>>> >> > > review,
>>>> >> > > > use, disclosure, or distribution is prohibited.  If you are not
>>>> the
>>>> >> > > > intended recipient, please contact the sender by reply email and
>>>> >> destroy
>>>> >> > > > all copies of the original message.  Thank you.
>>>> >> > >
>>>> >> >
>>>> >> >
>>>> >> > --
>>>> >> >
>>>> >> > *James Duong*
>>>> >> > Lead Software Developer
>>>> >> > Bit Quill Technologies Inc.
>>>> >> > Direct: +1.604.562.6082 | jamesd@bitquilltech.com
>>>> >> > https://www.bitquilltech.com
>>>> >> >
>>>> >> > This email message is for the sole use of the intended recipient(s)
>>>> and
>>>> >> may
>>>> >> > contain confidential and privileged information.  Any unauthorized
>>>> >> review,
>>>> >> > use, disclosure, or distribution is prohibited.  If you are not the
>>>> >> > intended recipient, please contact the sender by reply email and
>>>> destroy
>>>> >> > all copies of the original message.  Thank you.
>>>> >>
>>>>
> 

Re: DISCUSS: [FlightSQL] Catalog support

Posted by James Duong <Ja...@improving.com.INVALID>.
Hi Sutou,

I saw your PostgreSQL project and thought it was quite interesting, especially given the number of PostgreSQL-compatible databases.

Paul Nienaber will be picking up implementation of the catalog feature going forward and can provide an update

Get Outlook for Android<https://aka.ms/AAb9ysg>

________________________________
From: Sutou Kouhei <ko...@clear-code.com>
Sent: Thursday, February 9, 2023, 22:25
To: dev@arrow.apache.org <de...@arrow.apache.org>
Subject: Re: DISCUSS: [FlightSQL] Catalog support

Hi James

Is there any progress of this?

I'm developing a Flight SQL adapter for PostgreSQL:
https://github.com/apache/arrow-flight-sql-postgresql

I want to implement session feature for it because opening
a session in PostgreSQL is expensive. PostgreSQL uses one
process per session. If we open and close a session for
each Flight SQL call, we need to start one process for each
Flight SQL call.

I noticed that the current Flight SQL specification doesn't
provide the standard session support. So I'm interesting in
this discussion.

Background: https://github.com/apache/arrow-flight-sql-postgresql/issues/13


Thanks,
--
kou

In
 <CH...@CH2PR13MB3831.namprd13.prod.outlook.com>
  "Re: DISCUSS: [FlightSQL] Catalog support" on Mon, 12 Dec 2022 18:12:06 +0000,
  James Duong <Ja...@improving.com.INVALID> wrote:

> Hi David,
>
> I've written up the URI parsing in C++ and started adding session management messages. I'm also planning on having the ClientCookieMiddlewareFactory be able to report if sessions are enabled on the server.
>
> I (or another developer) will send an update once those features are ready for demo.
> ________________________________
> From: David Li <li...@apache.org>
> Sent: December 12, 2022 10:07 AM
> To: dev@arrow.apache.org <de...@arrow.apache.org>
> Subject: Re: DISCUSS: [FlightSQL] Catalog support
>
> Following up here, James are you interested in putting up a draft PR for the Flight SQL URI format and for session management?
>
> The Flight SQL URI format would then also cover Andrew's use case. And if someone wants to draw up a PR to the JDBC driver to enable arbitrary properties, I can review that too.
>
> On Sat, Dec 3, 2022, at 05:38, Andrew Lamb wrote:
>>> Andrew, do we need to look into adding more metadata to indicate
>> different query languages? (It's quite a shame that we named this Flight
>> SQL at this point...)
>>
>> TDLR is I don't think trying to explicitly support languages other than SQL
>> in FlightSQL is a good idea. Among other reasons, the JDBC / ODBC drivers,
>> which mostly assume SQL, are one of the key features of FlightSQL, and they
>> are likely not as useful for non SQL. I can see the argument to support for
>> substrait plans, and it will be interesting to see what use cases benefit
>> from that support.
>>
>> What would make our life easier would be some standard way to pass
>> application specific key/value pairs from the JDBC driver to a flight SQL
>> backend with each request (perhaps via gRPC headers). This would allow
>> passing configuration parameters that were not envisioned in the spec, from
>> end user (of the JDBC driver) all the way to our backend.
>>
>> Thanks again for driving this forward,
>> Andrew
>>
>> On Thu, Dec 1, 2022 at 7:11 PM David Li <li...@apache.org> wrote:
>>
>>> Hey James, thanks for putting this up.
>>>
>>> Inline:
>>>
>>> > The suggestion is to make this part of Flight as an
>>> > optional feature, rather than Flight SQL due to its applicability outside
>>> > of just database access.
>>>
>>> Which uses do you see? I see statefulness as a general antipattern here,
>>> so I'm wary of introducing it beyond where we need it.
>>>
>>> > - The Flight client supplies a New-Session header which has key-value
>>> pairs
>>> > for initial session options. This header can be applied to any RPC call,
>>> > but logically should be the first one the client makes.
>>>
>>> Handshake already effectively serves as this RPC - maybe we could extend
>>> it? (I also see Handshake as an antipattern because it's a stateful auth
>>> mechanism.)
>>>
>>> Should the session timeout/be on a lease? (gRPC doesn't really give the
>>> server a way to track the persistence of a particular client connection.)
>>>
>>> > It's a bit asymmetric that creating a new session is done by applying a
>>> > header, but closing a session is an RPC call. This was so that session
>>> > creation doesn't introduce another round trip before the first real data
>>> > request. If there's a way to batch RPC calls it might be better to make
>>> > session creation an RPC call.
>>>
>>> Is this a worrisome amount of overhead?
>>>
>>> Unfortunately gRPC doesn't batch RPC calls, but RPC calls on the same
>>> client generally share the same TCP connection (modulo load balancing
>>> behavior, but presumably that is not enabled if you want persistent
>>> sessions).
>>>
>>> On the implementation side, I'd like to avoid baking this in too deeply if
>>> at all possible. Ideally it'd be implemented entirely as middleware,
>>> possibly making use of an interface so applications can override the
>>> session storage (hashtable, Redis, etcd, etc.)
>>>
>>> > Just to chime in on this, one thing I'm curious about is whether there
>>> > will be support for user-defined catalog/schema hierarchy depth?
>>>
>>> Gavin, for ADBC we discussed adding a delimiter to the catalog name to
>>> handle this case - maybe we can handle this by adding a property for the
>>> delimiter to SqlInfo?
>>>
>>> > https://github.com/influxdata/influxdb_iox/issues/6102
>>>
>>> Andrew, do we need to look into adding more metadata to indicate different
>>> query languages? (It's quite a shame that we named this Flight SQL at this
>>> point...)
>>>
>>> On Thu, Dec 1, 2022, at 08:49, Andrew Lamb wrote:
>>> > Sorry for the late reply -- thank you James and David for this
>>> discussion.
>>> >
>>> > I agree that adding Catalog support would be a valuable addition to
>>> Flight
>>> > SQL, and it recently came up as we begin to implement Flight SQL in
>>> > InfluxDB IOx [1].
>>> >
>>> >> - A standard URI scheme for Flight SQL that can be used by multiple
>>> > client APIs (JDBC, ADBC, etc.)
>>> >
>>> > I agree this would be very valuable, along with a standard way (ideally
>>> > with HTTP headers) to send this information as part of the FlightSQL gRPC
>>> > requests.
>>> >
>>> >> I'd suggest we define session management features explicitly in Flight
>>> > (while being optional).
>>> >
>>> > I agree it is critical that server-side state is not required to
>>> implement
>>> > FlightSQL. Stateful connections would likely complicate deploying
>>> FlightSQL
>>> > in distributed systems. I suggest it should be possible to implement any
>>> > session management features by sending the entire session state with the
>>> > request, if desired.
>>> >
>>> > I don't have a strong opinion about the merits of including explicit
>>> > session management features in FlightSQL. It seems to me that keeping the
>>> > API surface of FlightSQL minimal and implementation flexibility maximal
>>> > should be the default. However, if JDBC/ODBC driver compatibility would
>>> be
>>> > improved with explicit state management APIs, then adding them to
>>> FlightSQL
>>> > seems like a good idea to me.
>>> >
>>> > Thanks again -- it is amazing to hit some issue in design and then find
>>> out
>>> > the Arrow community is already hard at work on a solution.
>>> >
>>> > Andrew
>>> >
>>> > [1] https://github.com/influxdata/influxdb_iox/issues/6102
>>> >
>>> > On Wed, Nov 30, 2022 at 7:17 PM Gavin Ray <ra...@gmail.com> wrote:
>>> >
>>> >> Just to chime in on this, one thing I'm curious about is whether there
>>> >> will be support for user-defined catalog/schema hierarchy depth?
>>> >>
>>> >> This comment that James made does seem reasonable to me
>>> >> > scheme://<host>:<port>/path-1/path-2/.../path-n
>>> >>
>>> >> Trino/Presto does a similar thing (jdbc:trino://localhost:8080/tpch/sf1)
>>> >>
>>> >> At Hasura, what we do is have an alias "FullyQualifiedName" which is
>>> >> just "Array<String>"
>>> >> and the identifier to some element in a data source is always
>>> >> fully-qualified:
>>> >>
>>> >> https://github.com/hasura/graphql-engine/tree/master/dc-agents#schema
>>> >>
>>> >> ["postgres_1", "db1", "schema2", "my_table", "col_a"]
>>> >> ["mongo", "db1",  "collection_a", "field_a"]
>>> >> ["csv_adapter", "myfile.csv", "col_x"]
>>> >>
>>> >> On Wed, Nov 30, 2022 at 6:31 PM James Duong
>>> >> <ja...@bitquilltech.com.invalid> wrote:
>>> >> >
>>> >> > Our current convention of sending connection properties as headers
>>> with
>>> >> > every request has the benefit of making statefulness optional, but has
>>> >> the
>>> >> > drawback of sending redundant, unused properties on requests after the
>>> >> > first, which increases the payload size unnecessarily.
>>> >> >
>>> >> > I'd suggest we define session management features explicitly in Flight
>>> >> > (while being optional). The suggestion is to make this part of Flight
>>> as
>>> >> an
>>> >> > optional feature, rather than Flight SQL due to its applicability
>>> outside
>>> >> > of just database access.
>>> >> >
>>> >> > Creating a session:
>>> >> > - The Flight client supplies a New-Session header which has key-value
>>> >> pairs
>>> >> > for initial session options. This header can be applied to any RPC
>>> call,
>>> >> > but logically should be the first one the client makes.
>>> >> > - The server should send a Set-Cookie header back containing some
>>> >> > server-side representation of the session that the client can use in
>>> >> > subsequent requests.
>>> >> > - The path specified in the URI is sent as a "Catalog" session option.
>>> >> >
>>> >> > Modifying session options:
>>> >> > - A separate RPC call that takes in a Stream<string, string>
>>> representing
>>> >> > each session option that is being modified and returns a stream of
>>> >> statuses
>>> >> > to indicate if the setting change was accepted.
>>> >> > - This RPC call is only valid when the Cookie header is used.
>>> >> > - It is up to the server to define if a failed session property
>>> change is
>>> >> > fatal or if other properties can continue to be set.
>>> >> >
>>> >> > Closing a session:
>>> >> > - A separate RPC call that tells the server to drop the session
>>> specified
>>> >> > by the Cookie header.
>>> >> >
>>> >> > Notes:
>>> >> > A Flight SQL client would check if session management RPCs are
>>> supported
>>> >> > through a new GetSqlInfo property. A Flight client doesn't have a way
>>> to
>>> >> do
>>> >> > this generically, but there could be an application-specific RPC or
>>> >> header
>>> >> > that reports this metadata.
>>> >> >
>>> >> > The O/JDBC and ADBC drivers would need to be updated to
>>> programmatically
>>> >> > check for session management RPCs. If unsupported, then use the old
>>> >> > behavior of sending all properties as headers with each request. If
>>> >> > supported, make use of the New-Session header and drop the session
>>> when
>>> >> > closing the client-side connection.
>>> >> >
>>> >> > It's a bit asymmetric that creating a new session is done by applying
>>> a
>>> >> > header, but closing a session is an RPC call. This was so that session
>>> >> > creation doesn't introduce another round trip before the first real
>>> data
>>> >> > request. If there's a way to batch RPC calls it might be better to
>>> make
>>> >> > session creation an RPC call.
>>> >> >
>>> >> > On Tue, Nov 22, 2022 at 3:16 PM David Li <li...@apache.org> wrote:
>>> >> >
>>> >> > > It sounds reasonable - then there are three points:
>>> >> > >
>>> >> > > - A standard URI scheme for Flight SQL that can be used by multiple
>>> >> client
>>> >> > > APIs (JDBC, ADBC, etc.)
>>> >> > > - A standard scheme for session data (likely header/cookie-based)
>>> >> > > - A mapping from URI parameters and fields to session data
>>> >> > >
>>> >> > >
>>> >> > >
>>> >> > > On Tue, Nov 22, 2022, at 17:45, James Duong wrote:
>>> >> > > > Just following up on this and if there are any thoughts.
>>> >> > > >
>>> >> > > > The purpose would be to standardize how we specify access to some
>>> >> named
>>> >> > > > logical grouping of data. This would make it easy to model
>>> >> catalog/schema
>>> >> > > > semantics in Flight SQL.
>>> >> > > >
>>> >> > > > Having this be part of the connection URI makes it similar to
>>> >> specifying
>>> >> > > a
>>> >> > > > resource in an HTTP URL (ie an endpoint) which should make it easy
>>> >> for
>>> >> > > end
>>> >> > > > users to work with and modify.
>>> >> > > >
>>> >> > > > On Fri, Nov 18, 2022 at 3:17 PM James Duong <
>>> jamesd@bitquilltech.com
>>> >> >
>>> >> > > wrote:
>>> >> > > >
>>> >> > > >> As for surfacing catalogs itself, perhaps we allow the URI take
>>> in a
>>> >> > > path
>>> >> > > >> and treat that as a way of specifying a multi-level resource that
>>> >> which
>>> >> > > the
>>> >> > > >> FlightClient is connecting to:
>>> >> > > >>
>>> >> > > >> eg a connection URI of the form:
>>> >> > > >> scheme://<host>:<port>/path-1/path-2/.../path-n
>>> >> > > >>
>>> >> > > >> The FlightClient could send this path as either a header or a
>>> >> session
>>> >> > > >> property (with a neutral name like 'resource-path'). Flight SQL
>>> >> > > Producers
>>> >> > > >> could interpret this as a catalog or schema.
>>> >> > > >> eg
>>> >> > > >> grpc://<host>:<port>/catalog/schema
>>> >> > > >>
>>> >> > > >> On Fri, Nov 11, 2022 at 2:07 AM James Henderson <jm...@juxt.pro>
>>> >> wrote:
>>> >> > > >>
>>> >> > > >>> Sounds good to me.
>>> >> > > >>>
>>> >> > > >>> > Are you interested in writing up a (sketch of a) proposal?
>>> >> > > >>>
>>> >> > > >>> Yep, can do - I'm OoO over the next couple of weeks so might be
>>> a
>>> >> bit
>>> >> > > >>> intermittent.
>>> >> > > >>>
>>> >> > > >>> On Thu, 10 Nov 2022 at 15:28, David Li <li...@apache.org>
>>> >> wrote:
>>> >> > > >>>
>>> >> > > >>> > Hey James H.,
>>> >> > > >>> >
>>> >> > > >>> > That would make sense to me. So it sounds like we'd want
>>> >> > > >>> >
>>> >> > > >>> > - Formal specification of using cookies/headers to mark a
>>> >> 'session'
>>> >> > > (I
>>> >> > > >>> > guess this will be a little inconsistent with transactions,
>>> >> though)
>>> >> > > >>> > - Adding RPCs to query session values
>>> >> > > >>> > - Adding RPCs to set session values
>>> >> > > >>> > - Listing standard values and types
>>> >> > > >>> >
>>> >> > > >>> > Some things may require more consideration, e.g. transaction
>>> >> > > isolation
>>> >> > > >>> > might be better off as part of the transaction RPCs than an
>>> >> ambient
>>> >> > > >>> > property. Are you interested in writing up a (sketch of a)
>>> >> proposal?
>>> >> > > >>> >
>>> >> > > >>> > -David
>>> >> > > >>> >
>>> >> > > >>> > On Thu, Nov 10, 2022, at 10:09, James Henderson wrote:
>>> >> > > >>> > > Similarly, we're also currently considering how best to
>>> >> implement
>>> >> > > >>> some of
>>> >> > > >>> > > the SQL standard session variables in our Flight SQL server
>>> -
>>> >> > > things
>>> >> > > >>> like
>>> >> > > >>> > > current transaction isolation level, access mode, time zone
>>> >> etc,
>>> >> > > which
>>> >> > > >>> > seem
>>> >> > > >>> > > to have similar properties to the (traditional) connection's
>>> >> > > current
>>> >> > > >>> > > catalog. We'd (perhaps naively) looked at solutions
>>> involving
>>> >> the
>>> >> > > >>> Flight
>>> >> > > >>> > > client's `ClientCookieMiddleware`, but might these have
>>> >> > > standardised
>>> >> > > >>> > > support within Flight SQL itself eventually?
>>> >> > > >>> > >
>>> >> > > >>> > > Cheers,
>>> >> > > >>> > >
>>> >> > > >>> > > James
>>> >> > > >>> > >
>>> >> > > >>> > > On Wed, 9 Nov 2022 at 21:51, David Li <li...@apache.org>
>>> >> wrote:
>>> >> > > >>> > >
>>> >> > > >>> > >> I think having better support for this makes sense, but
>>> >> perhaps we
>>> >> > > >>> can
>>> >> > > >>> > >> find a way to make it not tied to the connection itself?
>>> For
>>> >> > > >>> instance,
>>> >> > > >>> > in
>>> >> > > >>> > >> the same way transactions were implemented (as a handle).
>>> Or
>>> >> > > rather,
>>> >> > > >>> > >> instead of adding connection statefulness to Flight RPC,
>>> I'd
>>> >> > > rather
>>> >> > > >>> try
>>> >> > > >>> > to
>>> >> > > >>> > >> work within the gRPC/RPC paradigm.
>>> >> > > >>> > >>
>>> >> > > >>> > >> On Wed, Nov 9, 2022, at 16:47, James Duong wrote:
>>> >> > > >>> > >> > Databases such as SQL Server and PostgreSQL have the
>>> >> concept of
>>> >> > > >>> > catalogs
>>> >> > > >>> > >> as
>>> >> > > >>> > >> > containers of database schemas. Users can usually
>>> specify an
>>> >> > > >>> initial
>>> >> > > >>> > >> > catalog during the connection process, list catalogs, and
>>> >> > > sometimes
>>> >> > > >>> > >> change
>>> >> > > >>> > >> > catalogs during the session.
>>> >> > > >>> > >> >
>>> >> > > >>> > >> > Currently catalog support in Flight SQL is somewhat
>>> >> limited. The
>>> >> > > >>> > protocol
>>> >> > > >>> > >> > provides a way to list catalogs as well as metadata in
>>> >> > > SqlTypeInfo
>>> >> > > >>> for
>>> >> > > >>> > >> > reporting how catalogs are supported from a syntax
>>> >> perspective.
>>> >> > > >>> > >> >
>>> >> > > >>> > >> > ODBC and JDBC provide API calls for changing the catalog.
>>> >> > > >>> > Additionally,
>>> >> > > >>> > >> > Flight SQL doesn't really provide the concept of
>>> "initial"
>>> >> > > >>> connection
>>> >> > > >>> > >> > properties (such as a starting catalog) since Flight
>>> itself
>>> >> is
>>> >> > > >>> > stateless
>>> >> > > >>> > >> > from a connection perspective.
>>> >> > > >>> > >> >
>>> >> > > >>> > >> > To support catalogs properly, I'd imagine we need to make
>>> >> some
>>> >> > > >>> > changes to
>>> >> > > >>> > >> > the Flight SQL protocol:
>>> >> > > >>> > >> > - Introduce the concept of connection-time properties
>>> >> (perhaps
>>> >> > > an
>>> >> > > >>> > >> optional
>>> >> > > >>> > >> > RPC for Flight SQL applications that need this)
>>> >> > > >>> > >> > - Related to the above, expand the connection URL and
>>> Java
>>> >> > > builder
>>> >> > > >>> to
>>> >> > > >>> > >> allow
>>> >> > > >>> > >> > arbitrary application-specific properties.
>>> >> > > >>> > >> > - Add optional RPCs for changing the catalog and relevant
>>> >> error
>>> >> > > >>> codes
>>> >> > > >>> > if
>>> >> > > >>> > >> > this is not permitted.
>>> >> > > >>> > >> >
>>> >> > > >>> > >> >
>>> >> > > >>> > >> > --
>>> >> > > >>> > >> >
>>> >> > > >>> > >> > *James Duong*
>>> >> > > >>> > >> > Lead Software Developer
>>> >> > > >>> > >> > Bit Quill Technologies Inc.
>>> >> > > >>> > >> > Direct: +1.604.562.6082 | jamesd@bitquilltech.com
>>> >> > > >>> > >> > https://www.bitquilltech.com
>>> >> > > >>> > >> >
>>> >> > > >>> > >> > This email message is for the sole use of the intended
>>> >> > > recipient(s)
>>> >> > > >>> > and
>>> >> > > >>> > >> may
>>> >> > > >>> > >> > contain confidential and privileged information.  Any
>>> >> > > unauthorized
>>> >> > > >>> > >> review,
>>> >> > > >>> > >> > use, disclosure, or distribution is prohibited.  If you
>>> are
>>> >> not
>>> >> > > the
>>> >> > > >>> > >> > intended recipient, please contact the sender by reply
>>> >> email and
>>> >> > > >>> > destroy
>>> >> > > >>> > >> > all copies of the original message.  Thank you.
>>> >> > > >>> > >>
>>> >> > > >>> > >
>>> >> > > >>> > >
>>> >> > > >>> > > --
>>> >> > > >>> > > *James Henderson*
>>> >> > > >>> > > XTDB Developer at *JUXT*
>>> >> > > >>> > > Email jms@juxt.pro
>>> >> > > >>> > > Website https://juxt.pro
>>> >> > > >>> > >
>>> >> > > >>> > > [image: photo]
>>> >> > > >>> >
>>> >> > > >>>
>>> >> > > >>>
>>> >> > > >>> --
>>> >> > > >>> *James Henderson*
>>> >> > > >>> XTDB Developer at *JUXT*
>>> >> > > >>> Email jms@juxt.pro
>>> >> > > >>> Website https://juxt.pro
>>> >> > > >>>
>>> >> > > >>> [image: photo]
>>> >> > > >>>
>>> >> > > >>
>>> >> > > >>
>>> >> > > >> --
>>> >> > > >>
>>> >> > > >> *James Duong*
>>> >> > > >> Lead Software Developer
>>> >> > > >> Bit Quill Technologies Inc.
>>> >> > > >> Direct: +1.604.562.6082 | jamesd@bitquilltech.com
>>> >> > > >> https://www.bitquilltech.com
>>> >> > > >>
>>> >> > > >> This email message is for the sole use of the intended
>>> recipient(s)
>>> >> and
>>> >> > > >> may contain confidential and privileged information.  Any
>>> >> unauthorized
>>> >> > > >> review, use, disclosure, or distribution is prohibited.  If you
>>> are
>>> >> not
>>> >> > > the
>>> >> > > >> intended recipient, please contact the sender by reply email and
>>> >> destroy
>>> >> > > >> all copies of the original message.  Thank you.
>>> >> > > >>
>>> >> > > >
>>> >> > > >
>>> >> > > > --
>>> >> > > >
>>> >> > > > *James Duong*
>>> >> > > > Lead Software Developer
>>> >> > > > Bit Quill Technologies Inc.
>>> >> > > > Direct: +1.604.562.6082 | jamesd@bitquilltech.com
>>> >> > > > https://www.bitquilltech.com
>>> >> > > >
>>> >> > > > This email message is for the sole use of the intended
>>> recipient(s)
>>> >> and
>>> >> > > may
>>> >> > > > contain confidential and privileged information.  Any unauthorized
>>> >> > > review,
>>> >> > > > use, disclosure, or distribution is prohibited.  If you are not
>>> the
>>> >> > > > intended recipient, please contact the sender by reply email and
>>> >> destroy
>>> >> > > > all copies of the original message.  Thank you.
>>> >> > >
>>> >> >
>>> >> >
>>> >> > --
>>> >> >
>>> >> > *James Duong*
>>> >> > Lead Software Developer
>>> >> > Bit Quill Technologies Inc.
>>> >> > Direct: +1.604.562.6082 | jamesd@bitquilltech.com
>>> >> > https://www.bitquilltech.com
>>> >> >
>>> >> > This email message is for the sole use of the intended recipient(s)
>>> and
>>> >> may
>>> >> > contain confidential and privileged information.  Any unauthorized
>>> >> review,
>>> >> > use, disclosure, or distribution is prohibited.  If you are not the
>>> >> > intended recipient, please contact the sender by reply email and
>>> destroy
>>> >> > all copies of the original message.  Thank you.
>>> >>
>>>


Re: DISCUSS: [FlightSQL] Catalog support

Posted by Sutou Kouhei <ko...@clear-code.com>.
Hi James

Is there any progress of this?

I'm developing a Flight SQL adapter for PostgreSQL:
https://github.com/apache/arrow-flight-sql-postgresql

I want to implement session feature for it because opening
a session in PostgreSQL is expensive. PostgreSQL uses one
process per session. If we open and close a session for
each Flight SQL call, we need to start one process for each
Flight SQL call.

I noticed that the current Flight SQL specification doesn't
provide the standard session support. So I'm interesting in
this discussion.

Background: https://github.com/apache/arrow-flight-sql-postgresql/issues/13


Thanks,
-- 
kou

In 
 <CH...@CH2PR13MB3831.namprd13.prod.outlook.com>
  "Re: DISCUSS: [FlightSQL] Catalog support" on Mon, 12 Dec 2022 18:12:06 +0000,
  James Duong <Ja...@improving.com.INVALID> wrote:

> Hi David,
> 
> I've written up the URI parsing in C++ and started adding session management messages. I'm also planning on having the ClientCookieMiddlewareFactory be able to report if sessions are enabled on the server.
> 
> I (or another developer) will send an update once those features are ready for demo.
> ________________________________
> From: David Li <li...@apache.org>
> Sent: December 12, 2022 10:07 AM
> To: dev@arrow.apache.org <de...@arrow.apache.org>
> Subject: Re: DISCUSS: [FlightSQL] Catalog support
> 
> Following up here, James are you interested in putting up a draft PR for the Flight SQL URI format and for session management?
> 
> The Flight SQL URI format would then also cover Andrew's use case. And if someone wants to draw up a PR to the JDBC driver to enable arbitrary properties, I can review that too.
> 
> On Sat, Dec 3, 2022, at 05:38, Andrew Lamb wrote:
>>> Andrew, do we need to look into adding more metadata to indicate
>> different query languages? (It's quite a shame that we named this Flight
>> SQL at this point...)
>>
>> TDLR is I don't think trying to explicitly support languages other than SQL
>> in FlightSQL is a good idea. Among other reasons, the JDBC / ODBC drivers,
>> which mostly assume SQL, are one of the key features of FlightSQL, and they
>> are likely not as useful for non SQL. I can see the argument to support for
>> substrait plans, and it will be interesting to see what use cases benefit
>> from that support.
>>
>> What would make our life easier would be some standard way to pass
>> application specific key/value pairs from the JDBC driver to a flight SQL
>> backend with each request (perhaps via gRPC headers). This would allow
>> passing configuration parameters that were not envisioned in the spec, from
>> end user (of the JDBC driver) all the way to our backend.
>>
>> Thanks again for driving this forward,
>> Andrew
>>
>> On Thu, Dec 1, 2022 at 7:11 PM David Li <li...@apache.org> wrote:
>>
>>> Hey James, thanks for putting this up.
>>>
>>> Inline:
>>>
>>> > The suggestion is to make this part of Flight as an
>>> > optional feature, rather than Flight SQL due to its applicability outside
>>> > of just database access.
>>>
>>> Which uses do you see? I see statefulness as a general antipattern here,
>>> so I'm wary of introducing it beyond where we need it.
>>>
>>> > - The Flight client supplies a New-Session header which has key-value
>>> pairs
>>> > for initial session options. This header can be applied to any RPC call,
>>> > but logically should be the first one the client makes.
>>>
>>> Handshake already effectively serves as this RPC - maybe we could extend
>>> it? (I also see Handshake as an antipattern because it's a stateful auth
>>> mechanism.)
>>>
>>> Should the session timeout/be on a lease? (gRPC doesn't really give the
>>> server a way to track the persistence of a particular client connection.)
>>>
>>> > It's a bit asymmetric that creating a new session is done by applying a
>>> > header, but closing a session is an RPC call. This was so that session
>>> > creation doesn't introduce another round trip before the first real data
>>> > request. If there's a way to batch RPC calls it might be better to make
>>> > session creation an RPC call.
>>>
>>> Is this a worrisome amount of overhead?
>>>
>>> Unfortunately gRPC doesn't batch RPC calls, but RPC calls on the same
>>> client generally share the same TCP connection (modulo load balancing
>>> behavior, but presumably that is not enabled if you want persistent
>>> sessions).
>>>
>>> On the implementation side, I'd like to avoid baking this in too deeply if
>>> at all possible. Ideally it'd be implemented entirely as middleware,
>>> possibly making use of an interface so applications can override the
>>> session storage (hashtable, Redis, etcd, etc.)
>>>
>>> > Just to chime in on this, one thing I'm curious about is whether there
>>> > will be support for user-defined catalog/schema hierarchy depth?
>>>
>>> Gavin, for ADBC we discussed adding a delimiter to the catalog name to
>>> handle this case - maybe we can handle this by adding a property for the
>>> delimiter to SqlInfo?
>>>
>>> > https://github.com/influxdata/influxdb_iox/issues/6102
>>>
>>> Andrew, do we need to look into adding more metadata to indicate different
>>> query languages? (It's quite a shame that we named this Flight SQL at this
>>> point...)
>>>
>>> On Thu, Dec 1, 2022, at 08:49, Andrew Lamb wrote:
>>> > Sorry for the late reply -- thank you James and David for this
>>> discussion.
>>> >
>>> > I agree that adding Catalog support would be a valuable addition to
>>> Flight
>>> > SQL, and it recently came up as we begin to implement Flight SQL in
>>> > InfluxDB IOx [1].
>>> >
>>> >> - A standard URI scheme for Flight SQL that can be used by multiple
>>> > client APIs (JDBC, ADBC, etc.)
>>> >
>>> > I agree this would be very valuable, along with a standard way (ideally
>>> > with HTTP headers) to send this information as part of the FlightSQL gRPC
>>> > requests.
>>> >
>>> >> I'd suggest we define session management features explicitly in Flight
>>> > (while being optional).
>>> >
>>> > I agree it is critical that server-side state is not required to
>>> implement
>>> > FlightSQL. Stateful connections would likely complicate deploying
>>> FlightSQL
>>> > in distributed systems. I suggest it should be possible to implement any
>>> > session management features by sending the entire session state with the
>>> > request, if desired.
>>> >
>>> > I don't have a strong opinion about the merits of including explicit
>>> > session management features in FlightSQL. It seems to me that keeping the
>>> > API surface of FlightSQL minimal and implementation flexibility maximal
>>> > should be the default. However, if JDBC/ODBC driver compatibility would
>>> be
>>> > improved with explicit state management APIs, then adding them to
>>> FlightSQL
>>> > seems like a good idea to me.
>>> >
>>> > Thanks again -- it is amazing to hit some issue in design and then find
>>> out
>>> > the Arrow community is already hard at work on a solution.
>>> >
>>> > Andrew
>>> >
>>> > [1] https://github.com/influxdata/influxdb_iox/issues/6102
>>> >
>>> > On Wed, Nov 30, 2022 at 7:17 PM Gavin Ray <ra...@gmail.com> wrote:
>>> >
>>> >> Just to chime in on this, one thing I'm curious about is whether there
>>> >> will be support for user-defined catalog/schema hierarchy depth?
>>> >>
>>> >> This comment that James made does seem reasonable to me
>>> >> > scheme://<host>:<port>/path-1/path-2/.../path-n
>>> >>
>>> >> Trino/Presto does a similar thing (jdbc:trino://localhost:8080/tpch/sf1)
>>> >>
>>> >> At Hasura, what we do is have an alias "FullyQualifiedName" which is
>>> >> just "Array<String>"
>>> >> and the identifier to some element in a data source is always
>>> >> fully-qualified:
>>> >>
>>> >> https://github.com/hasura/graphql-engine/tree/master/dc-agents#schema
>>> >>
>>> >> ["postgres_1", "db1", "schema2", "my_table", "col_a"]
>>> >> ["mongo", "db1",  "collection_a", "field_a"]
>>> >> ["csv_adapter", "myfile.csv", "col_x"]
>>> >>
>>> >> On Wed, Nov 30, 2022 at 6:31 PM James Duong
>>> >> <ja...@bitquilltech.com.invalid> wrote:
>>> >> >
>>> >> > Our current convention of sending connection properties as headers
>>> with
>>> >> > every request has the benefit of making statefulness optional, but has
>>> >> the
>>> >> > drawback of sending redundant, unused properties on requests after the
>>> >> > first, which increases the payload size unnecessarily.
>>> >> >
>>> >> > I'd suggest we define session management features explicitly in Flight
>>> >> > (while being optional). The suggestion is to make this part of Flight
>>> as
>>> >> an
>>> >> > optional feature, rather than Flight SQL due to its applicability
>>> outside
>>> >> > of just database access.
>>> >> >
>>> >> > Creating a session:
>>> >> > - The Flight client supplies a New-Session header which has key-value
>>> >> pairs
>>> >> > for initial session options. This header can be applied to any RPC
>>> call,
>>> >> > but logically should be the first one the client makes.
>>> >> > - The server should send a Set-Cookie header back containing some
>>> >> > server-side representation of the session that the client can use in
>>> >> > subsequent requests.
>>> >> > - The path specified in the URI is sent as a "Catalog" session option.
>>> >> >
>>> >> > Modifying session options:
>>> >> > - A separate RPC call that takes in a Stream<string, string>
>>> representing
>>> >> > each session option that is being modified and returns a stream of
>>> >> statuses
>>> >> > to indicate if the setting change was accepted.
>>> >> > - This RPC call is only valid when the Cookie header is used.
>>> >> > - It is up to the server to define if a failed session property
>>> change is
>>> >> > fatal or if other properties can continue to be set.
>>> >> >
>>> >> > Closing a session:
>>> >> > - A separate RPC call that tells the server to drop the session
>>> specified
>>> >> > by the Cookie header.
>>> >> >
>>> >> > Notes:
>>> >> > A Flight SQL client would check if session management RPCs are
>>> supported
>>> >> > through a new GetSqlInfo property. A Flight client doesn't have a way
>>> to
>>> >> do
>>> >> > this generically, but there could be an application-specific RPC or
>>> >> header
>>> >> > that reports this metadata.
>>> >> >
>>> >> > The O/JDBC and ADBC drivers would need to be updated to
>>> programmatically
>>> >> > check for session management RPCs. If unsupported, then use the old
>>> >> > behavior of sending all properties as headers with each request. If
>>> >> > supported, make use of the New-Session header and drop the session
>>> when
>>> >> > closing the client-side connection.
>>> >> >
>>> >> > It's a bit asymmetric that creating a new session is done by applying
>>> a
>>> >> > header, but closing a session is an RPC call. This was so that session
>>> >> > creation doesn't introduce another round trip before the first real
>>> data
>>> >> > request. If there's a way to batch RPC calls it might be better to
>>> make
>>> >> > session creation an RPC call.
>>> >> >
>>> >> > On Tue, Nov 22, 2022 at 3:16 PM David Li <li...@apache.org> wrote:
>>> >> >
>>> >> > > It sounds reasonable - then there are three points:
>>> >> > >
>>> >> > > - A standard URI scheme for Flight SQL that can be used by multiple
>>> >> client
>>> >> > > APIs (JDBC, ADBC, etc.)
>>> >> > > - A standard scheme for session data (likely header/cookie-based)
>>> >> > > - A mapping from URI parameters and fields to session data
>>> >> > >
>>> >> > >
>>> >> > >
>>> >> > > On Tue, Nov 22, 2022, at 17:45, James Duong wrote:
>>> >> > > > Just following up on this and if there are any thoughts.
>>> >> > > >
>>> >> > > > The purpose would be to standardize how we specify access to some
>>> >> named
>>> >> > > > logical grouping of data. This would make it easy to model
>>> >> catalog/schema
>>> >> > > > semantics in Flight SQL.
>>> >> > > >
>>> >> > > > Having this be part of the connection URI makes it similar to
>>> >> specifying
>>> >> > > a
>>> >> > > > resource in an HTTP URL (ie an endpoint) which should make it easy
>>> >> for
>>> >> > > end
>>> >> > > > users to work with and modify.
>>> >> > > >
>>> >> > > > On Fri, Nov 18, 2022 at 3:17 PM James Duong <
>>> jamesd@bitquilltech.com
>>> >> >
>>> >> > > wrote:
>>> >> > > >
>>> >> > > >> As for surfacing catalogs itself, perhaps we allow the URI take
>>> in a
>>> >> > > path
>>> >> > > >> and treat that as a way of specifying a multi-level resource that
>>> >> which
>>> >> > > the
>>> >> > > >> FlightClient is connecting to:
>>> >> > > >>
>>> >> > > >> eg a connection URI of the form:
>>> >> > > >> scheme://<host>:<port>/path-1/path-2/.../path-n
>>> >> > > >>
>>> >> > > >> The FlightClient could send this path as either a header or a
>>> >> session
>>> >> > > >> property (with a neutral name like 'resource-path'). Flight SQL
>>> >> > > Producers
>>> >> > > >> could interpret this as a catalog or schema.
>>> >> > > >> eg
>>> >> > > >> grpc://<host>:<port>/catalog/schema
>>> >> > > >>
>>> >> > > >> On Fri, Nov 11, 2022 at 2:07 AM James Henderson <jm...@juxt.pro>
>>> >> wrote:
>>> >> > > >>
>>> >> > > >>> Sounds good to me.
>>> >> > > >>>
>>> >> > > >>> > Are you interested in writing up a (sketch of a) proposal?
>>> >> > > >>>
>>> >> > > >>> Yep, can do - I'm OoO over the next couple of weeks so might be
>>> a
>>> >> bit
>>> >> > > >>> intermittent.
>>> >> > > >>>
>>> >> > > >>> On Thu, 10 Nov 2022 at 15:28, David Li <li...@apache.org>
>>> >> wrote:
>>> >> > > >>>
>>> >> > > >>> > Hey James H.,
>>> >> > > >>> >
>>> >> > > >>> > That would make sense to me. So it sounds like we'd want
>>> >> > > >>> >
>>> >> > > >>> > - Formal specification of using cookies/headers to mark a
>>> >> 'session'
>>> >> > > (I
>>> >> > > >>> > guess this will be a little inconsistent with transactions,
>>> >> though)
>>> >> > > >>> > - Adding RPCs to query session values
>>> >> > > >>> > - Adding RPCs to set session values
>>> >> > > >>> > - Listing standard values and types
>>> >> > > >>> >
>>> >> > > >>> > Some things may require more consideration, e.g. transaction
>>> >> > > isolation
>>> >> > > >>> > might be better off as part of the transaction RPCs than an
>>> >> ambient
>>> >> > > >>> > property. Are you interested in writing up a (sketch of a)
>>> >> proposal?
>>> >> > > >>> >
>>> >> > > >>> > -David
>>> >> > > >>> >
>>> >> > > >>> > On Thu, Nov 10, 2022, at 10:09, James Henderson wrote:
>>> >> > > >>> > > Similarly, we're also currently considering how best to
>>> >> implement
>>> >> > > >>> some of
>>> >> > > >>> > > the SQL standard session variables in our Flight SQL server
>>> -
>>> >> > > things
>>> >> > > >>> like
>>> >> > > >>> > > current transaction isolation level, access mode, time zone
>>> >> etc,
>>> >> > > which
>>> >> > > >>> > seem
>>> >> > > >>> > > to have similar properties to the (traditional) connection's
>>> >> > > current
>>> >> > > >>> > > catalog. We'd (perhaps naively) looked at solutions
>>> involving
>>> >> the
>>> >> > > >>> Flight
>>> >> > > >>> > > client's `ClientCookieMiddleware`, but might these have
>>> >> > > standardised
>>> >> > > >>> > > support within Flight SQL itself eventually?
>>> >> > > >>> > >
>>> >> > > >>> > > Cheers,
>>> >> > > >>> > >
>>> >> > > >>> > > James
>>> >> > > >>> > >
>>> >> > > >>> > > On Wed, 9 Nov 2022 at 21:51, David Li <li...@apache.org>
>>> >> wrote:
>>> >> > > >>> > >
>>> >> > > >>> > >> I think having better support for this makes sense, but
>>> >> perhaps we
>>> >> > > >>> can
>>> >> > > >>> > >> find a way to make it not tied to the connection itself?
>>> For
>>> >> > > >>> instance,
>>> >> > > >>> > in
>>> >> > > >>> > >> the same way transactions were implemented (as a handle).
>>> Or
>>> >> > > rather,
>>> >> > > >>> > >> instead of adding connection statefulness to Flight RPC,
>>> I'd
>>> >> > > rather
>>> >> > > >>> try
>>> >> > > >>> > to
>>> >> > > >>> > >> work within the gRPC/RPC paradigm.
>>> >> > > >>> > >>
>>> >> > > >>> > >> On Wed, Nov 9, 2022, at 16:47, James Duong wrote:
>>> >> > > >>> > >> > Databases such as SQL Server and PostgreSQL have the
>>> >> concept of
>>> >> > > >>> > catalogs
>>> >> > > >>> > >> as
>>> >> > > >>> > >> > containers of database schemas. Users can usually
>>> specify an
>>> >> > > >>> initial
>>> >> > > >>> > >> > catalog during the connection process, list catalogs, and
>>> >> > > sometimes
>>> >> > > >>> > >> change
>>> >> > > >>> > >> > catalogs during the session.
>>> >> > > >>> > >> >
>>> >> > > >>> > >> > Currently catalog support in Flight SQL is somewhat
>>> >> limited. The
>>> >> > > >>> > protocol
>>> >> > > >>> > >> > provides a way to list catalogs as well as metadata in
>>> >> > > SqlTypeInfo
>>> >> > > >>> for
>>> >> > > >>> > >> > reporting how catalogs are supported from a syntax
>>> >> perspective.
>>> >> > > >>> > >> >
>>> >> > > >>> > >> > ODBC and JDBC provide API calls for changing the catalog.
>>> >> > > >>> > Additionally,
>>> >> > > >>> > >> > Flight SQL doesn't really provide the concept of
>>> "initial"
>>> >> > > >>> connection
>>> >> > > >>> > >> > properties (such as a starting catalog) since Flight
>>> itself
>>> >> is
>>> >> > > >>> > stateless
>>> >> > > >>> > >> > from a connection perspective.
>>> >> > > >>> > >> >
>>> >> > > >>> > >> > To support catalogs properly, I'd imagine we need to make
>>> >> some
>>> >> > > >>> > changes to
>>> >> > > >>> > >> > the Flight SQL protocol:
>>> >> > > >>> > >> > - Introduce the concept of connection-time properties
>>> >> (perhaps
>>> >> > > an
>>> >> > > >>> > >> optional
>>> >> > > >>> > >> > RPC for Flight SQL applications that need this)
>>> >> > > >>> > >> > - Related to the above, expand the connection URL and
>>> Java
>>> >> > > builder
>>> >> > > >>> to
>>> >> > > >>> > >> allow
>>> >> > > >>> > >> > arbitrary application-specific properties.
>>> >> > > >>> > >> > - Add optional RPCs for changing the catalog and relevant
>>> >> error
>>> >> > > >>> codes
>>> >> > > >>> > if
>>> >> > > >>> > >> > this is not permitted.
>>> >> > > >>> > >> >
>>> >> > > >>> > >> >
>>> >> > > >>> > >> > --
>>> >> > > >>> > >> >
>>> >> > > >>> > >> > *James Duong*
>>> >> > > >>> > >> > Lead Software Developer
>>> >> > > >>> > >> > Bit Quill Technologies Inc.
>>> >> > > >>> > >> > Direct: +1.604.562.6082 | jamesd@bitquilltech.com
>>> >> > > >>> > >> > https://www.bitquilltech.com
>>> >> > > >>> > >> >
>>> >> > > >>> > >> > This email message is for the sole use of the intended
>>> >> > > recipient(s)
>>> >> > > >>> > and
>>> >> > > >>> > >> may
>>> >> > > >>> > >> > contain confidential and privileged information.  Any
>>> >> > > unauthorized
>>> >> > > >>> > >> review,
>>> >> > > >>> > >> > use, disclosure, or distribution is prohibited.  If you
>>> are
>>> >> not
>>> >> > > the
>>> >> > > >>> > >> > intended recipient, please contact the sender by reply
>>> >> email and
>>> >> > > >>> > destroy
>>> >> > > >>> > >> > all copies of the original message.  Thank you.
>>> >> > > >>> > >>
>>> >> > > >>> > >
>>> >> > > >>> > >
>>> >> > > >>> > > --
>>> >> > > >>> > > *James Henderson*
>>> >> > > >>> > > XTDB Developer at *JUXT*
>>> >> > > >>> > > Email jms@juxt.pro
>>> >> > > >>> > > Website https://juxt.pro
>>> >> > > >>> > >
>>> >> > > >>> > > [image: photo]
>>> >> > > >>> >
>>> >> > > >>>
>>> >> > > >>>
>>> >> > > >>> --
>>> >> > > >>> *James Henderson*
>>> >> > > >>> XTDB Developer at *JUXT*
>>> >> > > >>> Email jms@juxt.pro
>>> >> > > >>> Website https://juxt.pro
>>> >> > > >>>
>>> >> > > >>> [image: photo]
>>> >> > > >>>
>>> >> > > >>
>>> >> > > >>
>>> >> > > >> --
>>> >> > > >>
>>> >> > > >> *James Duong*
>>> >> > > >> Lead Software Developer
>>> >> > > >> Bit Quill Technologies Inc.
>>> >> > > >> Direct: +1.604.562.6082 | jamesd@bitquilltech.com
>>> >> > > >> https://www.bitquilltech.com
>>> >> > > >>
>>> >> > > >> This email message is for the sole use of the intended
>>> recipient(s)
>>> >> and
>>> >> > > >> may contain confidential and privileged information.  Any
>>> >> unauthorized
>>> >> > > >> review, use, disclosure, or distribution is prohibited.  If you
>>> are
>>> >> not
>>> >> > > the
>>> >> > > >> intended recipient, please contact the sender by reply email and
>>> >> destroy
>>> >> > > >> all copies of the original message.  Thank you.
>>> >> > > >>
>>> >> > > >
>>> >> > > >
>>> >> > > > --
>>> >> > > >
>>> >> > > > *James Duong*
>>> >> > > > Lead Software Developer
>>> >> > > > Bit Quill Technologies Inc.
>>> >> > > > Direct: +1.604.562.6082 | jamesd@bitquilltech.com
>>> >> > > > https://www.bitquilltech.com
>>> >> > > >
>>> >> > > > This email message is for the sole use of the intended
>>> recipient(s)
>>> >> and
>>> >> > > may
>>> >> > > > contain confidential and privileged information.  Any unauthorized
>>> >> > > review,
>>> >> > > > use, disclosure, or distribution is prohibited.  If you are not
>>> the
>>> >> > > > intended recipient, please contact the sender by reply email and
>>> >> destroy
>>> >> > > > all copies of the original message.  Thank you.
>>> >> > >
>>> >> >
>>> >> >
>>> >> > --
>>> >> >
>>> >> > *James Duong*
>>> >> > Lead Software Developer
>>> >> > Bit Quill Technologies Inc.
>>> >> > Direct: +1.604.562.6082 | jamesd@bitquilltech.com
>>> >> > https://www.bitquilltech.com
>>> >> >
>>> >> > This email message is for the sole use of the intended recipient(s)
>>> and
>>> >> may
>>> >> > contain confidential and privileged information.  Any unauthorized
>>> >> review,
>>> >> > use, disclosure, or distribution is prohibited.  If you are not the
>>> >> > intended recipient, please contact the sender by reply email and
>>> destroy
>>> >> > all copies of the original message.  Thank you.
>>> >>
>>>

Re: DISCUSS: [FlightSQL] Catalog support

Posted by Andrew Lamb <al...@influxdata.com>.
As an update to this thread, we found we could support our version of
catalogs using JDBC URL parameters, which are passed to the service as gRPC
headers by the JDBC driver. I made a PR [1] proposing updates the docs to
make this feature more discoverable


Any URI parameters that are not handled by the driver are passed to
the Flight SQL service as gRPC headers. For example, the following URI ::

  jdbc:arrow-flight-sql://localhost:12345/?useEncryption=0&database=mydb

This will connect without authentication or encryption, to a Flight
SQL service running on ``localhost`` on port 12345. Each request will
also include a `database=mydb` gRPC header.

Andrew

[1] https://github.com/apache/arrow/pull/34026

On Tue, Dec 13, 2022 at 9:02 AM Andrew Lamb <al...@influxdata.com> wrote:

> Thank you both for pushing this forward
>
> On Mon, Dec 12, 2022 at 1:14 PM James Duong
> <Ja...@improving.com.invalid> wrote:
>
>> Hi David,
>>
>> I've written up the URI parsing in C++ and started adding session
>> management messages. I'm also planning on having the
>> ClientCookieMiddlewareFactory be able to report if sessions are enabled on
>> the server.
>>
>> I (or another developer) will send an update once those features are
>> ready for demo.
>> ________________________________
>> From: David Li <li...@apache.org>
>> Sent: December 12, 2022 10:07 AM
>> To: dev@arrow.apache.org <de...@arrow.apache.org>
>> Subject: Re: DISCUSS: [FlightSQL] Catalog support
>>
>> Following up here, James are you interested in putting up a draft PR for
>> the Flight SQL URI format and for session management?
>>
>> The Flight SQL URI format would then also cover Andrew's use case. And if
>> someone wants to draw up a PR to the JDBC driver to enable arbitrary
>> properties, I can review that too.
>>
>> On Sat, Dec 3, 2022, at 05:38, Andrew Lamb wrote:
>> >> Andrew, do we need to look into adding more metadata to indicate
>> > different query languages? (It's quite a shame that we named this Flight
>> > SQL at this point...)
>> >
>> > TDLR is I don't think trying to explicitly support languages other than
>> SQL
>> > in FlightSQL is a good idea. Among other reasons, the JDBC / ODBC
>> drivers,
>> > which mostly assume SQL, are one of the key features of FlightSQL, and
>> they
>> > are likely not as useful for non SQL. I can see the argument to support
>> for
>> > substrait plans, and it will be interesting to see what use cases
>> benefit
>> > from that support.
>> >
>> > What would make our life easier would be some standard way to pass
>> > application specific key/value pairs from the JDBC driver to a flight
>> SQL
>> > backend with each request (perhaps via gRPC headers). This would allow
>> > passing configuration parameters that were not envisioned in the spec,
>> from
>> > end user (of the JDBC driver) all the way to our backend.
>> >
>> > Thanks again for driving this forward,
>> > Andrew
>> >
>> > On Thu, Dec 1, 2022 at 7:11 PM David Li <li...@apache.org> wrote:
>> >
>> >> Hey James, thanks for putting this up.
>> >>
>> >> Inline:
>> >>
>> >> > The suggestion is to make this part of Flight as an
>> >> > optional feature, rather than Flight SQL due to its applicability
>> outside
>> >> > of just database access.
>> >>
>> >> Which uses do you see? I see statefulness as a general antipattern
>> here,
>> >> so I'm wary of introducing it beyond where we need it.
>> >>
>> >> > - The Flight client supplies a New-Session header which has key-value
>> >> pairs
>> >> > for initial session options. This header can be applied to any RPC
>> call,
>> >> > but logically should be the first one the client makes.
>> >>
>> >> Handshake already effectively serves as this RPC - maybe we could
>> extend
>> >> it? (I also see Handshake as an antipattern because it's a stateful
>> auth
>> >> mechanism.)
>> >>
>> >> Should the session timeout/be on a lease? (gRPC doesn't really give the
>> >> server a way to track the persistence of a particular client
>> connection.)
>> >>
>> >> > It's a bit asymmetric that creating a new session is done by
>> applying a
>> >> > header, but closing a session is an RPC call. This was so that
>> session
>> >> > creation doesn't introduce another round trip before the first real
>> data
>> >> > request. If there's a way to batch RPC calls it might be better to
>> make
>> >> > session creation an RPC call.
>> >>
>> >> Is this a worrisome amount of overhead?
>> >>
>> >> Unfortunately gRPC doesn't batch RPC calls, but RPC calls on the same
>> >> client generally share the same TCP connection (modulo load balancing
>> >> behavior, but presumably that is not enabled if you want persistent
>> >> sessions).
>> >>
>> >> On the implementation side, I'd like to avoid baking this in too
>> deeply if
>> >> at all possible. Ideally it'd be implemented entirely as middleware,
>> >> possibly making use of an interface so applications can override the
>> >> session storage (hashtable, Redis, etcd, etc.)
>> >>
>> >> > Just to chime in on this, one thing I'm curious about is whether
>> there
>> >> > will be support for user-defined catalog/schema hierarchy depth?
>> >>
>> >> Gavin, for ADBC we discussed adding a delimiter to the catalog name to
>> >> handle this case - maybe we can handle this by adding a property for
>> the
>> >> delimiter to SqlInfo?
>> >>
>> >> > https://github.com/influxdata/influxdb_iox/issues/6102
>> >>
>> >> Andrew, do we need to look into adding more metadata to indicate
>> different
>> >> query languages? (It's quite a shame that we named this Flight SQL at
>> this
>> >> point...)
>> >>
>> >> On Thu, Dec 1, 2022, at 08:49, Andrew Lamb wrote:
>> >> > Sorry for the late reply -- thank you James and David for this
>> >> discussion.
>> >> >
>> >> > I agree that adding Catalog support would be a valuable addition to
>> >> Flight
>> >> > SQL, and it recently came up as we begin to implement Flight SQL in
>> >> > InfluxDB IOx [1].
>> >> >
>> >> >> - A standard URI scheme for Flight SQL that can be used by multiple
>> >> > client APIs (JDBC, ADBC, etc.)
>> >> >
>> >> > I agree this would be very valuable, along with a standard way
>> (ideally
>> >> > with HTTP headers) to send this information as part of the FlightSQL
>> gRPC
>> >> > requests.
>> >> >
>> >> >> I'd suggest we define session management features explicitly in
>> Flight
>> >> > (while being optional).
>> >> >
>> >> > I agree it is critical that server-side state is not required to
>> >> implement
>> >> > FlightSQL. Stateful connections would likely complicate deploying
>> >> FlightSQL
>> >> > in distributed systems. I suggest it should be possible to implement
>> any
>> >> > session management features by sending the entire session state with
>> the
>> >> > request, if desired.
>> >> >
>> >> > I don't have a strong opinion about the merits of including explicit
>> >> > session management features in FlightSQL. It seems to me that
>> keeping the
>> >> > API surface of FlightSQL minimal and implementation flexibility
>> maximal
>> >> > should be the default. However, if JDBC/ODBC driver compatibility
>> would
>> >> be
>> >> > improved with explicit state management APIs, then adding them to
>> >> FlightSQL
>> >> > seems like a good idea to me.
>> >> >
>> >> > Thanks again -- it is amazing to hit some issue in design and then
>> find
>> >> out
>> >> > the Arrow community is already hard at work on a solution.
>> >> >
>> >> > Andrew
>> >> >
>> >> > [1] https://github.com/influxdata/influxdb_iox/issues/6102
>> >> >
>> >> > On Wed, Nov 30, 2022 at 7:17 PM Gavin Ray <ra...@gmail.com>
>> wrote:
>> >> >
>> >> >> Just to chime in on this, one thing I'm curious about is whether
>> there
>> >> >> will be support for user-defined catalog/schema hierarchy depth?
>> >> >>
>> >> >> This comment that James made does seem reasonable to me
>> >> >> > scheme://<host>:<port>/path-1/path-2/.../path-n
>> >> >>
>> >> >> Trino/Presto does a similar thing
>> (jdbc:trino://localhost:8080/tpch/sf1)
>> >> >>
>> >> >> At Hasura, what we do is have an alias "FullyQualifiedName" which is
>> >> >> just "Array<String>"
>> >> >> and the identifier to some element in a data source is always
>> >> >> fully-qualified:
>> >> >>
>> >> >>
>> https://github.com/hasura/graphql-engine/tree/master/dc-agents#schema
>> >> >>
>> >> >> ["postgres_1", "db1", "schema2", "my_table", "col_a"]
>> >> >> ["mongo", "db1",  "collection_a", "field_a"]
>> >> >> ["csv_adapter", "myfile.csv", "col_x"]
>> >> >>
>> >> >> On Wed, Nov 30, 2022 at 6:31 PM James Duong
>> >> >> <ja...@bitquilltech.com.invalid> wrote:
>> >> >> >
>> >> >> > Our current convention of sending connection properties as headers
>> >> with
>> >> >> > every request has the benefit of making statefulness optional,
>> but has
>> >> >> the
>> >> >> > drawback of sending redundant, unused properties on requests
>> after the
>> >> >> > first, which increases the payload size unnecessarily.
>> >> >> >
>> >> >> > I'd suggest we define session management features explicitly in
>> Flight
>> >> >> > (while being optional). The suggestion is to make this part of
>> Flight
>> >> as
>> >> >> an
>> >> >> > optional feature, rather than Flight SQL due to its applicability
>> >> outside
>> >> >> > of just database access.
>> >> >> >
>> >> >> > Creating a session:
>> >> >> > - The Flight client supplies a New-Session header which has
>> key-value
>> >> >> pairs
>> >> >> > for initial session options. This header can be applied to any RPC
>> >> call,
>> >> >> > but logically should be the first one the client makes.
>> >> >> > - The server should send a Set-Cookie header back containing some
>> >> >> > server-side representation of the session that the client can use
>> in
>> >> >> > subsequent requests.
>> >> >> > - The path specified in the URI is sent as a "Catalog" session
>> option.
>> >> >> >
>> >> >> > Modifying session options:
>> >> >> > - A separate RPC call that takes in a Stream<string, string>
>> >> representing
>> >> >> > each session option that is being modified and returns a stream of
>> >> >> statuses
>> >> >> > to indicate if the setting change was accepted.
>> >> >> > - This RPC call is only valid when the Cookie header is used.
>> >> >> > - It is up to the server to define if a failed session property
>> >> change is
>> >> >> > fatal or if other properties can continue to be set.
>> >> >> >
>> >> >> > Closing a session:
>> >> >> > - A separate RPC call that tells the server to drop the session
>> >> specified
>> >> >> > by the Cookie header.
>> >> >> >
>> >> >> > Notes:
>> >> >> > A Flight SQL client would check if session management RPCs are
>> >> supported
>> >> >> > through a new GetSqlInfo property. A Flight client doesn't have a
>> way
>> >> to
>> >> >> do
>> >> >> > this generically, but there could be an application-specific RPC
>> or
>> >> >> header
>> >> >> > that reports this metadata.
>> >> >> >
>> >> >> > The O/JDBC and ADBC drivers would need to be updated to
>> >> programmatically
>> >> >> > check for session management RPCs. If unsupported, then use the
>> old
>> >> >> > behavior of sending all properties as headers with each request.
>> If
>> >> >> > supported, make use of the New-Session header and drop the session
>> >> when
>> >> >> > closing the client-side connection.
>> >> >> >
>> >> >> > It's a bit asymmetric that creating a new session is done by
>> applying
>> >> a
>> >> >> > header, but closing a session is an RPC call. This was so that
>> session
>> >> >> > creation doesn't introduce another round trip before the first
>> real
>> >> data
>> >> >> > request. If there's a way to batch RPC calls it might be better to
>> >> make
>> >> >> > session creation an RPC call.
>> >> >> >
>> >> >> > On Tue, Nov 22, 2022 at 3:16 PM David Li <li...@apache.org>
>> wrote:
>> >> >> >
>> >> >> > > It sounds reasonable - then there are three points:
>> >> >> > >
>> >> >> > > - A standard URI scheme for Flight SQL that can be used by
>> multiple
>> >> >> client
>> >> >> > > APIs (JDBC, ADBC, etc.)
>> >> >> > > - A standard scheme for session data (likely
>> header/cookie-based)
>> >> >> > > - A mapping from URI parameters and fields to session data
>> >> >> > >
>> >> >> > >
>> >> >> > >
>> >> >> > > On Tue, Nov 22, 2022, at 17:45, James Duong wrote:
>> >> >> > > > Just following up on this and if there are any thoughts.
>> >> >> > > >
>> >> >> > > > The purpose would be to standardize how we specify access to
>> some
>> >> >> named
>> >> >> > > > logical grouping of data. This would make it easy to model
>> >> >> catalog/schema
>> >> >> > > > semantics in Flight SQL.
>> >> >> > > >
>> >> >> > > > Having this be part of the connection URI makes it similar to
>> >> >> specifying
>> >> >> > > a
>> >> >> > > > resource in an HTTP URL (ie an endpoint) which should make it
>> easy
>> >> >> for
>> >> >> > > end
>> >> >> > > > users to work with and modify.
>> >> >> > > >
>> >> >> > > > On Fri, Nov 18, 2022 at 3:17 PM James Duong <
>> >> jamesd@bitquilltech.com
>> >> >> >
>> >> >> > > wrote:
>> >> >> > > >
>> >> >> > > >> As for surfacing catalogs itself, perhaps we allow the URI
>> take
>> >> in a
>> >> >> > > path
>> >> >> > > >> and treat that as a way of specifying a multi-level resource
>> that
>> >> >> which
>> >> >> > > the
>> >> >> > > >> FlightClient is connecting to:
>> >> >> > > >>
>> >> >> > > >> eg a connection URI of the form:
>> >> >> > > >> scheme://<host>:<port>/path-1/path-2/.../path-n
>> >> >> > > >>
>> >> >> > > >> The FlightClient could send this path as either a header or a
>> >> >> session
>> >> >> > > >> property (with a neutral name like 'resource-path'). Flight
>> SQL
>> >> >> > > Producers
>> >> >> > > >> could interpret this as a catalog or schema.
>> >> >> > > >> eg
>> >> >> > > >> grpc://<host>:<port>/catalog/schema
>> >> >> > > >>
>> >> >> > > >> On Fri, Nov 11, 2022 at 2:07 AM James Henderson <
>> jms@juxt.pro>
>> >> >> wrote:
>> >> >> > > >>
>> >> >> > > >>> Sounds good to me.
>> >> >> > > >>>
>> >> >> > > >>> > Are you interested in writing up a (sketch of a) proposal?
>> >> >> > > >>>
>> >> >> > > >>> Yep, can do - I'm OoO over the next couple of weeks so
>> might be
>> >> a
>> >> >> bit
>> >> >> > > >>> intermittent.
>> >> >> > > >>>
>> >> >> > > >>> On Thu, 10 Nov 2022 at 15:28, David Li <lidavidm@apache.org
>> >
>> >> >> wrote:
>> >> >> > > >>>
>> >> >> > > >>> > Hey James H.,
>> >> >> > > >>> >
>> >> >> > > >>> > That would make sense to me. So it sounds like we'd want
>> >> >> > > >>> >
>> >> >> > > >>> > - Formal specification of using cookies/headers to mark a
>> >> >> 'session'
>> >> >> > > (I
>> >> >> > > >>> > guess this will be a little inconsistent with
>> transactions,
>> >> >> though)
>> >> >> > > >>> > - Adding RPCs to query session values
>> >> >> > > >>> > - Adding RPCs to set session values
>> >> >> > > >>> > - Listing standard values and types
>> >> >> > > >>> >
>> >> >> > > >>> > Some things may require more consideration, e.g.
>> transaction
>> >> >> > > isolation
>> >> >> > > >>> > might be better off as part of the transaction RPCs than
>> an
>> >> >> ambient
>> >> >> > > >>> > property. Are you interested in writing up a (sketch of a)
>> >> >> proposal?
>> >> >> > > >>> >
>> >> >> > > >>> > -David
>> >> >> > > >>> >
>> >> >> > > >>> > On Thu, Nov 10, 2022, at 10:09, James Henderson wrote:
>> >> >> > > >>> > > Similarly, we're also currently considering how best to
>> >> >> implement
>> >> >> > > >>> some of
>> >> >> > > >>> > > the SQL standard session variables in our Flight SQL
>> server
>> >> -
>> >> >> > > things
>> >> >> > > >>> like
>> >> >> > > >>> > > current transaction isolation level, access mode, time
>> zone
>> >> >> etc,
>> >> >> > > which
>> >> >> > > >>> > seem
>> >> >> > > >>> > > to have similar properties to the (traditional)
>> connection's
>> >> >> > > current
>> >> >> > > >>> > > catalog. We'd (perhaps naively) looked at solutions
>> >> involving
>> >> >> the
>> >> >> > > >>> Flight
>> >> >> > > >>> > > client's `ClientCookieMiddleware`, but might these have
>> >> >> > > standardised
>> >> >> > > >>> > > support within Flight SQL itself eventually?
>> >> >> > > >>> > >
>> >> >> > > >>> > > Cheers,
>> >> >> > > >>> > >
>> >> >> > > >>> > > James
>> >> >> > > >>> > >
>> >> >> > > >>> > > On Wed, 9 Nov 2022 at 21:51, David Li <
>> lidavidm@apache.org>
>> >> >> wrote:
>> >> >> > > >>> > >
>> >> >> > > >>> > >> I think having better support for this makes sense, but
>> >> >> perhaps we
>> >> >> > > >>> can
>> >> >> > > >>> > >> find a way to make it not tied to the connection
>> itself?
>> >> For
>> >> >> > > >>> instance,
>> >> >> > > >>> > in
>> >> >> > > >>> > >> the same way transactions were implemented (as a
>> handle).
>> >> Or
>> >> >> > > rather,
>> >> >> > > >>> > >> instead of adding connection statefulness to Flight
>> RPC,
>> >> I'd
>> >> >> > > rather
>> >> >> > > >>> try
>> >> >> > > >>> > to
>> >> >> > > >>> > >> work within the gRPC/RPC paradigm.
>> >> >> > > >>> > >>
>> >> >> > > >>> > >> On Wed, Nov 9, 2022, at 16:47, James Duong wrote:
>> >> >> > > >>> > >> > Databases such as SQL Server and PostgreSQL have the
>> >> >> concept of
>> >> >> > > >>> > catalogs
>> >> >> > > >>> > >> as
>> >> >> > > >>> > >> > containers of database schemas. Users can usually
>> >> specify an
>> >> >> > > >>> initial
>> >> >> > > >>> > >> > catalog during the connection process, list
>> catalogs, and
>> >> >> > > sometimes
>> >> >> > > >>> > >> change
>> >> >> > > >>> > >> > catalogs during the session.
>> >> >> > > >>> > >> >
>> >> >> > > >>> > >> > Currently catalog support in Flight SQL is somewhat
>> >> >> limited. The
>> >> >> > > >>> > protocol
>> >> >> > > >>> > >> > provides a way to list catalogs as well as metadata
>> in
>> >> >> > > SqlTypeInfo
>> >> >> > > >>> for
>> >> >> > > >>> > >> > reporting how catalogs are supported from a syntax
>> >> >> perspective.
>> >> >> > > >>> > >> >
>> >> >> > > >>> > >> > ODBC and JDBC provide API calls for changing the
>> catalog.
>> >> >> > > >>> > Additionally,
>> >> >> > > >>> > >> > Flight SQL doesn't really provide the concept of
>> >> "initial"
>> >> >> > > >>> connection
>> >> >> > > >>> > >> > properties (such as a starting catalog) since Flight
>> >> itself
>> >> >> is
>> >> >> > > >>> > stateless
>> >> >> > > >>> > >> > from a connection perspective.
>> >> >> > > >>> > >> >
>> >> >> > > >>> > >> > To support catalogs properly, I'd imagine we need to
>> make
>> >> >> some
>> >> >> > > >>> > changes to
>> >> >> > > >>> > >> > the Flight SQL protocol:
>> >> >> > > >>> > >> > - Introduce the concept of connection-time properties
>> >> >> (perhaps
>> >> >> > > an
>> >> >> > > >>> > >> optional
>> >> >> > > >>> > >> > RPC for Flight SQL applications that need this)
>> >> >> > > >>> > >> > - Related to the above, expand the connection URL and
>> >> Java
>> >> >> > > builder
>> >> >> > > >>> to
>> >> >> > > >>> > >> allow
>> >> >> > > >>> > >> > arbitrary application-specific properties.
>> >> >> > > >>> > >> > - Add optional RPCs for changing the catalog and
>> relevant
>> >> >> error
>> >> >> > > >>> codes
>> >> >> > > >>> > if
>> >> >> > > >>> > >> > this is not permitted.
>> >> >> > > >>> > >> >
>> >> >> > > >>> > >> >
>> >> >> > > >>> > >> > --
>> >> >> > > >>> > >> >
>> >> >> > > >>> > >> > *James Duong*
>> >> >> > > >>> > >> > Lead Software Developer
>> >> >> > > >>> > >> > Bit Quill Technologies Inc.
>> >> >> > > >>> > >> > Direct: +1.604.562.6082 | jamesd@bitquilltech.com
>> >> >> > > >>> > >> > https://www.bitquilltech.com
>> >> >> > > >>> > >> >
>> >> >> > > >>> > >> > This email message is for the sole use of the
>> intended
>> >> >> > > recipient(s)
>> >> >> > > >>> > and
>> >> >> > > >>> > >> may
>> >> >> > > >>> > >> > contain confidential and privileged information.  Any
>> >> >> > > unauthorized
>> >> >> > > >>> > >> review,
>> >> >> > > >>> > >> > use, disclosure, or distribution is prohibited.  If
>> you
>> >> are
>> >> >> not
>> >> >> > > the
>> >> >> > > >>> > >> > intended recipient, please contact the sender by
>> reply
>> >> >> email and
>> >> >> > > >>> > destroy
>> >> >> > > >>> > >> > all copies of the original message.  Thank you.
>> >> >> > > >>> > >>
>> >> >> > > >>> > >
>> >> >> > > >>> > >
>> >> >> > > >>> > > --
>> >> >> > > >>> > > *James Henderson*
>> >> >> > > >>> > > XTDB Developer at *JUXT*
>> >> >> > > >>> > > Email jms@juxt.pro
>> >> >> > > >>> > > Website https://juxt.pro
>> >> >> > > >>> > >
>> >> >> > > >>> > > [image: photo]
>> >> >> > > >>> >
>> >> >> > > >>>
>> >> >> > > >>>
>> >> >> > > >>> --
>> >> >> > > >>> *James Henderson*
>> >> >> > > >>> XTDB Developer at *JUXT*
>> >> >> > > >>> Email jms@juxt.pro
>> >> >> > > >>> Website https://juxt.pro
>> >> >> > > >>>
>> >> >> > > >>> [image: photo]
>> >> >> > > >>>
>> >> >> > > >>
>> >> >> > > >>
>> >> >> > > >> --
>> >> >> > > >>
>> >> >> > > >> *James Duong*
>> >> >> > > >> Lead Software Developer
>> >> >> > > >> Bit Quill Technologies Inc.
>> >> >> > > >> Direct: +1.604.562.6082 | jamesd@bitquilltech.com
>> >> >> > > >> https://www.bitquilltech.com
>> >> >> > > >>
>> >> >> > > >> This email message is for the sole use of the intended
>> >> recipient(s)
>> >> >> and
>> >> >> > > >> may contain confidential and privileged information.  Any
>> >> >> unauthorized
>> >> >> > > >> review, use, disclosure, or distribution is prohibited.  If
>> you
>> >> are
>> >> >> not
>> >> >> > > the
>> >> >> > > >> intended recipient, please contact the sender by reply email
>> and
>> >> >> destroy
>> >> >> > > >> all copies of the original message.  Thank you.
>> >> >> > > >>
>> >> >> > > >
>> >> >> > > >
>> >> >> > > > --
>> >> >> > > >
>> >> >> > > > *James Duong*
>> >> >> > > > Lead Software Developer
>> >> >> > > > Bit Quill Technologies Inc.
>> >> >> > > > Direct: +1.604.562.6082 | jamesd@bitquilltech.com
>> >> >> > > > https://www.bitquilltech.com
>> >> >> > > >
>> >> >> > > > This email message is for the sole use of the intended
>> >> recipient(s)
>> >> >> and
>> >> >> > > may
>> >> >> > > > contain confidential and privileged information.  Any
>> unauthorized
>> >> >> > > review,
>> >> >> > > > use, disclosure, or distribution is prohibited.  If you are
>> not
>> >> the
>> >> >> > > > intended recipient, please contact the sender by reply email
>> and
>> >> >> destroy
>> >> >> > > > all copies of the original message.  Thank you.
>> >> >> > >
>> >> >> >
>> >> >> >
>> >> >> > --
>> >> >> >
>> >> >> > *James Duong*
>> >> >> > Lead Software Developer
>> >> >> > Bit Quill Technologies Inc.
>> >> >> > Direct: +1.604.562.6082 | jamesd@bitquilltech.com
>> >> >> > https://www.bitquilltech.com
>> >> >> >
>> >> >> > This email message is for the sole use of the intended
>> recipient(s)
>> >> and
>> >> >> may
>> >> >> > contain confidential and privileged information.  Any unauthorized
>> >> >> review,
>> >> >> > use, disclosure, or distribution is prohibited.  If you are not
>> the
>> >> >> > intended recipient, please contact the sender by reply email and
>> >> destroy
>> >> >> > all copies of the original message.  Thank you.
>> >> >>
>> >>
>>
>

Re: DISCUSS: [FlightSQL] Catalog support

Posted by Andrew Lamb <al...@influxdata.com>.
Thank you both for pushing this forward

On Mon, Dec 12, 2022 at 1:14 PM James Duong
<Ja...@improving.com.invalid> wrote:

> Hi David,
>
> I've written up the URI parsing in C++ and started adding session
> management messages. I'm also planning on having the
> ClientCookieMiddlewareFactory be able to report if sessions are enabled on
> the server.
>
> I (or another developer) will send an update once those features are ready
> for demo.
> ________________________________
> From: David Li <li...@apache.org>
> Sent: December 12, 2022 10:07 AM
> To: dev@arrow.apache.org <de...@arrow.apache.org>
> Subject: Re: DISCUSS: [FlightSQL] Catalog support
>
> Following up here, James are you interested in putting up a draft PR for
> the Flight SQL URI format and for session management?
>
> The Flight SQL URI format would then also cover Andrew's use case. And if
> someone wants to draw up a PR to the JDBC driver to enable arbitrary
> properties, I can review that too.
>
> On Sat, Dec 3, 2022, at 05:38, Andrew Lamb wrote:
> >> Andrew, do we need to look into adding more metadata to indicate
> > different query languages? (It's quite a shame that we named this Flight
> > SQL at this point...)
> >
> > TDLR is I don't think trying to explicitly support languages other than
> SQL
> > in FlightSQL is a good idea. Among other reasons, the JDBC / ODBC
> drivers,
> > which mostly assume SQL, are one of the key features of FlightSQL, and
> they
> > are likely not as useful for non SQL. I can see the argument to support
> for
> > substrait plans, and it will be interesting to see what use cases benefit
> > from that support.
> >
> > What would make our life easier would be some standard way to pass
> > application specific key/value pairs from the JDBC driver to a flight SQL
> > backend with each request (perhaps via gRPC headers). This would allow
> > passing configuration parameters that were not envisioned in the spec,
> from
> > end user (of the JDBC driver) all the way to our backend.
> >
> > Thanks again for driving this forward,
> > Andrew
> >
> > On Thu, Dec 1, 2022 at 7:11 PM David Li <li...@apache.org> wrote:
> >
> >> Hey James, thanks for putting this up.
> >>
> >> Inline:
> >>
> >> > The suggestion is to make this part of Flight as an
> >> > optional feature, rather than Flight SQL due to its applicability
> outside
> >> > of just database access.
> >>
> >> Which uses do you see? I see statefulness as a general antipattern here,
> >> so I'm wary of introducing it beyond where we need it.
> >>
> >> > - The Flight client supplies a New-Session header which has key-value
> >> pairs
> >> > for initial session options. This header can be applied to any RPC
> call,
> >> > but logically should be the first one the client makes.
> >>
> >> Handshake already effectively serves as this RPC - maybe we could extend
> >> it? (I also see Handshake as an antipattern because it's a stateful auth
> >> mechanism.)
> >>
> >> Should the session timeout/be on a lease? (gRPC doesn't really give the
> >> server a way to track the persistence of a particular client
> connection.)
> >>
> >> > It's a bit asymmetric that creating a new session is done by applying
> a
> >> > header, but closing a session is an RPC call. This was so that session
> >> > creation doesn't introduce another round trip before the first real
> data
> >> > request. If there's a way to batch RPC calls it might be better to
> make
> >> > session creation an RPC call.
> >>
> >> Is this a worrisome amount of overhead?
> >>
> >> Unfortunately gRPC doesn't batch RPC calls, but RPC calls on the same
> >> client generally share the same TCP connection (modulo load balancing
> >> behavior, but presumably that is not enabled if you want persistent
> >> sessions).
> >>
> >> On the implementation side, I'd like to avoid baking this in too deeply
> if
> >> at all possible. Ideally it'd be implemented entirely as middleware,
> >> possibly making use of an interface so applications can override the
> >> session storage (hashtable, Redis, etcd, etc.)
> >>
> >> > Just to chime in on this, one thing I'm curious about is whether there
> >> > will be support for user-defined catalog/schema hierarchy depth?
> >>
> >> Gavin, for ADBC we discussed adding a delimiter to the catalog name to
> >> handle this case - maybe we can handle this by adding a property for the
> >> delimiter to SqlInfo?
> >>
> >> > https://github.com/influxdata/influxdb_iox/issues/6102
> >>
> >> Andrew, do we need to look into adding more metadata to indicate
> different
> >> query languages? (It's quite a shame that we named this Flight SQL at
> this
> >> point...)
> >>
> >> On Thu, Dec 1, 2022, at 08:49, Andrew Lamb wrote:
> >> > Sorry for the late reply -- thank you James and David for this
> >> discussion.
> >> >
> >> > I agree that adding Catalog support would be a valuable addition to
> >> Flight
> >> > SQL, and it recently came up as we begin to implement Flight SQL in
> >> > InfluxDB IOx [1].
> >> >
> >> >> - A standard URI scheme for Flight SQL that can be used by multiple
> >> > client APIs (JDBC, ADBC, etc.)
> >> >
> >> > I agree this would be very valuable, along with a standard way
> (ideally
> >> > with HTTP headers) to send this information as part of the FlightSQL
> gRPC
> >> > requests.
> >> >
> >> >> I'd suggest we define session management features explicitly in
> Flight
> >> > (while being optional).
> >> >
> >> > I agree it is critical that server-side state is not required to
> >> implement
> >> > FlightSQL. Stateful connections would likely complicate deploying
> >> FlightSQL
> >> > in distributed systems. I suggest it should be possible to implement
> any
> >> > session management features by sending the entire session state with
> the
> >> > request, if desired.
> >> >
> >> > I don't have a strong opinion about the merits of including explicit
> >> > session management features in FlightSQL. It seems to me that keeping
> the
> >> > API surface of FlightSQL minimal and implementation flexibility
> maximal
> >> > should be the default. However, if JDBC/ODBC driver compatibility
> would
> >> be
> >> > improved with explicit state management APIs, then adding them to
> >> FlightSQL
> >> > seems like a good idea to me.
> >> >
> >> > Thanks again -- it is amazing to hit some issue in design and then
> find
> >> out
> >> > the Arrow community is already hard at work on a solution.
> >> >
> >> > Andrew
> >> >
> >> > [1] https://github.com/influxdata/influxdb_iox/issues/6102
> >> >
> >> > On Wed, Nov 30, 2022 at 7:17 PM Gavin Ray <ra...@gmail.com>
> wrote:
> >> >
> >> >> Just to chime in on this, one thing I'm curious about is whether
> there
> >> >> will be support for user-defined catalog/schema hierarchy depth?
> >> >>
> >> >> This comment that James made does seem reasonable to me
> >> >> > scheme://<host>:<port>/path-1/path-2/.../path-n
> >> >>
> >> >> Trino/Presto does a similar thing
> (jdbc:trino://localhost:8080/tpch/sf1)
> >> >>
> >> >> At Hasura, what we do is have an alias "FullyQualifiedName" which is
> >> >> just "Array<String>"
> >> >> and the identifier to some element in a data source is always
> >> >> fully-qualified:
> >> >>
> >> >>
> https://github.com/hasura/graphql-engine/tree/master/dc-agents#schema
> >> >>
> >> >> ["postgres_1", "db1", "schema2", "my_table", "col_a"]
> >> >> ["mongo", "db1",  "collection_a", "field_a"]
> >> >> ["csv_adapter", "myfile.csv", "col_x"]
> >> >>
> >> >> On Wed, Nov 30, 2022 at 6:31 PM James Duong
> >> >> <ja...@bitquilltech.com.invalid> wrote:
> >> >> >
> >> >> > Our current convention of sending connection properties as headers
> >> with
> >> >> > every request has the benefit of making statefulness optional, but
> has
> >> >> the
> >> >> > drawback of sending redundant, unused properties on requests after
> the
> >> >> > first, which increases the payload size unnecessarily.
> >> >> >
> >> >> > I'd suggest we define session management features explicitly in
> Flight
> >> >> > (while being optional). The suggestion is to make this part of
> Flight
> >> as
> >> >> an
> >> >> > optional feature, rather than Flight SQL due to its applicability
> >> outside
> >> >> > of just database access.
> >> >> >
> >> >> > Creating a session:
> >> >> > - The Flight client supplies a New-Session header which has
> key-value
> >> >> pairs
> >> >> > for initial session options. This header can be applied to any RPC
> >> call,
> >> >> > but logically should be the first one the client makes.
> >> >> > - The server should send a Set-Cookie header back containing some
> >> >> > server-side representation of the session that the client can use
> in
> >> >> > subsequent requests.
> >> >> > - The path specified in the URI is sent as a "Catalog" session
> option.
> >> >> >
> >> >> > Modifying session options:
> >> >> > - A separate RPC call that takes in a Stream<string, string>
> >> representing
> >> >> > each session option that is being modified and returns a stream of
> >> >> statuses
> >> >> > to indicate if the setting change was accepted.
> >> >> > - This RPC call is only valid when the Cookie header is used.
> >> >> > - It is up to the server to define if a failed session property
> >> change is
> >> >> > fatal or if other properties can continue to be set.
> >> >> >
> >> >> > Closing a session:
> >> >> > - A separate RPC call that tells the server to drop the session
> >> specified
> >> >> > by the Cookie header.
> >> >> >
> >> >> > Notes:
> >> >> > A Flight SQL client would check if session management RPCs are
> >> supported
> >> >> > through a new GetSqlInfo property. A Flight client doesn't have a
> way
> >> to
> >> >> do
> >> >> > this generically, but there could be an application-specific RPC or
> >> >> header
> >> >> > that reports this metadata.
> >> >> >
> >> >> > The O/JDBC and ADBC drivers would need to be updated to
> >> programmatically
> >> >> > check for session management RPCs. If unsupported, then use the old
> >> >> > behavior of sending all properties as headers with each request. If
> >> >> > supported, make use of the New-Session header and drop the session
> >> when
> >> >> > closing the client-side connection.
> >> >> >
> >> >> > It's a bit asymmetric that creating a new session is done by
> applying
> >> a
> >> >> > header, but closing a session is an RPC call. This was so that
> session
> >> >> > creation doesn't introduce another round trip before the first real
> >> data
> >> >> > request. If there's a way to batch RPC calls it might be better to
> >> make
> >> >> > session creation an RPC call.
> >> >> >
> >> >> > On Tue, Nov 22, 2022 at 3:16 PM David Li <li...@apache.org>
> wrote:
> >> >> >
> >> >> > > It sounds reasonable - then there are three points:
> >> >> > >
> >> >> > > - A standard URI scheme for Flight SQL that can be used by
> multiple
> >> >> client
> >> >> > > APIs (JDBC, ADBC, etc.)
> >> >> > > - A standard scheme for session data (likely header/cookie-based)
> >> >> > > - A mapping from URI parameters and fields to session data
> >> >> > >
> >> >> > >
> >> >> > >
> >> >> > > On Tue, Nov 22, 2022, at 17:45, James Duong wrote:
> >> >> > > > Just following up on this and if there are any thoughts.
> >> >> > > >
> >> >> > > > The purpose would be to standardize how we specify access to
> some
> >> >> named
> >> >> > > > logical grouping of data. This would make it easy to model
> >> >> catalog/schema
> >> >> > > > semantics in Flight SQL.
> >> >> > > >
> >> >> > > > Having this be part of the connection URI makes it similar to
> >> >> specifying
> >> >> > > a
> >> >> > > > resource in an HTTP URL (ie an endpoint) which should make it
> easy
> >> >> for
> >> >> > > end
> >> >> > > > users to work with and modify.
> >> >> > > >
> >> >> > > > On Fri, Nov 18, 2022 at 3:17 PM James Duong <
> >> jamesd@bitquilltech.com
> >> >> >
> >> >> > > wrote:
> >> >> > > >
> >> >> > > >> As for surfacing catalogs itself, perhaps we allow the URI
> take
> >> in a
> >> >> > > path
> >> >> > > >> and treat that as a way of specifying a multi-level resource
> that
> >> >> which
> >> >> > > the
> >> >> > > >> FlightClient is connecting to:
> >> >> > > >>
> >> >> > > >> eg a connection URI of the form:
> >> >> > > >> scheme://<host>:<port>/path-1/path-2/.../path-n
> >> >> > > >>
> >> >> > > >> The FlightClient could send this path as either a header or a
> >> >> session
> >> >> > > >> property (with a neutral name like 'resource-path'). Flight
> SQL
> >> >> > > Producers
> >> >> > > >> could interpret this as a catalog or schema.
> >> >> > > >> eg
> >> >> > > >> grpc://<host>:<port>/catalog/schema
> >> >> > > >>
> >> >> > > >> On Fri, Nov 11, 2022 at 2:07 AM James Henderson <jms@juxt.pro
> >
> >> >> wrote:
> >> >> > > >>
> >> >> > > >>> Sounds good to me.
> >> >> > > >>>
> >> >> > > >>> > Are you interested in writing up a (sketch of a) proposal?
> >> >> > > >>>
> >> >> > > >>> Yep, can do - I'm OoO over the next couple of weeks so might
> be
> >> a
> >> >> bit
> >> >> > > >>> intermittent.
> >> >> > > >>>
> >> >> > > >>> On Thu, 10 Nov 2022 at 15:28, David Li <li...@apache.org>
> >> >> wrote:
> >> >> > > >>>
> >> >> > > >>> > Hey James H.,
> >> >> > > >>> >
> >> >> > > >>> > That would make sense to me. So it sounds like we'd want
> >> >> > > >>> >
> >> >> > > >>> > - Formal specification of using cookies/headers to mark a
> >> >> 'session'
> >> >> > > (I
> >> >> > > >>> > guess this will be a little inconsistent with transactions,
> >> >> though)
> >> >> > > >>> > - Adding RPCs to query session values
> >> >> > > >>> > - Adding RPCs to set session values
> >> >> > > >>> > - Listing standard values and types
> >> >> > > >>> >
> >> >> > > >>> > Some things may require more consideration, e.g.
> transaction
> >> >> > > isolation
> >> >> > > >>> > might be better off as part of the transaction RPCs than an
> >> >> ambient
> >> >> > > >>> > property. Are you interested in writing up a (sketch of a)
> >> >> proposal?
> >> >> > > >>> >
> >> >> > > >>> > -David
> >> >> > > >>> >
> >> >> > > >>> > On Thu, Nov 10, 2022, at 10:09, James Henderson wrote:
> >> >> > > >>> > > Similarly, we're also currently considering how best to
> >> >> implement
> >> >> > > >>> some of
> >> >> > > >>> > > the SQL standard session variables in our Flight SQL
> server
> >> -
> >> >> > > things
> >> >> > > >>> like
> >> >> > > >>> > > current transaction isolation level, access mode, time
> zone
> >> >> etc,
> >> >> > > which
> >> >> > > >>> > seem
> >> >> > > >>> > > to have similar properties to the (traditional)
> connection's
> >> >> > > current
> >> >> > > >>> > > catalog. We'd (perhaps naively) looked at solutions
> >> involving
> >> >> the
> >> >> > > >>> Flight
> >> >> > > >>> > > client's `ClientCookieMiddleware`, but might these have
> >> >> > > standardised
> >> >> > > >>> > > support within Flight SQL itself eventually?
> >> >> > > >>> > >
> >> >> > > >>> > > Cheers,
> >> >> > > >>> > >
> >> >> > > >>> > > James
> >> >> > > >>> > >
> >> >> > > >>> > > On Wed, 9 Nov 2022 at 21:51, David Li <
> lidavidm@apache.org>
> >> >> wrote:
> >> >> > > >>> > >
> >> >> > > >>> > >> I think having better support for this makes sense, but
> >> >> perhaps we
> >> >> > > >>> can
> >> >> > > >>> > >> find a way to make it not tied to the connection itself?
> >> For
> >> >> > > >>> instance,
> >> >> > > >>> > in
> >> >> > > >>> > >> the same way transactions were implemented (as a
> handle).
> >> Or
> >> >> > > rather,
> >> >> > > >>> > >> instead of adding connection statefulness to Flight RPC,
> >> I'd
> >> >> > > rather
> >> >> > > >>> try
> >> >> > > >>> > to
> >> >> > > >>> > >> work within the gRPC/RPC paradigm.
> >> >> > > >>> > >>
> >> >> > > >>> > >> On Wed, Nov 9, 2022, at 16:47, James Duong wrote:
> >> >> > > >>> > >> > Databases such as SQL Server and PostgreSQL have the
> >> >> concept of
> >> >> > > >>> > catalogs
> >> >> > > >>> > >> as
> >> >> > > >>> > >> > containers of database schemas. Users can usually
> >> specify an
> >> >> > > >>> initial
> >> >> > > >>> > >> > catalog during the connection process, list catalogs,
> and
> >> >> > > sometimes
> >> >> > > >>> > >> change
> >> >> > > >>> > >> > catalogs during the session.
> >> >> > > >>> > >> >
> >> >> > > >>> > >> > Currently catalog support in Flight SQL is somewhat
> >> >> limited. The
> >> >> > > >>> > protocol
> >> >> > > >>> > >> > provides a way to list catalogs as well as metadata in
> >> >> > > SqlTypeInfo
> >> >> > > >>> for
> >> >> > > >>> > >> > reporting how catalogs are supported from a syntax
> >> >> perspective.
> >> >> > > >>> > >> >
> >> >> > > >>> > >> > ODBC and JDBC provide API calls for changing the
> catalog.
> >> >> > > >>> > Additionally,
> >> >> > > >>> > >> > Flight SQL doesn't really provide the concept of
> >> "initial"
> >> >> > > >>> connection
> >> >> > > >>> > >> > properties (such as a starting catalog) since Flight
> >> itself
> >> >> is
> >> >> > > >>> > stateless
> >> >> > > >>> > >> > from a connection perspective.
> >> >> > > >>> > >> >
> >> >> > > >>> > >> > To support catalogs properly, I'd imagine we need to
> make
> >> >> some
> >> >> > > >>> > changes to
> >> >> > > >>> > >> > the Flight SQL protocol:
> >> >> > > >>> > >> > - Introduce the concept of connection-time properties
> >> >> (perhaps
> >> >> > > an
> >> >> > > >>> > >> optional
> >> >> > > >>> > >> > RPC for Flight SQL applications that need this)
> >> >> > > >>> > >> > - Related to the above, expand the connection URL and
> >> Java
> >> >> > > builder
> >> >> > > >>> to
> >> >> > > >>> > >> allow
> >> >> > > >>> > >> > arbitrary application-specific properties.
> >> >> > > >>> > >> > - Add optional RPCs for changing the catalog and
> relevant
> >> >> error
> >> >> > > >>> codes
> >> >> > > >>> > if
> >> >> > > >>> > >> > this is not permitted.
> >> >> > > >>> > >> >
> >> >> > > >>> > >> >
> >> >> > > >>> > >> > --
> >> >> > > >>> > >> >
> >> >> > > >>> > >> > *James Duong*
> >> >> > > >>> > >> > Lead Software Developer
> >> >> > > >>> > >> > Bit Quill Technologies Inc.
> >> >> > > >>> > >> > Direct: +1.604.562.6082 | jamesd@bitquilltech.com
> >> >> > > >>> > >> > https://www.bitquilltech.com
> >> >> > > >>> > >> >
> >> >> > > >>> > >> > This email message is for the sole use of the intended
> >> >> > > recipient(s)
> >> >> > > >>> > and
> >> >> > > >>> > >> may
> >> >> > > >>> > >> > contain confidential and privileged information.  Any
> >> >> > > unauthorized
> >> >> > > >>> > >> review,
> >> >> > > >>> > >> > use, disclosure, or distribution is prohibited.  If
> you
> >> are
> >> >> not
> >> >> > > the
> >> >> > > >>> > >> > intended recipient, please contact the sender by reply
> >> >> email and
> >> >> > > >>> > destroy
> >> >> > > >>> > >> > all copies of the original message.  Thank you.
> >> >> > > >>> > >>
> >> >> > > >>> > >
> >> >> > > >>> > >
> >> >> > > >>> > > --
> >> >> > > >>> > > *James Henderson*
> >> >> > > >>> > > XTDB Developer at *JUXT*
> >> >> > > >>> > > Email jms@juxt.pro
> >> >> > > >>> > > Website https://juxt.pro
> >> >> > > >>> > >
> >> >> > > >>> > > [image: photo]
> >> >> > > >>> >
> >> >> > > >>>
> >> >> > > >>>
> >> >> > > >>> --
> >> >> > > >>> *James Henderson*
> >> >> > > >>> XTDB Developer at *JUXT*
> >> >> > > >>> Email jms@juxt.pro
> >> >> > > >>> Website https://juxt.pro
> >> >> > > >>>
> >> >> > > >>> [image: photo]
> >> >> > > >>>
> >> >> > > >>
> >> >> > > >>
> >> >> > > >> --
> >> >> > > >>
> >> >> > > >> *James Duong*
> >> >> > > >> Lead Software Developer
> >> >> > > >> Bit Quill Technologies Inc.
> >> >> > > >> Direct: +1.604.562.6082 | jamesd@bitquilltech.com
> >> >> > > >> https://www.bitquilltech.com
> >> >> > > >>
> >> >> > > >> This email message is for the sole use of the intended
> >> recipient(s)
> >> >> and
> >> >> > > >> may contain confidential and privileged information.  Any
> >> >> unauthorized
> >> >> > > >> review, use, disclosure, or distribution is prohibited.  If
> you
> >> are
> >> >> not
> >> >> > > the
> >> >> > > >> intended recipient, please contact the sender by reply email
> and
> >> >> destroy
> >> >> > > >> all copies of the original message.  Thank you.
> >> >> > > >>
> >> >> > > >
> >> >> > > >
> >> >> > > > --
> >> >> > > >
> >> >> > > > *James Duong*
> >> >> > > > Lead Software Developer
> >> >> > > > Bit Quill Technologies Inc.
> >> >> > > > Direct: +1.604.562.6082 | jamesd@bitquilltech.com
> >> >> > > > https://www.bitquilltech.com
> >> >> > > >
> >> >> > > > This email message is for the sole use of the intended
> >> recipient(s)
> >> >> and
> >> >> > > may
> >> >> > > > contain confidential and privileged information.  Any
> unauthorized
> >> >> > > review,
> >> >> > > > use, disclosure, or distribution is prohibited.  If you are not
> >> the
> >> >> > > > intended recipient, please contact the sender by reply email
> and
> >> >> destroy
> >> >> > > > all copies of the original message.  Thank you.
> >> >> > >
> >> >> >
> >> >> >
> >> >> > --
> >> >> >
> >> >> > *James Duong*
> >> >> > Lead Software Developer
> >> >> > Bit Quill Technologies Inc.
> >> >> > Direct: +1.604.562.6082 | jamesd@bitquilltech.com
> >> >> > https://www.bitquilltech.com
> >> >> >
> >> >> > This email message is for the sole use of the intended recipient(s)
> >> and
> >> >> may
> >> >> > contain confidential and privileged information.  Any unauthorized
> >> >> review,
> >> >> > use, disclosure, or distribution is prohibited.  If you are not the
> >> >> > intended recipient, please contact the sender by reply email and
> >> destroy
> >> >> > all copies of the original message.  Thank you.
> >> >>
> >>
>

Re: DISCUSS: [FlightSQL] Catalog support

Posted by James Duong <Ja...@improving.com.INVALID>.
Hi David,

I've written up the URI parsing in C++ and started adding session management messages. I'm also planning on having the ClientCookieMiddlewareFactory be able to report if sessions are enabled on the server.

I (or another developer) will send an update once those features are ready for demo.
________________________________
From: David Li <li...@apache.org>
Sent: December 12, 2022 10:07 AM
To: dev@arrow.apache.org <de...@arrow.apache.org>
Subject: Re: DISCUSS: [FlightSQL] Catalog support

Following up here, James are you interested in putting up a draft PR for the Flight SQL URI format and for session management?

The Flight SQL URI format would then also cover Andrew's use case. And if someone wants to draw up a PR to the JDBC driver to enable arbitrary properties, I can review that too.

On Sat, Dec 3, 2022, at 05:38, Andrew Lamb wrote:
>> Andrew, do we need to look into adding more metadata to indicate
> different query languages? (It's quite a shame that we named this Flight
> SQL at this point...)
>
> TDLR is I don't think trying to explicitly support languages other than SQL
> in FlightSQL is a good idea. Among other reasons, the JDBC / ODBC drivers,
> which mostly assume SQL, are one of the key features of FlightSQL, and they
> are likely not as useful for non SQL. I can see the argument to support for
> substrait plans, and it will be interesting to see what use cases benefit
> from that support.
>
> What would make our life easier would be some standard way to pass
> application specific key/value pairs from the JDBC driver to a flight SQL
> backend with each request (perhaps via gRPC headers). This would allow
> passing configuration parameters that were not envisioned in the spec, from
> end user (of the JDBC driver) all the way to our backend.
>
> Thanks again for driving this forward,
> Andrew
>
> On Thu, Dec 1, 2022 at 7:11 PM David Li <li...@apache.org> wrote:
>
>> Hey James, thanks for putting this up.
>>
>> Inline:
>>
>> > The suggestion is to make this part of Flight as an
>> > optional feature, rather than Flight SQL due to its applicability outside
>> > of just database access.
>>
>> Which uses do you see? I see statefulness as a general antipattern here,
>> so I'm wary of introducing it beyond where we need it.
>>
>> > - The Flight client supplies a New-Session header which has key-value
>> pairs
>> > for initial session options. This header can be applied to any RPC call,
>> > but logically should be the first one the client makes.
>>
>> Handshake already effectively serves as this RPC - maybe we could extend
>> it? (I also see Handshake as an antipattern because it's a stateful auth
>> mechanism.)
>>
>> Should the session timeout/be on a lease? (gRPC doesn't really give the
>> server a way to track the persistence of a particular client connection.)
>>
>> > It's a bit asymmetric that creating a new session is done by applying a
>> > header, but closing a session is an RPC call. This was so that session
>> > creation doesn't introduce another round trip before the first real data
>> > request. If there's a way to batch RPC calls it might be better to make
>> > session creation an RPC call.
>>
>> Is this a worrisome amount of overhead?
>>
>> Unfortunately gRPC doesn't batch RPC calls, but RPC calls on the same
>> client generally share the same TCP connection (modulo load balancing
>> behavior, but presumably that is not enabled if you want persistent
>> sessions).
>>
>> On the implementation side, I'd like to avoid baking this in too deeply if
>> at all possible. Ideally it'd be implemented entirely as middleware,
>> possibly making use of an interface so applications can override the
>> session storage (hashtable, Redis, etcd, etc.)
>>
>> > Just to chime in on this, one thing I'm curious about is whether there
>> > will be support for user-defined catalog/schema hierarchy depth?
>>
>> Gavin, for ADBC we discussed adding a delimiter to the catalog name to
>> handle this case - maybe we can handle this by adding a property for the
>> delimiter to SqlInfo?
>>
>> > https://github.com/influxdata/influxdb_iox/issues/6102
>>
>> Andrew, do we need to look into adding more metadata to indicate different
>> query languages? (It's quite a shame that we named this Flight SQL at this
>> point...)
>>
>> On Thu, Dec 1, 2022, at 08:49, Andrew Lamb wrote:
>> > Sorry for the late reply -- thank you James and David for this
>> discussion.
>> >
>> > I agree that adding Catalog support would be a valuable addition to
>> Flight
>> > SQL, and it recently came up as we begin to implement Flight SQL in
>> > InfluxDB IOx [1].
>> >
>> >> - A standard URI scheme for Flight SQL that can be used by multiple
>> > client APIs (JDBC, ADBC, etc.)
>> >
>> > I agree this would be very valuable, along with a standard way (ideally
>> > with HTTP headers) to send this information as part of the FlightSQL gRPC
>> > requests.
>> >
>> >> I'd suggest we define session management features explicitly in Flight
>> > (while being optional).
>> >
>> > I agree it is critical that server-side state is not required to
>> implement
>> > FlightSQL. Stateful connections would likely complicate deploying
>> FlightSQL
>> > in distributed systems. I suggest it should be possible to implement any
>> > session management features by sending the entire session state with the
>> > request, if desired.
>> >
>> > I don't have a strong opinion about the merits of including explicit
>> > session management features in FlightSQL. It seems to me that keeping the
>> > API surface of FlightSQL minimal and implementation flexibility maximal
>> > should be the default. However, if JDBC/ODBC driver compatibility would
>> be
>> > improved with explicit state management APIs, then adding them to
>> FlightSQL
>> > seems like a good idea to me.
>> >
>> > Thanks again -- it is amazing to hit some issue in design and then find
>> out
>> > the Arrow community is already hard at work on a solution.
>> >
>> > Andrew
>> >
>> > [1] https://github.com/influxdata/influxdb_iox/issues/6102
>> >
>> > On Wed, Nov 30, 2022 at 7:17 PM Gavin Ray <ra...@gmail.com> wrote:
>> >
>> >> Just to chime in on this, one thing I'm curious about is whether there
>> >> will be support for user-defined catalog/schema hierarchy depth?
>> >>
>> >> This comment that James made does seem reasonable to me
>> >> > scheme://<host>:<port>/path-1/path-2/.../path-n
>> >>
>> >> Trino/Presto does a similar thing (jdbc:trino://localhost:8080/tpch/sf1)
>> >>
>> >> At Hasura, what we do is have an alias "FullyQualifiedName" which is
>> >> just "Array<String>"
>> >> and the identifier to some element in a data source is always
>> >> fully-qualified:
>> >>
>> >> https://github.com/hasura/graphql-engine/tree/master/dc-agents#schema
>> >>
>> >> ["postgres_1", "db1", "schema2", "my_table", "col_a"]
>> >> ["mongo", "db1",  "collection_a", "field_a"]
>> >> ["csv_adapter", "myfile.csv", "col_x"]
>> >>
>> >> On Wed, Nov 30, 2022 at 6:31 PM James Duong
>> >> <ja...@bitquilltech.com.invalid> wrote:
>> >> >
>> >> > Our current convention of sending connection properties as headers
>> with
>> >> > every request has the benefit of making statefulness optional, but has
>> >> the
>> >> > drawback of sending redundant, unused properties on requests after the
>> >> > first, which increases the payload size unnecessarily.
>> >> >
>> >> > I'd suggest we define session management features explicitly in Flight
>> >> > (while being optional). The suggestion is to make this part of Flight
>> as
>> >> an
>> >> > optional feature, rather than Flight SQL due to its applicability
>> outside
>> >> > of just database access.
>> >> >
>> >> > Creating a session:
>> >> > - The Flight client supplies a New-Session header which has key-value
>> >> pairs
>> >> > for initial session options. This header can be applied to any RPC
>> call,
>> >> > but logically should be the first one the client makes.
>> >> > - The server should send a Set-Cookie header back containing some
>> >> > server-side representation of the session that the client can use in
>> >> > subsequent requests.
>> >> > - The path specified in the URI is sent as a "Catalog" session option.
>> >> >
>> >> > Modifying session options:
>> >> > - A separate RPC call that takes in a Stream<string, string>
>> representing
>> >> > each session option that is being modified and returns a stream of
>> >> statuses
>> >> > to indicate if the setting change was accepted.
>> >> > - This RPC call is only valid when the Cookie header is used.
>> >> > - It is up to the server to define if a failed session property
>> change is
>> >> > fatal or if other properties can continue to be set.
>> >> >
>> >> > Closing a session:
>> >> > - A separate RPC call that tells the server to drop the session
>> specified
>> >> > by the Cookie header.
>> >> >
>> >> > Notes:
>> >> > A Flight SQL client would check if session management RPCs are
>> supported
>> >> > through a new GetSqlInfo property. A Flight client doesn't have a way
>> to
>> >> do
>> >> > this generically, but there could be an application-specific RPC or
>> >> header
>> >> > that reports this metadata.
>> >> >
>> >> > The O/JDBC and ADBC drivers would need to be updated to
>> programmatically
>> >> > check for session management RPCs. If unsupported, then use the old
>> >> > behavior of sending all properties as headers with each request. If
>> >> > supported, make use of the New-Session header and drop the session
>> when
>> >> > closing the client-side connection.
>> >> >
>> >> > It's a bit asymmetric that creating a new session is done by applying
>> a
>> >> > header, but closing a session is an RPC call. This was so that session
>> >> > creation doesn't introduce another round trip before the first real
>> data
>> >> > request. If there's a way to batch RPC calls it might be better to
>> make
>> >> > session creation an RPC call.
>> >> >
>> >> > On Tue, Nov 22, 2022 at 3:16 PM David Li <li...@apache.org> wrote:
>> >> >
>> >> > > It sounds reasonable - then there are three points:
>> >> > >
>> >> > > - A standard URI scheme for Flight SQL that can be used by multiple
>> >> client
>> >> > > APIs (JDBC, ADBC, etc.)
>> >> > > - A standard scheme for session data (likely header/cookie-based)
>> >> > > - A mapping from URI parameters and fields to session data
>> >> > >
>> >> > >
>> >> > >
>> >> > > On Tue, Nov 22, 2022, at 17:45, James Duong wrote:
>> >> > > > Just following up on this and if there are any thoughts.
>> >> > > >
>> >> > > > The purpose would be to standardize how we specify access to some
>> >> named
>> >> > > > logical grouping of data. This would make it easy to model
>> >> catalog/schema
>> >> > > > semantics in Flight SQL.
>> >> > > >
>> >> > > > Having this be part of the connection URI makes it similar to
>> >> specifying
>> >> > > a
>> >> > > > resource in an HTTP URL (ie an endpoint) which should make it easy
>> >> for
>> >> > > end
>> >> > > > users to work with and modify.
>> >> > > >
>> >> > > > On Fri, Nov 18, 2022 at 3:17 PM James Duong <
>> jamesd@bitquilltech.com
>> >> >
>> >> > > wrote:
>> >> > > >
>> >> > > >> As for surfacing catalogs itself, perhaps we allow the URI take
>> in a
>> >> > > path
>> >> > > >> and treat that as a way of specifying a multi-level resource that
>> >> which
>> >> > > the
>> >> > > >> FlightClient is connecting to:
>> >> > > >>
>> >> > > >> eg a connection URI of the form:
>> >> > > >> scheme://<host>:<port>/path-1/path-2/.../path-n
>> >> > > >>
>> >> > > >> The FlightClient could send this path as either a header or a
>> >> session
>> >> > > >> property (with a neutral name like 'resource-path'). Flight SQL
>> >> > > Producers
>> >> > > >> could interpret this as a catalog or schema.
>> >> > > >> eg
>> >> > > >> grpc://<host>:<port>/catalog/schema
>> >> > > >>
>> >> > > >> On Fri, Nov 11, 2022 at 2:07 AM James Henderson <jm...@juxt.pro>
>> >> wrote:
>> >> > > >>
>> >> > > >>> Sounds good to me.
>> >> > > >>>
>> >> > > >>> > Are you interested in writing up a (sketch of a) proposal?
>> >> > > >>>
>> >> > > >>> Yep, can do - I'm OoO over the next couple of weeks so might be
>> a
>> >> bit
>> >> > > >>> intermittent.
>> >> > > >>>
>> >> > > >>> On Thu, 10 Nov 2022 at 15:28, David Li <li...@apache.org>
>> >> wrote:
>> >> > > >>>
>> >> > > >>> > Hey James H.,
>> >> > > >>> >
>> >> > > >>> > That would make sense to me. So it sounds like we'd want
>> >> > > >>> >
>> >> > > >>> > - Formal specification of using cookies/headers to mark a
>> >> 'session'
>> >> > > (I
>> >> > > >>> > guess this will be a little inconsistent with transactions,
>> >> though)
>> >> > > >>> > - Adding RPCs to query session values
>> >> > > >>> > - Adding RPCs to set session values
>> >> > > >>> > - Listing standard values and types
>> >> > > >>> >
>> >> > > >>> > Some things may require more consideration, e.g. transaction
>> >> > > isolation
>> >> > > >>> > might be better off as part of the transaction RPCs than an
>> >> ambient
>> >> > > >>> > property. Are you interested in writing up a (sketch of a)
>> >> proposal?
>> >> > > >>> >
>> >> > > >>> > -David
>> >> > > >>> >
>> >> > > >>> > On Thu, Nov 10, 2022, at 10:09, James Henderson wrote:
>> >> > > >>> > > Similarly, we're also currently considering how best to
>> >> implement
>> >> > > >>> some of
>> >> > > >>> > > the SQL standard session variables in our Flight SQL server
>> -
>> >> > > things
>> >> > > >>> like
>> >> > > >>> > > current transaction isolation level, access mode, time zone
>> >> etc,
>> >> > > which
>> >> > > >>> > seem
>> >> > > >>> > > to have similar properties to the (traditional) connection's
>> >> > > current
>> >> > > >>> > > catalog. We'd (perhaps naively) looked at solutions
>> involving
>> >> the
>> >> > > >>> Flight
>> >> > > >>> > > client's `ClientCookieMiddleware`, but might these have
>> >> > > standardised
>> >> > > >>> > > support within Flight SQL itself eventually?
>> >> > > >>> > >
>> >> > > >>> > > Cheers,
>> >> > > >>> > >
>> >> > > >>> > > James
>> >> > > >>> > >
>> >> > > >>> > > On Wed, 9 Nov 2022 at 21:51, David Li <li...@apache.org>
>> >> wrote:
>> >> > > >>> > >
>> >> > > >>> > >> I think having better support for this makes sense, but
>> >> perhaps we
>> >> > > >>> can
>> >> > > >>> > >> find a way to make it not tied to the connection itself?
>> For
>> >> > > >>> instance,
>> >> > > >>> > in
>> >> > > >>> > >> the same way transactions were implemented (as a handle).
>> Or
>> >> > > rather,
>> >> > > >>> > >> instead of adding connection statefulness to Flight RPC,
>> I'd
>> >> > > rather
>> >> > > >>> try
>> >> > > >>> > to
>> >> > > >>> > >> work within the gRPC/RPC paradigm.
>> >> > > >>> > >>
>> >> > > >>> > >> On Wed, Nov 9, 2022, at 16:47, James Duong wrote:
>> >> > > >>> > >> > Databases such as SQL Server and PostgreSQL have the
>> >> concept of
>> >> > > >>> > catalogs
>> >> > > >>> > >> as
>> >> > > >>> > >> > containers of database schemas. Users can usually
>> specify an
>> >> > > >>> initial
>> >> > > >>> > >> > catalog during the connection process, list catalogs, and
>> >> > > sometimes
>> >> > > >>> > >> change
>> >> > > >>> > >> > catalogs during the session.
>> >> > > >>> > >> >
>> >> > > >>> > >> > Currently catalog support in Flight SQL is somewhat
>> >> limited. The
>> >> > > >>> > protocol
>> >> > > >>> > >> > provides a way to list catalogs as well as metadata in
>> >> > > SqlTypeInfo
>> >> > > >>> for
>> >> > > >>> > >> > reporting how catalogs are supported from a syntax
>> >> perspective.
>> >> > > >>> > >> >
>> >> > > >>> > >> > ODBC and JDBC provide API calls for changing the catalog.
>> >> > > >>> > Additionally,
>> >> > > >>> > >> > Flight SQL doesn't really provide the concept of
>> "initial"
>> >> > > >>> connection
>> >> > > >>> > >> > properties (such as a starting catalog) since Flight
>> itself
>> >> is
>> >> > > >>> > stateless
>> >> > > >>> > >> > from a connection perspective.
>> >> > > >>> > >> >
>> >> > > >>> > >> > To support catalogs properly, I'd imagine we need to make
>> >> some
>> >> > > >>> > changes to
>> >> > > >>> > >> > the Flight SQL protocol:
>> >> > > >>> > >> > - Introduce the concept of connection-time properties
>> >> (perhaps
>> >> > > an
>> >> > > >>> > >> optional
>> >> > > >>> > >> > RPC for Flight SQL applications that need this)
>> >> > > >>> > >> > - Related to the above, expand the connection URL and
>> Java
>> >> > > builder
>> >> > > >>> to
>> >> > > >>> > >> allow
>> >> > > >>> > >> > arbitrary application-specific properties.
>> >> > > >>> > >> > - Add optional RPCs for changing the catalog and relevant
>> >> error
>> >> > > >>> codes
>> >> > > >>> > if
>> >> > > >>> > >> > this is not permitted.
>> >> > > >>> > >> >
>> >> > > >>> > >> >
>> >> > > >>> > >> > --
>> >> > > >>> > >> >
>> >> > > >>> > >> > *James Duong*
>> >> > > >>> > >> > Lead Software Developer
>> >> > > >>> > >> > Bit Quill Technologies Inc.
>> >> > > >>> > >> > Direct: +1.604.562.6082 | jamesd@bitquilltech.com
>> >> > > >>> > >> > https://www.bitquilltech.com
>> >> > > >>> > >> >
>> >> > > >>> > >> > This email message is for the sole use of the intended
>> >> > > recipient(s)
>> >> > > >>> > and
>> >> > > >>> > >> may
>> >> > > >>> > >> > contain confidential and privileged information.  Any
>> >> > > unauthorized
>> >> > > >>> > >> review,
>> >> > > >>> > >> > use, disclosure, or distribution is prohibited.  If you
>> are
>> >> not
>> >> > > the
>> >> > > >>> > >> > intended recipient, please contact the sender by reply
>> >> email and
>> >> > > >>> > destroy
>> >> > > >>> > >> > all copies of the original message.  Thank you.
>> >> > > >>> > >>
>> >> > > >>> > >
>> >> > > >>> > >
>> >> > > >>> > > --
>> >> > > >>> > > *James Henderson*
>> >> > > >>> > > XTDB Developer at *JUXT*
>> >> > > >>> > > Email jms@juxt.pro
>> >> > > >>> > > Website https://juxt.pro
>> >> > > >>> > >
>> >> > > >>> > > [image: photo]
>> >> > > >>> >
>> >> > > >>>
>> >> > > >>>
>> >> > > >>> --
>> >> > > >>> *James Henderson*
>> >> > > >>> XTDB Developer at *JUXT*
>> >> > > >>> Email jms@juxt.pro
>> >> > > >>> Website https://juxt.pro
>> >> > > >>>
>> >> > > >>> [image: photo]
>> >> > > >>>
>> >> > > >>
>> >> > > >>
>> >> > > >> --
>> >> > > >>
>> >> > > >> *James Duong*
>> >> > > >> Lead Software Developer
>> >> > > >> Bit Quill Technologies Inc.
>> >> > > >> Direct: +1.604.562.6082 | jamesd@bitquilltech.com
>> >> > > >> https://www.bitquilltech.com
>> >> > > >>
>> >> > > >> This email message is for the sole use of the intended
>> recipient(s)
>> >> and
>> >> > > >> may contain confidential and privileged information.  Any
>> >> unauthorized
>> >> > > >> review, use, disclosure, or distribution is prohibited.  If you
>> are
>> >> not
>> >> > > the
>> >> > > >> intended recipient, please contact the sender by reply email and
>> >> destroy
>> >> > > >> all copies of the original message.  Thank you.
>> >> > > >>
>> >> > > >
>> >> > > >
>> >> > > > --
>> >> > > >
>> >> > > > *James Duong*
>> >> > > > Lead Software Developer
>> >> > > > Bit Quill Technologies Inc.
>> >> > > > Direct: +1.604.562.6082 | jamesd@bitquilltech.com
>> >> > > > https://www.bitquilltech.com
>> >> > > >
>> >> > > > This email message is for the sole use of the intended
>> recipient(s)
>> >> and
>> >> > > may
>> >> > > > contain confidential and privileged information.  Any unauthorized
>> >> > > review,
>> >> > > > use, disclosure, or distribution is prohibited.  If you are not
>> the
>> >> > > > intended recipient, please contact the sender by reply email and
>> >> destroy
>> >> > > > all copies of the original message.  Thank you.
>> >> > >
>> >> >
>> >> >
>> >> > --
>> >> >
>> >> > *James Duong*
>> >> > Lead Software Developer
>> >> > Bit Quill Technologies Inc.
>> >> > Direct: +1.604.562.6082 | jamesd@bitquilltech.com
>> >> > https://www.bitquilltech.com
>> >> >
>> >> > This email message is for the sole use of the intended recipient(s)
>> and
>> >> may
>> >> > contain confidential and privileged information.  Any unauthorized
>> >> review,
>> >> > use, disclosure, or distribution is prohibited.  If you are not the
>> >> > intended recipient, please contact the sender by reply email and
>> destroy
>> >> > all copies of the original message.  Thank you.
>> >>
>>

Re: DISCUSS: [FlightSQL] Catalog support

Posted by David Li <li...@apache.org>.
Following up here, James are you interested in putting up a draft PR for the Flight SQL URI format and for session management? 

The Flight SQL URI format would then also cover Andrew's use case. And if someone wants to draw up a PR to the JDBC driver to enable arbitrary properties, I can review that too.

On Sat, Dec 3, 2022, at 05:38, Andrew Lamb wrote:
>> Andrew, do we need to look into adding more metadata to indicate
> different query languages? (It's quite a shame that we named this Flight
> SQL at this point...)
>
> TDLR is I don't think trying to explicitly support languages other than SQL
> in FlightSQL is a good idea. Among other reasons, the JDBC / ODBC drivers,
> which mostly assume SQL, are one of the key features of FlightSQL, and they
> are likely not as useful for non SQL. I can see the argument to support for
> substrait plans, and it will be interesting to see what use cases benefit
> from that support.
>
> What would make our life easier would be some standard way to pass
> application specific key/value pairs from the JDBC driver to a flight SQL
> backend with each request (perhaps via gRPC headers). This would allow
> passing configuration parameters that were not envisioned in the spec, from
> end user (of the JDBC driver) all the way to our backend.
>
> Thanks again for driving this forward,
> Andrew
>
> On Thu, Dec 1, 2022 at 7:11 PM David Li <li...@apache.org> wrote:
>
>> Hey James, thanks for putting this up.
>>
>> Inline:
>>
>> > The suggestion is to make this part of Flight as an
>> > optional feature, rather than Flight SQL due to its applicability outside
>> > of just database access.
>>
>> Which uses do you see? I see statefulness as a general antipattern here,
>> so I'm wary of introducing it beyond where we need it.
>>
>> > - The Flight client supplies a New-Session header which has key-value
>> pairs
>> > for initial session options. This header can be applied to any RPC call,
>> > but logically should be the first one the client makes.
>>
>> Handshake already effectively serves as this RPC - maybe we could extend
>> it? (I also see Handshake as an antipattern because it's a stateful auth
>> mechanism.)
>>
>> Should the session timeout/be on a lease? (gRPC doesn't really give the
>> server a way to track the persistence of a particular client connection.)
>>
>> > It's a bit asymmetric that creating a new session is done by applying a
>> > header, but closing a session is an RPC call. This was so that session
>> > creation doesn't introduce another round trip before the first real data
>> > request. If there's a way to batch RPC calls it might be better to make
>> > session creation an RPC call.
>>
>> Is this a worrisome amount of overhead?
>>
>> Unfortunately gRPC doesn't batch RPC calls, but RPC calls on the same
>> client generally share the same TCP connection (modulo load balancing
>> behavior, but presumably that is not enabled if you want persistent
>> sessions).
>>
>> On the implementation side, I'd like to avoid baking this in too deeply if
>> at all possible. Ideally it'd be implemented entirely as middleware,
>> possibly making use of an interface so applications can override the
>> session storage (hashtable, Redis, etcd, etc.)
>>
>> > Just to chime in on this, one thing I'm curious about is whether there
>> > will be support for user-defined catalog/schema hierarchy depth?
>>
>> Gavin, for ADBC we discussed adding a delimiter to the catalog name to
>> handle this case - maybe we can handle this by adding a property for the
>> delimiter to SqlInfo?
>>
>> > https://github.com/influxdata/influxdb_iox/issues/6102
>>
>> Andrew, do we need to look into adding more metadata to indicate different
>> query languages? (It's quite a shame that we named this Flight SQL at this
>> point...)
>>
>> On Thu, Dec 1, 2022, at 08:49, Andrew Lamb wrote:
>> > Sorry for the late reply -- thank you James and David for this
>> discussion.
>> >
>> > I agree that adding Catalog support would be a valuable addition to
>> Flight
>> > SQL, and it recently came up as we begin to implement Flight SQL in
>> > InfluxDB IOx [1].
>> >
>> >> - A standard URI scheme for Flight SQL that can be used by multiple
>> > client APIs (JDBC, ADBC, etc.)
>> >
>> > I agree this would be very valuable, along with a standard way (ideally
>> > with HTTP headers) to send this information as part of the FlightSQL gRPC
>> > requests.
>> >
>> >> I'd suggest we define session management features explicitly in Flight
>> > (while being optional).
>> >
>> > I agree it is critical that server-side state is not required to
>> implement
>> > FlightSQL. Stateful connections would likely complicate deploying
>> FlightSQL
>> > in distributed systems. I suggest it should be possible to implement any
>> > session management features by sending the entire session state with the
>> > request, if desired.
>> >
>> > I don't have a strong opinion about the merits of including explicit
>> > session management features in FlightSQL. It seems to me that keeping the
>> > API surface of FlightSQL minimal and implementation flexibility maximal
>> > should be the default. However, if JDBC/ODBC driver compatibility would
>> be
>> > improved with explicit state management APIs, then adding them to
>> FlightSQL
>> > seems like a good idea to me.
>> >
>> > Thanks again -- it is amazing to hit some issue in design and then find
>> out
>> > the Arrow community is already hard at work on a solution.
>> >
>> > Andrew
>> >
>> > [1] https://github.com/influxdata/influxdb_iox/issues/6102
>> >
>> > On Wed, Nov 30, 2022 at 7:17 PM Gavin Ray <ra...@gmail.com> wrote:
>> >
>> >> Just to chime in on this, one thing I'm curious about is whether there
>> >> will be support for user-defined catalog/schema hierarchy depth?
>> >>
>> >> This comment that James made does seem reasonable to me
>> >> > scheme://<host>:<port>/path-1/path-2/.../path-n
>> >>
>> >> Trino/Presto does a similar thing (jdbc:trino://localhost:8080/tpch/sf1)
>> >>
>> >> At Hasura, what we do is have an alias "FullyQualifiedName" which is
>> >> just "Array<String>"
>> >> and the identifier to some element in a data source is always
>> >> fully-qualified:
>> >>
>> >> https://github.com/hasura/graphql-engine/tree/master/dc-agents#schema
>> >>
>> >> ["postgres_1", "db1", "schema2", "my_table", "col_a"]
>> >> ["mongo", "db1",  "collection_a", "field_a"]
>> >> ["csv_adapter", "myfile.csv", "col_x"]
>> >>
>> >> On Wed, Nov 30, 2022 at 6:31 PM James Duong
>> >> <ja...@bitquilltech.com.invalid> wrote:
>> >> >
>> >> > Our current convention of sending connection properties as headers
>> with
>> >> > every request has the benefit of making statefulness optional, but has
>> >> the
>> >> > drawback of sending redundant, unused properties on requests after the
>> >> > first, which increases the payload size unnecessarily.
>> >> >
>> >> > I'd suggest we define session management features explicitly in Flight
>> >> > (while being optional). The suggestion is to make this part of Flight
>> as
>> >> an
>> >> > optional feature, rather than Flight SQL due to its applicability
>> outside
>> >> > of just database access.
>> >> >
>> >> > Creating a session:
>> >> > - The Flight client supplies a New-Session header which has key-value
>> >> pairs
>> >> > for initial session options. This header can be applied to any RPC
>> call,
>> >> > but logically should be the first one the client makes.
>> >> > - The server should send a Set-Cookie header back containing some
>> >> > server-side representation of the session that the client can use in
>> >> > subsequent requests.
>> >> > - The path specified in the URI is sent as a "Catalog" session option.
>> >> >
>> >> > Modifying session options:
>> >> > - A separate RPC call that takes in a Stream<string, string>
>> representing
>> >> > each session option that is being modified and returns a stream of
>> >> statuses
>> >> > to indicate if the setting change was accepted.
>> >> > - This RPC call is only valid when the Cookie header is used.
>> >> > - It is up to the server to define if a failed session property
>> change is
>> >> > fatal or if other properties can continue to be set.
>> >> >
>> >> > Closing a session:
>> >> > - A separate RPC call that tells the server to drop the session
>> specified
>> >> > by the Cookie header.
>> >> >
>> >> > Notes:
>> >> > A Flight SQL client would check if session management RPCs are
>> supported
>> >> > through a new GetSqlInfo property. A Flight client doesn't have a way
>> to
>> >> do
>> >> > this generically, but there could be an application-specific RPC or
>> >> header
>> >> > that reports this metadata.
>> >> >
>> >> > The O/JDBC and ADBC drivers would need to be updated to
>> programmatically
>> >> > check for session management RPCs. If unsupported, then use the old
>> >> > behavior of sending all properties as headers with each request. If
>> >> > supported, make use of the New-Session header and drop the session
>> when
>> >> > closing the client-side connection.
>> >> >
>> >> > It's a bit asymmetric that creating a new session is done by applying
>> a
>> >> > header, but closing a session is an RPC call. This was so that session
>> >> > creation doesn't introduce another round trip before the first real
>> data
>> >> > request. If there's a way to batch RPC calls it might be better to
>> make
>> >> > session creation an RPC call.
>> >> >
>> >> > On Tue, Nov 22, 2022 at 3:16 PM David Li <li...@apache.org> wrote:
>> >> >
>> >> > > It sounds reasonable - then there are three points:
>> >> > >
>> >> > > - A standard URI scheme for Flight SQL that can be used by multiple
>> >> client
>> >> > > APIs (JDBC, ADBC, etc.)
>> >> > > - A standard scheme for session data (likely header/cookie-based)
>> >> > > - A mapping from URI parameters and fields to session data
>> >> > >
>> >> > >
>> >> > >
>> >> > > On Tue, Nov 22, 2022, at 17:45, James Duong wrote:
>> >> > > > Just following up on this and if there are any thoughts.
>> >> > > >
>> >> > > > The purpose would be to standardize how we specify access to some
>> >> named
>> >> > > > logical grouping of data. This would make it easy to model
>> >> catalog/schema
>> >> > > > semantics in Flight SQL.
>> >> > > >
>> >> > > > Having this be part of the connection URI makes it similar to
>> >> specifying
>> >> > > a
>> >> > > > resource in an HTTP URL (ie an endpoint) which should make it easy
>> >> for
>> >> > > end
>> >> > > > users to work with and modify.
>> >> > > >
>> >> > > > On Fri, Nov 18, 2022 at 3:17 PM James Duong <
>> jamesd@bitquilltech.com
>> >> >
>> >> > > wrote:
>> >> > > >
>> >> > > >> As for surfacing catalogs itself, perhaps we allow the URI take
>> in a
>> >> > > path
>> >> > > >> and treat that as a way of specifying a multi-level resource that
>> >> which
>> >> > > the
>> >> > > >> FlightClient is connecting to:
>> >> > > >>
>> >> > > >> eg a connection URI of the form:
>> >> > > >> scheme://<host>:<port>/path-1/path-2/.../path-n
>> >> > > >>
>> >> > > >> The FlightClient could send this path as either a header or a
>> >> session
>> >> > > >> property (with a neutral name like 'resource-path'). Flight SQL
>> >> > > Producers
>> >> > > >> could interpret this as a catalog or schema.
>> >> > > >> eg
>> >> > > >> grpc://<host>:<port>/catalog/schema
>> >> > > >>
>> >> > > >> On Fri, Nov 11, 2022 at 2:07 AM James Henderson <jm...@juxt.pro>
>> >> wrote:
>> >> > > >>
>> >> > > >>> Sounds good to me.
>> >> > > >>>
>> >> > > >>> > Are you interested in writing up a (sketch of a) proposal?
>> >> > > >>>
>> >> > > >>> Yep, can do - I'm OoO over the next couple of weeks so might be
>> a
>> >> bit
>> >> > > >>> intermittent.
>> >> > > >>>
>> >> > > >>> On Thu, 10 Nov 2022 at 15:28, David Li <li...@apache.org>
>> >> wrote:
>> >> > > >>>
>> >> > > >>> > Hey James H.,
>> >> > > >>> >
>> >> > > >>> > That would make sense to me. So it sounds like we'd want
>> >> > > >>> >
>> >> > > >>> > - Formal specification of using cookies/headers to mark a
>> >> 'session'
>> >> > > (I
>> >> > > >>> > guess this will be a little inconsistent with transactions,
>> >> though)
>> >> > > >>> > - Adding RPCs to query session values
>> >> > > >>> > - Adding RPCs to set session values
>> >> > > >>> > - Listing standard values and types
>> >> > > >>> >
>> >> > > >>> > Some things may require more consideration, e.g. transaction
>> >> > > isolation
>> >> > > >>> > might be better off as part of the transaction RPCs than an
>> >> ambient
>> >> > > >>> > property. Are you interested in writing up a (sketch of a)
>> >> proposal?
>> >> > > >>> >
>> >> > > >>> > -David
>> >> > > >>> >
>> >> > > >>> > On Thu, Nov 10, 2022, at 10:09, James Henderson wrote:
>> >> > > >>> > > Similarly, we're also currently considering how best to
>> >> implement
>> >> > > >>> some of
>> >> > > >>> > > the SQL standard session variables in our Flight SQL server
>> -
>> >> > > things
>> >> > > >>> like
>> >> > > >>> > > current transaction isolation level, access mode, time zone
>> >> etc,
>> >> > > which
>> >> > > >>> > seem
>> >> > > >>> > > to have similar properties to the (traditional) connection's
>> >> > > current
>> >> > > >>> > > catalog. We'd (perhaps naively) looked at solutions
>> involving
>> >> the
>> >> > > >>> Flight
>> >> > > >>> > > client's `ClientCookieMiddleware`, but might these have
>> >> > > standardised
>> >> > > >>> > > support within Flight SQL itself eventually?
>> >> > > >>> > >
>> >> > > >>> > > Cheers,
>> >> > > >>> > >
>> >> > > >>> > > James
>> >> > > >>> > >
>> >> > > >>> > > On Wed, 9 Nov 2022 at 21:51, David Li <li...@apache.org>
>> >> wrote:
>> >> > > >>> > >
>> >> > > >>> > >> I think having better support for this makes sense, but
>> >> perhaps we
>> >> > > >>> can
>> >> > > >>> > >> find a way to make it not tied to the connection itself?
>> For
>> >> > > >>> instance,
>> >> > > >>> > in
>> >> > > >>> > >> the same way transactions were implemented (as a handle).
>> Or
>> >> > > rather,
>> >> > > >>> > >> instead of adding connection statefulness to Flight RPC,
>> I'd
>> >> > > rather
>> >> > > >>> try
>> >> > > >>> > to
>> >> > > >>> > >> work within the gRPC/RPC paradigm.
>> >> > > >>> > >>
>> >> > > >>> > >> On Wed, Nov 9, 2022, at 16:47, James Duong wrote:
>> >> > > >>> > >> > Databases such as SQL Server and PostgreSQL have the
>> >> concept of
>> >> > > >>> > catalogs
>> >> > > >>> > >> as
>> >> > > >>> > >> > containers of database schemas. Users can usually
>> specify an
>> >> > > >>> initial
>> >> > > >>> > >> > catalog during the connection process, list catalogs, and
>> >> > > sometimes
>> >> > > >>> > >> change
>> >> > > >>> > >> > catalogs during the session.
>> >> > > >>> > >> >
>> >> > > >>> > >> > Currently catalog support in Flight SQL is somewhat
>> >> limited. The
>> >> > > >>> > protocol
>> >> > > >>> > >> > provides a way to list catalogs as well as metadata in
>> >> > > SqlTypeInfo
>> >> > > >>> for
>> >> > > >>> > >> > reporting how catalogs are supported from a syntax
>> >> perspective.
>> >> > > >>> > >> >
>> >> > > >>> > >> > ODBC and JDBC provide API calls for changing the catalog.
>> >> > > >>> > Additionally,
>> >> > > >>> > >> > Flight SQL doesn't really provide the concept of
>> "initial"
>> >> > > >>> connection
>> >> > > >>> > >> > properties (such as a starting catalog) since Flight
>> itself
>> >> is
>> >> > > >>> > stateless
>> >> > > >>> > >> > from a connection perspective.
>> >> > > >>> > >> >
>> >> > > >>> > >> > To support catalogs properly, I'd imagine we need to make
>> >> some
>> >> > > >>> > changes to
>> >> > > >>> > >> > the Flight SQL protocol:
>> >> > > >>> > >> > - Introduce the concept of connection-time properties
>> >> (perhaps
>> >> > > an
>> >> > > >>> > >> optional
>> >> > > >>> > >> > RPC for Flight SQL applications that need this)
>> >> > > >>> > >> > - Related to the above, expand the connection URL and
>> Java
>> >> > > builder
>> >> > > >>> to
>> >> > > >>> > >> allow
>> >> > > >>> > >> > arbitrary application-specific properties.
>> >> > > >>> > >> > - Add optional RPCs for changing the catalog and relevant
>> >> error
>> >> > > >>> codes
>> >> > > >>> > if
>> >> > > >>> > >> > this is not permitted.
>> >> > > >>> > >> >
>> >> > > >>> > >> >
>> >> > > >>> > >> > --
>> >> > > >>> > >> >
>> >> > > >>> > >> > *James Duong*
>> >> > > >>> > >> > Lead Software Developer
>> >> > > >>> > >> > Bit Quill Technologies Inc.
>> >> > > >>> > >> > Direct: +1.604.562.6082 | jamesd@bitquilltech.com
>> >> > > >>> > >> > https://www.bitquilltech.com
>> >> > > >>> > >> >
>> >> > > >>> > >> > This email message is for the sole use of the intended
>> >> > > recipient(s)
>> >> > > >>> > and
>> >> > > >>> > >> may
>> >> > > >>> > >> > contain confidential and privileged information.  Any
>> >> > > unauthorized
>> >> > > >>> > >> review,
>> >> > > >>> > >> > use, disclosure, or distribution is prohibited.  If you
>> are
>> >> not
>> >> > > the
>> >> > > >>> > >> > intended recipient, please contact the sender by reply
>> >> email and
>> >> > > >>> > destroy
>> >> > > >>> > >> > all copies of the original message.  Thank you.
>> >> > > >>> > >>
>> >> > > >>> > >
>> >> > > >>> > >
>> >> > > >>> > > --
>> >> > > >>> > > *James Henderson*
>> >> > > >>> > > XTDB Developer at *JUXT*
>> >> > > >>> > > Email jms@juxt.pro
>> >> > > >>> > > Website https://juxt.pro
>> >> > > >>> > >
>> >> > > >>> > > [image: photo]
>> >> > > >>> >
>> >> > > >>>
>> >> > > >>>
>> >> > > >>> --
>> >> > > >>> *James Henderson*
>> >> > > >>> XTDB Developer at *JUXT*
>> >> > > >>> Email jms@juxt.pro
>> >> > > >>> Website https://juxt.pro
>> >> > > >>>
>> >> > > >>> [image: photo]
>> >> > > >>>
>> >> > > >>
>> >> > > >>
>> >> > > >> --
>> >> > > >>
>> >> > > >> *James Duong*
>> >> > > >> Lead Software Developer
>> >> > > >> Bit Quill Technologies Inc.
>> >> > > >> Direct: +1.604.562.6082 | jamesd@bitquilltech.com
>> >> > > >> https://www.bitquilltech.com
>> >> > > >>
>> >> > > >> This email message is for the sole use of the intended
>> recipient(s)
>> >> and
>> >> > > >> may contain confidential and privileged information.  Any
>> >> unauthorized
>> >> > > >> review, use, disclosure, or distribution is prohibited.  If you
>> are
>> >> not
>> >> > > the
>> >> > > >> intended recipient, please contact the sender by reply email and
>> >> destroy
>> >> > > >> all copies of the original message.  Thank you.
>> >> > > >>
>> >> > > >
>> >> > > >
>> >> > > > --
>> >> > > >
>> >> > > > *James Duong*
>> >> > > > Lead Software Developer
>> >> > > > Bit Quill Technologies Inc.
>> >> > > > Direct: +1.604.562.6082 | jamesd@bitquilltech.com
>> >> > > > https://www.bitquilltech.com
>> >> > > >
>> >> > > > This email message is for the sole use of the intended
>> recipient(s)
>> >> and
>> >> > > may
>> >> > > > contain confidential and privileged information.  Any unauthorized
>> >> > > review,
>> >> > > > use, disclosure, or distribution is prohibited.  If you are not
>> the
>> >> > > > intended recipient, please contact the sender by reply email and
>> >> destroy
>> >> > > > all copies of the original message.  Thank you.
>> >> > >
>> >> >
>> >> >
>> >> > --
>> >> >
>> >> > *James Duong*
>> >> > Lead Software Developer
>> >> > Bit Quill Technologies Inc.
>> >> > Direct: +1.604.562.6082 | jamesd@bitquilltech.com
>> >> > https://www.bitquilltech.com
>> >> >
>> >> > This email message is for the sole use of the intended recipient(s)
>> and
>> >> may
>> >> > contain confidential and privileged information.  Any unauthorized
>> >> review,
>> >> > use, disclosure, or distribution is prohibited.  If you are not the
>> >> > intended recipient, please contact the sender by reply email and
>> destroy
>> >> > all copies of the original message.  Thank you.
>> >>
>>

Re: DISCUSS: [FlightSQL] Catalog support

Posted by Andrew Lamb <al...@influxdata.com>.
> Andrew, do we need to look into adding more metadata to indicate
different query languages? (It's quite a shame that we named this Flight
SQL at this point...)

TDLR is I don't think trying to explicitly support languages other than SQL
in FlightSQL is a good idea. Among other reasons, the JDBC / ODBC drivers,
which mostly assume SQL, are one of the key features of FlightSQL, and they
are likely not as useful for non SQL. I can see the argument to support for
substrait plans, and it will be interesting to see what use cases benefit
from that support.

What would make our life easier would be some standard way to pass
application specific key/value pairs from the JDBC driver to a flight SQL
backend with each request (perhaps via gRPC headers). This would allow
passing configuration parameters that were not envisioned in the spec, from
end user (of the JDBC driver) all the way to our backend.

Thanks again for driving this forward,
Andrew

On Thu, Dec 1, 2022 at 7:11 PM David Li <li...@apache.org> wrote:

> Hey James, thanks for putting this up.
>
> Inline:
>
> > The suggestion is to make this part of Flight as an
> > optional feature, rather than Flight SQL due to its applicability outside
> > of just database access.
>
> Which uses do you see? I see statefulness as a general antipattern here,
> so I'm wary of introducing it beyond where we need it.
>
> > - The Flight client supplies a New-Session header which has key-value
> pairs
> > for initial session options. This header can be applied to any RPC call,
> > but logically should be the first one the client makes.
>
> Handshake already effectively serves as this RPC - maybe we could extend
> it? (I also see Handshake as an antipattern because it's a stateful auth
> mechanism.)
>
> Should the session timeout/be on a lease? (gRPC doesn't really give the
> server a way to track the persistence of a particular client connection.)
>
> > It's a bit asymmetric that creating a new session is done by applying a
> > header, but closing a session is an RPC call. This was so that session
> > creation doesn't introduce another round trip before the first real data
> > request. If there's a way to batch RPC calls it might be better to make
> > session creation an RPC call.
>
> Is this a worrisome amount of overhead?
>
> Unfortunately gRPC doesn't batch RPC calls, but RPC calls on the same
> client generally share the same TCP connection (modulo load balancing
> behavior, but presumably that is not enabled if you want persistent
> sessions).
>
> On the implementation side, I'd like to avoid baking this in too deeply if
> at all possible. Ideally it'd be implemented entirely as middleware,
> possibly making use of an interface so applications can override the
> session storage (hashtable, Redis, etcd, etc.)
>
> > Just to chime in on this, one thing I'm curious about is whether there
> > will be support for user-defined catalog/schema hierarchy depth?
>
> Gavin, for ADBC we discussed adding a delimiter to the catalog name to
> handle this case - maybe we can handle this by adding a property for the
> delimiter to SqlInfo?
>
> > https://github.com/influxdata/influxdb_iox/issues/6102
>
> Andrew, do we need to look into adding more metadata to indicate different
> query languages? (It's quite a shame that we named this Flight SQL at this
> point...)
>
> On Thu, Dec 1, 2022, at 08:49, Andrew Lamb wrote:
> > Sorry for the late reply -- thank you James and David for this
> discussion.
> >
> > I agree that adding Catalog support would be a valuable addition to
> Flight
> > SQL, and it recently came up as we begin to implement Flight SQL in
> > InfluxDB IOx [1].
> >
> >> - A standard URI scheme for Flight SQL that can be used by multiple
> > client APIs (JDBC, ADBC, etc.)
> >
> > I agree this would be very valuable, along with a standard way (ideally
> > with HTTP headers) to send this information as part of the FlightSQL gRPC
> > requests.
> >
> >> I'd suggest we define session management features explicitly in Flight
> > (while being optional).
> >
> > I agree it is critical that server-side state is not required to
> implement
> > FlightSQL. Stateful connections would likely complicate deploying
> FlightSQL
> > in distributed systems. I suggest it should be possible to implement any
> > session management features by sending the entire session state with the
> > request, if desired.
> >
> > I don't have a strong opinion about the merits of including explicit
> > session management features in FlightSQL. It seems to me that keeping the
> > API surface of FlightSQL minimal and implementation flexibility maximal
> > should be the default. However, if JDBC/ODBC driver compatibility would
> be
> > improved with explicit state management APIs, then adding them to
> FlightSQL
> > seems like a good idea to me.
> >
> > Thanks again -- it is amazing to hit some issue in design and then find
> out
> > the Arrow community is already hard at work on a solution.
> >
> > Andrew
> >
> > [1] https://github.com/influxdata/influxdb_iox/issues/6102
> >
> > On Wed, Nov 30, 2022 at 7:17 PM Gavin Ray <ra...@gmail.com> wrote:
> >
> >> Just to chime in on this, one thing I'm curious about is whether there
> >> will be support for user-defined catalog/schema hierarchy depth?
> >>
> >> This comment that James made does seem reasonable to me
> >> > scheme://<host>:<port>/path-1/path-2/.../path-n
> >>
> >> Trino/Presto does a similar thing (jdbc:trino://localhost:8080/tpch/sf1)
> >>
> >> At Hasura, what we do is have an alias "FullyQualifiedName" which is
> >> just "Array<String>"
> >> and the identifier to some element in a data source is always
> >> fully-qualified:
> >>
> >> https://github.com/hasura/graphql-engine/tree/master/dc-agents#schema
> >>
> >> ["postgres_1", "db1", "schema2", "my_table", "col_a"]
> >> ["mongo", "db1",  "collection_a", "field_a"]
> >> ["csv_adapter", "myfile.csv", "col_x"]
> >>
> >> On Wed, Nov 30, 2022 at 6:31 PM James Duong
> >> <ja...@bitquilltech.com.invalid> wrote:
> >> >
> >> > Our current convention of sending connection properties as headers
> with
> >> > every request has the benefit of making statefulness optional, but has
> >> the
> >> > drawback of sending redundant, unused properties on requests after the
> >> > first, which increases the payload size unnecessarily.
> >> >
> >> > I'd suggest we define session management features explicitly in Flight
> >> > (while being optional). The suggestion is to make this part of Flight
> as
> >> an
> >> > optional feature, rather than Flight SQL due to its applicability
> outside
> >> > of just database access.
> >> >
> >> > Creating a session:
> >> > - The Flight client supplies a New-Session header which has key-value
> >> pairs
> >> > for initial session options. This header can be applied to any RPC
> call,
> >> > but logically should be the first one the client makes.
> >> > - The server should send a Set-Cookie header back containing some
> >> > server-side representation of the session that the client can use in
> >> > subsequent requests.
> >> > - The path specified in the URI is sent as a "Catalog" session option.
> >> >
> >> > Modifying session options:
> >> > - A separate RPC call that takes in a Stream<string, string>
> representing
> >> > each session option that is being modified and returns a stream of
> >> statuses
> >> > to indicate if the setting change was accepted.
> >> > - This RPC call is only valid when the Cookie header is used.
> >> > - It is up to the server to define if a failed session property
> change is
> >> > fatal or if other properties can continue to be set.
> >> >
> >> > Closing a session:
> >> > - A separate RPC call that tells the server to drop the session
> specified
> >> > by the Cookie header.
> >> >
> >> > Notes:
> >> > A Flight SQL client would check if session management RPCs are
> supported
> >> > through a new GetSqlInfo property. A Flight client doesn't have a way
> to
> >> do
> >> > this generically, but there could be an application-specific RPC or
> >> header
> >> > that reports this metadata.
> >> >
> >> > The O/JDBC and ADBC drivers would need to be updated to
> programmatically
> >> > check for session management RPCs. If unsupported, then use the old
> >> > behavior of sending all properties as headers with each request. If
> >> > supported, make use of the New-Session header and drop the session
> when
> >> > closing the client-side connection.
> >> >
> >> > It's a bit asymmetric that creating a new session is done by applying
> a
> >> > header, but closing a session is an RPC call. This was so that session
> >> > creation doesn't introduce another round trip before the first real
> data
> >> > request. If there's a way to batch RPC calls it might be better to
> make
> >> > session creation an RPC call.
> >> >
> >> > On Tue, Nov 22, 2022 at 3:16 PM David Li <li...@apache.org> wrote:
> >> >
> >> > > It sounds reasonable - then there are three points:
> >> > >
> >> > > - A standard URI scheme for Flight SQL that can be used by multiple
> >> client
> >> > > APIs (JDBC, ADBC, etc.)
> >> > > - A standard scheme for session data (likely header/cookie-based)
> >> > > - A mapping from URI parameters and fields to session data
> >> > >
> >> > >
> >> > >
> >> > > On Tue, Nov 22, 2022, at 17:45, James Duong wrote:
> >> > > > Just following up on this and if there are any thoughts.
> >> > > >
> >> > > > The purpose would be to standardize how we specify access to some
> >> named
> >> > > > logical grouping of data. This would make it easy to model
> >> catalog/schema
> >> > > > semantics in Flight SQL.
> >> > > >
> >> > > > Having this be part of the connection URI makes it similar to
> >> specifying
> >> > > a
> >> > > > resource in an HTTP URL (ie an endpoint) which should make it easy
> >> for
> >> > > end
> >> > > > users to work with and modify.
> >> > > >
> >> > > > On Fri, Nov 18, 2022 at 3:17 PM James Duong <
> jamesd@bitquilltech.com
> >> >
> >> > > wrote:
> >> > > >
> >> > > >> As for surfacing catalogs itself, perhaps we allow the URI take
> in a
> >> > > path
> >> > > >> and treat that as a way of specifying a multi-level resource that
> >> which
> >> > > the
> >> > > >> FlightClient is connecting to:
> >> > > >>
> >> > > >> eg a connection URI of the form:
> >> > > >> scheme://<host>:<port>/path-1/path-2/.../path-n
> >> > > >>
> >> > > >> The FlightClient could send this path as either a header or a
> >> session
> >> > > >> property (with a neutral name like 'resource-path'). Flight SQL
> >> > > Producers
> >> > > >> could interpret this as a catalog or schema.
> >> > > >> eg
> >> > > >> grpc://<host>:<port>/catalog/schema
> >> > > >>
> >> > > >> On Fri, Nov 11, 2022 at 2:07 AM James Henderson <jm...@juxt.pro>
> >> wrote:
> >> > > >>
> >> > > >>> Sounds good to me.
> >> > > >>>
> >> > > >>> > Are you interested in writing up a (sketch of a) proposal?
> >> > > >>>
> >> > > >>> Yep, can do - I'm OoO over the next couple of weeks so might be
> a
> >> bit
> >> > > >>> intermittent.
> >> > > >>>
> >> > > >>> On Thu, 10 Nov 2022 at 15:28, David Li <li...@apache.org>
> >> wrote:
> >> > > >>>
> >> > > >>> > Hey James H.,
> >> > > >>> >
> >> > > >>> > That would make sense to me. So it sounds like we'd want
> >> > > >>> >
> >> > > >>> > - Formal specification of using cookies/headers to mark a
> >> 'session'
> >> > > (I
> >> > > >>> > guess this will be a little inconsistent with transactions,
> >> though)
> >> > > >>> > - Adding RPCs to query session values
> >> > > >>> > - Adding RPCs to set session values
> >> > > >>> > - Listing standard values and types
> >> > > >>> >
> >> > > >>> > Some things may require more consideration, e.g. transaction
> >> > > isolation
> >> > > >>> > might be better off as part of the transaction RPCs than an
> >> ambient
> >> > > >>> > property. Are you interested in writing up a (sketch of a)
> >> proposal?
> >> > > >>> >
> >> > > >>> > -David
> >> > > >>> >
> >> > > >>> > On Thu, Nov 10, 2022, at 10:09, James Henderson wrote:
> >> > > >>> > > Similarly, we're also currently considering how best to
> >> implement
> >> > > >>> some of
> >> > > >>> > > the SQL standard session variables in our Flight SQL server
> -
> >> > > things
> >> > > >>> like
> >> > > >>> > > current transaction isolation level, access mode, time zone
> >> etc,
> >> > > which
> >> > > >>> > seem
> >> > > >>> > > to have similar properties to the (traditional) connection's
> >> > > current
> >> > > >>> > > catalog. We'd (perhaps naively) looked at solutions
> involving
> >> the
> >> > > >>> Flight
> >> > > >>> > > client's `ClientCookieMiddleware`, but might these have
> >> > > standardised
> >> > > >>> > > support within Flight SQL itself eventually?
> >> > > >>> > >
> >> > > >>> > > Cheers,
> >> > > >>> > >
> >> > > >>> > > James
> >> > > >>> > >
> >> > > >>> > > On Wed, 9 Nov 2022 at 21:51, David Li <li...@apache.org>
> >> wrote:
> >> > > >>> > >
> >> > > >>> > >> I think having better support for this makes sense, but
> >> perhaps we
> >> > > >>> can
> >> > > >>> > >> find a way to make it not tied to the connection itself?
> For
> >> > > >>> instance,
> >> > > >>> > in
> >> > > >>> > >> the same way transactions were implemented (as a handle).
> Or
> >> > > rather,
> >> > > >>> > >> instead of adding connection statefulness to Flight RPC,
> I'd
> >> > > rather
> >> > > >>> try
> >> > > >>> > to
> >> > > >>> > >> work within the gRPC/RPC paradigm.
> >> > > >>> > >>
> >> > > >>> > >> On Wed, Nov 9, 2022, at 16:47, James Duong wrote:
> >> > > >>> > >> > Databases such as SQL Server and PostgreSQL have the
> >> concept of
> >> > > >>> > catalogs
> >> > > >>> > >> as
> >> > > >>> > >> > containers of database schemas. Users can usually
> specify an
> >> > > >>> initial
> >> > > >>> > >> > catalog during the connection process, list catalogs, and
> >> > > sometimes
> >> > > >>> > >> change
> >> > > >>> > >> > catalogs during the session.
> >> > > >>> > >> >
> >> > > >>> > >> > Currently catalog support in Flight SQL is somewhat
> >> limited. The
> >> > > >>> > protocol
> >> > > >>> > >> > provides a way to list catalogs as well as metadata in
> >> > > SqlTypeInfo
> >> > > >>> for
> >> > > >>> > >> > reporting how catalogs are supported from a syntax
> >> perspective.
> >> > > >>> > >> >
> >> > > >>> > >> > ODBC and JDBC provide API calls for changing the catalog.
> >> > > >>> > Additionally,
> >> > > >>> > >> > Flight SQL doesn't really provide the concept of
> "initial"
> >> > > >>> connection
> >> > > >>> > >> > properties (such as a starting catalog) since Flight
> itself
> >> is
> >> > > >>> > stateless
> >> > > >>> > >> > from a connection perspective.
> >> > > >>> > >> >
> >> > > >>> > >> > To support catalogs properly, I'd imagine we need to make
> >> some
> >> > > >>> > changes to
> >> > > >>> > >> > the Flight SQL protocol:
> >> > > >>> > >> > - Introduce the concept of connection-time properties
> >> (perhaps
> >> > > an
> >> > > >>> > >> optional
> >> > > >>> > >> > RPC for Flight SQL applications that need this)
> >> > > >>> > >> > - Related to the above, expand the connection URL and
> Java
> >> > > builder
> >> > > >>> to
> >> > > >>> > >> allow
> >> > > >>> > >> > arbitrary application-specific properties.
> >> > > >>> > >> > - Add optional RPCs for changing the catalog and relevant
> >> error
> >> > > >>> codes
> >> > > >>> > if
> >> > > >>> > >> > this is not permitted.
> >> > > >>> > >> >
> >> > > >>> > >> >
> >> > > >>> > >> > --
> >> > > >>> > >> >
> >> > > >>> > >> > *James Duong*
> >> > > >>> > >> > Lead Software Developer
> >> > > >>> > >> > Bit Quill Technologies Inc.
> >> > > >>> > >> > Direct: +1.604.562.6082 | jamesd@bitquilltech.com
> >> > > >>> > >> > https://www.bitquilltech.com
> >> > > >>> > >> >
> >> > > >>> > >> > This email message is for the sole use of the intended
> >> > > recipient(s)
> >> > > >>> > and
> >> > > >>> > >> may
> >> > > >>> > >> > contain confidential and privileged information.  Any
> >> > > unauthorized
> >> > > >>> > >> review,
> >> > > >>> > >> > use, disclosure, or distribution is prohibited.  If you
> are
> >> not
> >> > > the
> >> > > >>> > >> > intended recipient, please contact the sender by reply
> >> email and
> >> > > >>> > destroy
> >> > > >>> > >> > all copies of the original message.  Thank you.
> >> > > >>> > >>
> >> > > >>> > >
> >> > > >>> > >
> >> > > >>> > > --
> >> > > >>> > > *James Henderson*
> >> > > >>> > > XTDB Developer at *JUXT*
> >> > > >>> > > Email jms@juxt.pro
> >> > > >>> > > Website https://juxt.pro
> >> > > >>> > >
> >> > > >>> > > [image: photo]
> >> > > >>> >
> >> > > >>>
> >> > > >>>
> >> > > >>> --
> >> > > >>> *James Henderson*
> >> > > >>> XTDB Developer at *JUXT*
> >> > > >>> Email jms@juxt.pro
> >> > > >>> Website https://juxt.pro
> >> > > >>>
> >> > > >>> [image: photo]
> >> > > >>>
> >> > > >>
> >> > > >>
> >> > > >> --
> >> > > >>
> >> > > >> *James Duong*
> >> > > >> Lead Software Developer
> >> > > >> Bit Quill Technologies Inc.
> >> > > >> Direct: +1.604.562.6082 | jamesd@bitquilltech.com
> >> > > >> https://www.bitquilltech.com
> >> > > >>
> >> > > >> This email message is for the sole use of the intended
> recipient(s)
> >> and
> >> > > >> may contain confidential and privileged information.  Any
> >> unauthorized
> >> > > >> review, use, disclosure, or distribution is prohibited.  If you
> are
> >> not
> >> > > the
> >> > > >> intended recipient, please contact the sender by reply email and
> >> destroy
> >> > > >> all copies of the original message.  Thank you.
> >> > > >>
> >> > > >
> >> > > >
> >> > > > --
> >> > > >
> >> > > > *James Duong*
> >> > > > Lead Software Developer
> >> > > > Bit Quill Technologies Inc.
> >> > > > Direct: +1.604.562.6082 | jamesd@bitquilltech.com
> >> > > > https://www.bitquilltech.com
> >> > > >
> >> > > > This email message is for the sole use of the intended
> recipient(s)
> >> and
> >> > > may
> >> > > > contain confidential and privileged information.  Any unauthorized
> >> > > review,
> >> > > > use, disclosure, or distribution is prohibited.  If you are not
> the
> >> > > > intended recipient, please contact the sender by reply email and
> >> destroy
> >> > > > all copies of the original message.  Thank you.
> >> > >
> >> >
> >> >
> >> > --
> >> >
> >> > *James Duong*
> >> > Lead Software Developer
> >> > Bit Quill Technologies Inc.
> >> > Direct: +1.604.562.6082 | jamesd@bitquilltech.com
> >> > https://www.bitquilltech.com
> >> >
> >> > This email message is for the sole use of the intended recipient(s)
> and
> >> may
> >> > contain confidential and privileged information.  Any unauthorized
> >> review,
> >> > use, disclosure, or distribution is prohibited.  If you are not the
> >> > intended recipient, please contact the sender by reply email and
> destroy
> >> > all copies of the original message.  Thank you.
> >>
>

Re: DISCUSS: [FlightSQL] Catalog support

Posted by David Li <li...@apache.org>.
Hey James, thanks for putting this up.

Inline:

> The suggestion is to make this part of Flight as an
> optional feature, rather than Flight SQL due to its applicability outside
> of just database access.

Which uses do you see? I see statefulness as a general antipattern here, so I'm wary of introducing it beyond where we need it. 

> - The Flight client supplies a New-Session header which has key-value pairs
> for initial session options. This header can be applied to any RPC call,
> but logically should be the first one the client makes.

Handshake already effectively serves as this RPC - maybe we could extend it? (I also see Handshake as an antipattern because it's a stateful auth mechanism.)

Should the session timeout/be on a lease? (gRPC doesn't really give the server a way to track the persistence of a particular client connection.)

> It's a bit asymmetric that creating a new session is done by applying a
> header, but closing a session is an RPC call. This was so that session
> creation doesn't introduce another round trip before the first real data
> request. If there's a way to batch RPC calls it might be better to make
> session creation an RPC call.

Is this a worrisome amount of overhead? 

Unfortunately gRPC doesn't batch RPC calls, but RPC calls on the same client generally share the same TCP connection (modulo load balancing behavior, but presumably that is not enabled if you want persistent sessions).

On the implementation side, I'd like to avoid baking this in too deeply if at all possible. Ideally it'd be implemented entirely as middleware, possibly making use of an interface so applications can override the session storage (hashtable, Redis, etcd, etc.)

> Just to chime in on this, one thing I'm curious about is whether there
> will be support for user-defined catalog/schema hierarchy depth?

Gavin, for ADBC we discussed adding a delimiter to the catalog name to handle this case - maybe we can handle this by adding a property for the delimiter to SqlInfo?

> https://github.com/influxdata/influxdb_iox/issues/6102

Andrew, do we need to look into adding more metadata to indicate different query languages? (It's quite a shame that we named this Flight SQL at this point...) 

On Thu, Dec 1, 2022, at 08:49, Andrew Lamb wrote:
> Sorry for the late reply -- thank you James and David for this discussion.
>
> I agree that adding Catalog support would be a valuable addition to Flight
> SQL, and it recently came up as we begin to implement Flight SQL in
> InfluxDB IOx [1].
>
>> - A standard URI scheme for Flight SQL that can be used by multiple
> client APIs (JDBC, ADBC, etc.)
>
> I agree this would be very valuable, along with a standard way (ideally
> with HTTP headers) to send this information as part of the FlightSQL gRPC
> requests.
>
>> I'd suggest we define session management features explicitly in Flight
> (while being optional).
>
> I agree it is critical that server-side state is not required to implement
> FlightSQL. Stateful connections would likely complicate deploying FlightSQL
> in distributed systems. I suggest it should be possible to implement any
> session management features by sending the entire session state with the
> request, if desired.
>
> I don't have a strong opinion about the merits of including explicit
> session management features in FlightSQL. It seems to me that keeping the
> API surface of FlightSQL minimal and implementation flexibility maximal
> should be the default. However, if JDBC/ODBC driver compatibility would be
> improved with explicit state management APIs, then adding them to FlightSQL
> seems like a good idea to me.
>
> Thanks again -- it is amazing to hit some issue in design and then find out
> the Arrow community is already hard at work on a solution.
>
> Andrew
>
> [1] https://github.com/influxdata/influxdb_iox/issues/6102
>
> On Wed, Nov 30, 2022 at 7:17 PM Gavin Ray <ra...@gmail.com> wrote:
>
>> Just to chime in on this, one thing I'm curious about is whether there
>> will be support for user-defined catalog/schema hierarchy depth?
>>
>> This comment that James made does seem reasonable to me
>> > scheme://<host>:<port>/path-1/path-2/.../path-n
>>
>> Trino/Presto does a similar thing (jdbc:trino://localhost:8080/tpch/sf1)
>>
>> At Hasura, what we do is have an alias "FullyQualifiedName" which is
>> just "Array<String>"
>> and the identifier to some element in a data source is always
>> fully-qualified:
>>
>> https://github.com/hasura/graphql-engine/tree/master/dc-agents#schema
>>
>> ["postgres_1", "db1", "schema2", "my_table", "col_a"]
>> ["mongo", "db1",  "collection_a", "field_a"]
>> ["csv_adapter", "myfile.csv", "col_x"]
>>
>> On Wed, Nov 30, 2022 at 6:31 PM James Duong
>> <ja...@bitquilltech.com.invalid> wrote:
>> >
>> > Our current convention of sending connection properties as headers with
>> > every request has the benefit of making statefulness optional, but has
>> the
>> > drawback of sending redundant, unused properties on requests after the
>> > first, which increases the payload size unnecessarily.
>> >
>> > I'd suggest we define session management features explicitly in Flight
>> > (while being optional). The suggestion is to make this part of Flight as
>> an
>> > optional feature, rather than Flight SQL due to its applicability outside
>> > of just database access.
>> >
>> > Creating a session:
>> > - The Flight client supplies a New-Session header which has key-value
>> pairs
>> > for initial session options. This header can be applied to any RPC call,
>> > but logically should be the first one the client makes.
>> > - The server should send a Set-Cookie header back containing some
>> > server-side representation of the session that the client can use in
>> > subsequent requests.
>> > - The path specified in the URI is sent as a "Catalog" session option.
>> >
>> > Modifying session options:
>> > - A separate RPC call that takes in a Stream<string, string> representing
>> > each session option that is being modified and returns a stream of
>> statuses
>> > to indicate if the setting change was accepted.
>> > - This RPC call is only valid when the Cookie header is used.
>> > - It is up to the server to define if a failed session property change is
>> > fatal or if other properties can continue to be set.
>> >
>> > Closing a session:
>> > - A separate RPC call that tells the server to drop the session specified
>> > by the Cookie header.
>> >
>> > Notes:
>> > A Flight SQL client would check if session management RPCs are supported
>> > through a new GetSqlInfo property. A Flight client doesn't have a way to
>> do
>> > this generically, but there could be an application-specific RPC or
>> header
>> > that reports this metadata.
>> >
>> > The O/JDBC and ADBC drivers would need to be updated to programmatically
>> > check for session management RPCs. If unsupported, then use the old
>> > behavior of sending all properties as headers with each request. If
>> > supported, make use of the New-Session header and drop the session when
>> > closing the client-side connection.
>> >
>> > It's a bit asymmetric that creating a new session is done by applying a
>> > header, but closing a session is an RPC call. This was so that session
>> > creation doesn't introduce another round trip before the first real data
>> > request. If there's a way to batch RPC calls it might be better to make
>> > session creation an RPC call.
>> >
>> > On Tue, Nov 22, 2022 at 3:16 PM David Li <li...@apache.org> wrote:
>> >
>> > > It sounds reasonable - then there are three points:
>> > >
>> > > - A standard URI scheme for Flight SQL that can be used by multiple
>> client
>> > > APIs (JDBC, ADBC, etc.)
>> > > - A standard scheme for session data (likely header/cookie-based)
>> > > - A mapping from URI parameters and fields to session data
>> > >
>> > >
>> > >
>> > > On Tue, Nov 22, 2022, at 17:45, James Duong wrote:
>> > > > Just following up on this and if there are any thoughts.
>> > > >
>> > > > The purpose would be to standardize how we specify access to some
>> named
>> > > > logical grouping of data. This would make it easy to model
>> catalog/schema
>> > > > semantics in Flight SQL.
>> > > >
>> > > > Having this be part of the connection URI makes it similar to
>> specifying
>> > > a
>> > > > resource in an HTTP URL (ie an endpoint) which should make it easy
>> for
>> > > end
>> > > > users to work with and modify.
>> > > >
>> > > > On Fri, Nov 18, 2022 at 3:17 PM James Duong <jamesd@bitquilltech.com
>> >
>> > > wrote:
>> > > >
>> > > >> As for surfacing catalogs itself, perhaps we allow the URI take in a
>> > > path
>> > > >> and treat that as a way of specifying a multi-level resource that
>> which
>> > > the
>> > > >> FlightClient is connecting to:
>> > > >>
>> > > >> eg a connection URI of the form:
>> > > >> scheme://<host>:<port>/path-1/path-2/.../path-n
>> > > >>
>> > > >> The FlightClient could send this path as either a header or a
>> session
>> > > >> property (with a neutral name like 'resource-path'). Flight SQL
>> > > Producers
>> > > >> could interpret this as a catalog or schema.
>> > > >> eg
>> > > >> grpc://<host>:<port>/catalog/schema
>> > > >>
>> > > >> On Fri, Nov 11, 2022 at 2:07 AM James Henderson <jm...@juxt.pro>
>> wrote:
>> > > >>
>> > > >>> Sounds good to me.
>> > > >>>
>> > > >>> > Are you interested in writing up a (sketch of a) proposal?
>> > > >>>
>> > > >>> Yep, can do - I'm OoO over the next couple of weeks so might be a
>> bit
>> > > >>> intermittent.
>> > > >>>
>> > > >>> On Thu, 10 Nov 2022 at 15:28, David Li <li...@apache.org>
>> wrote:
>> > > >>>
>> > > >>> > Hey James H.,
>> > > >>> >
>> > > >>> > That would make sense to me. So it sounds like we'd want
>> > > >>> >
>> > > >>> > - Formal specification of using cookies/headers to mark a
>> 'session'
>> > > (I
>> > > >>> > guess this will be a little inconsistent with transactions,
>> though)
>> > > >>> > - Adding RPCs to query session values
>> > > >>> > - Adding RPCs to set session values
>> > > >>> > - Listing standard values and types
>> > > >>> >
>> > > >>> > Some things may require more consideration, e.g. transaction
>> > > isolation
>> > > >>> > might be better off as part of the transaction RPCs than an
>> ambient
>> > > >>> > property. Are you interested in writing up a (sketch of a)
>> proposal?
>> > > >>> >
>> > > >>> > -David
>> > > >>> >
>> > > >>> > On Thu, Nov 10, 2022, at 10:09, James Henderson wrote:
>> > > >>> > > Similarly, we're also currently considering how best to
>> implement
>> > > >>> some of
>> > > >>> > > the SQL standard session variables in our Flight SQL server -
>> > > things
>> > > >>> like
>> > > >>> > > current transaction isolation level, access mode, time zone
>> etc,
>> > > which
>> > > >>> > seem
>> > > >>> > > to have similar properties to the (traditional) connection's
>> > > current
>> > > >>> > > catalog. We'd (perhaps naively) looked at solutions involving
>> the
>> > > >>> Flight
>> > > >>> > > client's `ClientCookieMiddleware`, but might these have
>> > > standardised
>> > > >>> > > support within Flight SQL itself eventually?
>> > > >>> > >
>> > > >>> > > Cheers,
>> > > >>> > >
>> > > >>> > > James
>> > > >>> > >
>> > > >>> > > On Wed, 9 Nov 2022 at 21:51, David Li <li...@apache.org>
>> wrote:
>> > > >>> > >
>> > > >>> > >> I think having better support for this makes sense, but
>> perhaps we
>> > > >>> can
>> > > >>> > >> find a way to make it not tied to the connection itself? For
>> > > >>> instance,
>> > > >>> > in
>> > > >>> > >> the same way transactions were implemented (as a handle). Or
>> > > rather,
>> > > >>> > >> instead of adding connection statefulness to Flight RPC, I'd
>> > > rather
>> > > >>> try
>> > > >>> > to
>> > > >>> > >> work within the gRPC/RPC paradigm.
>> > > >>> > >>
>> > > >>> > >> On Wed, Nov 9, 2022, at 16:47, James Duong wrote:
>> > > >>> > >> > Databases such as SQL Server and PostgreSQL have the
>> concept of
>> > > >>> > catalogs
>> > > >>> > >> as
>> > > >>> > >> > containers of database schemas. Users can usually specify an
>> > > >>> initial
>> > > >>> > >> > catalog during the connection process, list catalogs, and
>> > > sometimes
>> > > >>> > >> change
>> > > >>> > >> > catalogs during the session.
>> > > >>> > >> >
>> > > >>> > >> > Currently catalog support in Flight SQL is somewhat
>> limited. The
>> > > >>> > protocol
>> > > >>> > >> > provides a way to list catalogs as well as metadata in
>> > > SqlTypeInfo
>> > > >>> for
>> > > >>> > >> > reporting how catalogs are supported from a syntax
>> perspective.
>> > > >>> > >> >
>> > > >>> > >> > ODBC and JDBC provide API calls for changing the catalog.
>> > > >>> > Additionally,
>> > > >>> > >> > Flight SQL doesn't really provide the concept of "initial"
>> > > >>> connection
>> > > >>> > >> > properties (such as a starting catalog) since Flight itself
>> is
>> > > >>> > stateless
>> > > >>> > >> > from a connection perspective.
>> > > >>> > >> >
>> > > >>> > >> > To support catalogs properly, I'd imagine we need to make
>> some
>> > > >>> > changes to
>> > > >>> > >> > the Flight SQL protocol:
>> > > >>> > >> > - Introduce the concept of connection-time properties
>> (perhaps
>> > > an
>> > > >>> > >> optional
>> > > >>> > >> > RPC for Flight SQL applications that need this)
>> > > >>> > >> > - Related to the above, expand the connection URL and Java
>> > > builder
>> > > >>> to
>> > > >>> > >> allow
>> > > >>> > >> > arbitrary application-specific properties.
>> > > >>> > >> > - Add optional RPCs for changing the catalog and relevant
>> error
>> > > >>> codes
>> > > >>> > if
>> > > >>> > >> > this is not permitted.
>> > > >>> > >> >
>> > > >>> > >> >
>> > > >>> > >> > --
>> > > >>> > >> >
>> > > >>> > >> > *James Duong*
>> > > >>> > >> > Lead Software Developer
>> > > >>> > >> > Bit Quill Technologies Inc.
>> > > >>> > >> > Direct: +1.604.562.6082 | jamesd@bitquilltech.com
>> > > >>> > >> > https://www.bitquilltech.com
>> > > >>> > >> >
>> > > >>> > >> > This email message is for the sole use of the intended
>> > > recipient(s)
>> > > >>> > and
>> > > >>> > >> may
>> > > >>> > >> > contain confidential and privileged information.  Any
>> > > unauthorized
>> > > >>> > >> review,
>> > > >>> > >> > use, disclosure, or distribution is prohibited.  If you are
>> not
>> > > the
>> > > >>> > >> > intended recipient, please contact the sender by reply
>> email and
>> > > >>> > destroy
>> > > >>> > >> > all copies of the original message.  Thank you.
>> > > >>> > >>
>> > > >>> > >
>> > > >>> > >
>> > > >>> > > --
>> > > >>> > > *James Henderson*
>> > > >>> > > XTDB Developer at *JUXT*
>> > > >>> > > Email jms@juxt.pro
>> > > >>> > > Website https://juxt.pro
>> > > >>> > >
>> > > >>> > > [image: photo]
>> > > >>> >
>> > > >>>
>> > > >>>
>> > > >>> --
>> > > >>> *James Henderson*
>> > > >>> XTDB Developer at *JUXT*
>> > > >>> Email jms@juxt.pro
>> > > >>> Website https://juxt.pro
>> > > >>>
>> > > >>> [image: photo]
>> > > >>>
>> > > >>
>> > > >>
>> > > >> --
>> > > >>
>> > > >> *James Duong*
>> > > >> Lead Software Developer
>> > > >> Bit Quill Technologies Inc.
>> > > >> Direct: +1.604.562.6082 | jamesd@bitquilltech.com
>> > > >> https://www.bitquilltech.com
>> > > >>
>> > > >> This email message is for the sole use of the intended recipient(s)
>> and
>> > > >> may contain confidential and privileged information.  Any
>> unauthorized
>> > > >> review, use, disclosure, or distribution is prohibited.  If you are
>> not
>> > > the
>> > > >> intended recipient, please contact the sender by reply email and
>> destroy
>> > > >> all copies of the original message.  Thank you.
>> > > >>
>> > > >
>> > > >
>> > > > --
>> > > >
>> > > > *James Duong*
>> > > > Lead Software Developer
>> > > > Bit Quill Technologies Inc.
>> > > > Direct: +1.604.562.6082 | jamesd@bitquilltech.com
>> > > > https://www.bitquilltech.com
>> > > >
>> > > > This email message is for the sole use of the intended recipient(s)
>> and
>> > > may
>> > > > contain confidential and privileged information.  Any unauthorized
>> > > review,
>> > > > use, disclosure, or distribution is prohibited.  If you are not the
>> > > > intended recipient, please contact the sender by reply email and
>> destroy
>> > > > all copies of the original message.  Thank you.
>> > >
>> >
>> >
>> > --
>> >
>> > *James Duong*
>> > Lead Software Developer
>> > Bit Quill Technologies Inc.
>> > Direct: +1.604.562.6082 | jamesd@bitquilltech.com
>> > https://www.bitquilltech.com
>> >
>> > This email message is for the sole use of the intended recipient(s) and
>> may
>> > contain confidential and privileged information.  Any unauthorized
>> review,
>> > use, disclosure, or distribution is prohibited.  If you are not the
>> > intended recipient, please contact the sender by reply email and destroy
>> > all copies of the original message.  Thank you.
>>

Re: DISCUSS: [FlightSQL] Catalog support

Posted by Andrew Lamb <al...@influxdata.com>.
Sorry for the late reply -- thank you James and David for this discussion.

I agree that adding Catalog support would be a valuable addition to Flight
SQL, and it recently came up as we begin to implement Flight SQL in
InfluxDB IOx [1].

> - A standard URI scheme for Flight SQL that can be used by multiple
client APIs (JDBC, ADBC, etc.)

I agree this would be very valuable, along with a standard way (ideally
with HTTP headers) to send this information as part of the FlightSQL gRPC
requests.

> I'd suggest we define session management features explicitly in Flight
(while being optional).

I agree it is critical that server-side state is not required to implement
FlightSQL. Stateful connections would likely complicate deploying FlightSQL
in distributed systems. I suggest it should be possible to implement any
session management features by sending the entire session state with the
request, if desired.

I don't have a strong opinion about the merits of including explicit
session management features in FlightSQL. It seems to me that keeping the
API surface of FlightSQL minimal and implementation flexibility maximal
should be the default. However, if JDBC/ODBC driver compatibility would be
improved with explicit state management APIs, then adding them to FlightSQL
seems like a good idea to me.

Thanks again -- it is amazing to hit some issue in design and then find out
the Arrow community is already hard at work on a solution.

Andrew

[1] https://github.com/influxdata/influxdb_iox/issues/6102

On Wed, Nov 30, 2022 at 7:17 PM Gavin Ray <ra...@gmail.com> wrote:

> Just to chime in on this, one thing I'm curious about is whether there
> will be support for user-defined catalog/schema hierarchy depth?
>
> This comment that James made does seem reasonable to me
> > scheme://<host>:<port>/path-1/path-2/.../path-n
>
> Trino/Presto does a similar thing (jdbc:trino://localhost:8080/tpch/sf1)
>
> At Hasura, what we do is have an alias "FullyQualifiedName" which is
> just "Array<String>"
> and the identifier to some element in a data source is always
> fully-qualified:
>
> https://github.com/hasura/graphql-engine/tree/master/dc-agents#schema
>
> ["postgres_1", "db1", "schema2", "my_table", "col_a"]
> ["mongo", "db1",  "collection_a", "field_a"]
> ["csv_adapter", "myfile.csv", "col_x"]
>
> On Wed, Nov 30, 2022 at 6:31 PM James Duong
> <ja...@bitquilltech.com.invalid> wrote:
> >
> > Our current convention of sending connection properties as headers with
> > every request has the benefit of making statefulness optional, but has
> the
> > drawback of sending redundant, unused properties on requests after the
> > first, which increases the payload size unnecessarily.
> >
> > I'd suggest we define session management features explicitly in Flight
> > (while being optional). The suggestion is to make this part of Flight as
> an
> > optional feature, rather than Flight SQL due to its applicability outside
> > of just database access.
> >
> > Creating a session:
> > - The Flight client supplies a New-Session header which has key-value
> pairs
> > for initial session options. This header can be applied to any RPC call,
> > but logically should be the first one the client makes.
> > - The server should send a Set-Cookie header back containing some
> > server-side representation of the session that the client can use in
> > subsequent requests.
> > - The path specified in the URI is sent as a "Catalog" session option.
> >
> > Modifying session options:
> > - A separate RPC call that takes in a Stream<string, string> representing
> > each session option that is being modified and returns a stream of
> statuses
> > to indicate if the setting change was accepted.
> > - This RPC call is only valid when the Cookie header is used.
> > - It is up to the server to define if a failed session property change is
> > fatal or if other properties can continue to be set.
> >
> > Closing a session:
> > - A separate RPC call that tells the server to drop the session specified
> > by the Cookie header.
> >
> > Notes:
> > A Flight SQL client would check if session management RPCs are supported
> > through a new GetSqlInfo property. A Flight client doesn't have a way to
> do
> > this generically, but there could be an application-specific RPC or
> header
> > that reports this metadata.
> >
> > The O/JDBC and ADBC drivers would need to be updated to programmatically
> > check for session management RPCs. If unsupported, then use the old
> > behavior of sending all properties as headers with each request. If
> > supported, make use of the New-Session header and drop the session when
> > closing the client-side connection.
> >
> > It's a bit asymmetric that creating a new session is done by applying a
> > header, but closing a session is an RPC call. This was so that session
> > creation doesn't introduce another round trip before the first real data
> > request. If there's a way to batch RPC calls it might be better to make
> > session creation an RPC call.
> >
> > On Tue, Nov 22, 2022 at 3:16 PM David Li <li...@apache.org> wrote:
> >
> > > It sounds reasonable - then there are three points:
> > >
> > > - A standard URI scheme for Flight SQL that can be used by multiple
> client
> > > APIs (JDBC, ADBC, etc.)
> > > - A standard scheme for session data (likely header/cookie-based)
> > > - A mapping from URI parameters and fields to session data
> > >
> > >
> > >
> > > On Tue, Nov 22, 2022, at 17:45, James Duong wrote:
> > > > Just following up on this and if there are any thoughts.
> > > >
> > > > The purpose would be to standardize how we specify access to some
> named
> > > > logical grouping of data. This would make it easy to model
> catalog/schema
> > > > semantics in Flight SQL.
> > > >
> > > > Having this be part of the connection URI makes it similar to
> specifying
> > > a
> > > > resource in an HTTP URL (ie an endpoint) which should make it easy
> for
> > > end
> > > > users to work with and modify.
> > > >
> > > > On Fri, Nov 18, 2022 at 3:17 PM James Duong <jamesd@bitquilltech.com
> >
> > > wrote:
> > > >
> > > >> As for surfacing catalogs itself, perhaps we allow the URI take in a
> > > path
> > > >> and treat that as a way of specifying a multi-level resource that
> which
> > > the
> > > >> FlightClient is connecting to:
> > > >>
> > > >> eg a connection URI of the form:
> > > >> scheme://<host>:<port>/path-1/path-2/.../path-n
> > > >>
> > > >> The FlightClient could send this path as either a header or a
> session
> > > >> property (with a neutral name like 'resource-path'). Flight SQL
> > > Producers
> > > >> could interpret this as a catalog or schema.
> > > >> eg
> > > >> grpc://<host>:<port>/catalog/schema
> > > >>
> > > >> On Fri, Nov 11, 2022 at 2:07 AM James Henderson <jm...@juxt.pro>
> wrote:
> > > >>
> > > >>> Sounds good to me.
> > > >>>
> > > >>> > Are you interested in writing up a (sketch of a) proposal?
> > > >>>
> > > >>> Yep, can do - I'm OoO over the next couple of weeks so might be a
> bit
> > > >>> intermittent.
> > > >>>
> > > >>> On Thu, 10 Nov 2022 at 15:28, David Li <li...@apache.org>
> wrote:
> > > >>>
> > > >>> > Hey James H.,
> > > >>> >
> > > >>> > That would make sense to me. So it sounds like we'd want
> > > >>> >
> > > >>> > - Formal specification of using cookies/headers to mark a
> 'session'
> > > (I
> > > >>> > guess this will be a little inconsistent with transactions,
> though)
> > > >>> > - Adding RPCs to query session values
> > > >>> > - Adding RPCs to set session values
> > > >>> > - Listing standard values and types
> > > >>> >
> > > >>> > Some things may require more consideration, e.g. transaction
> > > isolation
> > > >>> > might be better off as part of the transaction RPCs than an
> ambient
> > > >>> > property. Are you interested in writing up a (sketch of a)
> proposal?
> > > >>> >
> > > >>> > -David
> > > >>> >
> > > >>> > On Thu, Nov 10, 2022, at 10:09, James Henderson wrote:
> > > >>> > > Similarly, we're also currently considering how best to
> implement
> > > >>> some of
> > > >>> > > the SQL standard session variables in our Flight SQL server -
> > > things
> > > >>> like
> > > >>> > > current transaction isolation level, access mode, time zone
> etc,
> > > which
> > > >>> > seem
> > > >>> > > to have similar properties to the (traditional) connection's
> > > current
> > > >>> > > catalog. We'd (perhaps naively) looked at solutions involving
> the
> > > >>> Flight
> > > >>> > > client's `ClientCookieMiddleware`, but might these have
> > > standardised
> > > >>> > > support within Flight SQL itself eventually?
> > > >>> > >
> > > >>> > > Cheers,
> > > >>> > >
> > > >>> > > James
> > > >>> > >
> > > >>> > > On Wed, 9 Nov 2022 at 21:51, David Li <li...@apache.org>
> wrote:
> > > >>> > >
> > > >>> > >> I think having better support for this makes sense, but
> perhaps we
> > > >>> can
> > > >>> > >> find a way to make it not tied to the connection itself? For
> > > >>> instance,
> > > >>> > in
> > > >>> > >> the same way transactions were implemented (as a handle). Or
> > > rather,
> > > >>> > >> instead of adding connection statefulness to Flight RPC, I'd
> > > rather
> > > >>> try
> > > >>> > to
> > > >>> > >> work within the gRPC/RPC paradigm.
> > > >>> > >>
> > > >>> > >> On Wed, Nov 9, 2022, at 16:47, James Duong wrote:
> > > >>> > >> > Databases such as SQL Server and PostgreSQL have the
> concept of
> > > >>> > catalogs
> > > >>> > >> as
> > > >>> > >> > containers of database schemas. Users can usually specify an
> > > >>> initial
> > > >>> > >> > catalog during the connection process, list catalogs, and
> > > sometimes
> > > >>> > >> change
> > > >>> > >> > catalogs during the session.
> > > >>> > >> >
> > > >>> > >> > Currently catalog support in Flight SQL is somewhat
> limited. The
> > > >>> > protocol
> > > >>> > >> > provides a way to list catalogs as well as metadata in
> > > SqlTypeInfo
> > > >>> for
> > > >>> > >> > reporting how catalogs are supported from a syntax
> perspective.
> > > >>> > >> >
> > > >>> > >> > ODBC and JDBC provide API calls for changing the catalog.
> > > >>> > Additionally,
> > > >>> > >> > Flight SQL doesn't really provide the concept of "initial"
> > > >>> connection
> > > >>> > >> > properties (such as a starting catalog) since Flight itself
> is
> > > >>> > stateless
> > > >>> > >> > from a connection perspective.
> > > >>> > >> >
> > > >>> > >> > To support catalogs properly, I'd imagine we need to make
> some
> > > >>> > changes to
> > > >>> > >> > the Flight SQL protocol:
> > > >>> > >> > - Introduce the concept of connection-time properties
> (perhaps
> > > an
> > > >>> > >> optional
> > > >>> > >> > RPC for Flight SQL applications that need this)
> > > >>> > >> > - Related to the above, expand the connection URL and Java
> > > builder
> > > >>> to
> > > >>> > >> allow
> > > >>> > >> > arbitrary application-specific properties.
> > > >>> > >> > - Add optional RPCs for changing the catalog and relevant
> error
> > > >>> codes
> > > >>> > if
> > > >>> > >> > this is not permitted.
> > > >>> > >> >
> > > >>> > >> >
> > > >>> > >> > --
> > > >>> > >> >
> > > >>> > >> > *James Duong*
> > > >>> > >> > Lead Software Developer
> > > >>> > >> > Bit Quill Technologies Inc.
> > > >>> > >> > Direct: +1.604.562.6082 | jamesd@bitquilltech.com
> > > >>> > >> > https://www.bitquilltech.com
> > > >>> > >> >
> > > >>> > >> > This email message is for the sole use of the intended
> > > recipient(s)
> > > >>> > and
> > > >>> > >> may
> > > >>> > >> > contain confidential and privileged information.  Any
> > > unauthorized
> > > >>> > >> review,
> > > >>> > >> > use, disclosure, or distribution is prohibited.  If you are
> not
> > > the
> > > >>> > >> > intended recipient, please contact the sender by reply
> email and
> > > >>> > destroy
> > > >>> > >> > all copies of the original message.  Thank you.
> > > >>> > >>
> > > >>> > >
> > > >>> > >
> > > >>> > > --
> > > >>> > > *James Henderson*
> > > >>> > > XTDB Developer at *JUXT*
> > > >>> > > Email jms@juxt.pro
> > > >>> > > Website https://juxt.pro
> > > >>> > >
> > > >>> > > [image: photo]
> > > >>> >
> > > >>>
> > > >>>
> > > >>> --
> > > >>> *James Henderson*
> > > >>> XTDB Developer at *JUXT*
> > > >>> Email jms@juxt.pro
> > > >>> Website https://juxt.pro
> > > >>>
> > > >>> [image: photo]
> > > >>>
> > > >>
> > > >>
> > > >> --
> > > >>
> > > >> *James Duong*
> > > >> Lead Software Developer
> > > >> Bit Quill Technologies Inc.
> > > >> Direct: +1.604.562.6082 | jamesd@bitquilltech.com
> > > >> https://www.bitquilltech.com
> > > >>
> > > >> This email message is for the sole use of the intended recipient(s)
> and
> > > >> may contain confidential and privileged information.  Any
> unauthorized
> > > >> review, use, disclosure, or distribution is prohibited.  If you are
> not
> > > the
> > > >> intended recipient, please contact the sender by reply email and
> destroy
> > > >> all copies of the original message.  Thank you.
> > > >>
> > > >
> > > >
> > > > --
> > > >
> > > > *James Duong*
> > > > Lead Software Developer
> > > > Bit Quill Technologies Inc.
> > > > Direct: +1.604.562.6082 | jamesd@bitquilltech.com
> > > > https://www.bitquilltech.com
> > > >
> > > > This email message is for the sole use of the intended recipient(s)
> and
> > > may
> > > > contain confidential and privileged information.  Any unauthorized
> > > review,
> > > > use, disclosure, or distribution is prohibited.  If you are not the
> > > > intended recipient, please contact the sender by reply email and
> destroy
> > > > all copies of the original message.  Thank you.
> > >
> >
> >
> > --
> >
> > *James Duong*
> > Lead Software Developer
> > Bit Quill Technologies Inc.
> > Direct: +1.604.562.6082 | jamesd@bitquilltech.com
> > https://www.bitquilltech.com
> >
> > This email message is for the sole use of the intended recipient(s) and
> may
> > contain confidential and privileged information.  Any unauthorized
> review,
> > use, disclosure, or distribution is prohibited.  If you are not the
> > intended recipient, please contact the sender by reply email and destroy
> > all copies of the original message.  Thank you.
>