You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by GitBox <gi...@apache.org> on 2022/08/15 14:41:50 UTC

[GitHub] [arrow-adbc] lidavidm commented on issue #64: [Format] Formalize thread safety and concurrency guarantees

lidavidm commented on issue #64:
URL: https://github.com/apache/arrow-adbc/issues/64#issuecomment-1215092467

   So digging around, I would probably vote that for ADBC, we roughly follow the JDBC/ODBC concurrency/thread safety guarantees.
   
   Here, a statement is just a handle/object to configure query state, and not necessarily a prepared statement (so, using JDBC/ODBC terminology, or a Cursor in Python DBAPI terms). 
   
   - Connections (and in general, all ADBC objects) must allow serialized access, but not necessarily concurrent access. This precludes thread-local storage. (Some drivers, like Flight SQL, may support concurrent access.)
   - Multiple statements can be created from a single connection, but the driver may impose limitations on "active" statements (I would vote to borrow ODBC's concept here: the driver must allow you to create and set up multiple queries, but may restrict you to only being able to execute/fetch one at a time.)
   - An individual statement can be used multiple times, but result sets cannot be read concurrently (that is: executing a statement invalidates prior result sets)
   - Prepared statements follow regular statements in behavior. (This makes Golang support a little weird; Golang is exceptional in allowing concurrent use of prepared statements)
   
   ## Connections
   
   Flight SQL - connections are safe for concurrent access (Send + Sync)
   Golang database/sql - it's not stated whether `Conn` is safe for concurrent use
   libpq - a single connection cannot be used by multiple threads at once, but the library is thread-safe (Send)
   JDBC - serial access is safe (Send) (e.g. see [Oracle driver docs](https://docs.oracle.com/cd/B19306_01/java.102/b14355/apxtips.htm#i1005436)) but concurrent access is driver-dependent
   ODBC - [all handles must be thread safe][odbc-multithreading], but the driver may serialize accesses internally (Send + Sync)
   
   ## Statements
   
   The question here is whether you should be able to create two AdbcStatements and use both of them, either from the same thread or different threads. (This question will sound a little funny to users of some APIs; I'm thinking about JDBC/ODBC/DBAPI here)
   
   Flight SQL - no explicit concept of a statement outside of a prepared statement. It is not defined whether a FlightInfo can be reused or not, but the individual endpoints within can be accessed in parallel
   Golang database/sql - no explicit concept of a statement outside of a prepared statement
   libpq - the protocol only allows for a single query at a time (pipelining can get around this)
   JDBC - "Each Connection object can create multiple Statement objects that may be used
   concurrently by the program." (JDBC 4.1 spec). And "only one ResultSet object per Statement object can be open at the same time". 
   ODBC - [multiple "active" statements are allowed on a single connection][odbc-statement], but the driver may limit this number (Send + Sync)
   
   ## Prepared statements
   
   Flight SQL - it is not defined whether prepared statements are thread safe or not. However, parameter binding is stateful so it doesn't really make sense to try to use them concurrently
   Golang database/sql - allows concurrent use from multiple goroutines (this is unusual!)
   libpq - since connections are not thread safe, neither are prepared statements
   JDBC - prepared statements are stateful (e.g. parameters) so it does not make sense to use them concurrently
   ODBC - same as statements
   
   [odbc-multithreading]: https://docs.microsoft.com/en-us/sql/odbc/reference/develop-app/multithreading?view=sql-server-ver16
   [odbc-statement]: https://docs.microsoft.com/en-us/sql/odbc/reference/develop-app/statement-handles?view=sql-server-ver16


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org