You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2022/12/08 04:27:52 UTC

[GitHub] [pulsar] zhicwu commented on pull request #18774: [feat][io] Upgarde ClickHouse driver to support loadbalance policy

zhicwu commented on PR #18774:
URL: https://github.com/apache/pulsar/pull/18774#issuecomment-1341982189

   Hi @tisonkun, `com.clickhouse/clickhouse-jdbc:0.3.2-patch11` is a complete rewrite, so not surprisingly it's not fully backward-compatible. It seems `ClickHouseJdbcAutoSchemaSink` is the only usage without any test. As a result, we cannot tell if anything broken from CI. So I'm listing key changes here to help you understand and estimate the impact of upgrading the driver:
   
   1. Apache Http Client 4.x was replaced by Http(s)URLConnection, meaning http [properties](https://docs.oracle.com/javase/8/docs/api/java/net/doc-files/net-properties.html#MiscHTTP) and [proxies](https://docs.oracle.com/javase/8/docs/api/java/net/doc-files/net-properties.html#Proxies) are now considered which may cause problem within a complex runtime
   2. Data format between client and server was changed from TabSeparated to RowBinary, meaning we no longer can pass `null` to a non-nullable column
   3. Type mapping and timezone handling were changed as well, meaning same query may give us different results but I guess it should be mainly about timestamp with timezone
   4. Introduced non-standard table types in meta data: `DICTIONARY`, `LOG TABLE`, `MEMORY TABLE`, `REMOTE TABLE`, `SYSTEM TABLE`, `TEMPORARY TABLE`, so you may find some tables disappeared
   
   Having said that, since v0.3.2 is packed with both legacy and new JDBC drivers(`ru.yandex.clickhouse.ClickHouseDriver` vs. `com.clickhouse.jdbc.ClickHouseDriver`), you may add an option to let user to choose during the transition period. Starting from v0.3.3, there'll be no legacy driver anymore.
   
   Lastly, some more comments for you to consider regarding ClickHouse sink:
   * support nested data types like `Array`, `Tuple`, `Map`, `JSON`
   * ClickHouse supports [multiple data formats](https://clickhouse.com/docs/en/interfaces/formats/) and `Avro` is one of them, so maybe we don't have to convert Avro data type to JDBC type and then ClickHouse data type
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org