You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@seatunnel.apache.org by "MonsterChenzhuo (via GitHub)" <gi...@apache.org> on 2023/05/09 10:03:25 UTC

[GitHub] [incubator-seatunnel] MonsterChenzhuo commented on issue #2808: [Feature][Connector-V2-Clickhouse] Clickhouse Source support multi-split read.

MonsterChenzhuo commented on issue #2808:
URL: https://github.com/apache/incubator-seatunnel/issues/2808#issuecomment-1539837013

   @hailin0 
   To implement ClickHouse sharded reading, it is necessary to change the original HTTP submission of SQL queries to be consistent with the JDBC submission used in the sink side. The reasons are as follows:
   
   Limited concurrent reading capability of HTTP.
   Limited data transfer capacity.
   To achieve sharding capabilities, JDBC's prepared statement feature is required. For example, the SQL query "select a, b from test" can be rewritten using JDBC's prepared statement capability as "select a, b from test where a between ? and ?". Then, by applying a predefined sharding strategy, the original SQL query "select a, b from test" can be split into multiple queries like:
   "select a, b from test where a between 1 and 50",
   "select a, b from test where a between 50 and 100",
   "select a, b from test where a between 100 and 200".
   By using a SplitEnumerator, the SQL queries can be distributed to different readers, thus achieving parallel reading capabilities.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org