You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@shardingsphere.apache.org by "azexcy (via GitHub)" <gi...@apache.org> on 2023/02/13 13:41:45 UTC

[GitHub] [shardingsphere] azexcy opened a new pull request, #24146: Using streaming query at pipeline inventory dump and data consistency check

azexcy opened a new pull request, #24146:
URL: https://github.com/apache/shardingsphere/pull/24146

   
   Changes proposed in this pull request:
     - Using streaming query at pipeline inventory dump and data consistency check
   
   ---
   
   Before committing this PR, I'm sure that I have checked the following options:
   - [ ] My code follows the [code of conduct](https://shardingsphere.apache.org/community/en/involved/conduct/code/) of this project.
   - [ ] I have self-reviewed the commit code.
   - [ ] I have (or in comment I request) added corresponding labels for the pull request.
   - [ ] I have passed maven check locally : `./mvnw clean install -B -T1C -Dmaven.javadoc.skip -Dmaven.jacoco.skip -e`.
   - [ ] I have made corresponding changes to the documentation.
   - [ ] I have added corresponding unit tests for my changes.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@shardingsphere.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [shardingsphere] sandynz merged pull request #24146: Using streaming query at pipeline inventory dump and data consistency check

Posted by "sandynz (via GitHub)" <gi...@apache.org>.
sandynz merged PR #24146:
URL: https://github.com/apache/shardingsphere/pull/24146


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@shardingsphere.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [shardingsphere] azexcy commented on pull request #24146: Using streaming query at pipeline inventory dump and data consistency check

Posted by "azexcy (via GitHub)" <gi...@apache.org>.
azexcy commented on PR #24146:
URL: https://github.com/apache/shardingsphere/pull/24146#issuecomment-1429051689

   If increasing `MAX_CONNECTIONS_SIZE_PER_QUERY`,  maybe the following error will occur, so remove it.
   ```
   java.sql.SQLException: Can not get 10 connections one time, partition succeed connection(2) have released. Please consider increasing the `maxPoolSize` of the data sources or decreasing the `max-connections-size-per-query` in properties.
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@shardingsphere.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [shardingsphere] sandynz commented on a diff in pull request #24146: Using streaming query at pipeline inventory dump and data consistency check

Posted by "sandynz (via GitHub)" <gi...@apache.org>.
sandynz commented on code in PR #24146:
URL: https://github.com/apache/shardingsphere/pull/24146#discussion_r1105214956


##########
jdbc/core/src/main/java/org/apache/shardingsphere/driver/data/pipeline/datasource/creator/ShardingSpherePipelineDataSourceCreator.java:
##########
@@ -41,6 +42,8 @@ public DataSource createPipelineDataSource(final Object dataSourceConfig) throws
         enableRangeQueryForInline(shardingRuleConfig);
         rootConfig.setDatabaseName(null);
         rootConfig.setSchemaName(null);
+        // TODO set a large enough value, make sure when a jdbc streaming query parameter is take effect
+        rootConfig.getProps().put(ConfigurationPropertyKey.MAX_CONNECTIONS_SIZE_PER_QUERY.getKey(), 100000);

Review Comment:
   Need some test when data source max connection is less than MAX_CONNECTIONS_SIZE_PER_QUERY



##########
kernel/data-pipeline/core/src/main/java/org/apache/shardingsphere/data/pipeline/core/util/JDBCStreamQueryUtil.java:
##########
@@ -0,0 +1,59 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.shardingsphere.data.pipeline.core.util;
+
+import java.sql.Connection;
+import java.sql.PreparedStatement;
+import java.sql.ResultSet;
+import java.sql.SQLException;
+
+/**
+ * JDBC stream query util.
+ */
+public final class JDBCStreamQueryUtil {
+    
+    /**
+     * Generate MySQL stream query prepared statement.
+     *
+     * @param connection connection
+     * @param sql sql
+     * @return stream query prepared statement
+     * @throws SQLException SQL exception
+     */
+    public static PreparedStatement generateMySQLStreamQueryPreparedStatement(final Connection connection, final String sql) throws SQLException {
+        PreparedStatement result = connection.prepareStatement(sql, ResultSet.TYPE_FORWARD_ONLY, ResultSet.CONCUR_READ_ONLY);
+        result.setFetchSize(Integer.MIN_VALUE);
+        return result;
+    }
+    
+    /**
+     * Generate PostgreSQL stream query prepared statement.
+     *
+     * @param connection connection
+     * @param sql sql
+     * @param fetchSize fetch size
+     * @return stream query prepared statement
+     * @throws SQLException SQL exception
+     */
+    public static PreparedStatement generatePostgreSQLStreamQueryPreparedStatement(final Connection connection, final String sql, final int fetchSize) throws SQLException {
+        PreparedStatement result = connection.prepareStatement(sql, ResultSet.TYPE_FORWARD_ONLY, ResultSet.CONCUR_READ_ONLY, ResultSet.CLOSE_CURSORS_AT_COMMIT);
+        connection.setAutoCommit(false);
+        result.setFetchSize(fetchSize);

Review Comment:
   If `fetchSize` is not required for streaming query, then it's better not set fetchSize here



##########
kernel/data-pipeline/core/src/main/java/org/apache/shardingsphere/data/pipeline/core/check/consistency/algorithm/DataMatchDataConsistencyCalculateAlgorithm.java:
##########
@@ -154,9 +158,18 @@ private CalculationContext createCalculationContext(final DataConsistencyCalcula
     
     private void fulfillCalculationContext(final CalculationContext calculationContext, final DataConsistencyCalculateParameter param) throws SQLException {
         String sql = getQuerySQL(param);
-        PreparedStatement preparedStatement = setCurrentStatement(calculationContext.getConnection().prepareStatement(sql));
+        DatabaseType databaseType = TypedSPILoader.getService(DatabaseType.class, param.getDatabaseType());
+        PreparedStatement preparedStatement;
+        if (databaseType instanceof MySQLDatabaseType) {
+            preparedStatement = setCurrentStatement(JDBCStreamQueryUtil.generateMySQLStreamQueryPreparedStatement(calculationContext.getConnection(), sql));
+        } else if (databaseType instanceof PostgreSQLDatabaseType || databaseType instanceof OpenGaussDatabaseType) {
+            preparedStatement = setCurrentStatement(JDBCStreamQueryUtil.generatePostgreSQLStreamQueryPreparedStatement(calculationContext.getConnection(), sql, chunkSize));
+        } else {
+            log.warn("not support {} streaming query now, pay attention to memory usage", databaseType.getType());
+            preparedStatement = setCurrentStatement(calculationContext.getConnection().prepareStatement(sql));
+            preparedStatement.setFetchSize(chunkSize);
+        }

Review Comment:
   Could we extract this code block into `JDBCStreamQueryUtil`? Since there's the same code block in `InventoryDumper`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@shardingsphere.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [shardingsphere] azexcy commented on a diff in pull request #24146: Using streaming query at pipeline inventory dump and data consistency check

Posted by "azexcy (via GitHub)" <gi...@apache.org>.
azexcy commented on code in PR #24146:
URL: https://github.com/apache/shardingsphere/pull/24146#discussion_r1105238188


##########
jdbc/core/src/main/java/org/apache/shardingsphere/driver/data/pipeline/datasource/creator/ShardingSpherePipelineDataSourceCreator.java:
##########
@@ -41,6 +42,8 @@ public DataSource createPipelineDataSource(final Object dataSourceConfig) throws
         enableRangeQueryForInline(shardingRuleConfig);
         rootConfig.setDatabaseName(null);
         rootConfig.setSchemaName(null);
+        // TODO set a large enough value, make sure when a jdbc streaming query parameter is take effect
+        rootConfig.getProps().put(ConfigurationPropertyKey.MAX_CONNECTIONS_SIZE_PER_QUERY.getKey(), 100000);

Review Comment:
   When the number of database connections is less than the number of slices, there will be problems



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@shardingsphere.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [shardingsphere] azexcy commented on a diff in pull request #24146: Using streaming query at pipeline inventory dump and data consistency check

Posted by "azexcy (via GitHub)" <gi...@apache.org>.
azexcy commented on code in PR #24146:
URL: https://github.com/apache/shardingsphere/pull/24146#discussion_r1105238188


##########
jdbc/core/src/main/java/org/apache/shardingsphere/driver/data/pipeline/datasource/creator/ShardingSpherePipelineDataSourceCreator.java:
##########
@@ -41,6 +42,8 @@ public DataSource createPipelineDataSource(final Object dataSourceConfig) throws
         enableRangeQueryForInline(shardingRuleConfig);
         rootConfig.setDatabaseName(null);
         rootConfig.setSchemaName(null);
+        // TODO set a large enough value, make sure when a jdbc streaming query parameter is take effect
+        rootConfig.getProps().put(ConfigurationPropertyKey.MAX_CONNECTIONS_SIZE_PER_QUERY.getKey(), 100000);

Review Comment:
   When the number of database connections is less than the number of slices, there will be problems
   ```
   java.sql.SQLException: Can not get 10 connections one time, partition succeed connection(2) have released. Please consider increasing the `maxPoolSize` of the data sources or decreasing the `max-connections-size-per-query` in properties.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@shardingsphere.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [shardingsphere] codecov-commenter commented on pull request #24146: Using streaming query at pipeline inventory dump and data consistency check

Posted by "codecov-commenter (via GitHub)" <gi...@apache.org>.
codecov-commenter commented on PR #24146:
URL: https://github.com/apache/shardingsphere/pull/24146#issuecomment-1428114885

   # [Codecov](https://codecov.io/gh/apache/shardingsphere/pull/24146?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#24146](https://codecov.io/gh/apache/shardingsphere/pull/24146?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (ca7a316) into [master](https://codecov.io/gh/apache/shardingsphere/commit/35dd65883c486fd7de9cc8859dcd8bfa2d73dcd3?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (35dd658) will **decrease** coverage by `0.04%`.
   > The diff coverage is `0.00%`.
   
   ```diff
   @@             Coverage Diff              @@
   ##             master   #24146      +/-   ##
   ============================================
   - Coverage     50.13%   50.10%   -0.04%     
     Complexity     1576     1576              
   ============================================
     Files          3258     3260       +2     
     Lines         53491    53494       +3     
     Branches       9834     9832       -2     
   ============================================
   - Hits          26816    26801      -15     
   - Misses        24312    24332      +20     
   + Partials       2363     2361       -2     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/shardingsphere/pull/24146?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...eator/ShardingSpherePipelineDataSourceCreator.java](https://codecov.io/gh/apache/shardingsphere/pull/24146?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-amRiYy9jb3JlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9zaGFyZGluZ3NwaGVyZS9kcml2ZXIvZGF0YS9waXBlbGluZS9kYXRhc291cmNlL2NyZWF0b3IvU2hhcmRpbmdTcGhlcmVQaXBlbGluZURhdGFTb3VyY2VDcmVhdG9yLmphdmE=) | `0.00% <0.00%> (ø)` | |
   | [...hm/DataMatchDataConsistencyCalculateAlgorithm.java](https://codecov.io/gh/apache/shardingsphere/pull/24146?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-a2VybmVsL2RhdGEtcGlwZWxpbmUvY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvc2hhcmRpbmdzcGhlcmUvZGF0YS9waXBlbGluZS9jb3JlL2NoZWNrL2NvbnNpc3RlbmN5L2FsZ29yaXRobS9EYXRhTWF0Y2hEYXRhQ29uc2lzdGVuY3lDYWxjdWxhdGVBbGdvcml0aG0uamF2YQ==) | `26.77% <0.00%> (-1.33%)` | :arrow_down: |
   | [...a/pipeline/core/ingest/dumper/InventoryDumper.java](https://codecov.io/gh/apache/shardingsphere/pull/24146?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-a2VybmVsL2RhdGEtcGlwZWxpbmUvY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvc2hhcmRpbmdzcGhlcmUvZGF0YS9waXBlbGluZS9jb3JlL2luZ2VzdC9kdW1wZXIvSW52ZW50b3J5RHVtcGVyLmphdmE=) | `0.00% <0.00%> (ø)` | |
   | [...e/data/pipeline/core/util/JDBCStreamQueryUtil.java](https://codecov.io/gh/apache/shardingsphere/pull/24146?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-a2VybmVsL2RhdGEtcGlwZWxpbmUvY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvc2hhcmRpbmdzcGhlcmUvZGF0YS9waXBlbGluZS9jb3JlL3V0aWwvSkRCQ1N0cmVhbVF1ZXJ5VXRpbC5qYXZh) | `0.00% <0.00%> (ø)` | |
   | [...nfra/util/expr/EspressoInlineExpressionParser.java](https://codecov.io/gh/apache/shardingsphere/pull/24146?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aW5mcmEvdXRpbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvc2hhcmRpbmdzcGhlcmUvaW5mcmEvdXRpbC9leHByL0VzcHJlc3NvSW5saW5lRXhwcmVzc2lvblBhcnNlci5qYXZh) | `0.00% <0.00%> (ø)` | |
   | [...ysql/authentication/MySQLAuthenticationEngine.java](https://codecov.io/gh/apache/shardingsphere/pull/24146?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHJveHkvZnJvbnRlbmQvbXlzcWwvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL3NoYXJkaW5nc3BoZXJlL3Byb3h5L2Zyb250ZW5kL215c3FsL2F1dGhlbnRpY2F0aW9uL015U1FMQXV0aGVudGljYXRpb25FbmdpbmUuamF2YQ==) | `93.75% <0.00%> (ø)` | |
   | [...sql/authentication/MySQLAuthenticationHandler.java](https://codecov.io/gh/apache/shardingsphere/pull/24146?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHJveHkvZnJvbnRlbmQvbXlzcWwvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL3NoYXJkaW5nc3BoZXJlL3Byb3h5L2Zyb250ZW5kL215c3FsL2F1dGhlbnRpY2F0aW9uL015U1FMQXV0aGVudGljYXRpb25IYW5kbGVyLmphdmE=) | `100.00% <0.00%> (ø)` | |
   | [.../representer/processor/NoneYamlTupleProcessor.java](https://codecov.io/gh/apache/shardingsphere/pull/24146?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-ZmVhdHVyZXMvc2hhcmRpbmcvY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvc2hhcmRpbmdzcGhlcmUvc2hhcmRpbmcveWFtbC9lbmdpbmUvcmVwcmVzZW50ZXIvcHJvY2Vzc29yL05vbmVZYW1sVHVwbGVQcm9jZXNzb3IuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...authentication/OpenGaussAuthenticationHandler.java](https://codecov.io/gh/apache/shardingsphere/pull/24146?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHJveHkvZnJvbnRlbmQvb3BlbmdhdXNzL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9zaGFyZGluZ3NwaGVyZS9wcm94eS9mcm9udGVuZC9vcGVuZ2F1c3MvYXV0aGVudGljYXRpb24vT3BlbkdhdXNzQXV0aGVudGljYXRpb25IYW5kbGVyLmphdmE=) | `86.36% <0.00%> (ø)` | |
   | [...authentication/PostgreSQLAuthenticationEngine.java](https://codecov.io/gh/apache/shardingsphere/pull/24146?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHJveHkvZnJvbnRlbmQvcG9zdGdyZXNxbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvc2hhcmRpbmdzcGhlcmUvcHJveHkvZnJvbnRlbmQvcG9zdGdyZXNxbC9hdXRoZW50aWNhdGlvbi9Qb3N0Z3JlU1FMQXV0aGVudGljYXRpb25FbmdpbmUuamF2YQ==) | `89.47% <0.00%> (ø)` | |
   | ... and [14 more](https://codecov.io/gh/apache/shardingsphere/pull/24146?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   :mega: We’re building smart automated test selection to slash your CI/CD build times. [Learn more](https://about.codecov.io/iterative-testing/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@shardingsphere.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [shardingsphere] sandynz commented on pull request #24146: Using streaming query at pipeline inventory dump and data consistency check

Posted by "sandynz (via GitHub)" <gi...@apache.org>.
sandynz commented on PR #24146:
URL: https://github.com/apache/shardingsphere/pull/24146#issuecomment-1429062727

   TODO:
   Enable streaming query in underlying ShardingSphereDataSource and statement and result set
   
   Refer to #24150 for more details.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@shardingsphere.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org