You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@shardingsphere.apache.org by "azexcy (via GitHub)" <gi...@apache.org> on 2023/02/13 13:41:45 UTC
[GitHub] [shardingsphere] azexcy opened a new pull request, #24146: Using streaming query at pipeline inventory dump and data consistency check
azexcy opened a new pull request, #24146:
URL: https://github.com/apache/shardingsphere/pull/24146
Changes proposed in this pull request:
- Using streaming query at pipeline inventory dump and data consistency check
---
Before committing this PR, I'm sure that I have checked the following options:
- [ ] My code follows the [code of conduct](https://shardingsphere.apache.org/community/en/involved/conduct/code/) of this project.
- [ ] I have self-reviewed the commit code.
- [ ] I have (or in comment I request) added corresponding labels for the pull request.
- [ ] I have passed maven check locally : `./mvnw clean install -B -T1C -Dmaven.javadoc.skip -Dmaven.jacoco.skip -e`.
- [ ] I have made corresponding changes to the documentation.
- [ ] I have added corresponding unit tests for my changes.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@shardingsphere.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [shardingsphere] sandynz merged pull request #24146: Using streaming query at pipeline inventory dump and data consistency check
Posted by "sandynz (via GitHub)" <gi...@apache.org>.
sandynz merged PR #24146:
URL: https://github.com/apache/shardingsphere/pull/24146
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@shardingsphere.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [shardingsphere] azexcy commented on pull request #24146: Using streaming query at pipeline inventory dump and data consistency check
Posted by "azexcy (via GitHub)" <gi...@apache.org>.
azexcy commented on PR #24146:
URL: https://github.com/apache/shardingsphere/pull/24146#issuecomment-1429051689
If increasing `MAX_CONNECTIONS_SIZE_PER_QUERY`, maybe the following error will occur, so remove it.
```
java.sql.SQLException: Can not get 10 connections one time, partition succeed connection(2) have released. Please consider increasing the `maxPoolSize` of the data sources or decreasing the `max-connections-size-per-query` in properties.
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@shardingsphere.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [shardingsphere] sandynz commented on a diff in pull request #24146: Using streaming query at pipeline inventory dump and data consistency check
Posted by "sandynz (via GitHub)" <gi...@apache.org>.
sandynz commented on code in PR #24146:
URL: https://github.com/apache/shardingsphere/pull/24146#discussion_r1105214956
##########
jdbc/core/src/main/java/org/apache/shardingsphere/driver/data/pipeline/datasource/creator/ShardingSpherePipelineDataSourceCreator.java:
##########
@@ -41,6 +42,8 @@ public DataSource createPipelineDataSource(final Object dataSourceConfig) throws
enableRangeQueryForInline(shardingRuleConfig);
rootConfig.setDatabaseName(null);
rootConfig.setSchemaName(null);
+ // TODO set a large enough value, make sure when a jdbc streaming query parameter is take effect
+ rootConfig.getProps().put(ConfigurationPropertyKey.MAX_CONNECTIONS_SIZE_PER_QUERY.getKey(), 100000);
Review Comment:
Need some test when data source max connection is less than MAX_CONNECTIONS_SIZE_PER_QUERY
##########
kernel/data-pipeline/core/src/main/java/org/apache/shardingsphere/data/pipeline/core/util/JDBCStreamQueryUtil.java:
##########
@@ -0,0 +1,59 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.shardingsphere.data.pipeline.core.util;
+
+import java.sql.Connection;
+import java.sql.PreparedStatement;
+import java.sql.ResultSet;
+import java.sql.SQLException;
+
+/**
+ * JDBC stream query util.
+ */
+public final class JDBCStreamQueryUtil {
+
+ /**
+ * Generate MySQL stream query prepared statement.
+ *
+ * @param connection connection
+ * @param sql sql
+ * @return stream query prepared statement
+ * @throws SQLException SQL exception
+ */
+ public static PreparedStatement generateMySQLStreamQueryPreparedStatement(final Connection connection, final String sql) throws SQLException {
+ PreparedStatement result = connection.prepareStatement(sql, ResultSet.TYPE_FORWARD_ONLY, ResultSet.CONCUR_READ_ONLY);
+ result.setFetchSize(Integer.MIN_VALUE);
+ return result;
+ }
+
+ /**
+ * Generate PostgreSQL stream query prepared statement.
+ *
+ * @param connection connection
+ * @param sql sql
+ * @param fetchSize fetch size
+ * @return stream query prepared statement
+ * @throws SQLException SQL exception
+ */
+ public static PreparedStatement generatePostgreSQLStreamQueryPreparedStatement(final Connection connection, final String sql, final int fetchSize) throws SQLException {
+ PreparedStatement result = connection.prepareStatement(sql, ResultSet.TYPE_FORWARD_ONLY, ResultSet.CONCUR_READ_ONLY, ResultSet.CLOSE_CURSORS_AT_COMMIT);
+ connection.setAutoCommit(false);
+ result.setFetchSize(fetchSize);
Review Comment:
If `fetchSize` is not required for streaming query, then it's better not set fetchSize here
##########
kernel/data-pipeline/core/src/main/java/org/apache/shardingsphere/data/pipeline/core/check/consistency/algorithm/DataMatchDataConsistencyCalculateAlgorithm.java:
##########
@@ -154,9 +158,18 @@ private CalculationContext createCalculationContext(final DataConsistencyCalcula
private void fulfillCalculationContext(final CalculationContext calculationContext, final DataConsistencyCalculateParameter param) throws SQLException {
String sql = getQuerySQL(param);
- PreparedStatement preparedStatement = setCurrentStatement(calculationContext.getConnection().prepareStatement(sql));
+ DatabaseType databaseType = TypedSPILoader.getService(DatabaseType.class, param.getDatabaseType());
+ PreparedStatement preparedStatement;
+ if (databaseType instanceof MySQLDatabaseType) {
+ preparedStatement = setCurrentStatement(JDBCStreamQueryUtil.generateMySQLStreamQueryPreparedStatement(calculationContext.getConnection(), sql));
+ } else if (databaseType instanceof PostgreSQLDatabaseType || databaseType instanceof OpenGaussDatabaseType) {
+ preparedStatement = setCurrentStatement(JDBCStreamQueryUtil.generatePostgreSQLStreamQueryPreparedStatement(calculationContext.getConnection(), sql, chunkSize));
+ } else {
+ log.warn("not support {} streaming query now, pay attention to memory usage", databaseType.getType());
+ preparedStatement = setCurrentStatement(calculationContext.getConnection().prepareStatement(sql));
+ preparedStatement.setFetchSize(chunkSize);
+ }
Review Comment:
Could we extract this code block into `JDBCStreamQueryUtil`? Since there's the same code block in `InventoryDumper`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@shardingsphere.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [shardingsphere] azexcy commented on a diff in pull request #24146: Using streaming query at pipeline inventory dump and data consistency check
Posted by "azexcy (via GitHub)" <gi...@apache.org>.
azexcy commented on code in PR #24146:
URL: https://github.com/apache/shardingsphere/pull/24146#discussion_r1105238188
##########
jdbc/core/src/main/java/org/apache/shardingsphere/driver/data/pipeline/datasource/creator/ShardingSpherePipelineDataSourceCreator.java:
##########
@@ -41,6 +42,8 @@ public DataSource createPipelineDataSource(final Object dataSourceConfig) throws
enableRangeQueryForInline(shardingRuleConfig);
rootConfig.setDatabaseName(null);
rootConfig.setSchemaName(null);
+ // TODO set a large enough value, make sure when a jdbc streaming query parameter is take effect
+ rootConfig.getProps().put(ConfigurationPropertyKey.MAX_CONNECTIONS_SIZE_PER_QUERY.getKey(), 100000);
Review Comment:
When the number of database connections is less than the number of slices, there will be problems
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@shardingsphere.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [shardingsphere] azexcy commented on a diff in pull request #24146: Using streaming query at pipeline inventory dump and data consistency check
Posted by "azexcy (via GitHub)" <gi...@apache.org>.
azexcy commented on code in PR #24146:
URL: https://github.com/apache/shardingsphere/pull/24146#discussion_r1105238188
##########
jdbc/core/src/main/java/org/apache/shardingsphere/driver/data/pipeline/datasource/creator/ShardingSpherePipelineDataSourceCreator.java:
##########
@@ -41,6 +42,8 @@ public DataSource createPipelineDataSource(final Object dataSourceConfig) throws
enableRangeQueryForInline(shardingRuleConfig);
rootConfig.setDatabaseName(null);
rootConfig.setSchemaName(null);
+ // TODO set a large enough value, make sure when a jdbc streaming query parameter is take effect
+ rootConfig.getProps().put(ConfigurationPropertyKey.MAX_CONNECTIONS_SIZE_PER_QUERY.getKey(), 100000);
Review Comment:
When the number of database connections is less than the number of slices, there will be problems
```
java.sql.SQLException: Can not get 10 connections one time, partition succeed connection(2) have released. Please consider increasing the `maxPoolSize` of the data sources or decreasing the `max-connections-size-per-query` in properties.
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@shardingsphere.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [shardingsphere] codecov-commenter commented on pull request #24146: Using streaming query at pipeline inventory dump and data consistency check
Posted by "codecov-commenter (via GitHub)" <gi...@apache.org>.
codecov-commenter commented on PR #24146:
URL: https://github.com/apache/shardingsphere/pull/24146#issuecomment-1428114885
# [Codecov](https://codecov.io/gh/apache/shardingsphere/pull/24146?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#24146](https://codecov.io/gh/apache/shardingsphere/pull/24146?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (ca7a316) into [master](https://codecov.io/gh/apache/shardingsphere/commit/35dd65883c486fd7de9cc8859dcd8bfa2d73dcd3?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (35dd658) will **decrease** coverage by `0.04%`.
> The diff coverage is `0.00%`.
```diff
@@ Coverage Diff @@
## master #24146 +/- ##
============================================
- Coverage 50.13% 50.10% -0.04%
Complexity 1576 1576
============================================
Files 3258 3260 +2
Lines 53491 53494 +3
Branches 9834 9832 -2
============================================
- Hits 26816 26801 -15
- Misses 24312 24332 +20
+ Partials 2363 2361 -2
```
| [Impacted Files](https://codecov.io/gh/apache/shardingsphere/pull/24146?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [...eator/ShardingSpherePipelineDataSourceCreator.java](https://codecov.io/gh/apache/shardingsphere/pull/24146?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-amRiYy9jb3JlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9zaGFyZGluZ3NwaGVyZS9kcml2ZXIvZGF0YS9waXBlbGluZS9kYXRhc291cmNlL2NyZWF0b3IvU2hhcmRpbmdTcGhlcmVQaXBlbGluZURhdGFTb3VyY2VDcmVhdG9yLmphdmE=) | `0.00% <0.00%> (ø)` | |
| [...hm/DataMatchDataConsistencyCalculateAlgorithm.java](https://codecov.io/gh/apache/shardingsphere/pull/24146?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-a2VybmVsL2RhdGEtcGlwZWxpbmUvY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvc2hhcmRpbmdzcGhlcmUvZGF0YS9waXBlbGluZS9jb3JlL2NoZWNrL2NvbnNpc3RlbmN5L2FsZ29yaXRobS9EYXRhTWF0Y2hEYXRhQ29uc2lzdGVuY3lDYWxjdWxhdGVBbGdvcml0aG0uamF2YQ==) | `26.77% <0.00%> (-1.33%)` | :arrow_down: |
| [...a/pipeline/core/ingest/dumper/InventoryDumper.java](https://codecov.io/gh/apache/shardingsphere/pull/24146?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-a2VybmVsL2RhdGEtcGlwZWxpbmUvY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvc2hhcmRpbmdzcGhlcmUvZGF0YS9waXBlbGluZS9jb3JlL2luZ2VzdC9kdW1wZXIvSW52ZW50b3J5RHVtcGVyLmphdmE=) | `0.00% <0.00%> (ø)` | |
| [...e/data/pipeline/core/util/JDBCStreamQueryUtil.java](https://codecov.io/gh/apache/shardingsphere/pull/24146?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-a2VybmVsL2RhdGEtcGlwZWxpbmUvY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvc2hhcmRpbmdzcGhlcmUvZGF0YS9waXBlbGluZS9jb3JlL3V0aWwvSkRCQ1N0cmVhbVF1ZXJ5VXRpbC5qYXZh) | `0.00% <0.00%> (ø)` | |
| [...nfra/util/expr/EspressoInlineExpressionParser.java](https://codecov.io/gh/apache/shardingsphere/pull/24146?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aW5mcmEvdXRpbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvc2hhcmRpbmdzcGhlcmUvaW5mcmEvdXRpbC9leHByL0VzcHJlc3NvSW5saW5lRXhwcmVzc2lvblBhcnNlci5qYXZh) | `0.00% <0.00%> (ø)` | |
| [...ysql/authentication/MySQLAuthenticationEngine.java](https://codecov.io/gh/apache/shardingsphere/pull/24146?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHJveHkvZnJvbnRlbmQvbXlzcWwvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL3NoYXJkaW5nc3BoZXJlL3Byb3h5L2Zyb250ZW5kL215c3FsL2F1dGhlbnRpY2F0aW9uL015U1FMQXV0aGVudGljYXRpb25FbmdpbmUuamF2YQ==) | `93.75% <0.00%> (ø)` | |
| [...sql/authentication/MySQLAuthenticationHandler.java](https://codecov.io/gh/apache/shardingsphere/pull/24146?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHJveHkvZnJvbnRlbmQvbXlzcWwvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL3NoYXJkaW5nc3BoZXJlL3Byb3h5L2Zyb250ZW5kL215c3FsL2F1dGhlbnRpY2F0aW9uL015U1FMQXV0aGVudGljYXRpb25IYW5kbGVyLmphdmE=) | `100.00% <0.00%> (ø)` | |
| [.../representer/processor/NoneYamlTupleProcessor.java](https://codecov.io/gh/apache/shardingsphere/pull/24146?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-ZmVhdHVyZXMvc2hhcmRpbmcvY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvc2hhcmRpbmdzcGhlcmUvc2hhcmRpbmcveWFtbC9lbmdpbmUvcmVwcmVzZW50ZXIvcHJvY2Vzc29yL05vbmVZYW1sVHVwbGVQcm9jZXNzb3IuamF2YQ==) | `0.00% <0.00%> (ø)` | |
| [...authentication/OpenGaussAuthenticationHandler.java](https://codecov.io/gh/apache/shardingsphere/pull/24146?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHJveHkvZnJvbnRlbmQvb3BlbmdhdXNzL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9zaGFyZGluZ3NwaGVyZS9wcm94eS9mcm9udGVuZC9vcGVuZ2F1c3MvYXV0aGVudGljYXRpb24vT3BlbkdhdXNzQXV0aGVudGljYXRpb25IYW5kbGVyLmphdmE=) | `86.36% <0.00%> (ø)` | |
| [...authentication/PostgreSQLAuthenticationEngine.java](https://codecov.io/gh/apache/shardingsphere/pull/24146?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHJveHkvZnJvbnRlbmQvcG9zdGdyZXNxbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvc2hhcmRpbmdzcGhlcmUvcHJveHkvZnJvbnRlbmQvcG9zdGdyZXNxbC9hdXRoZW50aWNhdGlvbi9Qb3N0Z3JlU1FMQXV0aGVudGljYXRpb25FbmdpbmUuamF2YQ==) | `89.47% <0.00%> (ø)` | |
| ... and [14 more](https://codecov.io/gh/apache/shardingsphere/pull/24146?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
:mega: We’re building smart automated test selection to slash your CI/CD build times. [Learn more](https://about.codecov.io/iterative-testing/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@shardingsphere.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [shardingsphere] sandynz commented on pull request #24146: Using streaming query at pipeline inventory dump and data consistency check
Posted by "sandynz (via GitHub)" <gi...@apache.org>.
sandynz commented on PR #24146:
URL: https://github.com/apache/shardingsphere/pull/24146#issuecomment-1429062727
TODO:
Enable streaming query in underlying ShardingSphereDataSource and statement and result set
Refer to #24150 for more details.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@shardingsphere.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org