You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@inlong.apache.org by "liaorui (via GitHub)" <gi...@apache.org> on 2023/03/07 08:36:56 UTC

[GitHub] [inlong] liaorui opened a new pull request, #7541: [INLONG-7539][Sort]starrocks connector is parsing operation type from rowdata

liaorui opened a new pull request, #7541:
URL: https://github.com/apache/inlong/pull/7541

   ### Prepare a Pull Request
   *(Change the title refer to the following example)*
   
   - Title Example: [INLONG-XYZ][Component] Title of the pull request
   
   *(The following *XYZ* should be replaced by the actual [GitHub Issue](https://github.com/apache/inlong/issues) number)*
   
   - Fixes #7539
   
   ### Motivation
   
   *Explain here the context, and why you're making that change. What is the problem you're trying to solve?*
   
   1. Source cdc connectors like mysql and postgres, pack operation type (INSERT、DELETE and UPDATE) into GenericRowData. StarRocks connector is using the RowKind of GenericRowData. This row kind is always INSERT.
   
   StarRocks connector should parse this type from RowData with canal-json or debezium-json format.
   
   2. When `schema-update.policy` is `LOG_WITH_IGNORE`, but `dirty.ignore` is false,  we should not invoke dirty helper to archive dirty data to s3.
   
   ### Modifications
   
   *Describe the modifications you've done.*
   
   1. `StarRocksDynamicSinkFunction` class parses the row kind from row data just like `doris-connector`.
   2. `StarRocksSinkManager` class has one line new code:  `if (dirtySinkHelper.getDirtyOptions().ignoreDirty())`. Only when `dirty.ignore` is true, we invoke dirty helper method, else not.
   
   
   ### Verifying this change
   
   *(Please pick either of the following options)*
   
   - [ ] This change is a trivial rework/code cleanup without any test coverage.
   
   - [ ] This change is already covered by existing tests, such as:
     *(please describe tests)*
   
   - [ ] This change added tests and can be verified as follows:
   
     *(example:)*
     - *Added integration tests for end-to-end deployment with large payloads (10MB)*
     - *Extended integration test for recovery after broker failure*
   
   ### Documentation
   
     - Does this pull request introduce a new feature? (yes / no)
     - If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)
     - If a feature is not applicable for documentation, explain why?
     - If a feature is not documented yet in this PR, please create a follow-up issue for adding the documentation
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@inlong.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [inlong] healchow commented on a diff in pull request #7541: [INLONG-7539][Sort] StarRocks connector parses row kind from GenericRowData

Posted by "healchow (via GitHub)" <gi...@apache.org>.
healchow commented on code in PR #7541:
URL: https://github.com/apache/inlong/pull/7541#discussion_r1128798630


##########
inlong-sort/sort-connectors/starrocks/src/main/java/org/apache/inlong/sort/starrocks/manager/StarRocksSinkManager.java:
##########
@@ -523,35 +523,37 @@ private boolean asyncFlush() throws Exception {
 
     private void handleDirtyData(SinkBufferEntity flushData, Exception e) throws JsonProcessingException {
         // archive dirty data
-        if (StarRocksSinkOptions.StreamLoadFormat.CSV.equals(sinkOptions.getStreamLoadFormat())) {
-            String columnSeparator = StarRocksDelimiterParser.parse(
-                    sinkOptions.getSinkStreamLoadProperties().get("column_separator"), "\t");
-            String[] col = flushData.getColumns().split(",");
-            int len = col.length;
-            for (byte[] row : flushData.getBuffer()) {
-                Map<String, String> jsonData = new LinkedHashMap<>(16);
-                // convert csv to json
-                String[] values = new String(row, StandardCharsets.UTF_8).split(columnSeparator);
-                for (int i = 0; i < len && i < values.length; i++) {
-                    jsonData.put(col[i], values[i]);
+        if (dirtySinkHelper.getDirtyOptions().ignoreDirty()) {

Review Comment:
   1. Does the condition here need to be reversed?
   2. Can you use guard statements to simplify the code, for example:
   ```java
   if (true) {
       return;
   }
   
   // other operations, not need to use else statement
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@inlong.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [inlong] dockerzhang merged pull request #7541: [INLONG-7539][Sort] StarRocks connector parses row kind from GenericRowData

Posted by "dockerzhang (via GitHub)" <gi...@apache.org>.
dockerzhang merged PR #7541:
URL: https://github.com/apache/inlong/pull/7541


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@inlong.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org