You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@seatunnel.apache.org by GitBox <gi...@apache.org> on 2022/05/08 15:17:46 UTC

[GitHub] [incubator-seatunnel] kalencaya opened a new issue, #1829: [Feature][flink-connector-jdbc] JdbcSink query sql supports param placeholder replacing row index.

kalencaya opened a new issue, #1829:
URL: https://github.com/apache/incubator-seatunnel/issues/1829

   ### Search before asking
   
   - [X] I had searched in the [feature](https://github.com/apache/incubator-seatunnel/issues?q=is%3Aissue+label%3A%22Feature%22) and found no similar feature requirement.
   
   
   ### Description
   
   a typical `JdbcSink` config follows:
   ```
   JdbcSink {
       source_table_name = fake
       driver = com.mysql.jdbc.Driver
       url = "jdbc:mysql://localhost/test"
       username = root
       query = "insert into test(name,age) values(?,?)"
       batch_size = 2
   }
   ```
   `insert into test(name,age) values(?,?)` requires that `Row` fields must match `name, age` order as `JdbcUtils#setRecordToStatement` implemention:
   ```
       public DataStreamSink<Row> outputStream(FlinkEnvironment env, DataStream<Row> dataStream) {
           Table table = env.getStreamTableEnvironment().fromDataStream(dataStream);
           TypeInformation<?>[] fieldTypes = table.getSchema().getFieldTypes();
           int[] types = Arrays.stream(fieldTypes).mapToInt(JdbcTypeUtil::typeInformationToSqlType).toArray();
           SinkFunction<Row> sink = org.apache.flink.connector.jdbc.JdbcSink.sink(
               query,
               (st, row) -> JdbcUtils.setRecordToStatement(st, types, row), // the key reason
               JdbcExecutionOptions.builder()
                   .withBatchSize(batchSize)
                   .withBatchIntervalMs(batchIntervalMs)
                   .withMaxRetries(maxRetries)
                   .build(),
               new JdbcConnectionOptions.JdbcConnectionOptionsBuilder()
                   .withUrl(dbUrl)
                   .withDriverName(driverName)
                   .withUsername(username)
                   .withPassword(password)
                   .build());
   
           if (config.hasPath(PARALLELISM)) {
               return dataStream.addSink(sink).setParallelism(config.getInt(PARALLELISM));
           }
           return dataStream.addSink(sink);
       }
   ```
   now query should support param placeholder such as `insert into test(name,age) values(#{name},#{age})`  would be better for  two reasons:
   * more brief semantics
   * jdbc upsert support. `insert into xxx on duplicate xxx` and `replace into` grammar will work
   
   
   ### Usage Scenario
   
   flink-connector-jdbc supports upsert
   
   ### Related issues
   
   _No response_
   
   ### Are you willing to submit a PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-seatunnel] ruanwenjun commented on issue #1829: [Feature][flink-connector-jdbc] JdbcSink query sql supports param placeholder replacing row index.

Posted by GitBox <gi...@apache.org>.
ruanwenjun commented on issue #1829:
URL: https://github.com/apache/incubator-seatunnel/issues/1829#issuecomment-1123110254

   >1: `insert into test(name,age) values(#{name},#{age}) ` is more brief semantics.
   I don't think so, I think `insert into test(name,age) values(?,?)` is more easy and is standard SQL.
   >2: jdbc upsert support
   This is interesting, but I would want to know this works, I guess it will only be able to upsert in the current jdbc batch?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-seatunnel] kalencaya commented on issue #1829: [Feature][flink-connector-jdbc] JdbcSink query sql supports param placeholder replacing row index.

Posted by GitBox <gi...@apache.org>.
kalencaya commented on issue #1829:
URL: https://github.com/apache/incubator-seatunnel/issues/1829#issuecomment-1123211811

   I'm confused with the meaning of `the current jdbc batch`.
   actually, I have added a transform operator on datastream or dataset which mapping Row(field1, filed2) to Row(field1, field2, field1, field2) dependending on `#{}` placeholder, I think it can work always.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-seatunnel] BenJFan commented on issue #1829: [Feature][flink-connector-jdbc] JdbcSink query sql supports param placeholder replacing row index.

Posted by GitBox <gi...@apache.org>.
BenJFan commented on issue #1829:
URL: https://github.com/apache/incubator-seatunnel/issues/1829#issuecomment-1127170032

   I agree with @kalencaya . The `(?, ?, ?)` not clear in seatunnel config file when create an data flow. Make a `#{}` placeholder will be good choose. But the `(?, ?, ?)` should work too.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-seatunnel] kalencaya commented on issue #1829: [Feature][flink-connector-jdbc] JdbcSink query sql supports param placeholder replacing row index.

Posted by GitBox <gi...@apache.org>.
kalencaya commented on issue #1829:
URL: https://github.com/apache/incubator-seatunnel/issues/1829#issuecomment-1123210109

   `insert into test(field1, field2, field3....) values (?,?,?...)` works with `PreparedStatement` not users, `insert into test(field1, field2, field3....) values (#{field1}, #{field2}, #{field2}, ...)` is more direct for user define how fields update.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-seatunnel] kalencaya commented on issue #1829: [Feature][flink-connector-jdbc] JdbcSink query sql supports param placeholder replacing row index.

Posted by GitBox <gi...@apache.org>.
kalencaya commented on issue #1829:
URL: https://github.com/apache/incubator-seatunnel/issues/1829#issuecomment-1127199372

   > I agree with @kalencaya . The `(?, ?, ?)` not clear in seatunnel config file when create an data flow. Make a `#{}` placeholder will be good choose. But the `(?, ?, ?)` should work too.
   
   emmmm, try to keep compatible with the existing `(?, ?, ?)` style? Does it is necessary?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org