You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2022/07/07 14:12:10 UTC

[GitHub] [pulsar] nicoloboschi opened a new pull request, #16448: [feature][connector] JDBC sinks: support upsert and row deletion

nicoloboschi opened a new pull request, #16448:
URL: https://github.com/apache/pulsar/pull/16448

   ### Motivation
   
   Currently JDBC sink will perform an insert query for most of the cases. The only way to execute an UPDATE or DELETE is to pass a specific property in the message properties. This is not handy and requires to all the sources to add this specific option.
   
   The goal is to provide a way to upsert and delete records based on the record content while using key-value records.
   1. Upsert: Insert or update the record based on the key fields.
   2. Delete: if the value is NULL, delete the record for the given row.
   
   ### Modifications
   * new config option `insertMode` with values
     * INSERT: only perform blindly insert (default, in order to not break upgrades)
     * UPDATE: only perform blindly update
     * UPSERT: use upsert query. Main issue is that there's no SQL standard so every sink has his own implementation (by default it throws a not supported exception)
   * new config option `nullValueAction` with values
     * FAIL: update the row with null values (default, in order to not break upgrades)
     * DELETE: delete the row with the the given key/pk
   
   Note that in order to not break compatibility the message property `ACTION` has the precedence on other logics.
   
   For the upsert mode, only the following sinks support it:
   - SqlLite
   - Postgres
   - MariaDB
   
   
   - Upgraded SqlLite to 3.36.0.3 to support upsert
   
   others will throw an exception and the message will be rejected
   
   - [x] `doc` 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] github-actions[bot] commented on pull request #16448: [feature][connector] JDBC sinks: support upsert and row deletion

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #16448:
URL: https://github.com/apache/pulsar/pull/16448#issuecomment-1177689854

   @nicoloboschi Please provide a correct documentation label for your PR.
   Instructions see [Pulsar Documentation Label Guide](https://docs.google.com/document/d/1Qw7LHQdXWBW9t2-r-A7QdFDBwmZh6ytB4guwMoXHqc0).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] nicoloboschi closed pull request #16448: [feature][connector] JDBC sinks: support upsert and row deletion

Posted by GitBox <gi...@apache.org>.
nicoloboschi closed pull request #16448: [feature][connector] JDBC sinks: support upsert and row deletion
URL: https://github.com/apache/pulsar/pull/16448


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] Anonymitaet commented on a diff in pull request #16448: [feature][connector] JDBC sinks: support upsert and row deletion

Posted by GitBox <gi...@apache.org>.
Anonymitaet commented on code in PR #16448:
URL: https://github.com/apache/pulsar/pull/16448#discussion_r916408560


##########
site2/docs/io-jdbc-sink.md:
##########
@@ -25,6 +26,8 @@ The configuration of all JDBC sink connectors has the following properties.
 | `key`       | String | false    | " " (empty string) | A comma-separated list containing the fields used in `where` condition of updating and deleting events.                  |
 | `timeoutMs` | int    | false    | 500                | The JDBC operation timeout in milliseconds.                                                                              |
 | `batchSize` | int    | false    | 200                | The batch size of updates made to the database.                                                                          |
+| `insertMode` | enum( INSERT,UPSERT,UPDATE) | false    | INSERT | If it is configured as UPSERT, the sink will use upsert semantics rather than plain INSERT/UPDATE statements. Upsert semantics refer to atomically adding a new row or updating the existing row if there is a primary key constraint violation, which provides idempotence. |
+| `nullValueAction` | enum(FAIL, DELETE) | false    | FAIL | How to handle records with null values, possible options are DELETE or FAIL. |

Review Comment:
   ```suggestion
   | `nullValueAction` | enum(FAIL, DELETE) | false    | FAIL | How to handle records with NULL values. Possible options are `DELETE` or `FAIL`. |
   ```



##########
site2/docs/io-jdbc-sink.md:
##########
@@ -25,6 +26,8 @@ The configuration of all JDBC sink connectors has the following properties.
 | `key`       | String | false    | " " (empty string) | A comma-separated list containing the fields used in `where` condition of updating and deleting events.                  |
 | `timeoutMs` | int    | false    | 500                | The JDBC operation timeout in milliseconds.                                                                              |
 | `batchSize` | int    | false    | 200                | The batch size of updates made to the database.                                                                          |
+| `insertMode` | enum( INSERT,UPSERT,UPDATE) | false    | INSERT | If it is configured as UPSERT, the sink will use upsert semantics rather than plain INSERT/UPDATE statements. Upsert semantics refer to atomically adding a new row or updating the existing row if there is a primary key constraint violation, which provides idempotence. |

Review Comment:
   ```suggestion
   | `insertMode` | enum( INSERT,UPSERT,UPDATE) | false    | INSERT | If it is configured as UPSERT, the sink uses upsert semantics rather than plain INSERT/UPDATE statements. Upsert semantics refer to atomically adding a new row or updating the existing row if there is a primary key constraint violation, which provides idempotence. |
   ```
   Write in the simple present tense as much as possible if you are covering facts that were, are, and forever shall be true.https://docs.google.com/document/d/1lc5j4RtuLIzlEYCBo97AC8-U_3Erzs_lxpkDuseU0n4/edit#bookmark=id.e8uqh1awkcnp



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] nicoloboschi merged pull request #16448: [feature][connector] JDBC sinks: support upsert and row deletion

Posted by GitBox <gi...@apache.org>.
nicoloboschi merged PR #16448:
URL: https://github.com/apache/pulsar/pull/16448


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org