You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "Philipp Suter (JIRA)" <ji...@apache.org> on 2016/02/19 15:58:19 UTC
[jira] [Updated] (OAK-4028) Use UPSERT (i.e. MERGE) for RDB to
limit roundtrips on writing
[ https://issues.apache.org/jira/browse/OAK-4028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Philipp Suter updated OAK-4028:
-------------------------------
Description:
All databases offer some kind of UPSERT functionality. Either as MERGE [] or INSERT .. ON CONFLICT UPDATE. Such statements could be used to prevent an additional roundtrip in case an INSERT was not successful and needed an actual UPDATE.
Problem: Distinguish if the DB actually did an INSERT or UPDATE for each statement. This is done differently for each supported database but possible, except most likely for Oracle:
MySQL: http://dev.mysql.com/doc/refman/5.7/en/insert-on-duplicate.html
- 1 for insert, 2 for update, 0 for update but unchanged
MS SQL: https://msdn.microsoft.com/en-us/library/bb510625.aspx
- See example 'C'
- Use the OUTPUT variable $action
- Needs some testing with JDBC
Oracle: https://community.oracle.com/thread/3803485
- Don't use merge
DB2: http://www-01.ibm.com/support/knowledgecenter/SSEPEK_11.0.0/com.ibm.db2z11.doc.apsg/src/tpc/db2z_selectvaluesmerge.dita?lang=en
- SELECT FROM MERGE
- use a column to identify if insert or update was done, e.g. createdDate or similar (Modified?)
- Needs some testing with JDBC
Postgres: http://www.postgresql.org/docs/current/static/sql-insert.html
- INSERT ... ON CONFLICT UPDATE ... RETURNING ...
- Similar to DB2: Use a DB field to separate freshly inserted values from updated ones
Ideally this is first tested with one database, e.g. MySQL due to the very convenient syntax in MySQL. If a difference in performance can be measured it could be implemented across the board.
was:
All databases offer some kind of UPSERT functionality. Either as MERGE [] or INSERT .. ON CONFLICT UPDATE. Such statements could be used to prevent an additional roundtrip in case an INSERT was not successful and needed an actual UPDATE.
Problem: Distinguish if the DB actually did an INSERT or UPDATE for each statement. This is done differently for each supported database but possible, except for Oracle:
MySQL:
http://dev.mysql.com/doc/refman/5.7/en/insert-on-duplicate.html
- 1 for insert, 2 for update, 0 for update but unchanged
MS SQL:
https://msdn.microsoft.com/en-us/library/bb510625.aspx
- See example 'C'
- Use the OUTPUT variable $action
- Needs some testing with JDBC
Oracle:
https://community.oracle.com/thread/3803485
- Don't use merge
DB2:
http://www-01.ibm.com/support/knowledgecenter/SSEPEK_11.0.0/com.ibm.db2z11.doc.apsg/src/tpc/db2z_selectvaluesmerge.dita?lang=en
- SELECT FROM MERGE
- use a column to identify if insert or update was done, e.g. createdDate or similar (Modified?)
- Needs some testing with JDBC
Postgres
http://www.postgresql.org/docs/current/static/sql-insert.html
- INSERT ... ON CONFLICT UPDATE ... RETURNING ...
- Similar to DB2: Use a DB field to separate freshly inserte values from updated ones
Ideally this is first tested with one database, e.g. MySQL due to the very convenient syntax in MySQL. If a difference in performance can be measured it could be implemented across the board.
> Use UPSERT (i.e. MERGE) for RDB to limit roundtrips on writing
> --------------------------------------------------------------
>
> Key: OAK-4028
> URL: https://issues.apache.org/jira/browse/OAK-4028
> Project: Jackrabbit Oak
> Issue Type: Improvement
> Components: documentmk, rdbmk
> Reporter: Philipp Suter
>
> All databases offer some kind of UPSERT functionality. Either as MERGE [] or INSERT .. ON CONFLICT UPDATE. Such statements could be used to prevent an additional roundtrip in case an INSERT was not successful and needed an actual UPDATE.
> Problem: Distinguish if the DB actually did an INSERT or UPDATE for each statement. This is done differently for each supported database but possible, except most likely for Oracle:
> MySQL: http://dev.mysql.com/doc/refman/5.7/en/insert-on-duplicate.html
> - 1 for insert, 2 for update, 0 for update but unchanged
>
> MS SQL: https://msdn.microsoft.com/en-us/library/bb510625.aspx
> - See example 'C'
> - Use the OUTPUT variable $action
> - Needs some testing with JDBC
>
> Oracle: https://community.oracle.com/thread/3803485
> - Don't use merge
> DB2: http://www-01.ibm.com/support/knowledgecenter/SSEPEK_11.0.0/com.ibm.db2z11.doc.apsg/src/tpc/db2z_selectvaluesmerge.dita?lang=en
> - SELECT FROM MERGE
> - use a column to identify if insert or update was done, e.g. createdDate or similar (Modified?)
> - Needs some testing with JDBC
> Postgres: http://www.postgresql.org/docs/current/static/sql-insert.html
> - INSERT ... ON CONFLICT UPDATE ... RETURNING ...
> - Similar to DB2: Use a DB field to separate freshly inserted values from updated ones
> Ideally this is first tested with one database, e.g. MySQL due to the very convenient syntax in MySQL. If a difference in performance can be measured it could be implemented across the board.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)