You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "Julian Reschke (JIRA)" <ji...@apache.org> on 2014/09/18 15:20:34 UTC
[jira] [Comment Edited] (OAK-1941) RDB: decide on table layout

    [ https://issues.apache.org/jira/browse/OAK-1941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14066382#comment-14066382 ] 

Julian Reschke edited comment on OAK-1941 at 9/18/14 1:20 PM:
--------------------------------------------------------------

Minutes from a brainstorming with Marcel and Chetan:

The DocumentStore API essentially reflects the abilities of MongoDB. In particular:

1) a JSON document can be updated without reading it first (through native support for the UpdateOp)

2) Updates can be made conditional (again through checkConditions in the UpdateOp)

3) When an update occured, MongoDB can return the previous state of the document

A relational database normally doesn't have these abilities, and this will likely cause problems with the RDB persistence:

- concurrent updates may require retries; this is expensive and may fail if the different nodes do not have exactly the same DB connectivity (CPU, network, ...)

- always having to fetch documents may make simple update operations more expensive

In order to address these problems, we have to try to get closer to MongoDB's capabilities.


1) Update without having the document in memory

This can be achieved by changing the data model to be one document + several updates. In a sting column, we can just modify the data format from "one JSON object" to "array of JSON objects", where evertyhing but the first is an UpdateOp. Upon reading, the RDBDocumentStore would reconstruct the document. This will have a small performance penalty on read. The background document splitting might take care of this (rewriting documents), or we may have to think about a cleanup task.

In case where the overflow BLOB is used, we can change the datamodel to "gzipped-JSON-doc-in-BLOB" plus "sequence-of-updates in string column".

The update operation itself might become DB-specific, for instance, the following appears to work on Postgres:

UPDATE nodes
SET size = size + 10,
    modcount = modcount + 1,
    data = data || ', true'
    WHERE id = '0:/' and size + 10 < ...; 

2) Conditional Updates

The only conditions that are checked in practice are on the "_collisions" map, where the DocumentMK will verify that a given revision is not in the map.

(The size of the map is currently unbounded because there's no cleanup yet)

It would be tricky to store the set of revisions separately, but we *can* introduce a counter that increases everytime the collisions map is changed. Having that, we can make updates conditional on the collisions map being the same as a known state.

(Maintaining that information is implemented in attachment OAK-1941-cmodcount.diff)


3) Returning the previous state

Some DBs have extensions for this ("RETURNING"), some will require writing stored procedures. 

[Update from Oakathon discussion in Sep 2014]:

a) We might not need the "previous" state due to potential changes in the DocumentMK. To be discusse.
b) An alternative way to obtain the previous state would be to read the new state of the document, locate the UpdateOp that as added by "us", and then to apply all preceding UpdateOps to the base document (will fail if the documents gets fully rewritten in the meantime)


Next steps:

- extract MongoDB specific test cases that simulate concurrent updates

- try to get 1) and 3) implemented on DB2, and see how this behaves for these tests

Other considerations:

- for databases where we don't want to add DB-specific code we may want to see whether a simple improvement in the retry logic helps; right now, when retrying, we use two transactions to read and write back; doing it in a single one with the right isolation level might already be good enough.



was (Author: reschke):
Minutes from a brainstorming with Marcel and Chetan:

The DocumentStore API essentially reflects the abilities of MongoDB. In particular:

1) a JSON document can be updated without reading it first (through native support for the UpdateOp)

2) Updates can be made conditional (again through checkConditions in the UpdateOp)

3) When an update occured, MongoDB can return the previous state of the document

A relational database normally doesn't have these abilities, and this will likely cause problems with the RDB persistence:

- concurrent updates may require retries; this is expensive and may fail if the different nodes do not have exactly the same DB connectivity (CPU, network, ...)

- always having to fetch documents may make simple update operations more expensive

In order to address these problems, we have to try to get closer to MongoDB's capabilities.


1) Update without having the document in memory

This can be achieved by changing the data model to be one document + several updates. In a sting column, we can just modify the data format from "one JSON object" to "array of JSON objects", where evertyhing but the first is an UpdateOp. Upon reading, the RDBDocumentStore would reconstruct the document. This will have a small performance penalty on read. The background document splitting might take care of this (rewriting documents), or we may have to think about a cleanup task.

In case where the overflow BLOB is used, we can change the datamodel to "gzipped-JSON-doc-in-BLOB" plus "sequence-of-updates in string column".

The update operation itself might become DB-specific, for instance, the following appears to work on Postgres:

UPDATE nodes
SET size = size + 10,
    modcount = modcount + 1,
    data = data || ', true'
    WHERE id = '0:/' and size + 10 < ...; 

2) Conditional Updates

The only conditions that are checked in practice are on the "_collisions" map, where the DocumentMK will verify that a given revision is not in the map.

(The size of the map is currently unbounded because there's no cleanup yet)

It would be tricky to store the set of revisions separately, but we *can* introduce a counter that increases everytime the collisions map is changed. Having that, we can make updates conditional on the collisions map being the same as a known state.

(Maintaining that information is implemented in attachment OAK-1941-cmodcount.diff)


3) Returning the previous state

Some DBs have extensions for this ("RETURNING"), some will require writing stored procedures. 


Next steps:

- extract MongoDB specific test cases that simulate concurrent updates

- try to get 1) and 3) implemented on DB2, and see how this behaves for these tests

Other considerations:

- for databases where we don't want to add DB-specific code we may want to see whether a simple improvement in the retry logic helps; right now, when retrying, we use two transactions to read and write back; doing it in a single one with the right isolation level might already be good enough.


> RDB: decide on table layout
> ---------------------------
>
>                 Key: OAK-1941
>                 URL: https://issues.apache.org/jira/browse/OAK-1941
>             Project: Jackrabbit Oak
>          Issue Type: Sub-task
>          Components: rdbmk
>            Reporter: Julian Reschke
>             Fix For: 1.1
>
>         Attachments: OAK-1941-cmodcount.diff
>
>
> The current approach is to serialize the Document using JSON, and then to store either (a) the full JSON in a VARCHAR column, or, if that column isn't wide enough, (b) to store it in a BLOB (optionally gzipped).
> For debugging purposes, the inline VARCHAR always gets populated with the start of the JSON serialization.
> However, with Oracle we are limited to 4000 bytes (which may be way less characters due to non-ASCII overhead), so many document instances will use what was initially thought to be the exception case.
> Questions:
> 1) Do we stick with JSON or do we attempt a different serialization? It might make sense both wrt to length and performance. There might be also some code to borrow from the off-heap serialization code.
> 2) Do we get rid of the "dual" strategy, and just always use the BLOB? The indirection might make things more expensive, but then the total column width would drop considerably. -- How can we do good benchmarks on this?
> (This all assumes that we stick with a model where all code is the same between database types, except for the DDL statements; of course it's also conceivable add more vendor-specific special cases into the Java code)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)