You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "Marcel Reutegger (JIRA)" <ji...@apache.org> on 2018/05/03 10:19:00 UTC

[jira] [Commented] (OAK-6087) Avoid reads from MongoDB primary

    [ https://issues.apache.org/jira/browse/OAK-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16462246#comment-16462246 ] 

Marcel Reutegger commented on OAK-6087:
---------------------------------------

bq. if I understand this correctly, this means that we might pick a lagging nearest secondary which can block our query until it catches up to our client session?

Yes, this is indeed possible and we should probably have a way to detect this. We could configure the MongoDocumentStore with a write concern that includes the nearest secondary. This would ensure each write operation done by a MongoDocumentStore is immediately available on the secondary. However the performance penalty would just be passed to the write operation.

bq. Imo, we should see that we still are ok perf wise!?

Agreed. The tests I've done so far show a slight impact on performance when every thing is on the same machine. However, depending on the deployment and how close a secondary is to Oak, the performance may even improve. E.g. because latency to the secondary is lower than to the primary.

I'll perform some more tests and report the results. 

> Avoid reads from MongoDB primary
> --------------------------------
>
>                 Key: OAK-6087
>                 URL: https://issues.apache.org/jira/browse/OAK-6087
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: mongomk
>            Reporter: Marcel Reutegger
>            Assignee: Marcel Reutegger
>            Priority: Major
>              Labels: scalability
>         Attachments: OAK-6087.patch
>
>
> With OAK-2106 Oak now attempts to read from a MongoDB secondary when it detects the requested data is available on the secondary.
> When multiple Oak cluster nodes are deployed on a MongoDB replica set, many reads are still directed to the primary. One of the reasons why this is seen in practice, are observers and JCR event listeners that are triggered rather soon after a change happens and therefore read recently modified documents. This makes it difficult for Oak to direct calls to a nearby secondary, because changes may not yet be available there.
> A rather simple solution for the observers may be to delay processing of changes until they are available on the near secondary.
> A more sophisticated solution discussed offline could hide the replica set entirely and always read from the nearest secondary. Writes would obviously still go to the primary, but only return when the write is available also on the nearest secondary. This guarantees that any subsequent read is able to see the preceding write.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)