You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "Tomek Rękawek (JIRA)" <ji...@apache.org> on 2015/11/04 10:19:27 UTC
[jira] [Issue Comment Deleted] (OAK-2106) Optimize reads from secondaries

     [ https://issues.apache.org/jira/browse/OAK-2106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tomek Rękawek updated OAK-2106:
-------------------------------
    Comment: was deleted

(was: {quote}Let's say the estimator measures a lag of 2 seconds at time T. That is, secondaries have synced up to T-2s. At T+5s the secondaries still lag behind at T-2s.{quote}

Let's have S - secondary optime, P - primery optime, T - current time. The lag is measured as S-P, not S-T. It should allow to avoid the case in which the lag is large, but we happen to measure it right after some operation has been applied.

If we want to make it more reliable we can measure eg. 10 last values and return the largest one.

{quote}I'm also a bit concerned about introducing a dependency from MongoDocumentStore to classes like UnmergedBranches and UnsavedModifications.
I would rather like to see a solution where the client of the DocumentStore can express how fresh the document needs to be when it reads from the store.{quote}

It concerns me as well (as this is some kind of circular dependency), but I wasn't able to find something better. The access to unmerged branches is necessary so we won't ask the secondary about the path belonging to branch. It doesn't depend on the time, as user may modify many nodes (which'll result in creating branch) and keep the changes unmerged for a very long time.

Situation looks a bit different with the UnsavedModifications, as they are saved on a regular basis ({{asyncDelay}}) - we can add this value to the estimated lag to be sure that background update thread has run and the changes has been replicated.

{quote}I would rather like to see a solution where the client of the DocumentStore can express how fresh the document needs to be when it reads from the store. I think this also means the decision whether a read can be directed to a secondary must not depend on the lag as a duration, but should rather calculate a time when it is safe to read from a secondary.{quote}

We can take the {{find(maxCacheAge)}} parameter into consideration in the {{getMongoReadPreference}}, however it doesn't solve the issue with the unmerged branches.

{quote}The tricky part here is how to handle time differences on the machines where the Oak cluster nodes are running and the MongoDB replica set. Each change on a document is associated with a revision, where the timestamp of the revision is tied to the local clock where the revision was created. The oplog timestamp on the other hand is derived from the primary replica set member clock, I assume.{quote}

The replication set status is taken from the primary. For each secondary member we have 3 times available:

* optime - secondary time of the last operation applied,
* lastHeartbeat - secondary time of the last heartbeat sent,
* lastHeartbeatRecv - primary time of the last heartbeat received.

Primary member provides:

* optime,
* current timestamp.

As stated above, I estimate lag by subtracting primary optime from secondary optime. These two times comes from different machines and therefore clock differences will make it less accurate.

The other way of measuring the lag would be comparing lastHeartBeatRecv and current time stamp. These two times comes from the same machine (primary). It tells us how often the secondary ask for changes, but not how long does it take to apply them. Maybe the first thing is more important - if so, I can change the estimation method.)

> Optimize reads from secondaries
> -------------------------------
>
>                 Key: OAK-2106
>                 URL: https://issues.apache.org/jira/browse/OAK-2106
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: core, mongomk
>            Reporter: Marcel Reutegger
>            Assignee: Marcel Reutegger
>              Labels: performance, scalability
>
> OAK-1645 introduced support for reads from secondaries under certain
> conditions. The current implementation checks the _lastRev on a potentially
> cached parent document and reads from a secondary if it has not been
> modified in the last 6 hours. This timespan is somewhat arbitrary but
> reflects the assumption that the replication lag of a secondary shouldn't
> be more than 6 hours.
> This logic should be optimized to take the actual replication lag into
> account. MongoDB provides information about the replication lag with
> the command rs.status().



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)