You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Michael McCandless (JIRA)" <ji...@apache.org> on 2015/03/02 10:43:06 UTC
[jira] [Updated] (LUCENE-5438) add near-real-time replication

     [ https://issues.apache.org/jira/browse/LUCENE-5438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless updated LUCENE-5438:
---------------------------------------
    Attachment: LUCENE-5438.patch

Here's a new patch based on trunk: recent progress with IDs and
checksums made things much easier here because now I can reliably
identify files across nodes.

This is still just a proof-of-concept test case and still has many
nocommits, but I believe the core idea is finally working.

The test randomly starts N nodes, each with its own filesystem
directory.  One node is primary, and the rest replicas. An indexing
thread adds docs to the primary, and periodically a new NRT point is
flushed and replicated out.

The test is quite evil: most of MDW's checks are left enabled.
E.g. virus checker sometimes prevents deletion of files, unref'd files
at close are caught, double-write to same filename is caught (except
after crash).  It uses RIW.  The replicas have random rate limiters so
some nodes are fast to copy and others are slow.  Replicas are
randomly crashed or gracefully closed, and restarted.  Primary is
randomly crashed/closed and a replica is promoted as new primary.

A file is considered the "same" across primary and replica if its file
name is the same, its full byte[] index header and footer are
identical, and it's in a "known to be fsync'd" state (e.g. not an
unref'd file on init of replica/primary).

A searching thread asserts that each searcher version always has the
same hit count across all nodes, and that there is no data loss.

I also added a simplistic/restricted transaction log (sits logically
outside of/above the cluster, not per-node) to show how NRT points can
be correlated back to locations in the xlog and used to replay
indexing events on primary crash so no indexed documents are lost.

Versions are consistent across replicas, so at any time if you have a
follow-on search needing a specific searcher version, you can use any
replica that has that version and it's guaranteed to be searching the
same point-in-time.

I would love to get this into Jenkins soon, but one big problem here
is I had to open up all sorts of ridiculous APIs in IW/SegmentInfos
... I have to think about how to minimize this.


> add near-real-time replication
> ------------------------------
>
>                 Key: LUCENE-5438
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5438
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/replicator
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>         Attachments: LUCENE-5438.patch, LUCENE-5438.patch, LUCENE-5438.patch
>
>
> Lucene's replication module makes it easy to incrementally sync index
> changes from a master index to any number of replicas, and it
> handles/abstracts all the underlying complexity of holding a
> time-expiring snapshot, finding which files need copying, syncing more
> than one index (e.g., taxo + index), etc.
> But today you must first commit on the master, and then again the
> replica's copied files are fsync'd, because the code operates on
> commit points.  But this isn't "technically" necessary, and it mixes
> up durability and fast turnaround time.
> Long ago we added near-real-time readers to Lucene, for the same
> reason: you shouldn't have to commit just to see the new index
> changes.
> I think we should do the same for replication: allow the new segments
> to be copied out to replica(s), and new NRT readers to be opened, to
> fully decouple committing from visibility.  This way apps can then
> separately choose when to replicate (for freshness), and when to
> commit (for durability).
> I think for some apps this could be a compelling alternative to the
> "re-index all documents on each shard" approach that Solr Cloud /
> ElasticSearch implement today, and it may also mean that the
> transaction log can remain external to / above the cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org