You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-dev@jackrabbit.apache.org by Ian Boston <ie...@tfd.co.uk> on 2016/11/18 10:05:57 UTC

Hybrid indexing and soft commits.

Hi,
IIUC the Hybrid indexing on the master operates in parallel with the master
index writer, performing the same task but repeatedly throwing its work
away when the master provides an update. IIUC, it effectively performs many
soft commits to achieve NRT behaviour.

I wonder if there is an opportunity to use the Hybrid indexer on the master
instance and every n seconds (or even minutes) perform a hard commit. That
hard commit being the output of the master index writer, committed by Oak
to the DS. This would avoid doing the work and follows the pattern used by
Solr and ES, where an indexing update is written to a WAL, soft committed
and periodically hard committed. The WAL comes free as part of Oak so if
the soft commits are lost, the index and WAL starts from the last hard
commit.

To be clear. I am only talking about de-duplicating the effort performed on
the master node by the hybrid indexer and the master index writer. I am not
talking about anything performed on slave index reader instances which also
have a hybrid indexer. Those hybrid indexers will still work as they do now.

wdyt?
Best Regards
Ian

Re: Hybrid indexing and soft commits.

Posted by Tommaso Teofili <to...@gmail.com>.
+1

Tommaso

Il giorno ven 18 nov 2016 alle ore 11:06 Ian Boston <ie...@tfd.co.uk> ha
scritto:

> Hi,
> IIUC the Hybrid indexing on the master operates in parallel with the master
> index writer, performing the same task but repeatedly throwing its work
> away when the master provides an update. IIUC, it effectively performs many
> soft commits to achieve NRT behaviour.
>
> I wonder if there is an opportunity to use the Hybrid indexer on the master
> instance and every n seconds (or even minutes) perform a hard commit. That
> hard commit being the output of the master index writer, committed by Oak
> to the DS. This would avoid doing the work and follows the pattern used by
> Solr and ES, where an indexing update is written to a WAL, soft committed
> and periodically hard committed. The WAL comes free as part of Oak so if
> the soft commits are lost, the index and WAL starts from the last hard
> commit.
>
> To be clear. I am only talking about de-duplicating the effort performed on
> the master node by the hybrid indexer and the master index writer. I am not
> talking about anything performed on slave index reader instances which also
> have a hybrid indexer. Those hybrid indexers will still work as they do
> now.
>
> wdyt?
> Best Regards
> Ian
>