You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-dev@jackrabbit.apache.org by "Jim.Tully" <Ji...@target.com> on 2015/12/05 02:48:17 UTC

Lucent index speed

We are using Oak embedded in a web application, and are now experiencing significant delays in async indexing.  New nodes added are sometimes not available by query for up to an hour.  I’m hoping you can identify areas I might explore to improve this performance.

We have multiple instances of the web application running with the same Mongodb cluster connected via SSL.  Our Repository constructor is:



ns = new DocumentMK.Builder().setMongoDB(createMongoDB()).getNodeStore();


Oak oak = new Oak(ns);


LuceneIndexProvider provider = new LuceneIndexProvider();

Jcr jcr = new Jcr(oak).with((QueryIndexProvider) provider).with((Observer) provider)

        .with(new LuceneIndexEditorProvider()).withAsyncIndexing();

repository = jcr.createRepository();


The web application creates the repository at start up, and disposes of it as shutdown.  We have no observers registered at all, but do have 6 lucene indexes defined.  The index that is currently giving me heartburn looks like below.  Where would I start to find what is dragging performance down so drastically?


<?xml version="1.0" encoding="UTF-8"?>

<sv:node xmlns:sv="http://www.jcp.org/jcr/sv/1.0" sv:name="PageIndex">

<sv:property sv:name="jcr:primaryType" sv:type="Name">

<sv:value>oak:QueryIndexDefinition</sv:value>

</sv:property>

<sv:property sv:name="compatVersion" sv:type="Long">

<sv:value>2</sv:value>

</sv:property>

<sv:property sv:name="compatMode" sv:type="Long">

<sv:value>2</sv:value>

</sv:property>

<sv:property sv:name="type" sv:type="String">

<sv:value>lucene</sv:value>

</sv:property>

<sv:property sv:name="async" sv:type="String">

<sv:value>async</sv:value>

</sv:property>

<sv:property sv:name="name" sv:type="String">

<sv:value>PageIndex</sv:value>

</sv:property>

<sv:property sv:name="indexPath" sv:type="String">

<sv:value>/pages/oak:index/PageIndex</sv:value>

</sv:property>



<sv:node sv:name="indexRules" sv:type="Name">

<sv:property sv:name="jcr:primaryType" sv:type="Name">

<sv:value>nt:unstructured</sv:value>

</sv:property>

<sv:node sv:name="tgt:page">

<sv:property sv:name="jcr:primaryType" sv:type="Name">

<sv:value>nt:unstructured</sv:value>

</sv:property>

<sv:node sv:name="properties" sv:type="Name">

<sv:property sv:name="jcr:primaryType" sv:type="Name">

<sv:value>nt:unstructured</sv:value>

</sv:property>

<sv:node sv:name="jcr:activationDate">

<sv:property sv:name="jcr:primaryType" sv:type="Name">

<sv:value>nt:unstructured</sv:value>

</sv:property>

<sv:property sv:name="type" sv:type="String">

<sv:value>Date</sv:value>

</sv:property>

<sv:property sv:name="ordered" sv:type="Boolean">

<sv:value>true</sv:value>

</sv:property>

<sv:property sv:name="propertyIndex" sv:type="Boolean">

<sv:value>true</sv:value>

</sv:property>

</sv:node>

<sv:node sv:name="jcr:deactivationDate">

<sv:property sv:name="jcr:primaryType" sv:type="Name">

<sv:value>nt:unstructured</sv:value>

</sv:property>

<sv:property sv:name="type" sv:type="String">

<sv:value>Date</sv:value>

</sv:property>

<sv:property sv:name="ordered" sv:type="Boolean">

<sv:value>true</sv:value>

</sv:property>

<sv:property sv:name="propertyIndex" sv:type="Boolean">

<sv:value>true</sv:value>

</sv:property>

</sv:node>

<sv:node sv:name="jcr:status">

<sv:property sv:name="jcr:primaryType" sv:type="Name">

<sv:value>nt:unstructured</sv:value>

</sv:property>

<sv:property sv:name="propertyIndex" sv:type="Boolean">

<sv:value>true</sv:value>

</sv:property>

</sv:node>

<sv:node sv:name="presentation">

<sv:property sv:name="jcr:primaryType" sv:type="Name">

<sv:value>nt:unstructured</sv:value>

</sv:property>

<sv:property sv:name="propertyIndex" sv:type="Boolean">

<sv:value>true</sv:value>

</sv:property>

</sv:node>

</sv:node>

</sv:node>

</sv:node>

<sv:node sv:name="analyzers">

<sv:node sv:name="default">

<sv:property sv:name="class" sv:type="String">

<sv:value>org.apache.lucene.analysis.standard.StandardAnalyzer</sv:value>

</sv:property>

<sv:property sv:name="luceneMatchVersion" sv:type="String">

<sv:value>LUCENE_47</sv:value>

</sv:property>

<sv:node sv:name="tokenizer">

<sv:property sv:name="name" sv:type="String">

<sv:value>Standard</sv:value>

</sv:property>

</sv:node>

</sv:node>

</sv:node>

</sv:node>

Thanks,

Jim


Re: Lucent index speed

Posted by Chetan Mehrotra <ch...@gmail.com>.
Hi Jim,

How does the indexing performs if you say just run a single webapp node?

Chetan Mehrotra


On Sat, Dec 5, 2015 at 7:18 AM, Jim.Tully <Ji...@target.com> wrote:
> We are using Oak embedded in a web application, and are now experiencing significant delays in async indexing.  New nodes added are sometimes not available by query for up to an hour.  I’m hoping you can identify areas I might explore to improve this performance.
>
> We have multiple instances of the web application running with the same Mongodb cluster connected via SSL.  Our Repository constructor is:
>
>
>
> ns = new DocumentMK.Builder().setMongoDB(createMongoDB()).getNodeStore();
>
>
> Oak oak = new Oak(ns);
>
>
> LuceneIndexProvider provider = new LuceneIndexProvider();
>
> Jcr jcr = new Jcr(oak).with((QueryIndexProvider) provider).with((Observer) provider)
>
>         .with(new LuceneIndexEditorProvider()).withAsyncIndexing();
>
> repository = jcr.createRepository();
>
>
> The web application creates the repository at start up, and disposes of it as shutdown.  We have no observers registered at all, but do have 6 lucene indexes defined.  The index that is currently giving me heartburn looks like below.  Where would I start to find what is dragging performance down so drastically?
>
>
> <?xml version="1.0" encoding="UTF-8"?>
>
> <sv:node xmlns:sv="http://www.jcp.org/jcr/sv/1.0" sv:name="PageIndex">
>
> <sv:property sv:name="jcr:primaryType" sv:type="Name">
>
> <sv:value>oak:QueryIndexDefinition</sv:value>
>
> </sv:property>
>
> <sv:property sv:name="compatVersion" sv:type="Long">
>
> <sv:value>2</sv:value>
>
> </sv:property>
>
> <sv:property sv:name="compatMode" sv:type="Long">
>
> <sv:value>2</sv:value>
>
> </sv:property>
>
> <sv:property sv:name="type" sv:type="String">
>
> <sv:value>lucene</sv:value>
>
> </sv:property>
>
> <sv:property sv:name="async" sv:type="String">
>
> <sv:value>async</sv:value>
>
> </sv:property>
>
> <sv:property sv:name="name" sv:type="String">
>
> <sv:value>PageIndex</sv:value>
>
> </sv:property>
>
> <sv:property sv:name="indexPath" sv:type="String">
>
> <sv:value>/pages/oak:index/PageIndex</sv:value>
>
> </sv:property>
>
>
>
> <sv:node sv:name="indexRules" sv:type="Name">
>
> <sv:property sv:name="jcr:primaryType" sv:type="Name">
>
> <sv:value>nt:unstructured</sv:value>
>
> </sv:property>
>
> <sv:node sv:name="tgt:page">
>
> <sv:property sv:name="jcr:primaryType" sv:type="Name">
>
> <sv:value>nt:unstructured</sv:value>
>
> </sv:property>
>
> <sv:node sv:name="properties" sv:type="Name">
>
> <sv:property sv:name="jcr:primaryType" sv:type="Name">
>
> <sv:value>nt:unstructured</sv:value>
>
> </sv:property>
>
> <sv:node sv:name="jcr:activationDate">
>
> <sv:property sv:name="jcr:primaryType" sv:type="Name">
>
> <sv:value>nt:unstructured</sv:value>
>
> </sv:property>
>
> <sv:property sv:name="type" sv:type="String">
>
> <sv:value>Date</sv:value>
>
> </sv:property>
>
> <sv:property sv:name="ordered" sv:type="Boolean">
>
> <sv:value>true</sv:value>
>
> </sv:property>
>
> <sv:property sv:name="propertyIndex" sv:type="Boolean">
>
> <sv:value>true</sv:value>
>
> </sv:property>
>
> </sv:node>
>
> <sv:node sv:name="jcr:deactivationDate">
>
> <sv:property sv:name="jcr:primaryType" sv:type="Name">
>
> <sv:value>nt:unstructured</sv:value>
>
> </sv:property>
>
> <sv:property sv:name="type" sv:type="String">
>
> <sv:value>Date</sv:value>
>
> </sv:property>
>
> <sv:property sv:name="ordered" sv:type="Boolean">
>
> <sv:value>true</sv:value>
>
> </sv:property>
>
> <sv:property sv:name="propertyIndex" sv:type="Boolean">
>
> <sv:value>true</sv:value>
>
> </sv:property>
>
> </sv:node>
>
> <sv:node sv:name="jcr:status">
>
> <sv:property sv:name="jcr:primaryType" sv:type="Name">
>
> <sv:value>nt:unstructured</sv:value>
>
> </sv:property>
>
> <sv:property sv:name="propertyIndex" sv:type="Boolean">
>
> <sv:value>true</sv:value>
>
> </sv:property>
>
> </sv:node>
>
> <sv:node sv:name="presentation">
>
> <sv:property sv:name="jcr:primaryType" sv:type="Name">
>
> <sv:value>nt:unstructured</sv:value>
>
> </sv:property>
>
> <sv:property sv:name="propertyIndex" sv:type="Boolean">
>
> <sv:value>true</sv:value>
>
> </sv:property>
>
> </sv:node>
>
> </sv:node>
>
> </sv:node>
>
> </sv:node>
>
> <sv:node sv:name="analyzers">
>
> <sv:node sv:name="default">
>
> <sv:property sv:name="class" sv:type="String">
>
> <sv:value>org.apache.lucene.analysis.standard.StandardAnalyzer</sv:value>
>
> </sv:property>
>
> <sv:property sv:name="luceneMatchVersion" sv:type="String">
>
> <sv:value>LUCENE_47</sv:value>
>
> </sv:property>
>
> <sv:node sv:name="tokenizer">
>
> <sv:property sv:name="name" sv:type="String">
>
> <sv:value>Standard</sv:value>
>
> </sv:property>
>
> </sv:node>
>
> </sv:node>
>
> </sv:node>
>
> </sv:node>
>
> Thanks,
>
> Jim
>