You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jackrabbit.apache.org by mitziuro <mi...@gmail.com> on 2009/09/08 11:01:16 UTC

Refresh problem

 Everytime i change the path of a node, the node is blocked for listing
until jackrabbit makes some calculations.
For example if i have 20 nodes it takes a while after i've changed the path
to list all of them.
Is there a way to avoid that?

Re: Refresh problem

Posted by Marcel Reutegger <ma...@gmx.net>.
Hi,

2009/9/12 arcassis@gmail.com <ar...@gmail.com>:
> Now, after I have commented out the textFilterClasses parameter from the
> SearchIndex element, JR's works fine (I don't see anymore that latency). The
> good news are that we don't need in our application for the moment text
> extraction. But when we'll do need...the problem will reapear....

then this must be a bug. can you please share some more information?
what version of jackrabbit are you using? do you have code to
reproduce the issue? thanks

regards
 marcel

Re: Refresh problem

Posted by "arcassis@gmail.com" <ar...@gmail.com>.
Hello Marcel!
I'm mitziuro's collegue. I want to thank you for what you suggested
regarding JR's SearchIndex, it was very helpful !

With this configuration for a workspace's SearchIndex :
-------------------------
<SearchIndex class="org.apache.jackrabbit.core.query.lucene.SearchIndex">
            <param name="path" value="${wsp.home}/index"/>
            <param name="textFilterClasses"
value="org.apache.jackrabbit.extractor.PlainTextExtractor,org.apache.jackrabbit.extractor.MsWordTextExtractor,org.apache.jackrabbit.extractor.MsExcelTextExtractor,org.apache.jackrabbit.extractor.MsPowerPointTextExtractor,org.apache.jackrabbit.extractor.PdfTextExtractor,org.apache.jackrabbit.extractor.OpenOfficeTextExtractor,org.apache.jackrabbit.extractor.RTFTextExtractor,org.apache.jackrabbit.extractor.HTMLTextExtractor,org.apache.jackrabbit.extractor.XMLTextExtractor"/>
            <param name="extractorPoolSize" value="2"/>
            <param name="supportHighlighting" value="true"/>
            <param name="respectDocumentOrder" value="true"/>
        </SearchIndex>
-------------------------
our application behaviour was something like this: 1. add a document, 2.
after saving the document, go to document listing, result: newly added
document wasn't there 3. another click on the document
listing page and voila, the document appeared, (sometimes you need to make
2-3 clicks on the document listing page).
So...my conclusion is that, even if text extraction takes longer than the
default extractorTimeout value (100ms), and a new background thread is
started to handler this,  newly added nodes still don't appear. Why is this
happening ? What else is keeping a lock on the node ?

Now, after I have commented out the textFilterClasses parameter from the
SearchIndex element, JR's works fine (I don't see anymore that latency). The
good news are that we don't need in our application for the moment text
extraction. But when we'll do need...the problem will reapear....


All the best,

Dan




On Fri, Sep 11, 2009 at 7:07 PM, Guo Du <mr...@gmail.com> wrote:

> On Fri, Sep 11, 2009 at 3:05 PM, Marcel Reutegger
> <ma...@gmx.net> wrote:
> > that's not quite correct. everything except text extraction is
> > guaranteed to be indexed as soon as the save (or transaction commit)
> > returns.
>
> Thanks for the knowledge!
>
> --Guo
>



-- 
Arcassis

Re: Refresh problem

Posted by Guo Du <mr...@gmail.com>.
On Fri, Sep 11, 2009 at 3:05 PM, Marcel Reutegger
<ma...@gmx.net> wrote:
> that's not quite correct. everything except text extraction is
> guaranteed to be indexed as soon as the save (or transaction commit)
> returns.

Thanks for the knowledge!

--Guo

Re: Refresh problem

Posted by Marcel Reutegger <ma...@gmx.net>.
Hi,

On Tue, Sep 8, 2009 at 11:58, Guo Du <mr...@gmail.com> wrote:
> It designed to work like this. The index for search is running on
> background and it's not guaranteed to work immediately after you do
> any changes to the nodes.

that's not quite correct. everything except text extraction is
guaranteed to be indexed as soon as the save (or transaction commit)
returns.

do you move resource nodes with binary data that you have configured
for text extraction and do fulltext queries? if that's the case then
you might see some delay until text is extracted and available for
fulltext queries. However that behaviour can be configured. See
extractorTimeout on http://wiki.apache.org/jackrabbit/Search.

regards
 marcel

Re: Refresh problem

Posted by Guo Du <mr...@gmail.com>.
On Tue, Sep 8, 2009 at 10:25 AM, mitziuro<mi...@gmail.com> wrote:
> After i'm changing the path with session.move, i'm doing a search with
> xPath.
> If i do the search in the next second after the event i don't see that node.
> If i refresh i see it.
> For 20+ nodes the situation is worse because i get partial results in the
> next seconds after i moved the nodes.
> Is there a way to change this behaviour (don't block the nodes for listing
> or something else) ?
> Thanks
>
It designed to work like this. The index for search is running on
background and it's not guaranteed to work immediately after you do
any changes to the nodes.

Instead of using search api, you may traverse nodes with children to
avoid unexpected result.

Regards,

-Guo

Re: Refresh problem

Posted by mitziuro <mi...@gmail.com>.
On Tue, Sep 8, 2009 at 12:05 PM, Jukka Zitting <ju...@gmail.com>wrote:

> Hi,
>
> On Tue, Sep 8, 2009 at 11:01 AM, mitziuro<mi...@gmail.com> wrote:
> >  Everytime i change the path of a node, the node is blocked for listing
> > until jackrabbit makes some calculations.
> > For example if i have 20 nodes it takes a while after i've changed the
> path
> > to list all of them.
> > Is there a way to avoid that?
>
> I'm not sure what you're after here. Obviously Jackrabbit needs some
> time to persist any changes and load modified information, but in
> normal use this is pretty fast.
>
> Can you be a bit more specific? What are doing (sample code) and what
> kind of performance you are expecting/seeing?
>
> BR,
>
> Jukka Zitting
>

After i'm changing the path with session.move, i'm doing a search with
xPath.
If i do the search in the next second after the event i don't see that node.
If i refresh i see it.
For 20+ nodes the situation is worse because i get partial results in the
next seconds after i moved the nodes.
Is there a way to change this behaviour (don't block the nodes for listing
or something else) ?
Thanks

Re: Refresh problem

Posted by Jukka Zitting <ju...@gmail.com>.
Hi,

On Tue, Sep 8, 2009 at 11:01 AM, mitziuro<mi...@gmail.com> wrote:
>  Everytime i change the path of a node, the node is blocked for listing
> until jackrabbit makes some calculations.
> For example if i have 20 nodes it takes a while after i've changed the path
> to list all of them.
> Is there a way to avoid that?

I'm not sure what you're after here. Obviously Jackrabbit needs some
time to persist any changes and load modified information, but in
normal use this is pretty fast.

Can you be a bit more specific? What are doing (sample code) and what
kind of performance you are expecting/seeing?

BR,

Jukka Zitting