You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@lenya.apache.org by "Mann, David" <da...@matrixone.com> on 2005/01/11 09:35:14 UTC

Hot update lucene search db

Hi all,
I'm new to lenya and have gotten the lucene engine running, but have noticed
that if I try to re-index while the site is running, the search engine files
are not updated properly. Is it required to shut down the app server (just
running "lenya servlet" for now) before updating the search, or am I doing
something wrong? I also tried "incremental" mode instead of creating a new
index, but it didn't seem to help. If I run it offline (after crawling the
site) it works fine and picks up all the changes.
 
Current env:
apache-lenya-1.2.1-src
jdk1.4.2_06
WindowsXP
 
Update command:
tools\bin\ant -f build/lenya/webapp/lenya/bin/crawl_and_index.xml
-Dlucene.xconf=build/lenya/webapp/lenya/pubs/default/config/search/lucene-li
ve.xconf index
 
lucene-live.xconf:
<lucene>
  <update-index type="new"/>
<!--
  <update-index type="incremental"/>
--> 
 
  <index-dir src="../../work/search/lucene/index/live/index"/>
  <htdocs-dump-dir src="../../work/search/lucene/htdocs_dump/live"/>
 
  <indexer class="org.apache.lenya.lucene.index.DefaultIndexer"/>
</lucene>
 
Thanks,
David
 

Re: Hot update lucene search db

Posted by Andreas Kuckartz <A....@ping.de>.
Bug 29312 has a link to bug 32263 (which is a bug within Apache Cocoon not
Lenya). Please vote for that bug so that it gets the attention of the Cocoon
developers:
http://issues.apache.org/bugzilla/show_bug.cgi?id=32263

Cheers,
Andreas

----- Original Message -----
From: "Gregor J. Rothfuss" <gr...@apache.org>
To: "Lenya Users List" <us...@lenya.apache.org>
Sent: Tuesday, January 11, 2005 3:54 PM
Subject: Re: Hot update lucene search db


> Mann, David wrote:
> > Hi all,
> > I'm new to lenya and have gotten the lucene engine running, but have noticed
> > that if I try to re-index while the site is running, the search engine files
> > are not updated properly.
>
> this is a known issue:
>
> http://issues.apache.org/bugzilla/show_bug.cgi?id=29312
>
> --
> Gregor J. Rothfuss
> COO, Wyona       Content Management Solutions    http://wyona.com
> Apache Lenya                              http://lenya.apache.org
> gregor.rothfuss@wyona.com                       gregor@apache.org
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@lenya.apache.org
> For additional commands, e-mail: user-help@lenya.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@lenya.apache.org
For additional commands, e-mail: user-help@lenya.apache.org


Re: Hot update lucene search db

Posted by "Gregor J. Rothfuss" <gr...@apache.org>.
Mann, David wrote:
> Hi all,
> I'm new to lenya and have gotten the lucene engine running, but have noticed
> that if I try to re-index while the site is running, the search engine files
> are not updated properly. 

this is a known issue:

http://issues.apache.org/bugzilla/show_bug.cgi?id=29312

-- 
Gregor J. Rothfuss
COO, Wyona       Content Management Solutions    http://wyona.com
Apache Lenya                              http://lenya.apache.org
gregor.rothfuss@wyona.com                       gregor@apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@lenya.apache.org
For additional commands, e-mail: user-help@lenya.apache.org


Re: Hot update lucene search db

Posted by Michael Wechner <mi...@wyona.com>.
Mann, David wrote:

> Hi all,
> I'm new to lenya and have gotten the lucene engine running, but have 
> noticed that if I try to re-index while the site is running, the 
> search engine files are not updated properly.


actually this should be possible

> Is it required to shut down the app server (just running "lenya 
> servlet" for now) before updating the search, or am I doing something 
> wrong? I also tried "incremental" mode instead of creating a new 
> index, but it didn't seem to help


Incremental doesn't really work properly and it's also kind of useless in
a batch process, because looking for lastmodifieds also comsumes quite a 
lot of resources and doesn't seem to be much faster and building the 
index from scratch.

But incremental indexing for individual documents makes a lot of sense,
but because of performance reasons should be decoupled from the system
itself. Currently this is not the case and one won't notice until one has
a > 100000 documents

> . If I run it offline (after crawling the site) it works fine and 
> picks up all the changes.


as said above this is quite strange. Do the log files say anything 
particular?

>  
> Current env:
> apache-lenya-1.2.1-src
> jdk1.4.2_06
> WindowsXP



Maybe there is a difference between Windows and Linux/UNIX. I just know
how it behaves on Linux/Debian

HTH

Michi

>  
> Update command:
> tools\bin\ant -f build/lenya/webapp/lenya/bin/crawl_and_index.xml 
> -Dlucene.xconf=build/lenya/webapp/lenya/pubs/default/config/search/lucene-live.xconf 
> index
>  
> lucene-live.xconf:
> <lucene>
>   <update-index type="new"/>
> <!--
>   <update-index type="incremental"/>
> -->
>  
>   <index-dir src="../../work/search/lucene/index/live/index"/>
>   <htdocs-dump-dir src="../../work/search/lucene/htdocs_dump/live"/>
>  
>   <indexer class="org.apache.lenya.lucene.index.DefaultIndexer"/>
> </lucene>
>  
> Thanks,
> David
>  



-- 
Michael Wechner
Wyona Inc.  -   Open Source Content Management   -   Apache Lenya
http://www.wyona.com                      http://lenya.apache.org
michael.wechner@wyona.com                        michi@apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@lenya.apache.org
For additional commands, e-mail: user-help@lenya.apache.org