You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by John Thompson <jo...@gmail.com> on 2008/06/19 11:32:10 UTC

Can I update my search engine without restarting tomcat?

I've noticed that when I do new crawls and update my database and index
files and then search on my site, the newly crawled pages don't show up
until I restart tomcat.  Is there a way around this?  I don't really want to
be regularly restarting a production server.  I'm using nutch-0.9.

Best,
John

Re: Can I update my search engine without restarting tomcat?

Posted by "Eric J. Christeson" <Er...@ndsu.edu>.
On Jun 19, 2008, at 1:20 PM, John Thompson wrote:

>> You can set up an account in Tomcat Manager if you don't have one  
>> already.
>> The Manager lets you go in and independently start/stop/reload any  
>> of the
>> different webapps you have running. This is exactly how I get new  
>> Nutch
>> crawls/indexes to be active on our production server.
>
>
> Won't restarting the webapp cause Tomcat to serve up error pages to  
> users
> who are trying to connect to the webapp at that moment?

We set up our own servlet which uses a NutchBean in a Singleton  
pattern.  It runs an update thread which periodically checks a known  
file which contains a directory name.  When loading a new search db,  
we copy the directory where we want it (we use a directory with a  
date in the name) and edit the file to point to the new dir.  A  
client never has problems related to unavailability because they  
either get the old NutchBean, referencing the old dir, or the new  
NutchBean, referencing the new dir.  We keep the old ones around for  
a period of time (default 5 minutes) in case anyone has a search page  
open and wants previous/next page of results.
We haven't run into any problems with this setup.  If anyone wants  
more information, let me know.

--
Eric J. Christeson                                  
<Er...@ndsu.edu>
Information Technology Services         (701) 231-8693 (Voice)
Room 242C, IACC Building
North Dakota State University, Fargo, ND 58105-5164

Organizations which design systems are constrained to produce designs  
which
are copies of the communication structures of these organizations.  (For
example, if you have four groups working on a compiler, you'll get a
4-pass compiler) - Conway's Law





Re: Can I update my search engine without restarting tomcat?

Posted by John Thompson <jo...@gmail.com>.
On 6/19/08, Howie Wang <ho...@hotmail.com> wrote:
>
> Not sure about Nutch 0.9, but I'm on an earlier version and I just
> have a JSP page that removes the nutch bean from the application,
> then gets it again to re-instantiate it. It looks something like this:
>
>     application.removeAttribute("nutchBean");
>     NutchBean bean = NutchBean.get(application);
>
> I just call this page when a new index is loaded.

This works like a charm!  Thanks Howie!

-John

RE: Can I update my search engine without restarting tomcat?

Posted by Howie Wang <ho...@hotmail.com>.
Not sure about Nutch 0.9, but I'm on an earlier version and I just
have a JSP page that removes the nutch bean from the application,
then gets it again to re-instantiate it. It looks something like this:

    application.removeAttribute("nutchBean");
    NutchBean bean = NutchBean.get(application);

I just call this page when a new index is loaded.

Howie


> Date: Thu, 19 Jun 2008 11:20:15 -0700
> From: john.thompson78@gmail.com
> To: nutch-user@lucene.apache.org
> Subject: Re: Can I update my search engine without restarting tomcat?
> 
> > You can set up an account in Tomcat Manager if you don't have one already.
> > The Manager lets you go in and independently start/stop/reload any of the
> > different webapps you have running. This is exactly how I get new Nutch
> > crawls/indexes to be active on our production server.
> 
> 
> Won't restarting the webapp cause Tomcat to serve up error pages to users
> who are trying to connect to the webapp at that moment?
> 
> -John
> 
> 
> >
> >
> > -Wynz
> >
> >
> > On Jun 19, 2008, at 5:32 AM, John Thompson wrote:
> >
> >  I've noticed that when I do new crawls and update my database and index
> >> files and then search on my site, the newly crawled pages don't show up
> >> until I restart tomcat.  Is there a way around this?  I don't really want
> >> to
> >> be regularly restarting a production server.  I'm using nutch-0.9.
> >>
> >> Best,
> >> John
> >>
> >
> >

_________________________________________________________________
Introducing Live Search cashback .  It's search that pays you back!
http://search.live.com/cashback/?&pkw=form=MIJAAF/publ=HMTGL/crea=introsrchcashback

Re: Can I update my search engine without restarting tomcat?

Posted by John Thompson <jo...@gmail.com>.
> You can set up an account in Tomcat Manager if you don't have one already.
> The Manager lets you go in and independently start/stop/reload any of the
> different webapps you have running. This is exactly how I get new Nutch
> crawls/indexes to be active on our production server.


Won't restarting the webapp cause Tomcat to serve up error pages to users
who are trying to connect to the webapp at that moment?

-John


>
>
> -Wynz
>
>
> On Jun 19, 2008, at 5:32 AM, John Thompson wrote:
>
>  I've noticed that when I do new crawls and update my database and index
>> files and then search on my site, the newly crawled pages don't show up
>> until I restart tomcat.  Is there a way around this?  I don't really want
>> to
>> be regularly restarting a production server.  I'm using nutch-0.9.
>>
>> Best,
>> John
>>
>
>

Re: Can I update my search engine without restarting tomcat?

Posted by Wynz Lo <wy...@gmail.com>.
John,

You can set up an account in Tomcat Manager if you don't have one  
already. The Manager lets you go in and independently start/stop/ 
reload any of the different webapps you have running. This is exactly  
how I get new Nutch crawls/indexes to be active on our production  
server.

-Wynz

On Jun 19, 2008, at 5:32 AM, John Thompson wrote:

> I've noticed that when I do new crawls and update my database and  
> index
> files and then search on my site, the newly crawled pages don't show  
> up
> until I restart tomcat.  Is there a way around this?  I don't really  
> want to
> be regularly restarting a production server.  I'm using nutch-0.9.
>
> Best,
> John