You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Howie Wang <ho...@hotmail.com> on 2006/02/01 06:02:32 UTC

Re: Updating the search index

I haven't tested this out too well, but I've been using it on my
development box. In case people are wondering how to get a
new NutchBean which will re-read the index, what I did was write
a little JSP page called reset_nutch_bean.jsp. All it does it remove
the "nutchBean" attr from the application context. So the next
search that comes along will create a new bean.

Any time I upload a new index, I just run this page. Pretty easy.
It's not automated, but it's easy enough to do after uploading
an index. You could automate it by adding it to a script
after finishing an index or after doing an upload or just run it
periodically with cron.

The code is after my sig. Let me know if what I've done is bad in
some way.

Howie

<%@ page
  contentType="text/html; charset=UTF-8"
  pageEncoding="UTF-8"

  import="javax.servlet.*"
  import="javax.servlet.http.*"

%>
<%
	application.removeAttribute("nutchBean");
%>



Re: Updating the search index

Posted by Byron Miller <by...@yahoo.com>.
With all of the discussions of
killing/restarting/pooling nutch bean has anyone
noticed that you push your luck in doing so?

I often get GC failed to collect, out of memory errors
and such when trying to do anything but a clean
shutdown.

I'm moving to 64bit jvm and java 1.5 so i'll let you
know if those memory errors continue.

-byron

--- Raghavendra Prabhu <rr...@gmail.com> wrote:

> With respect to updating , I had also suggested
> another method
> 
> Where we control NutchBean instantiation
> 
> But i introduced it into the form of object pooling
> 
> This pool will take care of reinstatiating nutch
> bean and returning the
> reference to it
> The pool can have a text file as an input which
> changes on reindexing . From
> this it will understand and recreate new search
> index and return the
> reference to the bean
> 
> 
> 
> 
> 
> 
> On 2/1/06, Raghavendra Prabhu <rr...@gmail.com>
> wrote:
> >
> >
> > Maybe We should have a close method also in
> IndexSearcher
> >
> > Which gives greater flexibility while deleting
> files
> >
> > But even after implemention , the data is locked
> by some search mechanism
> >
> > Maybe Some ArrayFileReader is holding it
> >
> > Are'nt we supposed to release these handlers
> >
> > Rgds
> >
> > Prabhu
> >
> >
> >  On 2/1/06, Raghavendra Prabhu
> <rr...@gmail.com> wrote:
> > >
> > > Hi guys
> > >
> > > Even i face the same problem
> > >
> > > I am doing something similar to what howie is
> doing
> > >
> > > But i want to delete the existing files in the
> index and replace them
> > >
> > >
> > > I even implement a new method which calls
> close.This closes the
> > >
> > > Indexsearcher and the reader.
> > >
> > > Then also i am not able to delete the file in
> windows
> > >
> > > Do i have to close anything other than the
> indexsearcher and the reader
> > > to delete the files
> > >
> > >
> > >  On 2/1/06, Howie Wang <howie_wang@hotmail.com >
> wrote:
> > > >
> > > > I haven't tested this out too well, but I've
> been using it on my
> > > > development box. In case people are wondering
> how to get a
> > > > new NutchBean which will re-read the index,
> what I did was write
> > > > a little JSP page called reset_nutch_bean.jsp.
> All it does it remove
> > > > the "nutchBean" attr from the application
> context. So the next
> > > > search that comes along will create a new
> bean.
> > > >
> > > > Any time I upload a new index, I just run this
> page. Pretty easy.
> > > > It's not automated, but it's easy enough to do
> after uploading
> > > > an index. You could automate it by adding it
> to a script
> > > > after finishing an index or after doing an
> upload or just run it
> > > > periodically with cron.
> > > >
> > > > The code is after my sig. Let me know if what
> I've done is bad in
> > > > some way.
> > > >
> > > > Howie
> > > >
> > > > <%@ page
> > > > contentType="text/html; charset=UTF-8"
> > > > pageEncoding="UTF-8"
> > > >
> > > > import="javax.servlet.*"
> > > > import="javax.servlet.http.*"
> > > >
> > > > %>
> > > > <%
> > > >       
> application.removeAttribute("nutchBean");
> > > > %>
> > > >
> > > >
> > > >
> > >
> >
> 


Re: Updating the search index

Posted by Raghavendra Prabhu <rr...@gmail.com>.
With respect to updating , I had also suggested another method

Where we control NutchBean instantiation

But i introduced it into the form of object pooling

This pool will take care of reinstatiating nutch bean and returning the
reference to it
The pool can have a text file as an input which changes on reindexing . From
this it will understand and recreate new search index and return the
reference to the bean






On 2/1/06, Raghavendra Prabhu <rr...@gmail.com> wrote:
>
>
> Maybe We should have a close method also in IndexSearcher
>
> Which gives greater flexibility while deleting files
>
> But even after implemention , the data is locked by some search mechanism
>
> Maybe Some ArrayFileReader is holding it
>
> Are'nt we supposed to release these handlers
>
> Rgds
>
> Prabhu
>
>
>  On 2/1/06, Raghavendra Prabhu <rr...@gmail.com> wrote:
> >
> > Hi guys
> >
> > Even i face the same problem
> >
> > I am doing something similar to what howie is doing
> >
> > But i want to delete the existing files in the index and replace them
> >
> >
> > I even implement a new method which calls close.This closes the
> >
> > Indexsearcher and the reader.
> >
> > Then also i am not able to delete the file in windows
> >
> > Do i have to close anything other than the indexsearcher and the reader
> > to delete the files
> >
> >
> >  On 2/1/06, Howie Wang <howie_wang@hotmail.com > wrote:
> > >
> > > I haven't tested this out too well, but I've been using it on my
> > > development box. In case people are wondering how to get a
> > > new NutchBean which will re-read the index, what I did was write
> > > a little JSP page called reset_nutch_bean.jsp. All it does it remove
> > > the "nutchBean" attr from the application context. So the next
> > > search that comes along will create a new bean.
> > >
> > > Any time I upload a new index, I just run this page. Pretty easy.
> > > It's not automated, but it's easy enough to do after uploading
> > > an index. You could automate it by adding it to a script
> > > after finishing an index or after doing an upload or just run it
> > > periodically with cron.
> > >
> > > The code is after my sig. Let me know if what I've done is bad in
> > > some way.
> > >
> > > Howie
> > >
> > > <%@ page
> > > contentType="text/html; charset=UTF-8"
> > > pageEncoding="UTF-8"
> > >
> > > import="javax.servlet.*"
> > > import="javax.servlet.http.*"
> > >
> > > %>
> > > <%
> > >        application.removeAttribute("nutchBean");
> > > %>
> > >
> > >
> > >
> >
>

Re: Updating the search index

Posted by Raghavendra Prabhu <rr...@gmail.com>.
Maybe We should have a close method also in IndexSearcher

Which gives greater flexibility while deleting files

But even after implemention , the data is locked by some search mechanism

Maybe Some ArrayFileReader is holding it

Are'nt we supposed to release these handlers

Rgds

Prabhu


On 2/1/06, Raghavendra Prabhu <rr...@gmail.com> wrote:
>
> Hi guys
>
> Even i face the same problem
>
> I am doing something similar to what howie is doing
>
> But i want to delete the existing files in the index and replace them
>
>
> I even implement a new method which calls close.This closes the
>
> Indexsearcher and the reader.
>
> Then also i am not able to delete the file in windows
>
> Do i have to close anything other than the indexsearcher and the reader to
> delete the files
>
>
>  On 2/1/06, Howie Wang <ho...@hotmail.com> wrote:
> >
> > I haven't tested this out too well, but I've been using it on my
> > development box. In case people are wondering how to get a
> > new NutchBean which will re-read the index, what I did was write
> > a little JSP page called reset_nutch_bean.jsp. All it does it remove
> > the "nutchBean" attr from the application context. So the next
> > search that comes along will create a new bean.
> >
> > Any time I upload a new index, I just run this page. Pretty easy.
> > It's not automated, but it's easy enough to do after uploading
> > an index. You could automate it by adding it to a script
> > after finishing an index or after doing an upload or just run it
> > periodically with cron.
> >
> > The code is after my sig. Let me know if what I've done is bad in
> > some way.
> >
> > Howie
> >
> > <%@ page
> > contentType="text/html; charset=UTF-8"
> > pageEncoding="UTF-8"
> >
> > import="javax.servlet.*"
> > import="javax.servlet.http.*"
> >
> > %>
> > <%
> >        application.removeAttribute("nutchBean");
> > %>
> >
> >
> >
>

Re: Updating the search index

Posted by Raghavendra Prabhu <rr...@gmail.com>.
Hi guys

Even i face the same problem

I am doing something similar to what howie is doing

But i want to delete the existing files in the index and replace them


I even implement a new method which calls close.This closes the

Indexsearcher and the reader.

Then also i am not able to delete the file in windows

Do i have to close anything other than the indexsearcher and the reader to
delete the files


On 2/1/06, Howie Wang <ho...@hotmail.com> wrote:
>
> I haven't tested this out too well, but I've been using it on my
> development box. In case people are wondering how to get a
> new NutchBean which will re-read the index, what I did was write
> a little JSP page called reset_nutch_bean.jsp. All it does it remove
> the "nutchBean" attr from the application context. So the next
> search that comes along will create a new bean.
>
> Any time I upload a new index, I just run this page. Pretty easy.
> It's not automated, but it's easy enough to do after uploading
> an index. You could automate it by adding it to a script
> after finishing an index or after doing an upload or just run it
> periodically with cron.
>
> The code is after my sig. Let me know if what I've done is bad in
> some way.
>
> Howie
>
> <%@ page
> contentType="text/html; charset=UTF-8"
> pageEncoding="UTF-8"
>
> import="javax.servlet.*"
> import="javax.servlet.http.*"
>
> %>
> <%
>        application.removeAttribute("nutchBean");
> %>
>
>
>