You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Markus Atteneder <MA...@gmx.de> on 2005/06/21 16:42:34 UTC

Updateing Documents:

I am looking for a SearchEngine for our Intranet and so i deal with Lucene.
I  have read the FAQ and some Postings and i got first experiences with it
and now i have some questions. 
1. Is lucene a suitable SearchEngine for a Intranetsearch? I've experienced
with poi and pdfbox for indexing Word/Excel/PDF files.
2. Files are changing frequently, so the indexing should run at least daily.
Is there a possibility out of the box to delete changed files from the index
and readd them to the index? I've read that documents only can be deleted if
you know the ID of the document in the index and that could change after a
optimization of the index. Is there a "best practice" for that? I thind a
full indexing every day is not a good solution because of the datavolume.
3. Does anyone know a project based on lucene that offers a complete
solution for a Intranetsearch?

-- 
Geschenkt: 3 Monate GMX ProMail gratis + 3 Ausgaben stern gratis
++ Jetzt anmelden & testen ++ http://www.gmx.net/de/go/promail ++

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Updateing Documents:

Posted by jian chen <ch...@gmail.com>.
Hi,

You may look at this website
http://www.zilverline.org

Cheers,

Jian

On 6/21/05, Markus Atteneder <MA...@gmx.de> wrote:
> I am looking for a SearchEngine for our Intranet and so i deal with Lucene.
> I  have read the FAQ and some Postings and i got first experiences with it
> and now i have some questions.
> 1. Is lucene a suitable SearchEngine for a Intranetsearch? I've experienced
> with poi and pdfbox for indexing Word/Excel/PDF files.
> 2. Files are changing frequently, so the indexing should run at least daily.
> Is there a possibility out of the box to delete changed files from the index
> and readd them to the index? I've read that documents only can be deleted if
> you know the ID of the document in the index and that could change after a
> optimization of the index. Is there a "best practice" for that? I thind a
> full indexing every day is not a good solution because of the datavolume.
> 3. Does anyone know a project based on lucene that offers a complete
> solution for a Intranetsearch?
> 
> --
> Geschenkt: 3 Monate GMX ProMail gratis + 3 Ausgaben stern gratis
> ++ Jetzt anmelden & testen ++ http://www.gmx.net/de/go/promail ++
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Updateing Documents:

Posted by Chris Hostetter <ho...@fucit.org>.
: 3. Does anyone know a project based on lucene that offers a complete
: solution for a Intranetsearch?

nutch...
http://lucene.apache.org/nutch/



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org