You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@cocoon.apache.org by "Sreenivasan N." <sr...@ap.sony.com> on 2002/12/10 06:39:44 UTC

How to build custom search engine in cocoon?

Hi ALL

I am trying to build a search engine which search based on keyword to give 
document URL. The documents can be xml, pdf, word. How can i build a search 
engine.

Design 1 :  Use Xindice native database to keep the keywords, location of 
the document. Search into it and get the URL and give it as hyperlink.
Design 2 :  Save XML documents with keywords directly into Xindice and retrive

Is there any method i can do a search were the keywords and the documents 
are seperate? Kindly help me out

Thanks in advance
Sreenivasan.







"Attitudes are much more important than aptitudes."
"Nothing is impossible for a willing heart"

Sreenivasan N.
Sony SARD
Ext 5816

Email. sreenivasan.n@ap.sony.com
Per: nsreenivasan@yahoo.com


---------------------------------------------------------------------
Please check that your question  has not already been answered in the
FAQ before posting.     <http://xml.apache.org/cocoon/faq/index.html>

To unsubscribe, e-mail:     <co...@xml.apache.org>
For additional commands, e-mail:   <co...@xml.apache.org>


RE: How to build custom search engine in cocoon?

Posted by Hugo Burm <hu...@xs4all.nl>.
I don't know your reasons for using Xindice, but using Lucene as your search
engine may be a better approach:
- The Lucene 1.2 jar is integrated into the Cocoon war.
- With Lucene you divide your XML document into fields. For each field you
can decide whether to store it into the Lucene index or refer to it as an
external link. So you can store titles of documents internally in the Lucene
index, and refer to the body as an external link.
- Lucene can index XML documents that are stored in Xindice. So if your
content is in Xindice, you can leave it there.
- Indexing pdf and Word documents is not trivial. See the Lucene mailing
list for solutions.
- Building a search engine is complicated. You may have to tackle: literals,
stemming, fuzzy search, stop words, and apply boolean operators to the parts
that define the query of your users. Just to mention a few.
- See jakarta.apache.org for more info on Lucene. Check the xml examples.

Hugo


-----Original Message-----
From: Sreenivasan N. [mailto:sreenivasan.n@ap.sony.com]
Sent: Tuesday, December 10, 2002 6:40 AM
To: cocoon-users@xml.apache.org
Subject: How to build custom search engine in cocoon?


Hi ALL

I am trying to build a search engine which search based on keyword to give
document URL. The documents can be xml, pdf, word. How can i build a search
engine.

Design 1 :  Use Xindice native database to keep the keywords, location of
the document. Search into it and get the URL and give it as hyperlink.
Design 2 :  Save XML documents with keywords directly into Xindice and
retrive

Is there any method i can do a search were the keywords and the documents
are seperate? Kindly help me out

Thanks in advance
Sreenivasan.







"Attitudes are much more important than aptitudes."
"Nothing is impossible for a willing heart"

Sreenivasan N.
Sony SARD
Ext 5816

Email. sreenivasan.n@ap.sony.com
Per: nsreenivasan@yahoo.com


---------------------------------------------------------------------
Please check that your question  has not already been answered in the
FAQ before posting.     <http://xml.apache.org/cocoon/faq/index.html>

To unsubscribe, e-mail:     <co...@xml.apache.org>
For additional commands, e-mail:   <co...@xml.apache.org>



---------------------------------------------------------------------
Please check that your question  has not already been answered in the
FAQ before posting.     <http://xml.apache.org/cocoon/faq/index.html>

To unsubscribe, e-mail:     <co...@xml.apache.org>
For additional commands, e-mail:   <co...@xml.apache.org>