You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@tomcat.apache.org by Marco Tedone <mt...@jemos.org> on 2003/09/05 00:32:14 UTC
[OT] Realizing a search functionality
Hi, I must admit that I don't know anything about how to realize a search
functionality. The only thing that I know is that most sites have a search
functionality which, when searching for something, return a list of links
more or less involved in the search string.
The only things I know are:
1) An index of the web site contents should be created somehow
2) The search 'action' (I'm talking in Struts terms, but I think it could be
anything) should interact with this index to match the required string
3) A list (which form does it assume) containing all the links related to
the query string should be created, eventually read and displayed to the
client
Did anyone of you realized succesfully a search functionality in its site?
Could you please address me towards some good software (possibly
open-source, possibly Jakarta, possibly java-oriented) and patterns to use
to realize a search functionality?
Many thanks,
Marco
Re: [OT] Realizing a search functionality
Posted by Marco Tedone <mt...@jemos.org>.
Sorry....I found Jakarta Lucene....I'll work on it :)
Marco
----- Original Message -----
From: "Marco Tedone" <mt...@jemos.org>
To: "Tomcat Users List" <to...@jakarta.apache.org>
Sent: Thursday, September 04, 2003 11:32 PM
Subject: [OT] Realizing a search functionality
> Hi, I must admit that I don't know anything about how to realize a search
> functionality. The only thing that I know is that most sites have a search
> functionality which, when searching for something, return a list of links
> more or less involved in the search string.
>
> The only things I know are:
>
> 1) An index of the web site contents should be created somehow
> 2) The search 'action' (I'm talking in Struts terms, but I think it could
be
> anything) should interact with this index to match the required string
> 3) A list (which form does it assume) containing all the links related to
> the query string should be created, eventually read and displayed to the
> client
>
> Did anyone of you realized succesfully a search functionality in its site?
> Could you please address me towards some good software (possibly
> open-source, possibly Jakarta, possibly java-oriented) and patterns to
use
> to realize a search functionality?
>
> Many thanks,
>
> Marco
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: tomcat-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: tomcat-user-help@jakarta.apache.org
>
Re: [OT] Realizing a search functionality
Posted by John Turner <to...@johnturner.com>.
Thanks for the clarification.
John
Tim Funk wrote:
> Lucene indexes "documents". A document is composed of fields and does
> not need (and it actuually is not) to be a physical file.
>
> In the simplistic example of a site consisting of a single dynamic web
> page backed by a database. You would create "documents" based on the
> database data where the db data goes into named fields. Then when you
> construct your query, it will return a list of documents. When you
> iterate through each document, you need to pull the appropriate field
> out of the document to reconstruct the appropriate URL.
>
> In a nutshell, it can do what you want, but there is a lot of setup work
> to construct documents and a lot of work to display results from
> documents from queries.
>
> -Tim
>
> John Turner wrote:
>
>>
>> AFAIK, Lucene indexes files. How then, do you index a dynamic site?
>> The only files that exist on a dynamic site are source code files.
>> Servlets would never be indexed...how then do you index the content
>> returned from the servlet? Can Lucene do this?
>>
>> The Lucene site is pretty sparse in information. Not having worked
>> with it, and not knowing every option available when using it, I think
>> there might be some other alternatives.
>> John
>>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: tomcat-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: tomcat-user-help@jakarta.apache.org
>
Re: [OT] Realizing a search functionality
Posted by Tim Funk <fu...@joedog.org>.
Lucene indexes "documents". A document is composed of fields and does not
need (and it actuually is not) to be a physical file.
In the simplistic example of a site consisting of a single dynamic web page
backed by a database. You would create "documents" based on the database data
where the db data goes into named fields. Then when you construct your query,
it will return a list of documents. When you iterate through each document,
you need to pull the appropriate field out of the document to reconstruct the
appropriate URL.
In a nutshell, it can do what you want, but there is a lot of setup work to
construct documents and a lot of work to display results from documents from
queries.
-Tim
John Turner wrote:
>
> AFAIK, Lucene indexes files. How then, do you index a dynamic site? The
> only files that exist on a dynamic site are source code files. Servlets
> would never be indexed...how then do you index the content returned from
> the servlet? Can Lucene do this?
>
> The Lucene site is pretty sparse in information. Not having worked with
> it, and not knowing every option available when using it, I think there
> might be some other alternatives.
>
> John
>
Re: [OT] Realizing a search functionality
Posted by Marco Tedone <mt...@jemos.org>.
Thank you. I think I'll go for Lucene.
Marco
----- Original Message -----
From: "John Turner" <to...@johnturner.com>
To: "Tomcat Users List" <to...@jakarta.apache.org>
Sent: Friday, September 05, 2003 1:20 PM
Subject: Re: [OT] Realizing a search functionality
>
> AFAIK, Lucene indexes files. How then, do you index a dynamic site?
> The only files that exist on a dynamic site are source code files.
> Servlets would never be indexed...how then do you index the content
> returned from the servlet? Can Lucene do this?
>
> The Lucene site is pretty sparse in information. Not having worked with
> it, and not knowing every option available when using it, I think there
> might be some other alternatives. I've used Verity in the past, but
> that is a commercial product. The other tool I've used in the past to
> great success is Atomz (http://www.atomz.com). The "trial" is
> never-ending, so an index of up to 500 "pages" is free. Pages also =
> URL. The nice thing about Atomz is that it will spider your site and
> index the content returned, thus it works quite well for dynamic sites.
>
> In other words, it will take a URL like
> "http://your.domain.com/content.jsp?id=512&view=full" and index the
> content returned from that, not the actual text string of the URL.
>
> The only requirement is that you display the Atomz logo on the search
> results page. You can pay a small annual fee to have that removed. All
> indexes and collections are kept on the Atomz site, not yours, and you
> can define the stylesheet and template that is used to display the
> search results, as well as define the frequency of indexing.
>
> John
>
> Schalk wrote:
> > Marco
> >
> > You may to have a look at Lucene (OpenSource Jakarata project) at:
> > http://jakarta.apache.org/lucene/docs/index.html
> >
> > Kind Regards
> > Schalk Neethling
> > Volume4.Development.Multimedia.Branding
> > emotionalize.conceptualize.visualize.realize
> > Tel: +27125468436
> > Fax: +27125468436
> > email:schalk@volume4.co.za
> > web: www.volume4.co.za
> >
> >
> > :: -----Original Message-----
> > :: From: Marco Tedone [mailto:mtedone@jemos.org]
> > :: Sent: Friday, September 05, 2003 12:32 AM
> > :: To: Tomcat Users List
> > :: Subject: [OT] Realizing a search functionality
> > ::
> > :: Hi, I must admit that I don't know anything about how to realize a
search
> > :: functionality. The only thing that I know is that most sites have a
> > search
> > :: functionality which, when searching for something, return a list of
links
> > :: more or less involved in the search string.
> > ::
> > :: The only things I know are:
> > ::
> > :: 1) An index of the web site contents should be created somehow
> > :: 2) The search 'action' (I'm talking in Struts terms, but I think it
could
> > be
> > :: anything) should interact with this index to match the required
string
> > :: 3) A list (which form does it assume) containing all the links
related to
> > :: the query string should be created, eventually read and displayed to
the
> > :: client
> > ::
> > :: Did anyone of you realized succesfully a search functionality in its
> > site?
> > :: Could you please address me towards some good software (possibly
> > :: open-source, possibly Jakarta, possibly java-oriented) and patterns
to
> > use
> > :: to realize a search functionality?
> > ::
> > :: Many thanks,
> > ::
> > :: Marco
> > ::
> > ::
> > ::
> > ::
> > :: ---------------------------------------------------------------------
> > :: To unsubscribe, e-mail: tomcat-user-unsubscribe@jakarta.apache.org
> > :: For additional commands, e-mail: tomcat-user-help@jakarta.apache.org
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: tomcat-user-unsubscribe@jakarta.apache.org
> > For additional commands, e-mail: tomcat-user-help@jakarta.apache.org
> >
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: tomcat-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: tomcat-user-help@jakarta.apache.org
>
>
Re: [OT] Realizing a search functionality
Posted by John Turner <to...@johnturner.com>.
Ulrich Mayring wrote:
> John Turner wrote:
>
>> Ulrich Mayring wrote:
>>
>>> I can only recommend Lucene, it is vastly superior to any
>>> pre-packaged search engine, because you do not depend on specific
>>> features or behavior, but can customize everything to your needs.
>>
>>
>> Assuming you have time, money, skills, etc. to do so, which is not
>> always the case.
>
>
> Skills is the key issue. It took me all of one week to write our own
> custom search engine and I doubt that anyone would be able to install
> and configure a third-party product any faster than that. I had no prior
> exposure to Lucene, but of course knew my way around Java.
Hmmm...I had Atomz working for several clients by lunch one day. ;) I'm
not arguing, just emphasizing that some of us are not Java developers.
Granted, the question was somewhat in a context of "using Java" and not
"using Tomcat", but not every Tomcat user is a developer.
John
Re: [OT] Realizing a search functionality
Posted by Ulrich Mayring <ul...@denic.de>.
John Turner wrote:
> Ulrich Mayring wrote:
>
>> I can only recommend Lucene, it is vastly superior to any pre-packaged
>> search engine, because you do not depend on specific features or
>> behavior, but can customize everything to your needs.
>
> Assuming you have time, money, skills, etc. to do so, which is not
> always the case.
Skills is the key issue. It took me all of one week to write our own
custom search engine and I doubt that anyone would be able to install
and configure a third-party product any faster than that. I had no prior
exposure to Lucene, but of course knew my way around Java.
So, I don't think time and money are factors here at all. BTW, the guy
who originally wrote Lucene is now developing an OpenSource version of
Google with major financial backing. So you can see that there is some
serious technology behind Lucene and IMHO it's worth to learn it.
Ulrich
Re: [OT] Realizing a search functionality
Posted by John Turner <to...@johnturner.com>.
Ulrich Mayring wrote:
>
> Lucene is not a search engine, but an API for writing a search engine,
> so it can do everything that you can write in Java. By itself it does
> nothing, like the JDK.
Thanks for the clarification.
>
> I can only recommend Lucene, it is vastly superior to any pre-packaged
> search engine, because you do not depend on specific features or
> behavior, but can customize everything to your needs.
Assuming you have time, money, skills, etc. to do so, which is not
always the case.
John
Re: [OT] Realizing a search functionality
Posted by Ulrich Mayring <ul...@denic.de>.
John Turner wrote:
>
> AFAIK, Lucene indexes files. How then, do you index a dynamic site? The
> only files that exist on a dynamic site are source code files. Servlets
> would never be indexed...how then do you index the content returned from
> the servlet? Can Lucene do this?
Lucene is not a search engine, but an API for writing a search engine,
so it can do everything that you can write in Java. By itself it does
nothing, like the JDK.
In my case I've implemented a search engine that gets local files and
hands them to the Lucene Indexer, but that could also be implemented so
that it retrieves files via HTTP.
I can only recommend Lucene, it is vastly superior to any pre-packaged
search engine, because you do not depend on specific features or
behavior, but can customize everything to your needs.
Ulrich
Re: [OT] Realizing a search functionality
Posted by Louise Pryor <li...@louisepryor.com>.
On Friday, September 5, 2003 at 1:20:00 PM, John Turner wrote:
<snip>
JT> The other tool I've used in the past to
JT> great success is Atomz (http://www.atomz.com). The "trial" is
JT> never-ending, so an index of up to 500 "pages" is free. Pages also =
JT> URL. The nice thing about Atomz is that it will spider your site and
JT> index the content returned, thus it works quite well for dynamic sites.
JT> In other words, it will take a URL like
JT> "http://your.domain.com/content.jsp?id=512&view=full" and index the
JT> content returned from that, not the actual text string of the URL.
<snip>
I use atomz, because it's free. There are a couple of issues with it:
- the template for the search results is pretty hard to get right.
- because of the spidering, session tracking through the URL is not a
good idea. It gets up to the limit of 500 *very* quickly, as the
session id part of the URL makes it think that it's a whole new page.
Luckily my web site isn't really dependent on sessions, so I was able
to get round that (but it does mean that I can't use the struts
rewriting tags...).
Otherwise I'm very happy with atomz.
--
Louise Pryor
http://www.louisepryor.com
Re: [OT] Realizing a search functionality
Posted by John Turner <to...@johnturner.com>.
AFAIK, Lucene indexes files. How then, do you index a dynamic site?
The only files that exist on a dynamic site are source code files.
Servlets would never be indexed...how then do you index the content
returned from the servlet? Can Lucene do this?
The Lucene site is pretty sparse in information. Not having worked with
it, and not knowing every option available when using it, I think there
might be some other alternatives. I've used Verity in the past, but
that is a commercial product. The other tool I've used in the past to
great success is Atomz (http://www.atomz.com). The "trial" is
never-ending, so an index of up to 500 "pages" is free. Pages also =
URL. The nice thing about Atomz is that it will spider your site and
index the content returned, thus it works quite well for dynamic sites.
In other words, it will take a URL like
"http://your.domain.com/content.jsp?id=512&view=full" and index the
content returned from that, not the actual text string of the URL.
The only requirement is that you display the Atomz logo on the search
results page. You can pay a small annual fee to have that removed. All
indexes and collections are kept on the Atomz site, not yours, and you
can define the stylesheet and template that is used to display the
search results, as well as define the frequency of indexing.
John
Schalk wrote:
> Marco
>
> You may to have a look at Lucene (OpenSource Jakarata project) at:
> http://jakarta.apache.org/lucene/docs/index.html
>
> Kind Regards
> Schalk Neethling
> Volume4.Development.Multimedia.Branding
> emotionalize.conceptualize.visualize.realize
> Tel: +27125468436
> Fax: +27125468436
> email:schalk@volume4.co.za
> web: www.volume4.co.za
>
>
> :: -----Original Message-----
> :: From: Marco Tedone [mailto:mtedone@jemos.org]
> :: Sent: Friday, September 05, 2003 12:32 AM
> :: To: Tomcat Users List
> :: Subject: [OT] Realizing a search functionality
> ::
> :: Hi, I must admit that I don't know anything about how to realize a search
> :: functionality. The only thing that I know is that most sites have a
> search
> :: functionality which, when searching for something, return a list of links
> :: more or less involved in the search string.
> ::
> :: The only things I know are:
> ::
> :: 1) An index of the web site contents should be created somehow
> :: 2) The search 'action' (I'm talking in Struts terms, but I think it could
> be
> :: anything) should interact with this index to match the required string
> :: 3) A list (which form does it assume) containing all the links related to
> :: the query string should be created, eventually read and displayed to the
> :: client
> ::
> :: Did anyone of you realized succesfully a search functionality in its
> site?
> :: Could you please address me towards some good software (possibly
> :: open-source, possibly Jakarta, possibly java-oriented) and patterns to
> use
> :: to realize a search functionality?
> ::
> :: Many thanks,
> ::
> :: Marco
> ::
> ::
> ::
> ::
> :: ---------------------------------------------------------------------
> :: To unsubscribe, e-mail: tomcat-user-unsubscribe@jakarta.apache.org
> :: For additional commands, e-mail: tomcat-user-help@jakarta.apache.org
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: tomcat-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: tomcat-user-help@jakarta.apache.org
>
RE: [OT] Realizing a search functionality
Posted by Schalk <sc...@volume4.co.za>.
Marco
You may to have a look at Lucene (OpenSource Jakarata project) at:
http://jakarta.apache.org/lucene/docs/index.html
Kind Regards
Schalk Neethling
Volume4.Development.Multimedia.Branding
emotionalize.conceptualize.visualize.realize
Tel: +27125468436
Fax: +27125468436
email:schalk@volume4.co.za
web: www.volume4.co.za
:: -----Original Message-----
:: From: Marco Tedone [mailto:mtedone@jemos.org]
:: Sent: Friday, September 05, 2003 12:32 AM
:: To: Tomcat Users List
:: Subject: [OT] Realizing a search functionality
::
:: Hi, I must admit that I don't know anything about how to realize a search
:: functionality. The only thing that I know is that most sites have a
search
:: functionality which, when searching for something, return a list of links
:: more or less involved in the search string.
::
:: The only things I know are:
::
:: 1) An index of the web site contents should be created somehow
:: 2) The search 'action' (I'm talking in Struts terms, but I think it could
be
:: anything) should interact with this index to match the required string
:: 3) A list (which form does it assume) containing all the links related to
:: the query string should be created, eventually read and displayed to the
:: client
::
:: Did anyone of you realized succesfully a search functionality in its
site?
:: Could you please address me towards some good software (possibly
:: open-source, possibly Jakarta, possibly java-oriented) and patterns to
use
:: to realize a search functionality?
::
:: Many thanks,
::
:: Marco
::
::
::
::
:: ---------------------------------------------------------------------
:: To unsubscribe, e-mail: tomcat-user-unsubscribe@jakarta.apache.org
:: For additional commands, e-mail: tomcat-user-help@jakarta.apache.org