You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Christian Ubbesen <cu...@octacom.net> on 2002/05/31 23:00:20 UTC

Standalone Lucene server

I'm thinking of using Lucene as a general purpose tool in my toolbox,
and therefore use it in non-java-only-environments.

For instance, I would like to use the search capabilities in one of my
clients NT4/IIS/ASP-environments.

Since Lucene is essentially a java-library today, I'm wondering if
anyone have wrapped it up as standalone search engine with some neat
interface (keywords: TCP, HTTP, XML-RPC, SOAP, whatever really...)? 

Otherwise I suppose it could be an idea to create this sort of
container.


Christian



--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: Standalone Lucene server

Posted by Erik Hatcher <li...@ehatchersolutions.com>.
Well, I would certainly encourage you to get the book if you're doing
anything at all with Ant.  But the book itself won't really go into Lucene
details.  I've attached two screen shots from the book to this message (will
they be stripped by the mail server?) - the "canoo" extension is because we
did automated functional testing of our web app with Canoo WebTest. You can
see the Lucene logo and a sophisticated Lucene expression being used in one
screenshot, and the results of that search in the other.

The code will be available for download from Manning's site when the book is
published, but its really nothing elaborate.  I've contributed the Ant task
to do indexing to the lucene-dev crew, and when I finally get some breathing
room I'll commit it to the Lucene sandbox CVS repository (but its in the
mail archives at least).

As for the code wrapping it into a web service and EJB - not much to it
really (laughably simple, actually)- as we wrote a very simple wrapper that
did the querying and returned results so that Lucene's API was not even seen
from tools that searched.  That same wrapper was easily reused no matter
what front-end we wanted to put on it.  I can contribute that code to the
sandbox as well, probably.

    Erik


----- Original Message -----
From: "Clemens Marschner" <cm...@lanlab.de>
To: "Lucene Users List" <lu...@jakarta.apache.org>
Sent: Saturday, June 01, 2002 3:04 PM
Subject: Re: Standalone Lucene server


> Hm, so I suppose we need to buy the book to see it running... (and get the
> code)? ;-)
>
> Regards,
>
> Clemens
>
> ----- Original Message -----
> From: "Erik Hatcher" <li...@ehatchersolutions.com>
> To: "Lucene Users List" <lu...@jakarta.apache.org>
> Sent: Saturday, June 01, 2002 1:22 AM
> Subject: Re: Standalone Lucene server
>
>
> > The application we built for our book (Java Development with Ant -
> > http://www.manning.com/antbook/) uses Lucene to build an index from an
Ant
> > build (think static documentation here) and then was incorporated in a
few
> > different environments:
> >
> >     - command-line query tool
> >     - Ant query task
> >     - webapp - with simple text box query and results display (using
> Struts)
> >     - Stateless Session Bean - deployed into JBoss, and EJB client could
> > query index
> >     - Web Service - using Tomcat and Axis - can query the index from a
web
> > service client
> >
>
>
>
> --
> To unsubscribe, e-mail:
<ma...@jakarta.apache.org>
> For additional commands, e-mail:
<ma...@jakarta.apache.org>
>
>

Re: Standalone Lucene server

Posted by Clemens Marschner <cm...@lanlab.de>.
Hm, so I suppose we need to buy the book to see it running... (and get the
code)? ;-)

Regards,

Clemens

----- Original Message -----
From: "Erik Hatcher" <li...@ehatchersolutions.com>
To: "Lucene Users List" <lu...@jakarta.apache.org>
Sent: Saturday, June 01, 2002 1:22 AM
Subject: Re: Standalone Lucene server


> The application we built for our book (Java Development with Ant -
> http://www.manning.com/antbook/) uses Lucene to build an index from an Ant
> build (think static documentation here) and then was incorporated in a few
> different environments:
>
>     - command-line query tool
>     - Ant query task
>     - webapp - with simple text box query and results display (using
Struts)
>     - Stateless Session Bean - deployed into JBoss, and EJB client could
> query index
>     - Web Service - using Tomcat and Axis - can query the index from a web
> service client
>



--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: Standalone Lucene server

Posted by Erik Hatcher <li...@ehatchersolutions.com>.
The application we built for our book (Java Development with Ant -
http://www.manning.com/antbook/) uses Lucene to build an index from an Ant
build (think static documentation here) and then was incorporated in a few
different environments:

    - command-line query tool
    - Ant query task
    - webapp - with simple text box query and results display (using Struts)
    - Stateless Session Bean - deployed into JBoss, and EJB client could
query index
    - Web Service - using Tomcat and Axis - can query the index from a web
service client

It was all built as proof-of-concept, so certainly has not been exposed to
any heavy loads but it all worked nicely.

But on a related note, if you've got NT4/IIS/ASP, why not use MS Index
Server. Yeah, I know thats probably curse words in this forum, but use the
tools you have handy.  As a matter of fact, for my day job we ran into just
this very situation where we needed Word documents indexed and POI isn't
really up to the job just yet.  (our requirements are to run WebSphere
behind IIS, so Windows 2000 was already in the picture). So I built a very
simple ASP page that queried Index Server and returned XML.  A few lines of
Java code using Commons Digester and I was in business easily. (if there is
a Java way to query Index Server, I'd love to hear about it!).

    Erik


----- Original Message -----
From: "Christian Ubbesen" <cu...@octacom.net>
To: <lu...@jakarta.apache.org>
Sent: Friday, May 31, 2002 5:00 PM
Subject: Standalone Lucene server


> I'm thinking of using Lucene as a general purpose tool in my toolbox,
> and therefore use it in non-java-only-environments.
>
> For instance, I would like to use the search capabilities in one of my
> clients NT4/IIS/ASP-environments.
>
> Since Lucene is essentially a java-library today, I'm wondering if
> anyone have wrapped it up as standalone search engine with some neat
> interface (keywords: TCP, HTTP, XML-RPC, SOAP, whatever really...)?
>
> Otherwise I suppose it could be an idea to create this sort of
> container.
>
>
> Christian
>
>
>
> --
> To unsubscribe, e-mail:
<ma...@jakarta.apache.org>
> For additional commands, e-mail:
<ma...@jakarta.apache.org>
>
>


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: Standalone Lucene server

Posted by Richard Taylor <ri...@newscientist.com>.
Hi,

We're using Lucene in exactly this way (even though we have java
throughout).

I have wrapped up lucene as a web service and send search requests as xml
strings that are unmarshalled on the server (using JAXB).
These are then used to build up a query and execute it returning the number
of results (storing the search for the web service client by an ID).  Then
the client requests a subset of the results (10 at a time in our case) and
these are returned as an XML string.

We initially had it set up with AXIS (trying to keep to the JAX-RPC
standards) but there were too many bugs and performace issues for our
production environment.  We finally launched with glue
(www.themindelectric.com) and I was amazed with the stability and ease of
use. To be fair though it was very easy to switch between the two and I'll
keep a watch on the AXIS development.

This allows your web service clients to be any SOAP client.

Richard Taylor
New Scientist Developer

----- Original Message -----
From: "Christian Ubbesen" <cu...@octacom.net>
To: <lu...@jakarta.apache.org>
Sent: Friday, May 31, 2002 10:00 PM
Subject: Standalone Lucene server


> I'm thinking of using Lucene as a general purpose tool in my toolbox,
> and therefore use it in non-java-only-environments.
>
> For instance, I would like to use the search capabilities in one of my
> clients NT4/IIS/ASP-environments.
>
> Since Lucene is essentially a java-library today, I'm wondering if
> anyone have wrapped it up as standalone search engine with some neat
> interface (keywords: TCP, HTTP, XML-RPC, SOAP, whatever really...)?
>
> Otherwise I suppose it could be an idea to create this sort of
> container.
>
>
> Christian
>
>
>
> --
> To unsubscribe, e-mail:
<ma...@jakarta.apache.org>
> For additional commands, e-mail:
<ma...@jakarta.apache.org>


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


RE: Standalone Lucene server

Posted by Ian Forsyth <ia...@plusfour.org>.
I am working on this for a php environment. Essentially i am working to make
the lucene api accessible via command line... where results are spit out
stdout using xml

i am working at making an extensible class to index a mysql database..

the command line would looks something like..
java -cp /home/username/lucene/lucene.jar org.apache.lucene.IndexMySQL -h
[host] -u [username] -p [pass] -d [database] -T [tables] -F [fields]

java -cp /home/username/lucene/lucene.jar org.apache.lucene.SearchMySQL -h
[host] -u [username] -p [pass] -d [database] -T [tables] -F [fields] -q
[query] -s [start] -stop [stop] -resultkeys [the fields to be used in the
results]

the result of the above is a string/xml stream that is easily parsable to
show a results page..
so the xml stream would be like...

<?xml version="1.0"?>
<result table="band">
<field name="bandid" type="int">30</field>
<field name="bandname"> type="varchar">Storm &amp; Stress</field>
<field name="desc" type="varchar">Like water in the city subway going into
the ocean quickly</field>
</result>

then i can build a link in php <a href="/index.php?page=band&id=30">Storm
&amp; Stress</a>

I am far from having this working.. but from what i have thus far this seems
to be the most extensible.. interms of indexing and searching outside of a
java application server environment, its just a matter of outputting your
results in stdout in some predictable format... and if you build the search
class to output xml then chances are any language you deal with you will
you'll be able to parse a proper result set easily..

Ian

> -----Original Message-----
> From: Christian Ubbesen [mailto:cu@octacom.net]
> Sent: Friday, May 31, 2002 5:00 PM
> To: lucene-user@jakarta.apache.org
> Subject: Standalone Lucene server
>
>
> I'm thinking of using Lucene as a general purpose tool in my toolbox,
> and therefore use it in non-java-only-environments.
>
> For instance, I would like to use the search capabilities in one of my
> clients NT4/IIS/ASP-environments.
>
> Since Lucene is essentially a java-library today, I'm wondering if
> anyone have wrapped it up as standalone search engine with some neat
> interface (keywords: TCP, HTTP, XML-RPC, SOAP, whatever really...)?
>
> Otherwise I suppose it could be an idea to create this sort of
> container.
>
>
> Christian
>
>
>
> --
> To unsubscribe, e-mail:
<ma...@jakarta.apache.org>
For additional commands, e-mail:
<ma...@jakarta.apache.org>



--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: Standalone Lucene server

Posted by James Cooper <pi...@bitmechanic.com>.
On Fri, 31 May 2002, Christian Ubbesen wrote:

> Since Lucene is essentially a java-library today, I'm wondering if
> anyone have wrapped it up as standalone search engine with some neat
> interface (keywords: TCP, HTTP, XML-RPC, SOAP, whatever really...)? 

hi,

yeah, I agree.  Lucene is definately useful outside of Java
applications.  I'm currently using it as the search engine for a PHP based
web site.

I'm not doing anything super-smart.  Just using exec() to fork a JVM that
runs the search, prints the results to STDOUT, which I then parse in PHP.

I could see having a standard XML format for search results being
useful.  The only issue is that the format will likely need to change
given the structure of your index.

cheers

-- James


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>