You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Peter Carlson <ca...@bookandhammer.com> on 2002/10/30 20:12:24 UTC

Lucene Site Updated

The Lucene Website has been updated with some new content.

A Lucene File Format that was originally just an attachement from Doug 
Cutting was transformed to xml/html by Otis Gospodnetic. This document 
is available in the left hand nav area.

Also, the LARM project now has html documentation. Originally by 
Clemens Marschener  in PDF, .doc format, it is now in xml/html format 
by Otis Gospodnetic and is available under lucene-sandbox/larm in left 
hand nav area.

Please let us know if there are any issues with the site.

Thanks

--Peter


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: Lucene Site Updated

Posted by Clemens Marschner <cm...@lanlab.de>.
Thanks Peter for the update,

for those of you who read the doc on the LARM crawler: I added some new
sections, reflecting the changed state of the CVS version of the crawler:

- command line options were extended such that a list of URLs can now be
transmitted, not only one.
- URL normalization was added. See the new section on that
- the HostResolver was divided from the HostManager (in order to make it
more resusable) and can now be configured from the command line

The crawler still is in fact a framework with a "reference implementation"
in the class FetcherMain. You can change many options only by editing this
main class. You can see an example there on how Lucene can be made a storage
for the crawler.

--Clemens


----- Original Message -----
From: "Peter Carlson" <ca...@bookandhammer.com>
To: "Lucene Developers List" <lu...@jakarta.apache.org>; "List Lucene
Users" <lu...@jakarta.apache.org>
Sent: Wednesday, October 30, 2002 8:12 PM
Subject: Lucene Site Updated


> The Lucene Website has been updated with some new content.
>
> A Lucene File Format that was originally just an attachement from Doug
> Cutting was transformed to xml/html by Otis Gospodnetic. This document
> is available in the left hand nav area.
>
> Also, the LARM project now has html documentation. Originally by
> Clemens Marschener  in PDF, .doc format, it is now in xml/html format
> by Otis Gospodnetic and is available under lucene-sandbox/larm in left
> hand nav area.
>
> Please let us know if there are any issues with the site.
>
> Thanks
>
> --Peter
>
>
> --
> To unsubscribe, e-mail:
<ma...@jakarta.apache.org>
> For additional commands, e-mail:
<ma...@jakarta.apache.org>
>


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: Lucene Site Updated

Posted by Clemens Marschner <cm...@lanlab.de>.
Thanks Peter for the update,

for those of you who read the doc on the LARM crawler: I added some new
sections, reflecting the changed state of the CVS version of the crawler:

- command line options were extended such that a list of URLs can now be
transmitted, not only one.
- URL normalization was added. See the new section on that
- the HostResolver was divided from the HostManager (in order to make it
more resusable) and can now be configured from the command line

The crawler still is in fact a framework with a "reference implementation"
in the class FetcherMain. You can change many options only by editing this
main class. You can see an example there on how Lucene can be made a storage
for the crawler.

--Clemens


----- Original Message -----
From: "Peter Carlson" <ca...@bookandhammer.com>
To: "Lucene Developers List" <lu...@jakarta.apache.org>; "List Lucene
Users" <lu...@jakarta.apache.org>
Sent: Wednesday, October 30, 2002 8:12 PM
Subject: Lucene Site Updated


> The Lucene Website has been updated with some new content.
>
> A Lucene File Format that was originally just an attachement from Doug
> Cutting was transformed to xml/html by Otis Gospodnetic. This document
> is available in the left hand nav area.
>
> Also, the LARM project now has html documentation. Originally by
> Clemens Marschener  in PDF, .doc format, it is now in xml/html format
> by Otis Gospodnetic and is available under lucene-sandbox/larm in left
> hand nav area.
>
> Please let us know if there are any issues with the site.
>
> Thanks
>
> --Peter
>
>
> --
> To unsubscribe, e-mail:
<ma...@jakarta.apache.org>
> For additional commands, e-mail:
<ma...@jakarta.apache.org>
>


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: Lucene Site Updated

Posted by Otis Gospodnetic <ot...@yahoo.com>.
Clemens did all the work, I just applied some patches and put stuff in
CVS.

Otis

--- Peter Carlson <ca...@bookandhammer.com> wrote:
> The Lucene Website has been updated with some new content.
> 
> A Lucene File Format that was originally just an attachement from
> Doug 
> Cutting was transformed to xml/html by Otis Gospodnetic. This
> document 
> is available in the left hand nav area.
> 
> Also, the LARM project now has html documentation. Originally by 
> Clemens Marschener  in PDF, .doc format, it is now in xml/html format
> 
> by Otis Gospodnetic and is available under lucene-sandbox/larm in
> left 
> hand nav area.
> 
> Please let us know if there are any issues with the site.
> 
> Thanks
> 
> --Peter
> 
> 
> --
> To unsubscribe, e-mail:  
> <ma...@jakarta.apache.org>
> For additional commands, e-mail:
> <ma...@jakarta.apache.org>
> 


__________________________________________________
Do you Yahoo!?
HotJobs - Search new jobs daily now
http://hotjobs.yahoo.com/

--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>