You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by "Zhang, Lisheng" <Li...@broadvision.com> on 2004/09/29 19:38:26 UTC
Free software to crawl internet site?
Hi,
Does anyone know if there is free-software to crawl internet site
(webcrawler)? I know currently lucene does not have this feature
according to official lucene FAQ.
Thanks very much for helps,
Lisheng
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org
Re: Free software to crawl internet site?
Posted by Dawid Weiss <da...@cs.put.poznan.pl>.
Nutch has a crawler. So does Egothor (the crawler is called Capek). If
you type "web crawler" in Google you'll get tons of projects.
Dawid
Zhang, Lisheng wrote:
> Hi,
>
> Does anyone know if there is free-software to crawl internet site
> (webcrawler)? I know currently lucene does not have this feature
> according to official lucene FAQ.
>
> Thanks very much for helps,
>
> Lisheng
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org
Re: Free software to crawl internet site?
Posted by Otis Gospodnetic <ot...@yahoo.com>.
LARM is no longer being maintained, unfortunately. Use Nutch -
http://www.nutch.org/
Otis
--- Bo Gundersen <bg...@atira.dk> wrote:
> Zhang, Lisheng wrote:
>
> > Hi,
> >
> > Does anyone know if there is free-software to crawl internet site
> > (webcrawler)? I know currently lucene does not have this feature
> > according to official lucene FAQ.
> >
> > Thanks very much for helps,
>
> In the lucene sandbox there is a pretty advanced crawler called LARM,
>
> you can check it out at this URL
> http://jakarta.apache.org/lucene/docs/lucene-sandbox/ (right at the
> bottom).
>
> --
> Bo Gundersen
> DBA/Software Developer
> M.Sc.CS.
> www.atira.dk
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org
Re: Free software to crawl internet site?
Posted by Bo Gundersen <bg...@atira.dk>.
Zhang, Lisheng wrote:
> Hi,
>
> Does anyone know if there is free-software to crawl internet site
> (webcrawler)? I know currently lucene does not have this feature
> according to official lucene FAQ.
>
> Thanks very much for helps,
In the lucene sandbox there is a pretty advanced crawler called LARM,
you can check it out at this URL
http://jakarta.apache.org/lucene/docs/lucene-sandbox/ (right at the bottom).
--
Bo Gundersen
DBA/Software Developer
M.Sc.CS.
www.atira.dk
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org