You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by spamsucks <sp...@rhoderunner.com> on 2007/02/01 22:10:43 UTC

Looking for crawler recommendations.

Has anyone integrated a crawler with lucene that they had success with?  I 
cannot use Nutch, since 60% of our searchable content is contained in a 
database.  I need to do a hybrid between database indexing and website 
crawling.  I would be just crawling one domain with a given set of 
directories.

I found this list of crawlers, but nothing that quite seems to fit my needs. 
One problem with a couple of the libraries that may work is that they use a 
GNU license.
http://www.manageability.org/blog/stuff/open-source-web-crawlers-java/view

Thanks.



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org