You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Victor Lee <vi...@yahoo.com> on 2005/10/30 05:20:06 UTC

How to Use Nutch to Crawl Database?

Hi,
  How do I use Nutch to crawl internal database
instead of web server?  Does Nutch even support this
option?

Many thanks.  


		
__________________________________ 
Yahoo! FareChase: Search multiple travel sites in one click.
http://farechase.yahoo.com

Re: How to Use Nutch to Crawl Database?

Posted by sub paul <su...@gmail.com>.
Hi Victor,

You probably want to use just lucene to index your database.

http://www.dbsight.net/ (commercial product ) seems to do that very well.

You can also take a look at compass framework,
http://compassframework.org/display/SITE/Home
It seems very promising too but have not tried it.

They both use lucene to index the database.

Nutch also uses lucene for indexing purposes. However, if you database
has a web interface ??????, nutch might be able to crawl it just fine.

HTH,
Paul




On 10/29/05, Victor Lee <vi...@yahoo.com> wrote:
> Hi,
>   How do I use Nutch to crawl internal database
> instead of web server?  Does Nutch even support this
> option?
>
> Many thanks.
>
>
>
> __________________________________
> Yahoo! FareChase: Search multiple travel sites in one click.
> http://farechase.yahoo.com
>

Re: How to Use Nutch to Crawl Database?

Posted by Otis Gospodnetic <ot...@yahoo.com>.
Please don't cross-post unless there is a good reason.
I think you should ask on the Nutch list (I see you already did).

Here are a few good starting points:
  http://www.google.com/search?q=lucene%20database

Otis


--- Victor Lee <vi...@yahoo.com> wrote:

> Hi,
>   How do I use Nutch to crawl internal database
> instead of web server?  Does Nutch even support this
> option?
> 
> Many thanks.  
> 
> 
> 		
> __________________________________ 
> Yahoo! FareChase: Search multiple travel sites in one click.
> http://farechase.yahoo.com
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org