You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@nutch.apache.org by Shanthoosh PV <sh...@flipkart.com> on 2010/09/01 10:14:21 UTC

crawling webpage results

Hi ,

                    I want to crawl a result obtained based upon  a user
defined keyword search in a search engine . Is it possible to do it in nutch
. Please provide useful insights , i tried searching in this forum and
google but found nothing helpful .

                    The user may provide a search engine like
google.comalong with keyword to search for in that search engine . The
results of this
search should be crawled . Is it possibe to do in nutch , just providing the
search engine url along with the keyword for search.



Shanthoosh

Re: crawling webpage results

Posted by Alex McLintock <al...@gmail.com>.

This should really be a user type question, not a dev question. But
what the heck.

The first thing which comes to mind is to do the search yourself and
provide the results of that search as seed pages.

But since you asked on the dev mailing list, you could possibly write
something which actually queried Google itself through its API - but
Nutch doesn't do that itself. If you do write it then consider
submitting it as a patch.

Goodluck

Alex

On 1 September 2010 09:14, Shanthoosh PV <sh...@flipkart.com> wrote:
> Hi ,
>
>                     I want to crawl a result obtained based upon  a user
> defined keyword search in a search engine . Is it possible to do it in nutch
> . Please provide useful insights , i tried searching in this forum and
> google but found nothing helpful .
>
>                     The user may provide a search engine like google.com
> along with keyword to search for in that search engine . The results of this
> search should be crawled . Is it possibe to do in nutch , just providing the
> search engine url along with the keyword for search.
>
>
>
> Shanthoosh
>