You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by David Podunavac <da...@wyona.com> on 2006/09/04 15:43:53 UTC

several url to search for [multiple url]

Hi all,

found out how to specify only A certain url to look for even if
crawl-urlfilter.txt has more than one. Simply adding at the end of the
query

- <earchTerm> url:file:/<somePathForLocalSearchPurposes> or
- url:someUrl.com
- site:www.someUrl.com

all of theses would return me results

and nutch will only look for some Url
weird is that if

the following will not return any resuts
- url:http://www.sameUrl.com ist not working just like
- site:www.sameUrls.com

but the problem is that i cannot add multiple urls to look for or is
there a trick how to do that

any hints appreciated

david


Re: several url to search for [multiple url]

Posted by og...@yahoo.com.
I'm not sure if "site" is a valid field, but if it is, then try:

  site:siteOneHere site:siteTwoHere .....

Otis
P.S.
No need to mail nutch-dev with this, just use nutch-user.

----- Original Message ----
From: David Podunavac <da...@wyona.com>
To: nutch-user@lucene.apache.org; nutch-dev@lucene.apache.org
Sent: Monday, September 4, 2006 9:43:53 AM
Subject: several url to search for [multiple url]

Hi all,

found out how to specify only A certain url to look for even if
crawl-urlfilter.txt has more than one. Simply adding at the end of the
query

- <earchTerm> url:file:/<somePathForLocalSearchPurposes> or
- url:someUrl.com
- site:www.someUrl.com

all of theses would return me results

and nutch will only look for some Url
weird is that if

the following will not return any resuts
- url:http://www.sameUrl.com ist not working just like
- site:www.sameUrls.com

but the problem is that i cannot add multiple urls to look for or is
there a trick how to do that

any hints appreciated

david