You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by "Mr. Udatny" <ru...@rosa.com> on 2005/08/24 12:13:11 UTC

different RegexUrlFilter configurations possible?

hi there,

how can i provide several different regex Definitions for RegexUrlFilter 
within the same installation?

i saw that i cannot provide/pass some additional (dynamic) configuration 
parameter to the plugin.

the idea is to have several independend webdb's (lets say for every 
"RootUrl" one) and for every instance of the webdb another set of 
regex-instructions should apply.

how can this be accomplished?

is this mail targeted correctly to the nutch-user mailing list?

greetings,

ud


Re: different RegexUrlFilter configurations possible?

Posted by "Mr. Udatny" <ru...@rosa.com>.
thanks for the input. i'll try that out. but then i still would need to 
create as many nutch configuration files and URL-regex-filter-files as 
webdbs (or regex configs) i have, right?

i'd like to pass the regex definitions as a parameter for example to the 
UpdateDatabaseTool - since the regex-filter-defs are maintained/managed 
"somewhere else". Any ideas how that could be accomplished? Or if i 
would do it - what is the suggested (proper) way to do it?

kind regards,

ud



Piotr Kosiorowski wrote:

>You can use different NUTCH_CONF_DIR environment variables for each
>instance and keep configuration in separate directories. I have not
>done it myself - but I think it should work.
>Regards
>Piotr
>
>On 8/24/05, Mr. Udatny <ru...@rosa.com> wrote:
>  
>
>>hi there,
>>
>>how can i provide several different regex Definitions for RegexUrlFilter
>>within the same installation?
>>
>>i saw that i cannot provide/pass some additional (dynamic) configuration
>>parameter to the plugin.
>>
>>the idea is to have several independend webdb's (lets say for every
>>"RootUrl" one) and for every instance of the webdb another set of
>>regex-instructions should apply.
>>
>>how can this be accomplished?
>>
>>is this mail targeted correctly to the nutch-user mailing list?
>>
>>greetings,
>>
>>ud
>>
>>    
>>
>>
>  
>

Re: different RegexUrlFilter configurations possible?

Posted by Piotr Kosiorowski <pk...@gmail.com>.
You can use different NUTCH_CONF_DIR environment variables for each
instance and keep configuration in separate directories. I have not
done it myself - but I think it should work.
Regards
Piotr

On 8/24/05, Mr. Udatny <ru...@rosa.com> wrote:
> hi there,
> 
> how can i provide several different regex Definitions for RegexUrlFilter
> within the same installation?
> 
> i saw that i cannot provide/pass some additional (dynamic) configuration
> parameter to the plugin.
> 
> the idea is to have several independend webdb's (lets say for every
> "RootUrl" one) and for every instance of the webdb another set of
> regex-instructions should apply.
> 
> how can this be accomplished?
> 
> is this mail targeted correctly to the nutch-user mailing list?
> 
> greetings,
> 
> ud
> 
>