You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Tobias Zahn <To...@arcor.de> on 2007/02/06 20:31:06 UTC

api.RegexURLFilterBase - Configuration Resources

Hello!
I have written a new plugin extending the IndexingFilter and using the
RegexURLFilterBase class.
In the log there is this message:

FATAL api.RegexURLFilterBase - Can't find resource: null

I don't know how to handle that Configuration-Objects (setConf() etc.)
What should I do to avoid that error? Where does the
Configuration-Object come from?

TIA
Tobias Zahn

Re: api.RegexURLFilterBase - Configuration Resources

Posted by Tobias Zahn <To...@arcor.de>.
Again, thank you for your help.
In the end, I had slightly wrong configs for my plugin, but now it seems
to work. But since nutch makes no output on the commandline anymore, I
can't find out if everything is correct in the end (readdb -stats).

I don't know why it is that way - I haven't changed anything.
It would be create if someone would have an idea what to do now!

My nutch version is 0.8.


Best regards,
Tobias Zahn

Re: api.RegexURLFilterBase - Configuration Resources

Posted by Renaud Richardet <re...@oslutions.com>.
Tobias Zahn wrote:
> Hello!
> I have written a new plugin extending the IndexingFilter and using the
> RegexURLFilterBase class.
> In the log there is this message:
>
> FATAL api.RegexURLFilterBase - Can't find resource: null
>   
in your new class CustomIndexingFilter, create a field Configuration 
conf, and implement setConf, getConf like this:

public void setConf(Configuration conf) {
    this.conf = conf;
  }

  public Configuration getConf() {
    return this.conf;
  }

and pass the conf object to RegexURLFilterBase before calling it.

RegexURLFilterBase r = new RegexURLFilter();
r.setConf(conf);
r.filter("sometext");

This should do the trick.

I assume you have setup the build configuration of your plugin 
correctly, was tricky for me ;-)

build.xml

 <!-- Build compilation dependencies -->
 <target name="deps-jar">
     ......
     <ant target="jar" inheritall="false" dir="../urlfilter-regex"/>
     <ant target="jar" inheritall="false" dir="../lib-regex-filter"/>
</target>

 <!-- Add compilation dependencies to classpath -->
 <path id="plugin.deps">
   <fileset dir="${nutch.root}/build">
     .......
     <include name="**/urlfilter-regex/*.jar" />
     <include name="**/lib-regex-filter/*.jar" />
   </fileset>
 </path>

and plugin.xml
 <requires>
      <import plugin="nutch-extensionpoints"/>
     ......    
      <import plugin="urlfilter-regex"/>
      <import plugin="lib-regex-filter"/>
   </requires>

HTH,
Renaud



> I don't know how to handle that Configuration-Objects (setConf() etc.)
> What should I do to avoid that error? Where does the
> Configuration-Object come from?
>
> TIA
> Tobias Zahn
>
>   


-- 
Renaud Richardet                                      +1 617 230 9112
my email is my first name at apache.org      http://www.oslutions.com