You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Bartosz Gadzimski <ba...@o2.pl> on 2009/02/24 14:28:37 UTC

NutchAnalysis.java STOP_WORDS not configurable?

Hello,

I know you are all busy getting nutch to 1.0 but I found that in 
NutchAnalysis.java query stop words are compiled in.

Can we set them outside code or we have to recompile when need to change 
them?

public class NutchAnalysis implements NutchAnalysisConstants {

  private static final String[] STOP_WORDS = ......


Thanks,
Bartosz

Re: NutchAnalysis.java STOP_WORDS not configurable?

Posted by Otis Gospodnetic <og...@yahoo.com>.
I believe Lucene has (in contrib/analyzers) a class called WordLoader or something like that.  Perhaps you can use that to load stopwords from a file (like Solr does) and submit that as a patch?

Thanks,
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



----- Original Message ----
> From: Bartosz Gadzimski <ba...@o2.pl>
> To: nutch-dev@lucene.apache.org
> Sent: Tuesday, February 24, 2009 8:28:37 AM
> Subject: NutchAnalysis.java STOP_WORDS not configurable?
> 
> Hello,
> 
> I know you are all busy getting nutch to 1.0 but I found that in 
> NutchAnalysis.java query stop words are compiled in.
> 
> Can we set them outside code or we have to recompile when need to change them?
> 
> public class NutchAnalysis implements NutchAnalysisConstants {
> 
> private static final String[] STOP_WORDS = ......
> 
> 
> Thanks,
> Bartosz