You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Marko Bauhardt <mb...@media-style.com> on 2006/05/17 11:18:36 UTC

Query Boosting

Hi,
i used nutch-0.8 revision 405566.
i have a question about the query-basic plugin. A simple way to test  
the parsed Query is the main application  
org.apache.nutch.searcher.Query.
If i execute Query and type the query "foo" i get the following output.

Translated: +(url:foo^0.0 anchor:foo^0.0 content:foo title:foo^0.0  
host:foo^0.0)

That output means that the boosting is 0.0 for every field. But what  
is with the default boosting from the nutch-default.xml? This  
configuration is not used.
If i change the BasicQueryFilter i get the output with the boosting  
values from the nutch-default.xml.

Translated: +(url:foo^4.0 anchor:foo^2.0 content:foo title:foo^1.5  
host:foo^2.0)


I think the FIELD_BOOSTS must be re-initalized if a new boosting is set.
Here is a simpe patch to verify what i mean:


Index: src/plugin/query-basic/src/java/org/apache/nutch/searcher/ 
basic/BasicQueryFilter.java
===================================================================
--- src/plugin/query-basic/src/java/org/apache/nutch/searcher/basic/ 
BasicQueryFilter.java       (revision 405566)
+++ src/plugin/query-basic/src/java/org/apache/nutch/searcher/basic/ 
BasicQueryFilter.java       (working copy)
@@ -48,7 +48,7 @@
    private static final String[] FIELDS =
    { "url", "anchor", "content", "title", "host" };
-  private final float[] FIELD_BOOSTS =
+  private float[] FIELD_BOOSTS =
    { URL_BOOST, ANCHOR_BOOST, 1.0f, TITLE_BOOST, HOST_BOOST };
    /**
@@ -178,6 +178,7 @@
      this.TITLE_BOOST = conf.getFloat("query.title.boost", 1.5f);
      this.HOST_BOOST = conf.getFloat("query.host.boost", 2.0f);
      this.PHRASE_BOOST = conf.getFloat("query.phrase.boost", 1.0f);
+    FIELD_BOOSTS = new float[]{ URL_BOOST, ANCHOR_BOOST, 1.0f,  
TITLE_BOOST, HOST_BOOST };
    }
    public Configuration getConf() {


Or do i overllook something?

Marko