You are viewing a plain text version of this content. The canonical link for it is here.

Posted to java-user@lucene.apache.org by escher2k <es...@yahoo.com> on 2007/04/30 21:33:01 UTC

Modifying norms...

I want to modify the norms to only include values between 0 and 100.
Currently, I have a custom implementation of the default similarity. Is it
sufficient to override the encodeNorm and decodeNorm methods from the base
implementation in my custom Similarity class ? Please let me know if there
are any performance implications to this.
-- 
View this message in context: http://www.nabble.com/Modifying-norms...-tf3671499.html#a10259327
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Modifying norms...

Posted by Chris Hostetter <ho...@fucit.org>.

: Essentially what I am trying to do is boost every document by a certain
: factor, so that
: the boost is between 1.0 and 2.0. After this, I we are trying to do a search
: across multiple fields
: and have a computation based purely on tf. Example -

it sounds like you are trying to place too much stock in the precise score
values you get back from a query.  if it's really important to you i
would suggest playing with the boost values you use and your tf/idf
functions so they work with the current boost/norm encoding instead of
tyring to change how the norms are encoded.  that way you won't have to
worry baout havking the static encoding funcs in Similarity.


-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Modifying norms...

Posted by escher2k <es...@yahoo.com>.

Essentially what I am trying to do is boost every document by a certain
factor, so that
the boost is between 1.0 and 2.0. After this, I we are trying to do a search
across multiple fields
and have a computation based purely on tf. Example -
if (field1)
  tf = some function
else if (field2)
  tf = some other function
...

Now the boost is getting rounded to 1.0, 1.25, 1.5 or 2.0 due to the norm is
stored, whereas I want more precision (e.g. 1.31, 1.45 etc). The boost is
used for ranking documents. 

Thanks.


Chris Hostetter wrote:
> 
> 
> : Thanks Hoss. Suppose, I go ahead and modify Similarity.java from
> 	...
> : Should this work ?
> 
> it depends on your definition of "work" ... if that code is what you want
> it to do, then yes: it will do what you want it to do.
> 
> : P.S. This is a very custom implementation. For the specific problem that
> I
> : have, the lengthNorm
> : is set to 1 (independent of numTerms).
> 
> if your length norm is always 1, why do you care what the norm values are?
> are you using document and field boosts? ... if "no" then none of this
> shoudl matter.  if "yes" then why not just change the boost values you use
> to get the behavior you want instead of modifying the encoding mechanism?
> 
> 
> 
> 
> -Hoss
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/Modifying-norms...-tf3671499.html#a10263146
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Modifying norms...

Posted by Chris Hostetter <ho...@fucit.org>.

: Thanks Hoss. Suppose, I go ahead and modify Similarity.java from
	...
: Should this work ?

it depends on your definition of "work" ... if that code is what you want
it to do, then yes: it will do what you want it to do.

: P.S. This is a very custom implementation. For the specific problem that I
: have, the lengthNorm
: is set to 1 (independent of numTerms).

if your length norm is always 1, why do you care what the norm values are?
are you using document and field boosts? ... if "no" then none of this
shoudl matter.  if "yes" then why not just change the boost values you use
to get the behavior you want instead of modifying the encoding mechanism?




-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Modifying norms...

Posted by escher2k <es...@yahoo.com>.

Thanks Hoss. Suppose, I go ahead and modify Similarity.java from 
static {
    for (int i = 0; i < 256; i++)
      NORM_TABLE[i] = SmallFloat.byte315ToFloat((byte)i);
  }
TO
static {
    for (int i = 0; i < 256; i++)
      NORM_TABLE[i] = (float) i  * 100.0 /256.0;
  } 

Should this work ?

Thanks.

P.S. This is a very custom implementation. For the specific problem that I
have, the lengthNorm
is set to 1 (independent of numTerms). 


Chris Hostetter wrote:
> 
> 
> : I want to modify the norms to only include values between 0 and 100.
> : Currently, I have a custom implementation of the default similarity. Is
> it
> : sufficient to override the encodeNorm and decodeNorm methods from the
> base
> : implementation in my custom Similarity class ? Please let me know if
> there
> : are any performance implications to this.
> 
> those methods are static, so it's not possible to override them.  if you
> are not using doc or field boosts, overriding lengthNorm is suitable for
> your goal.
> 
> -Hoss
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/Modifying-norms...-tf3671499.html#a10262494
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Modifying norms...

Posted by Chris Hostetter <ho...@fucit.org>.

: I want to modify the norms to only include values between 0 and 100.
: Currently, I have a custom implementation of the default similarity. Is it
: sufficient to override the encodeNorm and decodeNorm methods from the base
: implementation in my custom Similarity class ? Please let me know if there
: are any performance implications to this.

those methods are static, so it's not possible to override them.  if you
are not using doc or field boosts, overriding lengthNorm is suitable for
your goal.

-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org