You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Neil Chaudhuri <nc...@potomacfusion.com> on 2011/12/12 19:04:20 UTC

DictionaryVectorizer Parameters

Can I get an explanation of what the following means and how the values affect outcomes?

normPower - L_p norm to be computed
logNormalize - whether to use log normalization

Thanks.

Re: DictionaryVectorizer Parameters

Posted by Suneel Marthi <su...@yahoo.com>.
normPower - Normalization is used to negate the effects of features that skew up results disproportionately. 



    Mathematically the norm of a vector [x,y,z] is   x/(x^p + y^p + z^p), y/(x^p + y^p + z^p), z/(x^p + y^p + z^p)

when p = 1 its 1-norm or Manhattan norm
when p =2 its Euclidean norm.....

logNormalize - measure of the log likelihood that an NGram is a significant unit

Both of the above terms have been described in detail in Chapter 8 of Mahout in Action book.



________________________________
 From: Neil Chaudhuri <nc...@potomacfusion.com>
To: "user@mahout.apache.org" <us...@mahout.apache.org> 
Sent: Monday, December 12, 2011 1:04 PM
Subject: DictionaryVectorizer Parameters
 
Can I get an explanation of what the following means and how the values affect outcomes?

normPower - L_p norm to be computed
logNormalize - whether to use log normalization

Thanks.