You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Neil Chaudhuri <nc...@potomacfusion.com> on 2011/12/12 19:04:20 UTC
DictionaryVectorizer Parameters
Can I get an explanation of what the following means and how the values affect outcomes?
normPower - L_p norm to be computed
logNormalize - whether to use log normalization
Thanks.
Re: DictionaryVectorizer Parameters
Posted by Suneel Marthi <su...@yahoo.com>.
normPower - Normalization is used to negate the effects of features that skew up results disproportionately.
Mathematically the norm of a vector [x,y,z] is x/(x^p + y^p + z^p), y/(x^p + y^p + z^p), z/(x^p + y^p + z^p)
when p = 1 its 1-norm or Manhattan norm
when p =2 its Euclidean norm.....
logNormalize - measure of the log likelihood that an NGram is a significant unit
Both of the above terms have been described in detail in Chapter 8 of Mahout in Action book.
________________________________
From: Neil Chaudhuri <nc...@potomacfusion.com>
To: "user@mahout.apache.org" <us...@mahout.apache.org>
Sent: Monday, December 12, 2011 1:04 PM
Subject: DictionaryVectorizer Parameters
Can I get an explanation of what the following means and how the values affect outcomes?
normPower - L_p norm to be computed
logNormalize - whether to use log normalization
Thanks.