You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Tamas Jambor <ja...@gmail.com> on 2013/03/01 15:30:18 UTC

standardize data for PCA

Hi,

just wondering if there is a way to standardize data for SSVD (ie
(x-mean)/sd)? I understand that using the PCA option would centralize the
matrix per column mean, but it seems that there is no option to standardize
it.

thanks,
Tamas

Re: standardize data for PCA

Posted by Tamas Jambor <ja...@gmail.com>.
I'd like to normalize columns, as the variables are on a different scale.

I'm More interested in on the analysis of correlation not the covariance.

On Sat, Mar 2, 2013 at 2:12 AM, Ted Dunning <te...@gmail.com> wrote:

> Are you normalizing rows or columns?
>
> On Fri, Mar 1, 2013 at 6:30 AM, Tamas Jambor <ja...@gmail.com> wrote:
>
> > Hi,
> >
> > just wondering if there is a way to standardize data for SSVD (ie
> > (x-mean)/sd)? I understand that using the PCA option would centralize the
> > matrix per column mean, but it seems that there is no option to
> standardize
> > it.
> >
> > thanks,
> > Tamas
> >
>

Re: standardize data for PCA

Posted by Ted Dunning <te...@gmail.com>.
Are you normalizing rows or columns?

On Fri, Mar 1, 2013 at 6:30 AM, Tamas Jambor <ja...@gmail.com> wrote:

> Hi,
>
> just wondering if there is a way to standardize data for SSVD (ie
> (x-mean)/sd)? I understand that using the PCA option would centralize the
> matrix per column mean, but it seems that there is no option to standardize
> it.
>
> thanks,
> Tamas
>

Re: standardize data for PCA

Posted by Dmitriy Lyubimov <dl...@gmail.com>.
No, there's no option to pre-scale it w.r.t. variance. But (I think) the
definition of PCA doesn't require it. More over, in fact, in many cases one
wants to preserve euclidean distances proportions between data points in
the dimensionally reduced output, in which case it wouldn't make sense to
scale it.

Can you perhaps elaborate a little on your reason for scaling input for PCA?


On Fri, Mar 1, 2013 at 6:30 AM, Tamas Jambor <ja...@gmail.com> wrote:

> Hi,
>
> just wondering if there is a way to standardize data for SSVD (ie
> (x-mean)/sd)? I understand that using the PCA option would centralize the
> matrix per column mean, but it seems that there is no option to standardize
> it.
>
> thanks,
> Tamas
>