You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mahout.apache.org by co...@apache.org on 2008/02/03 16:21:00 UTC
[CONF] Apache Lucene Mahout: Principal Components Analysis (page
created)
Principal Components Analysis (MAHOUT) created by Isabel Drost
http://cwiki.apache.org/confluence/display/MAHOUT/Principal+Components+Analysis
Content:
---------------------------------------------------------------------
h1. Principal Components Analysis
PCA is used to reduce high dimensional data set to lower dimensions. PCA can be used to identify patterns in data, express the data in a lower dimensional space. That way, similarities and differences can be highlighted. It is mostly used in face recognition and image compression.
There are several flaws one has to be aware of when working with PCA:
* Linearity assumption - data is assumed to be linear combinations of some basis. There exist non-linear methods such as kernel PCA that alleviate that problem.
* Principal components are assumed to be orthogonal. ICA tries to cope with this limitation.
* Mean and covariance are assumed to be statistically important.
* Large variances are assumed to have important dynamics.
h2. Parallelization strategy
h2. Design of packages
---------------------------------------------------------------------
CONFLUENCE INFORMATION
This message is automatically generated by Confluence
Unsubscribe or edit your notifications preferences
http://cwiki.apache.org/confluence/users/viewnotifications.action
If you think it was sent incorrectly contact one of the administrators
http://cwiki.apache.org/confluence/administrators.action
If you want more information on Confluence, or have a bug to report see
http://www.atlassian.com/software/confluence