You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@mahout.apache.org by Lance Norskog <go...@gmail.com> on 2011/06/09 00:25:54 UTC

Vector truncation for visualization

I've used multi-dimensional scaling (MDS) in another toolkit to
down-project high-dim vectors to 2d and 3d. What tools for this are
available in Mahout? Random Projection down to 2 dimensions is easy,
but seems unsound.

-- 
Lance Norskog
goksron@gmail.com

Re: Vector truncation for visualization

Posted by Lance Norskog <go...@gmail.com>.

 For the singular vectors technique:

The matrix is all of the input vectors as rows.

1) Given a matrix where the row vectors are all vectors in the space,
2) Subtract the global mean from each member.
3) Do svd and get the ordered set of singular vectors.
3a) Truncate this and pull the first two singular vectors.
3b) Thus, truncating SVD is helpful.

For the affinity matrix, the variable 'r_ij' means 'random matrix
i,j'? And use this for a direct projection onto 2 dimensions?

For another opinion: Lanczos Vectors versus Singular Vectors for
Effective Dimension Reduction

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.158.779&rep=rep1&type=pdf

On Wed, Jun 8, 2011 at 11:20 PM, Ted Dunning <te...@gmail.com> wrote:
> On Thu, Jun 9, 2011 at 2:27 AM, Lance Norskog <go...@gmail.com> wrote:
>> Projecting to the first "two" singular vectors?
>
> Yes.
>
>> Do an SVD on a random matrix, and use the first 2 (or three) singular
>> vectors as a matrix?
>
> Not a random matrix.  A matrix of positions shifted back to have
> average mean (aka PCA).
>
>>
>> What goes into the affinity matrix?
>
> exp(-r_ij ^ 2 / \sigma) is a common usage.  sigma is chosen to have a
> fairly sparse affinity matrix.  r_ij is distance from i to j.
>

-- 
Lance Norskog
goksron@gmail.com

Re: Vector truncation for visualization

Posted by Ted Dunning <te...@gmail.com>.

On Thu, Jun 9, 2011 at 2:27 AM, Lance Norskog <go...@gmail.com> wrote:
> Projecting to the first "two" singular vectors?

Yes.

> Do an SVD on a random matrix, and use the first 2 (or three) singular
> vectors as a matrix?

Not a random matrix.  A matrix of positions shifted back to have
average mean (aka PCA).

>
> What goes into the affinity matrix?

exp(-r_ij ^ 2 / \sigma) is a common usage.  sigma is chosen to have a
fairly sparse affinity matrix.  r_ij is distance from i to j.

Re: Vector truncation for visualization

Posted by Lance Norskog <go...@gmail.com>.

Projecting to the first "two" singular vectors?
Do an SVD on a random matrix, and use the first 2 (or three) singular
vectors as a matrix?

What goes into the affinity matrix?

On Wed, Jun 8, 2011 at 4:24 PM, Ted Dunning <te...@gmail.com> wrote:
> Projecting to the first to singular vectors is better.
>
> Forming an affinity (rather than distance) matrix and projecting to
> those coordinations is very interesting.
>
> On Thu, Jun 9, 2011 at 12:25 AM, Lance Norskog <go...@gmail.com> wrote:
>> I've used multi-dimensional scaling (MDS) in another toolkit to
>> down-project high-dim vectors to 2d and 3d. What tools for this are
>> available in Mahout? Random Projection down to 2 dimensions is easy,
>> but seems unsound.
>>
>> --
>> Lance Norskog
>> goksron@gmail.com
>>
>



-- 
Lance Norskog
goksron@gmail.com

Re: Vector truncation for visualization

Posted by Ted Dunning <te...@gmail.com>.

Projecting to the first to singular vectors is better.

Forming an affinity (rather than distance) matrix and projecting to
those coordinations is very interesting.

On Thu, Jun 9, 2011 at 12:25 AM, Lance Norskog <go...@gmail.com> wrote:
> I've used multi-dimensional scaling (MDS) in another toolkit to
> down-project high-dim vectors to 2d and 3d. What tools for this are
> available in Mahout? Random Projection down to 2 dimensions is easy,
> but seems unsound.
>
> --
> Lance Norskog
> goksron@gmail.com
>