You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Chui-Hui Chiu <cc...@tigers.lsu.edu> on 2012/11/21 18:17:01 UTC
Reading the vector files
Hello, all,
I ran the K-Mean Clustering sample and got the output files. How do I
convert the output Mahout vector files to a human readable format? Is
there any documents about that?
Thanks,
Chiu
Re: Mahout svd command question
Posted by kuba <pa...@interia.pl>.
Thanks for info!
I also found documentation for ssvd:
https://cwiki.apache.org/MAHOUT/stochastic-singular-value-decomposition.html
That would definitley completly solve my problem.
Big Thanks again!
W dniu 22.11.2012 22:00, Ted Dunning pisze:
> That implementation is deprecated. The SSVD implement should be used
> instead.
>
> On Thu, Nov 22, 2012 at 9:58 AM, Abramov Pavel <p....@rambler-co.ru>wrote:
>
>> Hi,
>>
>> Here is step by step manual for Lanczos implementation:
>>
>> https://cwiki.apache.org/MAHOUT/dimensional-reduction.html
>>
>> Pavel
>> ________________________________________
>> От: kuba [pawloch@interia.pl]
>> Отправлено: 22 ноября 2012 г. 21:34
>> To: user@mahout.apache.org
>> Тема: Mahout svd command question
>>
>> Hi,
>>
>> I'm new to hadoop, mahout, and language processing.
>> I'm trying to do LSA (Latent Semantic Analysis) in mahout.
>> I've made my own version of tf-idf matrix building (I know there's
>> seqdirectory and seq2sparse, that can do it for me, but I needed some
>> modifications).
>> I've done 'mahout svd' and I've got output, but don't know how to
>> interpret it.
>>
>> According to books I've read SVD should return three matrices:
>> M = U * Epsilon * (Vt),
>>
>> but 'mahout svd' return only one. I can't find any documentation. Which
>> one does it return, is it U ?
>>
>> Do I have to transpose my tf-idf matrix and compute SVD again to get
>> second matrix ( V )?
>>
>> Also I've found people using:
>> mahout cleansvd
>> what is it for? is there any good documentation?
>>
>>
Re: Mahout svd command question
Posted by Ted Dunning <te...@gmail.com>.
That implementation is deprecated. The SSVD implement should be used
instead.
On Thu, Nov 22, 2012 at 9:58 AM, Abramov Pavel <p....@rambler-co.ru>wrote:
> Hi,
>
> Here is step by step manual for Lanczos implementation:
>
> https://cwiki.apache.org/MAHOUT/dimensional-reduction.html
>
> Pavel
> ________________________________________
> От: kuba [pawloch@interia.pl]
> Отправлено: 22 ноября 2012 г. 21:34
> To: user@mahout.apache.org
> Тема: Mahout svd command question
>
> Hi,
>
> I'm new to hadoop, mahout, and language processing.
> I'm trying to do LSA (Latent Semantic Analysis) in mahout.
> I've made my own version of tf-idf matrix building (I know there's
> seqdirectory and seq2sparse, that can do it for me, but I needed some
> modifications).
> I've done 'mahout svd' and I've got output, but don't know how to
> interpret it.
>
> According to books I've read SVD should return three matrices:
> M = U * Epsilon * (Vt),
>
> but 'mahout svd' return only one. I can't find any documentation. Which
> one does it return, is it U ?
>
> Do I have to transpose my tf-idf matrix and compute SVD again to get
> second matrix ( V )?
>
> Also I've found people using:
> mahout cleansvd
> what is it for? is there any good documentation?
>
>
HA: Mahout svd command question
Posted by Abramov Pavel <p....@rambler-co.ru>.
Hi,
Here is step by step manual for Lanczos implementation:
https://cwiki.apache.org/MAHOUT/dimensional-reduction.html
Pavel
________________________________________
От: kuba [pawloch@interia.pl]
Отправлено: 22 ноября 2012 г. 21:34
To: user@mahout.apache.org
Тема: Mahout svd command question
Hi,
I'm new to hadoop, mahout, and language processing.
I'm trying to do LSA (Latent Semantic Analysis) in mahout.
I've made my own version of tf-idf matrix building (I know there's
seqdirectory and seq2sparse, that can do it for me, but I needed some
modifications).
I've done 'mahout svd' and I've got output, but don't know how to
interpret it.
According to books I've read SVD should return three matrices:
M = U * Epsilon * (Vt),
but 'mahout svd' return only one. I can't find any documentation. Which
one does it return, is it U ?
Do I have to transpose my tf-idf matrix and compute SVD again to get
second matrix ( V )?
Also I've found people using:
mahout cleansvd
what is it for? is there any good documentation?
Mahout svd command question
Posted by kuba <pa...@interia.pl>.
Hi,
I'm new to hadoop, mahout, and language processing.
I'm trying to do LSA (Latent Semantic Analysis) in mahout.
I've made my own version of tf-idf matrix building (I know there's
seqdirectory and seq2sparse, that can do it for me, but I needed some
modifications).
I've done 'mahout svd' and I've got output, but don't know how to
interpret it.
According to books I've read SVD should return three matrices:
M = U * Epsilon * (Vt),
but 'mahout svd' return only one. I can't find any documentation. Which
one does it return, is it U ?
Do I have to transpose my tf-idf matrix and compute SVD again to get
second matrix ( V )?
Also I've found people using:
mahout cleansvd
what is it for? is there any good documentation?
Re: Reading the vector files
Posted by DAN HELM <da...@verizon.net>.
See: http://amgadmadkour.blogspot.com/2012/07/kmeans-clustering-using-apache-mahout.html
________________________________
From: Chui-Hui Chiu <cc...@tigers.lsu.edu>
To: user@mahout.apache.org
Sent: Wednesday, November 21, 2012 12:17 PM
Subject: Reading the vector files
Hello, all,
I ran the K-Mean Clustering sample and got the output files. How do I
convert the output Mahout vector files to a human readable format? Is
there any documents about that?
Thanks,
Chiu