You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by yunming zhang <yz...@rice.edu> on 2013/02/07 05:49:28 UTC

Question about latent dirichlet allocation

Hi,

I am trying to get Latent Dirichlet Allocation to work,

I was following the instructions on this page
https://cwiki.apache.org/MAHOUT/latent-dirichlet-allocation.html

I have two questions
1) I want to make sure LDA is a different algorithm than the dirichlet
clustering algorithm? I am only asking because when I tried to run
cluster-reuters.sh  under Mahout/trunk/examples/bin, it gave me choices
between kmeans, fuzzy kmeans, meanshift and dirichlet, so it seems as if
LDA and Dirichlet clustering algorithm is the same thing?

2) I tried to run the script cluster-reuters.sh and I got the error

SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
explanation.
SLF4J: slf4j-api 1.6.x (or later) is incompatible with this binding.
SLF4J: Your binding is version 1.5.5 or earlier.
SLF4J: Upgrade your binding to version 1.6.x.
Exception in thread "main" java.lang.NoSuchMethodError:
org.slf4j.impl.StaticLoggerBinder.getSingleton()Lorg/slf4j/impl/StaticLoggerBinder;


Not sure what is going on there?

Thanks

Yunming

RE: Question about latent dirichlet allocation

Posted by "Walshe, Maurice (RBI-UK)" <Ma...@rbi.co.uk>.
unsubscribe

-----Original Message-----
From: Yutaka Mandai [mailto:20525entradero@gmail.com] 
Sent: 08 February 2013 01:22
To: user@mahout.apache.org
Cc: user@mahout.apache.org
Subject: Re: Question about latent dirichlet allocation

Yunming
I noticed my typo in my original reply.

rowed must be read as 'rowid'.
Regards,,,
Y.Mandai

iPhoneから送信

On 2013/02/08, at 6:08, David LaBarbera <da...@localresponse.com> wrote:

> What version of mahout are you running and what was the command you used?
> In Mahout 0.8, if you typed
> mahout lda
> you would get
> Try the new Collapsed Variation Bayes LDA, try bin/mahout cvb or 
> bin/mahout cvb0_local
> 
> The exception you have looks like a classpath error
> 
> David
> 
> On Feb 7, 2013, at 4:01 PM, yunming zhang <zh...@gmail.com> wrote:
> 
>> Hi Yutaka,
>> 
>> Thanks for the reply!
>> 
>> So I think lda ($MAHOUT lea) is still in the mahout package? Are you 
>> saying
>> 1)  it is not doing latent dirichlet allocation now and is replaced 
>> by CVB,so I can only use cvb?
>> or
>> 2) it is based on CVB, I could use cvb or lea?
>> 
>> Thanks
>> 
>> Yunming
>> 
>> 
>> On Thu, Feb 7, 2013 at 3:41 AM, Yutaka Mandai <20...@gmail.com>wrote:
>> 
>>> Yunming
>>> Hi there!
>>> I believe Dirichlet clustering and LDA is a different thing.
>>> Dirichlet clustering is a model based clustering algorithm where 
>>> data is assumed to fit into Gaussian distribution , while LDA is 
>>> essentially based on CVB (Collapsed variational  Bayes)
>>> 
>>> LDA used to be LDA in Mahout until 0.5 but beyond this, it's now 
>>> implemented as CVB.
>>> 
>>> You might have to do rowed first from your original term vector and 
>>> then feed that generated matrix to CVB.
>>> 
>>> Hope this will give you some idea.
>>> Regards,,,
>>> Y.Mandai
>>> 
>>> iPhoneから送信
>>> 
>>> On 2013/02/07, at 13:49, yunming zhang <yz...@rice.edu> wrote:
>>> 
>>>> Hi,
>>>> 
>>>> I am trying to get Latent Dirichlet Allocation to work,
>>>> 
>>>> I was following the instructions on this page 
>>>> https://cwiki.apache.org/MAHOUT/latent-dirichlet-allocation.html
>>>> 
>>>> I have two questions
>>>> 1) I want to make sure LDA is a different algorithm than the 
>>>> dirichlet clustering algorithm? I am only asking because when I 
>>>> tried to run cluster-reuters.sh  under Mahout/trunk/examples/bin, 
>>>> it gave me choices between kmeans, fuzzy kmeans, meanshift and 
>>>> dirichlet, so it seems as if LDA and Dirichlet clustering algorithm is the same thing?
>>>> 
>>>> 2) I tried to run the script cluster-reuters.sh and I got the error
>>>> 
>>>> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
>>>> explanation.
>>>> SLF4J: slf4j-api 1.6.x (or later) is incompatible with this binding.
>>>> SLF4J: Your binding is version 1.5.5 or earlier.
>>>> SLF4J: Upgrade your binding to version 1.6.x.
>>>> Exception in thread "main" java.lang.NoSuchMethodError:
>>>> 
>>> org.slf4j.impl.StaticLoggerBinder.getSingleton()Lorg/slf4j/impl/Stat
>>> icLoggerBinder;
>>>> 
>>>> 
>>>> Not sure what is going on there?
>>>> 
>>>> Thanks
>>>> 
>>>> Yunming
>>> 
> 

===================== DISCLAIMER ======================
This message is intended only for the use of the person(s) 
("Intended Recipient") to whom it is addressed. It may contain 
information which is privileged and confidential. Accordingly 
any dissemination, distribution, copying or other use of this 
message or any of its content by any person other than the Intended 
Recipient may constitute a breach of civil or criminal law and is 
strictly prohibited. If you are not the Intended Recipient, please 
contact the sender as soon as possible.

Reed Business Information Limited.
Registered Office: Quadrant House, The Quadrant, Sutton, Surrey, SM2 5AS, UK.
Registered in England under Company No. 151537

=======================================================

Re: Question about latent dirichlet allocation

Posted by Yutaka Mandai <20...@gmail.com>.
Yunming
I noticed my typo in my original reply.

rowed must be read as 'rowid'.
Regards,,,
Y.Mandai

iPhoneから送信

On 2013/02/08, at 6:08, David LaBarbera <da...@localresponse.com> wrote:

> What version of mahout are you running and what was the command you used?
> In Mahout 0.8, if you typed
> mahout lda
> you would get 
> Try the new Collapsed Variation Bayes LDA, try bin/mahout cvb or bin/mahout cvb0_local
> 
> The exception you have looks like a classpath error
> 
> David
> 
> On Feb 7, 2013, at 4:01 PM, yunming zhang <zh...@gmail.com> wrote:
> 
>> Hi Yutaka,
>> 
>> Thanks for the reply!
>> 
>> So I think lda ($MAHOUT lea) is still in the mahout package? Are you saying
>> 1)  it is not doing latent dirichlet allocation now and is replaced by
>> CVB,so I can only use cvb?
>> or
>> 2) it is based on CVB, I could use cvb or lea?
>> 
>> Thanks
>> 
>> Yunming
>> 
>> 
>> On Thu, Feb 7, 2013 at 3:41 AM, Yutaka Mandai <20...@gmail.com>wrote:
>> 
>>> Yunming
>>> Hi there!
>>> I believe Dirichlet clustering and LDA is a different thing.
>>> Dirichlet clustering is a model based clustering algorithm where data is
>>> assumed to fit into Gaussian distribution , while LDA is essentially based
>>> on CVB (Collapsed variational  Bayes)
>>> 
>>> LDA used to be LDA in Mahout until 0.5 but beyond this, it's now
>>> implemented as CVB.
>>> 
>>> You might have to do rowed first from your original term vector and then
>>> feed that generated matrix to CVB.
>>> 
>>> Hope this will give you some idea.
>>> Regards,,,
>>> Y.Mandai
>>> 
>>> iPhoneから送信
>>> 
>>> On 2013/02/07, at 13:49, yunming zhang <yz...@rice.edu> wrote:
>>> 
>>>> Hi,
>>>> 
>>>> I am trying to get Latent Dirichlet Allocation to work,
>>>> 
>>>> I was following the instructions on this page
>>>> https://cwiki.apache.org/MAHOUT/latent-dirichlet-allocation.html
>>>> 
>>>> I have two questions
>>>> 1) I want to make sure LDA is a different algorithm than the dirichlet
>>>> clustering algorithm? I am only asking because when I tried to run
>>>> cluster-reuters.sh  under Mahout/trunk/examples/bin, it gave me choices
>>>> between kmeans, fuzzy kmeans, meanshift and dirichlet, so it seems as if
>>>> LDA and Dirichlet clustering algorithm is the same thing?
>>>> 
>>>> 2) I tried to run the script cluster-reuters.sh and I got the error
>>>> 
>>>> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
>>>> explanation.
>>>> SLF4J: slf4j-api 1.6.x (or later) is incompatible with this binding.
>>>> SLF4J: Your binding is version 1.5.5 or earlier.
>>>> SLF4J: Upgrade your binding to version 1.6.x.
>>>> Exception in thread "main" java.lang.NoSuchMethodError:
>>>> 
>>> org.slf4j.impl.StaticLoggerBinder.getSingleton()Lorg/slf4j/impl/StaticLoggerBinder;
>>>> 
>>>> 
>>>> Not sure what is going on there?
>>>> 
>>>> Thanks
>>>> 
>>>> Yunming
>>> 
> 

Re: Question about latent dirichlet allocation

Posted by David LaBarbera <da...@localresponse.com>.
What version of mahout are you running and what was the command you used?
In Mahout 0.8, if you typed
mahout lda
you would get 
Try the new Collapsed Variation Bayes LDA, try bin/mahout cvb or bin/mahout cvb0_local

The exception you have looks like a classpath error

David

On Feb 7, 2013, at 4:01 PM, yunming zhang <zh...@gmail.com> wrote:

> Hi Yutaka,
> 
> Thanks for the reply!
> 
> So I think lda ($MAHOUT lea) is still in the mahout package? Are you saying
> 1)  it is not doing latent dirichlet allocation now and is replaced by
> CVB,so I can only use cvb?
> or
> 2) it is based on CVB, I could use cvb or lea?
> 
> Thanks
> 
> Yunming
> 
> 
> On Thu, Feb 7, 2013 at 3:41 AM, Yutaka Mandai <20...@gmail.com>wrote:
> 
>> Yunming
>> Hi there!
>> I believe Dirichlet clustering and LDA is a different thing.
>> Dirichlet clustering is a model based clustering algorithm where data is
>> assumed to fit into Gaussian distribution , while LDA is essentially based
>> on CVB (Collapsed variational  Bayes)
>> 
>> LDA used to be LDA in Mahout until 0.5 but beyond this, it's now
>> implemented as CVB.
>> 
>> You might have to do rowed first from your original term vector and then
>> feed that generated matrix to CVB.
>> 
>> Hope this will give you some idea.
>> Regards,,,
>> Y.Mandai
>> 
>> iPhoneから送信
>> 
>> On 2013/02/07, at 13:49, yunming zhang <yz...@rice.edu> wrote:
>> 
>>> Hi,
>>> 
>>> I am trying to get Latent Dirichlet Allocation to work,
>>> 
>>> I was following the instructions on this page
>>> https://cwiki.apache.org/MAHOUT/latent-dirichlet-allocation.html
>>> 
>>> I have two questions
>>> 1) I want to make sure LDA is a different algorithm than the dirichlet
>>> clustering algorithm? I am only asking because when I tried to run
>>> cluster-reuters.sh  under Mahout/trunk/examples/bin, it gave me choices
>>> between kmeans, fuzzy kmeans, meanshift and dirichlet, so it seems as if
>>> LDA and Dirichlet clustering algorithm is the same thing?
>>> 
>>> 2) I tried to run the script cluster-reuters.sh and I got the error
>>> 
>>> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
>>> explanation.
>>> SLF4J: slf4j-api 1.6.x (or later) is incompatible with this binding.
>>> SLF4J: Your binding is version 1.5.5 or earlier.
>>> SLF4J: Upgrade your binding to version 1.6.x.
>>> Exception in thread "main" java.lang.NoSuchMethodError:
>>> 
>> org.slf4j.impl.StaticLoggerBinder.getSingleton()Lorg/slf4j/impl/StaticLoggerBinder;
>>> 
>>> 
>>> Not sure what is going on there?
>>> 
>>> Thanks
>>> 
>>> Yunming
>> 


Re: Question about latent dirichlet allocation

Posted by yunming zhang <zh...@gmail.com>.
Hi Yutaka,

Thanks for the reply!

So I think lda ($MAHOUT lea) is still in the mahout package? Are you saying
1)  it is not doing latent dirichlet allocation now and is replaced by
CVB,so I can only use cvb?
or
2) it is based on CVB, I could use cvb or lea?

Thanks

Yunming


On Thu, Feb 7, 2013 at 3:41 AM, Yutaka Mandai <20...@gmail.com>wrote:

> Yunming
> Hi there!
> I believe Dirichlet clustering and LDA is a different thing.
> Dirichlet clustering is a model based clustering algorithm where data is
> assumed to fit into Gaussian distribution , while LDA is essentially based
> on CVB (Collapsed variational  Bayes)
>
> LDA used to be LDA in Mahout until 0.5 but beyond this, it's now
> implemented as CVB.
>
> You might have to do rowed first from your original term vector and then
> feed that generated matrix to CVB.
>
> Hope this will give you some idea.
> Regards,,,
> Y.Mandai
>
> iPhoneから送信
>
> On 2013/02/07, at 13:49, yunming zhang <yz...@rice.edu> wrote:
>
> > Hi,
> >
> > I am trying to get Latent Dirichlet Allocation to work,
> >
> > I was following the instructions on this page
> > https://cwiki.apache.org/MAHOUT/latent-dirichlet-allocation.html
> >
> > I have two questions
> > 1) I want to make sure LDA is a different algorithm than the dirichlet
> > clustering algorithm? I am only asking because when I tried to run
> > cluster-reuters.sh  under Mahout/trunk/examples/bin, it gave me choices
> > between kmeans, fuzzy kmeans, meanshift and dirichlet, so it seems as if
> > LDA and Dirichlet clustering algorithm is the same thing?
> >
> > 2) I tried to run the script cluster-reuters.sh and I got the error
> >
> > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
> > explanation.
> > SLF4J: slf4j-api 1.6.x (or later) is incompatible with this binding.
> > SLF4J: Your binding is version 1.5.5 or earlier.
> > SLF4J: Upgrade your binding to version 1.6.x.
> > Exception in thread "main" java.lang.NoSuchMethodError:
> >
> org.slf4j.impl.StaticLoggerBinder.getSingleton()Lorg/slf4j/impl/StaticLoggerBinder;
> >
> >
> > Not sure what is going on there?
> >
> > Thanks
> >
> > Yunming
>

Re: Question about latent dirichlet allocation

Posted by Yutaka Mandai <20...@gmail.com>.
Yunming
Hi there!
I believe Dirichlet clustering and LDA is a different thing.
Dirichlet clustering is a model based clustering algorithm where data is assumed to fit into Gaussian distribution , while LDA is essentially based on CVB (Collapsed variational  Bayes)

LDA used to be LDA in Mahout until 0.5 but beyond this, it's now implemented as CVB.

You might have to do rowed first from your original term vector and then feed that generated matrix to CVB.

Hope this will give you some idea.
Regards,,,
Y.Mandai

iPhoneから送信

On 2013/02/07, at 13:49, yunming zhang <yz...@rice.edu> wrote:

> Hi,
> 
> I am trying to get Latent Dirichlet Allocation to work,
> 
> I was following the instructions on this page
> https://cwiki.apache.org/MAHOUT/latent-dirichlet-allocation.html
> 
> I have two questions
> 1) I want to make sure LDA is a different algorithm than the dirichlet
> clustering algorithm? I am only asking because when I tried to run
> cluster-reuters.sh  under Mahout/trunk/examples/bin, it gave me choices
> between kmeans, fuzzy kmeans, meanshift and dirichlet, so it seems as if
> LDA and Dirichlet clustering algorithm is the same thing?
> 
> 2) I tried to run the script cluster-reuters.sh and I got the error
> 
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
> explanation.
> SLF4J: slf4j-api 1.6.x (or later) is incompatible with this binding.
> SLF4J: Your binding is version 1.5.5 or earlier.
> SLF4J: Upgrade your binding to version 1.6.x.
> Exception in thread "main" java.lang.NoSuchMethodError:
> org.slf4j.impl.StaticLoggerBinder.getSingleton()Lorg/slf4j/impl/StaticLoggerBinder;
> 
> 
> Not sure what is going on there?
> 
> Thanks
> 
> Yunming