You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Siddharth Tiwari <si...@live.com> on 2012/09/01 11:54:19 UTC
Using mahout for classifying tweets
Hi Users,
I am novice at using Mahout. Can anybody guide me at how can I use Mahout for classifying text into differen classes. In my case its 5 classes and the text is tweets. I mean if there is any tutorial on how to create training model for mahout and how to use it for training and then how we give the dataset for classification ( how we make it compatible for mahout ), then after the classification how to infer the output etc.
I am sorry if my questions seem dumb, but its only because I have very little knowledge about mahout and I am trying to get grip on it. Thank you so much
*------------------------*
Cheers !!!
Siddharth Tiwari
Have a refreshing day !!!
"Every duty is holy, and devotion to duty is the highest form of worship of God.”
"Maybe other people will try to limit me but I don't limit myself"
Re: Using mahout for classifying tweets
Posted by Paritosh Ranjan <pr...@xebia.com>.
Even I am a novice at Mahout Classification, still I will try to give it
a shot in hope that someone will correct me or improve the answer.
First thing, the text data ( tweets ) would need conversion into
Vectors. In Mahout terms, this is known as vector encoding. This can be
done into three ways (one Vector cell
per word, category, or continuous value, Represent Vectors implicitly as
bags of words, or feature hashing).
Look for ContinuousValueEncoder, AdaptiveWordValueEncoder,
StaticWordValueEncoder and FeatureVectorEncoder classes or seqdirectory,
seq2encoded commands.
Then you can use OnlineLogisticRegression, CrossFoldLearner and
AdaptiveLogisticRegression classes or trainnb, testnb, trainlogistic,
runlogistic, trainAdaptiveLogistic, validateAdaptiveLogistic,
runAdaptiveLogistic commands for configuring classification algorithms.
HTH,
Paritosh
On 01-09-2012 15:24, Siddharth Tiwari wrote:
> Hi Users,
>
> I am novice at using Mahout. Can anybody guide me at how can I use Mahout for classifying text into differen classes. In my case its 5 classes and the text is tweets. I mean if there is any tutorial on how to create training model for mahout and how to use it for training and then how we give the dataset for classification ( how we make it compatible for mahout ), then after the classification how to infer the output etc.
> I am sorry if my questions seem dumb, but its only because I have very little knowledge about mahout and I am trying to get grip on it. Thank you so much
>
> *------------------------*
>
> Cheers !!!
>
> Siddharth Tiwari
>
> Have a refreshing day !!!
> "Every duty is holy, and devotion to duty is the highest form of worship of God.”
>
> "Maybe other people will try to limit me but I don't limit myself"
>
Re: Using mahout for classifying tweets
Posted by Salman Mahmood <sa...@influestor.com>.
Buy the book "mahout in action". It gives u an in-depth knowledge of
how classification is done in mahout. I dont know any tutorial links
but you can start with downloading the source code and examining the
example code for classification. But I really recommend the book since
you are a new user.
Sent from my iPhone
On 1 Sep 2012, at 13:03, Paritosh Ranjan <pr...@xebia.com> wrote:
> Even I am a novice at Mahout Classification, still I will try to give it
> a shot in hope that someone will correct me or improve the answer.
>
> First thing, the text data ( tweets ) would need conversion into
> Vectors. In Mahout terms, this is known as vector encoding. This can be
> done into three ways (one Vector cell
> per word, category, or continuous value, Represent Vectors implicitly as
> bags of words, or feature hashing).
>
> Look for ContinuousValueEncoder, AdaptiveWordValueEncoder,
> StaticWordValueEncoder and FeatureVectorEncoder classes or seqdirectory,
> seq2encoded commands.
>
> Then you can use OnlineLogisticRegression, CrossFoldLearner and
> AdaptiveLogisticRegression classes or trainnb, testnb, trainlogistic,
> runlogistic, trainAdaptiveLogistic, validateAdaptiveLogistic,
> runAdaptiveLogistic commands for configuring classification algorithms.
>
> HTH,
> Paritosh
>
> On 01-09-2012 15:24, Siddharth Tiwari wrote:
>> Hi Users,
>>
>> I am novice at using Mahout. Can anybody guide me at how can I use Mahout for classifying text into differen classes. In my case its 5 classes and the text is tweets. I mean if there is any tutorial on how to create training model for mahout and how to use it for training and then how we give the dataset for classification ( how we make it compatible for mahout ), then after the classification how to infer the output etc.
>> I am sorry if my questions seem dumb, but its only because I have very little knowledge about mahout and I am trying to get grip on it. Thank you so much
>>
>> *------------------------*
>>
>> Cheers !!!
>>
>> Siddharth Tiwari
>>
>> Have a refreshing day !!!
>> "Every duty is holy, and devotion to duty is the highest form of worship of God.”
>>
>> "Maybe other people will try to limit me but I don't limit myself"
>>
>
>