You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Tharindu Rusira <th...@gmail.com> on 2013/12/24 08:02:29 UTC

SVM in Mahout

Hi all,
Do we have a SVM implementation in Mahout(either sequential or mapreduce)?
I was searching in JIRA and MAHOUT-14[1] proposes a SVM implementation and
also MAHOUT-334, MAHOUT-232  have patches available for SVM.
Are these codes available in any Mahout release? Because the comments in
these issues suggest otherwise.

[1]
https://issues.apache.org/jira/browse/MAHOUT-14?jql=project%20%3D%20MAHOUT%20AND%20text%20~%20%22svm%22
[2]
https://issues.apache.org/jira/browse/MAHOUT-334?jql=project%20%3D%20MAHOUT%20AND%20text%20~%20%22svm%22
[3]
https://issues.apache.org/jira/browse/MAHOUT-232?jql=project%20%3D%20MAHOUT%20AND%20text%20~%20%22svm%22

Thanks,

-- 
M.P. Tharindu Rusira Kumara

Department of Computer Science and Engineering,
University of Moratuwa,
Sri Lanka.
+94757033733
www.tharindu-rusira.blogspot.com

Re: SVM in Mahout

Posted by Ted Dunning <te...@gmail.com>.
Logistic regression with L1 regularization is generally at least as good as
SVM.  The problem with SVM is that it uses radially symmetric
regularization which doesn't learn sparse solutions very well.  L1
regularization is much better for that.


On Tue, Dec 24, 2013 at 10:06 AM, Steven Bourke <sb...@gmail.com> wrote:

> Just test out libsvm against log regression on a sample of your data to
> get an understanding of upside downside for your particular problem
>
> Sent from my iPhone
>
> > On 24 Dec 2013, at 15:55, Tharindu Rusira <th...@gmail.com>
> wrote:
> >
> > Thanks all for the words of wisdom :) ,
> >
> > @Ted, I'm coming from a text mining background. Many text books recommend
> > SVM because of its impressive performance with vectors having a larger
> > cardinality which is the usual case when dealing with text documents. Do
> > you think logistic regression would perform as good as SVM for text
> mining
> > applications?
> >
> > Thanks
> >
> >
> > On Tue, Dec 24, 2013 at 3:47 PM, unmesha sreeveni <unmeshabiju@gmail.com
> >wrote:
> >
> >> You can paralize svm using same equations (which has slight difference)
> >> explained in
> >>
> >>
> http://books.google.co.in/books/about/DATA_MINING.html?id=IYc2muhCbmEC&redir_esc=y
> >>
> >> But i dont gaurentee about the performance. for some 100 MB data it
> takes
> >> 10 min to train the data.
> >>
> >>
> >>> On Tue, Dec 24, 2013 at 3:30 PM, tuku <ut...@gmail.com> wrote:
> >>>
> >>> someone tried to implement SVM in a summer google code but it turns out
> >> map
> >>> reduced version of svm is too difficult to implement and they dropped
> the
> >>> project.
> >>> I bet you can train via libsvm and use just classification part with
> map
> >>> reduce but if I have a choice I prefer logistic regression too
> >>
> ~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~
> >>> If you don't know where you're going, any road will get you there
> >>> "Not all those who wander are lost." - J.R.R. Tolkien
> >>> "Fish don't know they're in water."
> >>> "Smile, breathe and go slowly." - Thich Nhat Hanh, Zen Buddhist monk
> >>> "Zamanlarını para kazanmak ve saklamakla geçirenler, sonunda, en çok
> >>> istediklerinin satın alınamayacak şeyler olduğunu anlarlar."
> >>> "And in the end, it's not the years in your life that count. It's the
> >> life
> >>> in your years."
> >>> "in 20 years, you will be more dissapointed by what you didn't do than
> >> what
> >>> you did."
> >>> "If you want to go fast, go alone. If you want to go far, go with
> >> others."
> >>> "Remember, happiness is a way of travel not a destination"
> >>> "A good traveller has no fixed plans, and is not intent on arriving."
> >>>
> >>>
> >>>> On 24 December 2013 11:11, Ted Dunning <te...@gmail.com> wrote:
> >>>>
> >>>> You might try logistic regression with regularization for a very
> >> similar
> >>>> result.
> >>>>
> >>>>
> >>>> On Mon, Dec 23, 2013 at 11:57 PM, Sebastian Schelter <
> >>>> ssc.open@googlemail.com> wrote:
> >>>>
> >>>>> Hi Tharindu,
> >>>>>
> >>>>> There is no SVM implementation in an official release.
> >>>>>
> >>>>> --sebastian
> >>>>>
> >>>>>> On 24.12.2013 08:02, Tharindu Rusira wrote:
> >>>>>> Hi all,
> >>>>>> Do we have a SVM implementation in Mahout(either sequential or
> >>>>> mapreduce)?
> >>>>>> I was searching in JIRA and MAHOUT-14[1] proposes a SVM
> >>> implementation
> >>>>> and
> >>>>>> also MAHOUT-334, MAHOUT-232  have patches available for SVM.
> >>>>>> Are these codes available in any Mahout release? Because the
> >> comments
> >>>> in
> >>>>>> these issues suggest otherwise.
> >>>>>>
> >>>>>> [1]
> >>
> https://issues.apache.org/jira/browse/MAHOUT-14?jql=project%20%3D%20MAHOUT%20AND%20text%20~%20%22svm%22
> >>>>>> [2]
> >>
> https://issues.apache.org/jira/browse/MAHOUT-334?jql=project%20%3D%20MAHOUT%20AND%20text%20~%20%22svm%22
> >>>>>> [3]
> >>
> https://issues.apache.org/jira/browse/MAHOUT-232?jql=project%20%3D%20MAHOUT%20AND%20text%20~%20%22svm%22
> >>>>>>
> >>>>>> Thanks,
> >>
> >>
> >>
> >> --
> >> *Thanks & Regards*
> >>
> >> Unmesha Sreeveni U.B
> >>
> >> *Junior Developer*
> >> http://www.unmeshasreeveni.blogspot.in/
> >
> >
> >
> > --
> > M.P. Tharindu Rusira Kumara
> >
> > Department of Computer Science and Engineering,
> > University of Moratuwa,
> > Sri Lanka.
> > +94757033733
> > www.tharindu-rusira.blogspot.com
>

Re: SVM in Mahout

Posted by Steven Bourke <sb...@gmail.com>.
Just test out libsvm against log regression on a sample of your data to get an understanding of upside downside for your particular problem 

Sent from my iPhone

> On 24 Dec 2013, at 15:55, Tharindu Rusira <th...@gmail.com> wrote:
> 
> Thanks all for the words of wisdom :) ,
> 
> @Ted, I'm coming from a text mining background. Many text books recommend
> SVM because of its impressive performance with vectors having a larger
> cardinality which is the usual case when dealing with text documents. Do
> you think logistic regression would perform as good as SVM for text mining
> applications?
> 
> Thanks
> 
> 
> On Tue, Dec 24, 2013 at 3:47 PM, unmesha sreeveni <un...@gmail.com>wrote:
> 
>> You can paralize svm using same equations (which has slight difference)
>> explained in
>> 
>> http://books.google.co.in/books/about/DATA_MINING.html?id=IYc2muhCbmEC&redir_esc=y
>> 
>> But i dont gaurentee about the performance. for some 100 MB data it takes
>> 10 min to train the data.
>> 
>> 
>>> On Tue, Dec 24, 2013 at 3:30 PM, tuku <ut...@gmail.com> wrote:
>>> 
>>> someone tried to implement SVM in a summer google code but it turns out
>> map
>>> reduced version of svm is too difficult to implement and they dropped the
>>> project.
>>> I bet you can train via libsvm and use just classification part with map
>>> reduce but if I have a choice I prefer logistic regression too
>> ~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~
>>> If you don't know where you're going, any road will get you there
>>> "Not all those who wander are lost." - J.R.R. Tolkien
>>> "Fish don't know they're in water."
>>> "Smile, breathe and go slowly." - Thich Nhat Hanh, Zen Buddhist monk
>>> "Zamanlarını para kazanmak ve saklamakla geçirenler, sonunda, en çok
>>> istediklerinin satın alınamayacak şeyler olduğunu anlarlar."
>>> "And in the end, it's not the years in your life that count. It's the
>> life
>>> in your years."
>>> "in 20 years, you will be more dissapointed by what you didn't do than
>> what
>>> you did."
>>> "If you want to go fast, go alone. If you want to go far, go with
>> others."
>>> "Remember, happiness is a way of travel not a destination"
>>> "A good traveller has no fixed plans, and is not intent on arriving."
>>> 
>>> 
>>>> On 24 December 2013 11:11, Ted Dunning <te...@gmail.com> wrote:
>>>> 
>>>> You might try logistic regression with regularization for a very
>> similar
>>>> result.
>>>> 
>>>> 
>>>> On Mon, Dec 23, 2013 at 11:57 PM, Sebastian Schelter <
>>>> ssc.open@googlemail.com> wrote:
>>>> 
>>>>> Hi Tharindu,
>>>>> 
>>>>> There is no SVM implementation in an official release.
>>>>> 
>>>>> --sebastian
>>>>> 
>>>>>> On 24.12.2013 08:02, Tharindu Rusira wrote:
>>>>>> Hi all,
>>>>>> Do we have a SVM implementation in Mahout(either sequential or
>>>>> mapreduce)?
>>>>>> I was searching in JIRA and MAHOUT-14[1] proposes a SVM
>>> implementation
>>>>> and
>>>>>> also MAHOUT-334, MAHOUT-232  have patches available for SVM.
>>>>>> Are these codes available in any Mahout release? Because the
>> comments
>>>> in
>>>>>> these issues suggest otherwise.
>>>>>> 
>>>>>> [1]
>> https://issues.apache.org/jira/browse/MAHOUT-14?jql=project%20%3D%20MAHOUT%20AND%20text%20~%20%22svm%22
>>>>>> [2]
>> https://issues.apache.org/jira/browse/MAHOUT-334?jql=project%20%3D%20MAHOUT%20AND%20text%20~%20%22svm%22
>>>>>> [3]
>> https://issues.apache.org/jira/browse/MAHOUT-232?jql=project%20%3D%20MAHOUT%20AND%20text%20~%20%22svm%22
>>>>>> 
>>>>>> Thanks,
>> 
>> 
>> 
>> --
>> *Thanks & Regards*
>> 
>> Unmesha Sreeveni U.B
>> 
>> *Junior Developer*
>> http://www.unmeshasreeveni.blogspot.in/
> 
> 
> 
> -- 
> M.P. Tharindu Rusira Kumara
> 
> Department of Computer Science and Engineering,
> University of Moratuwa,
> Sri Lanka.
> +94757033733
> www.tharindu-rusira.blogspot.com

Re: SVM in Mahout

Posted by Tharindu Rusira <th...@gmail.com>.
Thanks all for the words of wisdom :) ,

@Ted, I'm coming from a text mining background. Many text books recommend
SVM because of its impressive performance with vectors having a larger
cardinality which is the usual case when dealing with text documents. Do
you think logistic regression would perform as good as SVM for text mining
applications?

Thanks


On Tue, Dec 24, 2013 at 3:47 PM, unmesha sreeveni <un...@gmail.com>wrote:

> You can paralize svm using same equations (which has slight difference)
> explained in
>
> http://books.google.co.in/books/about/DATA_MINING.html?id=IYc2muhCbmEC&redir_esc=y
>
> But i dont gaurentee about the performance. for some 100 MB data it takes
> 10 min to train the data.
>
>
> On Tue, Dec 24, 2013 at 3:30 PM, tuku <ut...@gmail.com> wrote:
>
> > someone tried to implement SVM in a summer google code but it turns out
> map
> > reduced version of svm is too difficult to implement and they dropped the
> > project.
> > I bet you can train via libsvm and use just classification part with map
> > reduce but if I have a choice I prefer logistic regression too
> >
> >
> >
> ~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~
> > If you don't know where you're going, any road will get you there
> > "Not all those who wander are lost." - J.R.R. Tolkien
> > "Fish don't know they're in water."
> > "Smile, breathe and go slowly." - Thich Nhat Hanh, Zen Buddhist monk
> > "Zamanlarını para kazanmak ve saklamakla geçirenler, sonunda, en çok
> > istediklerinin satın alınamayacak şeyler olduğunu anlarlar."
> > "And in the end, it's not the years in your life that count. It's the
> life
> > in your years."
> > "in 20 years, you will be more dissapointed by what you didn't do than
> what
> > you did."
> > "If you want to go fast, go alone. If you want to go far, go with
> others."
> > "Remember, happiness is a way of travel not a destination"
> > "A good traveller has no fixed plans, and is not intent on arriving."
> >
> >
> > On 24 December 2013 11:11, Ted Dunning <te...@gmail.com> wrote:
> >
> > > You might try logistic regression with regularization for a very
> similar
> > > result.
> > >
> > >
> > > On Mon, Dec 23, 2013 at 11:57 PM, Sebastian Schelter <
> > > ssc.open@googlemail.com> wrote:
> > >
> > > > Hi Tharindu,
> > > >
> > > > There is no SVM implementation in an official release.
> > > >
> > > > --sebastian
> > > >
> > > > On 24.12.2013 08:02, Tharindu Rusira wrote:
> > > > > Hi all,
> > > > > Do we have a SVM implementation in Mahout(either sequential or
> > > > mapreduce)?
> > > > > I was searching in JIRA and MAHOUT-14[1] proposes a SVM
> > implementation
> > > > and
> > > > > also MAHOUT-334, MAHOUT-232  have patches available for SVM.
> > > > > Are these codes available in any Mahout release? Because the
> comments
> > > in
> > > > > these issues suggest otherwise.
> > > > >
> > > > > [1]
> > > > >
> > > >
> > >
> >
> https://issues.apache.org/jira/browse/MAHOUT-14?jql=project%20%3D%20MAHOUT%20AND%20text%20~%20%22svm%22
> > > > > [2]
> > > > >
> > > >
> > >
> >
> https://issues.apache.org/jira/browse/MAHOUT-334?jql=project%20%3D%20MAHOUT%20AND%20text%20~%20%22svm%22
> > > > > [3]
> > > > >
> > > >
> > >
> >
> https://issues.apache.org/jira/browse/MAHOUT-232?jql=project%20%3D%20MAHOUT%20AND%20text%20~%20%22svm%22
> > > > >
> > > > > Thanks,
> > > > >
> > > >
> > > >
> > >
> >
>
>
>
> --
> *Thanks & Regards*
>
> Unmesha Sreeveni U.B
>
> *Junior Developer*
> http://www.unmeshasreeveni.blogspot.in/
>



-- 
M.P. Tharindu Rusira Kumara

Department of Computer Science and Engineering,
University of Moratuwa,
Sri Lanka.
+94757033733
www.tharindu-rusira.blogspot.com

Re: SVM in Mahout

Posted by unmesha sreeveni <un...@gmail.com>.
You can paralize svm using same equations (which has slight difference)
explained in
http://books.google.co.in/books/about/DATA_MINING.html?id=IYc2muhCbmEC&redir_esc=y

But i dont gaurentee about the performance. for some 100 MB data it takes
10 min to train the data.


On Tue, Dec 24, 2013 at 3:30 PM, tuku <ut...@gmail.com> wrote:

> someone tried to implement SVM in a summer google code but it turns out map
> reduced version of svm is too difficult to implement and they dropped the
> project.
> I bet you can train via libsvm and use just classification part with map
> reduce but if I have a choice I prefer logistic regression too
>
>
> ~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~
> If you don't know where you're going, any road will get you there
> "Not all those who wander are lost." - J.R.R. Tolkien
> "Fish don't know they're in water."
> "Smile, breathe and go slowly." - Thich Nhat Hanh, Zen Buddhist monk
> "Zamanlarını para kazanmak ve saklamakla geçirenler, sonunda, en çok
> istediklerinin satın alınamayacak şeyler olduğunu anlarlar."
> "And in the end, it's not the years in your life that count. It's the life
> in your years."
> "in 20 years, you will be more dissapointed by what you didn't do than what
> you did."
> "If you want to go fast, go alone. If you want to go far, go with others."
> "Remember, happiness is a way of travel not a destination"
> "A good traveller has no fixed plans, and is not intent on arriving."
>
>
> On 24 December 2013 11:11, Ted Dunning <te...@gmail.com> wrote:
>
> > You might try logistic regression with regularization for a very similar
> > result.
> >
> >
> > On Mon, Dec 23, 2013 at 11:57 PM, Sebastian Schelter <
> > ssc.open@googlemail.com> wrote:
> >
> > > Hi Tharindu,
> > >
> > > There is no SVM implementation in an official release.
> > >
> > > --sebastian
> > >
> > > On 24.12.2013 08:02, Tharindu Rusira wrote:
> > > > Hi all,
> > > > Do we have a SVM implementation in Mahout(either sequential or
> > > mapreduce)?
> > > > I was searching in JIRA and MAHOUT-14[1] proposes a SVM
> implementation
> > > and
> > > > also MAHOUT-334, MAHOUT-232  have patches available for SVM.
> > > > Are these codes available in any Mahout release? Because the comments
> > in
> > > > these issues suggest otherwise.
> > > >
> > > > [1]
> > > >
> > >
> >
> https://issues.apache.org/jira/browse/MAHOUT-14?jql=project%20%3D%20MAHOUT%20AND%20text%20~%20%22svm%22
> > > > [2]
> > > >
> > >
> >
> https://issues.apache.org/jira/browse/MAHOUT-334?jql=project%20%3D%20MAHOUT%20AND%20text%20~%20%22svm%22
> > > > [3]
> > > >
> > >
> >
> https://issues.apache.org/jira/browse/MAHOUT-232?jql=project%20%3D%20MAHOUT%20AND%20text%20~%20%22svm%22
> > > >
> > > > Thanks,
> > > >
> > >
> > >
> >
>



-- 
*Thanks & Regards*

Unmesha Sreeveni U.B

*Junior Developer*
http://www.unmeshasreeveni.blogspot.in/

Re: SVM in Mahout

Posted by tuku <ut...@gmail.com>.
someone tried to implement SVM in a summer google code but it turns out map
reduced version of svm is too difficult to implement and they dropped the
project.
I bet you can train via libsvm and use just classification part with map
reduce but if I have a choice I prefer logistic regression too

~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~--~
If you don't know where you're going, any road will get you there
"Not all those who wander are lost." - J.R.R. Tolkien
"Fish don't know they're in water."
"Smile, breathe and go slowly." - Thich Nhat Hanh, Zen Buddhist monk
"Zamanlarını para kazanmak ve saklamakla geçirenler, sonunda, en çok
istediklerinin satın alınamayacak şeyler olduğunu anlarlar."
"And in the end, it's not the years in your life that count. It's the life
in your years."
"in 20 years, you will be more dissapointed by what you didn't do than what
you did."
"If you want to go fast, go alone. If you want to go far, go with others."
"Remember, happiness is a way of travel not a destination"
"A good traveller has no fixed plans, and is not intent on arriving."


On 24 December 2013 11:11, Ted Dunning <te...@gmail.com> wrote:

> You might try logistic regression with regularization for a very similar
> result.
>
>
> On Mon, Dec 23, 2013 at 11:57 PM, Sebastian Schelter <
> ssc.open@googlemail.com> wrote:
>
> > Hi Tharindu,
> >
> > There is no SVM implementation in an official release.
> >
> > --sebastian
> >
> > On 24.12.2013 08:02, Tharindu Rusira wrote:
> > > Hi all,
> > > Do we have a SVM implementation in Mahout(either sequential or
> > mapreduce)?
> > > I was searching in JIRA and MAHOUT-14[1] proposes a SVM implementation
> > and
> > > also MAHOUT-334, MAHOUT-232  have patches available for SVM.
> > > Are these codes available in any Mahout release? Because the comments
> in
> > > these issues suggest otherwise.
> > >
> > > [1]
> > >
> >
> https://issues.apache.org/jira/browse/MAHOUT-14?jql=project%20%3D%20MAHOUT%20AND%20text%20~%20%22svm%22
> > > [2]
> > >
> >
> https://issues.apache.org/jira/browse/MAHOUT-334?jql=project%20%3D%20MAHOUT%20AND%20text%20~%20%22svm%22
> > > [3]
> > >
> >
> https://issues.apache.org/jira/browse/MAHOUT-232?jql=project%20%3D%20MAHOUT%20AND%20text%20~%20%22svm%22
> > >
> > > Thanks,
> > >
> >
> >
>

Re: SVM in Mahout

Posted by Ted Dunning <te...@gmail.com>.
You might try logistic regression with regularization for a very similar
result.


On Mon, Dec 23, 2013 at 11:57 PM, Sebastian Schelter <
ssc.open@googlemail.com> wrote:

> Hi Tharindu,
>
> There is no SVM implementation in an official release.
>
> --sebastian
>
> On 24.12.2013 08:02, Tharindu Rusira wrote:
> > Hi all,
> > Do we have a SVM implementation in Mahout(either sequential or
> mapreduce)?
> > I was searching in JIRA and MAHOUT-14[1] proposes a SVM implementation
> and
> > also MAHOUT-334, MAHOUT-232  have patches available for SVM.
> > Are these codes available in any Mahout release? Because the comments in
> > these issues suggest otherwise.
> >
> > [1]
> >
> https://issues.apache.org/jira/browse/MAHOUT-14?jql=project%20%3D%20MAHOUT%20AND%20text%20~%20%22svm%22
> > [2]
> >
> https://issues.apache.org/jira/browse/MAHOUT-334?jql=project%20%3D%20MAHOUT%20AND%20text%20~%20%22svm%22
> > [3]
> >
> https://issues.apache.org/jira/browse/MAHOUT-232?jql=project%20%3D%20MAHOUT%20AND%20text%20~%20%22svm%22
> >
> > Thanks,
> >
>
>

Re: SVM in Mahout

Posted by Sebastian Schelter <ss...@googlemail.com>.
Hi Tharindu,

There is no SVM implementation in an official release.

--sebastian

On 24.12.2013 08:02, Tharindu Rusira wrote:
> Hi all,
> Do we have a SVM implementation in Mahout(either sequential or mapreduce)?
> I was searching in JIRA and MAHOUT-14[1] proposes a SVM implementation and
> also MAHOUT-334, MAHOUT-232  have patches available for SVM.
> Are these codes available in any Mahout release? Because the comments in
> these issues suggest otherwise.
> 
> [1]
> https://issues.apache.org/jira/browse/MAHOUT-14?jql=project%20%3D%20MAHOUT%20AND%20text%20~%20%22svm%22
> [2]
> https://issues.apache.org/jira/browse/MAHOUT-334?jql=project%20%3D%20MAHOUT%20AND%20text%20~%20%22svm%22
> [3]
> https://issues.apache.org/jira/browse/MAHOUT-232?jql=project%20%3D%20MAHOUT%20AND%20text%20~%20%22svm%22
> 
> Thanks,
>