You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@mahout.apache.org by Dmitriy Lyubimov <dl...@gmail.com> on 2014/10/22 01:12:49 UTC

Any idea which approaches to non-liniear svm are easily parallelizable?

in particular, from libSVM --
http://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf ?

thanks.
-d

Re: Any idea which approaches to non-liniear svm are easily parallelizable?

Posted by peng <pc...@uowmail.edu.au>.

And I agree with Ted, the non-linearity induced by most kernel functions 
are overcomplex and can easily overfit. Deep learning is a more reliable 
abstraction.

On 10/22/2014 01:42 PM, peng wrote:
> Is the kernel projection referring to online/incremental incomplete 
> Cholesky decomposition? Sorry I haven't used SVM for a long time and 
> didn't keep up with SotA.
>
> If that's true, I haven't find an out-of-the-box implementation, but 
> this should be easy.
>
> Yours Peng
>
> On 10/22/2014 01:32 PM, Dmitriy Lyubimov wrote:
>> Andrew, thanks a bunch for the pointers!
>>
>>
>>
>> On Wed, Oct 22, 2014 at 10:14 AM, Andrew Palumbo <ap...@outlook.com> 
>> wrote:
>>
>>> If you do want to stick with SVM-
>>>
>>> This is a question that I keep coming back myself to and unfortunately
>>> have forgotten more (and lost more literature) than I’ve retained.
>>> I believe that the most easily parallelizable sections of libSVM for 
>>> small
>>> datasets are (for C-SVC(R), RBF and polynomial kernels):
>>>
>>>      1.    The Kernel projections
>>>      2.    The Hyper-Paramater grid search for C,  \gamma (I believe 
>>> this
>>> is now included in LibSVM- I havent looked at it in a while)
>>>      3.    For multi-class SVC:  the concurrent computation of each 
>>> SVM for
>>> each one-against-one class vote.
>>>
>>> I’m unfamiliar with any easily parallizable method for QP itself.
>>> Unfortunately for (2), (3) this involves broadcasting the entire 
>>> dataset
>>> out to each node of a cluster (or working in a shared memory 
>>> environment)
>>> so may not be practical depending on the size of your data set.  
>>> I’ve only
>>> ever implemented (2) for relatively small datasets using MPI and and 
>>> a with
>>> pure java socket implementation.
>>>
>>> Other approaches (further from simple LibSVM), which are more 
>>> applicable
>>> to large datasets (I’m less familiar with these):
>>>
>>>      4.    Divide and conquer the QP/SMO problem and solve (As I’ve 
>>> said,
>>> I’m unfamiliar with this and  I don’t know of any standard)
>>>
>>>      5.    Break the training set into subsets and solve.
>>>
>>> For (5) there are several approaches, two that I know of are ensemble
>>> approaches and those that accumulate Support Vectors from each 
>>> partition
>>> and heuristically keep/reject them until the model converges. As 
>>> well I’ve
>>> read some just read some research on implementing this in map a 
>>> MapReduce
>>> style[2].
>>>
>>> I came across this paper [1] last night which you may find 
>>> interesting as
>>> well which is an interesting comparison of some SVM parallelization
>>> strategies, particularly it discusses (1) for a shared memory 
>>> environment
>>> and for offloading work to GPUs (using OpenMP  and CUDA). It also cites
>>> several other nice papers discussing SVM parallelization strategies
>>> especially for (5).  Also then goes on to discuss more purely linear
>>> algebra approach to optimizing SVMs (sec. 5)
>>>
>>> Also regarding (5) you may be interested in [2] (something I’ve only
>>> looked over briefly).
>>>
>>> [1] http://arxiv.org/pdf/1404.1066v1.pdf
>>> [2] http://arxiv.org/pdf/1301.0082.pdf
>>>
>>>
>>>> From: ted.dunning@gmail.com
>>>> Date: Tue, 21 Oct 2014 17:32:22 -0700
>>>> Subject: Re: Any idea which approaches to non-liniear svm are easily
>>> parallelizable?
>>>> To: dev@mahout.apache.org
>>>>
>>>> Last I heard, the best methods pre-project and do linear SVM.
>>>>
>>>> Beyond that, I would guess that deep learning techniques would subsume
>>>> non-linear SVM pretty easily.  The best parallel implementation I know
>>> for
>>>> that is in H2O.
>>>>
>>>>
>>>>
>>>> On Tue, Oct 21, 2014 at 4:12 PM, Dmitriy Lyubimov <dl...@gmail.com>
>>> wrote:
>>>>> in particular, from libSVM --
>>>>> http://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf ?
>>>>>
>>>>> thanks.
>>>>> -d
>>>>>
>

RE: Any idea which approaches to non-liniear svm are easily parallelizable?

Posted by Andrew Palumbo <ap...@outlook.com>.



I was simply referring parallelizing the iterative computation of the 
Kernel matrix in the LibSVM code.  I'd not proposed any decomposition.  

Ok, I see- you're talking the approach taken in PSVM [1], correct? I'm not familiar with this.   

I do agree with you and Ted that traditional non-liniear SVM are not really suitable for large datasets- I've had small datasets (~5000 x ~300) take days on a single machine.  The fact that accuracy (and training time) vary so greatly w.r.t. parameter pairs make a grid search essential so expand that time by the granularity of your search and...  However I don't necessary agree that they tend to overfit- but I guess that this depends on the nature of the problem.  In my cases I've often found them highly accurate in out of sample classification.

As you and Ted have mentioned, Kernel approximation and projection to a linear problem trained eg. via SGD  seem to be good and scalable alternative for large sets.. though I have no experience with this and have no idea how the accuracy compares. 

[1]http://idning.googlecode.com/svn-history/r711/trunk/paper/reference/google/NIPS2007_0435.pdf  


> Date: Thu, 23 Oct 2014 01:58:32 -0400
> From: pc175@uowmail.edu.au
> To: dev@mahout.apache.org
> CC: dlieu.7@gmail.com; ap.dev@outlook.com
> Subject: Re: Any idea which approaches to non-liniear svm are easily parallelizable?
> 
> Yes I am, In fact, my question is just about whether approximation is 
> used to make the total workload of computing the matrix sub-quadratic to 
> the training set size.
> 
> On 10/22/2014 02:21 PM, Andrew Palumbo wrote:
> > Peng, I'm not sure if you were referring to what I wrote:
> >
> >>>>       1.    The Kernel projections
> > if so-  I was talking about parallelizing the computation of eg. the RBF Kernals
> >
> >> Date: Wed, 22 Oct 2014 13:42:45 -0400
> >> From: pc175@uowmail.edu.au
> >> To: dev@mahout.apache.org
> >> CC: dlieu.7@gmail.com
> >> Subject: Re: Any idea which approaches to non-liniear svm are easily parallelizable?
> >>
> >> Is the kernel projection referring to online/incremental incomplete
> >> Cholesky decomposition? Sorry I haven't used SVM for a long time and
> >> didn't keep up with SotA.
> >>
> >> If that's true, I haven't find an out-of-the-box implementation, but
> >> this should be easy.
> >>
> >> Yours Peng
> >>
> >> On 10/22/2014 01:32 PM, Dmitriy Lyubimov wrote:
> >>> Andrew, thanks a bunch for the pointers!
> >>>
> >>>
> >>>
> >>> On Wed, Oct 22, 2014 at 10:14 AM, Andrew Palumbo <ap...@outlook.com> wrote:
> >>>
> >>>> If you do want to stick with SVM-
> >>>>
> >>>> This is a question that I keep coming back myself to and unfortunately
> >>>> have forgotten more (and lost more literature) than I’ve retained.
> >>>> I believe that the most easily parallelizable sections of libSVM for small
> >>>> datasets are (for C-SVC(R), RBF and polynomial kernels):
> >>>>
> >>>>       2.    The Hyper-Paramater grid search for C,  \gamma (I believe this
> >>>> is now included in LibSVM- I havent looked at it in a while)
> >>>>       3.    For multi-class SVC:  the concurrent computation of each SVM for
> >>>> each one-against-one class vote.
> >>>>
> >>>> I’m unfamiliar with any easily parallizable method for QP itself.
> >>>> Unfortunately for (2), (3) this involves broadcasting the entire dataset
> >>>> out to each node of a cluster (or working in a shared memory environment)
> >>>> so may not be practical depending on the size of your data set.  I’ve only
> >>>> ever implemented (2) for relatively small datasets using MPI and and a with
> >>>> pure java socket implementation.
> >>>>
> >>>> Other approaches (further from simple LibSVM), which are more applicable
> >>>> to large datasets (I’m less familiar with these):
> >>>>
> >>>>       4.    Divide and conquer the QP/SMO problem and solve (As I’ve said,
> >>>> I’m unfamiliar with this and  I don’t know of any standard)
> >>>>
> >>>>       5.    Break the training set into subsets and solve.
> >>>>
> >>>> For (5) there are several approaches, two that I know of are ensemble
> >>>> approaches and those that accumulate Support Vectors from each partition
> >>>> and heuristically keep/reject them until the model converges.  As well I’ve
> >>>> read some just read some research on implementing this in map a MapReduce
> >>>> style[2].
> >>>>
> >>>> I came across this paper [1] last night which you may find interesting as
> >>>> well which is an interesting comparison of some SVM parallelization
> >>>> strategies, particularly it discusses (1) for a shared memory environment
> >>>> and for offloading work to GPUs (using OpenMP  and CUDA). It also cites
> >>>> several other nice papers discussing SVM parallelization strategies
> >>>> especially for (5).  Also then goes on to discuss more purely linear
> >>>> algebra approach to optimizing SVMs (sec. 5)
> >>>>
> >>>> Also regarding (5) you may be interested in [2] (something I’ve only
> >>>> looked over briefly).
> >>>>
> >>>> [1] http://arxiv.org/pdf/1404.1066v1.pdf
> >>>> [2] http://arxiv.org/pdf/1301.0082.pdf
> >>>>
> >>>>
> >>>>> From: ted.dunning@gmail.com
> >>>>> Date: Tue, 21 Oct 2014 17:32:22 -0700
> >>>>> Subject: Re: Any idea which approaches to non-liniear svm are easily
> >>>> parallelizable?
> >>>>> To: dev@mahout.apache.org
> >>>>>
> >>>>> Last I heard, the best methods pre-project and do linear SVM.
> >>>>>
> >>>>> Beyond that, I would guess that deep learning techniques would subsume
> >>>>> non-linear SVM pretty easily.  The best parallel implementation I know
> >>>> for
> >>>>> that is in H2O.
> >>>>>
> >>>>>
> >>>>>
> >>>>> On Tue, Oct 21, 2014 at 4:12 PM, Dmitriy Lyubimov <dl...@gmail.com>
> >>>> wrote:
> >>>>>> in particular, from libSVM --
> >>>>>> http://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf ?
> >>>>>>
> >>>>>> thanks.
> >>>>>> -d
> >>>>>>
> >   		 	   		
>

Re: Any idea which approaches to non-liniear svm are easily parallelizable?

Posted by peng <pc...@uowmail.edu.au>.

Yes I am, In fact, my question is just about whether approximation is 
used to make the total workload of computing the matrix sub-quadratic to 
the training set size.

On 10/22/2014 02:21 PM, Andrew Palumbo wrote:
> Peng, I'm not sure if you were referring to what I wrote:
>
>>>>       1.    The Kernel projections
> if so-  I was talking about parallelizing the computation of eg. the RBF Kernals
>
>> Date: Wed, 22 Oct 2014 13:42:45 -0400
>> From: pc175@uowmail.edu.au
>> To: dev@mahout.apache.org
>> CC: dlieu.7@gmail.com
>> Subject: Re: Any idea which approaches to non-liniear svm are easily parallelizable?
>>
>> Is the kernel projection referring to online/incremental incomplete
>> Cholesky decomposition? Sorry I haven't used SVM for a long time and
>> didn't keep up with SotA.
>>
>> If that's true, I haven't find an out-of-the-box implementation, but
>> this should be easy.
>>
>> Yours Peng
>>
>> On 10/22/2014 01:32 PM, Dmitriy Lyubimov wrote:
>>> Andrew, thanks a bunch for the pointers!
>>>
>>>
>>>
>>> On Wed, Oct 22, 2014 at 10:14 AM, Andrew Palumbo <ap...@outlook.com> wrote:
>>>
>>>> If you do want to stick with SVM-
>>>>
>>>> This is a question that I keep coming back myself to and unfortunately
>>>> have forgotten more (and lost more literature) than I’ve retained.
>>>> I believe that the most easily parallelizable sections of libSVM for small
>>>> datasets are (for C-SVC(R), RBF and polynomial kernels):
>>>>
>>>>       2.    The Hyper-Paramater grid search for C,  \gamma (I believe this
>>>> is now included in LibSVM- I havent looked at it in a while)
>>>>       3.    For multi-class SVC:  the concurrent computation of each SVM for
>>>> each one-against-one class vote.
>>>>
>>>> I’m unfamiliar with any easily parallizable method for QP itself.
>>>> Unfortunately for (2), (3) this involves broadcasting the entire dataset
>>>> out to each node of a cluster (or working in a shared memory environment)
>>>> so may not be practical depending on the size of your data set.  I’ve only
>>>> ever implemented (2) for relatively small datasets using MPI and and a with
>>>> pure java socket implementation.
>>>>
>>>> Other approaches (further from simple LibSVM), which are more applicable
>>>> to large datasets (I’m less familiar with these):
>>>>
>>>>       4.    Divide and conquer the QP/SMO problem and solve (As I’ve said,
>>>> I’m unfamiliar with this and  I don’t know of any standard)
>>>>
>>>>       5.    Break the training set into subsets and solve.
>>>>
>>>> For (5) there are several approaches, two that I know of are ensemble
>>>> approaches and those that accumulate Support Vectors from each partition
>>>> and heuristically keep/reject them until the model converges.  As well I’ve
>>>> read some just read some research on implementing this in map a MapReduce
>>>> style[2].
>>>>
>>>> I came across this paper [1] last night which you may find interesting as
>>>> well which is an interesting comparison of some SVM parallelization
>>>> strategies, particularly it discusses (1) for a shared memory environment
>>>> and for offloading work to GPUs (using OpenMP  and CUDA). It also cites
>>>> several other nice papers discussing SVM parallelization strategies
>>>> especially for (5).  Also then goes on to discuss more purely linear
>>>> algebra approach to optimizing SVMs (sec. 5)
>>>>
>>>> Also regarding (5) you may be interested in [2] (something I’ve only
>>>> looked over briefly).
>>>>
>>>> [1] http://arxiv.org/pdf/1404.1066v1.pdf
>>>> [2] http://arxiv.org/pdf/1301.0082.pdf
>>>>
>>>>
>>>>> From: ted.dunning@gmail.com
>>>>> Date: Tue, 21 Oct 2014 17:32:22 -0700
>>>>> Subject: Re: Any idea which approaches to non-liniear svm are easily
>>>> parallelizable?
>>>>> To: dev@mahout.apache.org
>>>>>
>>>>> Last I heard, the best methods pre-project and do linear SVM.
>>>>>
>>>>> Beyond that, I would guess that deep learning techniques would subsume
>>>>> non-linear SVM pretty easily.  The best parallel implementation I know
>>>> for
>>>>> that is in H2O.
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Oct 21, 2014 at 4:12 PM, Dmitriy Lyubimov <dl...@gmail.com>
>>>> wrote:
>>>>>> in particular, from libSVM --
>>>>>> http://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf ?
>>>>>>
>>>>>> thanks.
>>>>>> -d
>>>>>>
>

RE: Any idea which approaches to non-liniear svm are easily parallelizable?

Posted by Andrew Palumbo <ap...@outlook.com>.

Peng, I'm not sure if you were referring to what I wrote:

> >>      1.    The Kernel projections

if so-  I was talking about parallelizing the computation of eg. the RBF Kernals

> Date: Wed, 22 Oct 2014 13:42:45 -0400
> From: pc175@uowmail.edu.au
> To: dev@mahout.apache.org
> CC: dlieu.7@gmail.com
> Subject: Re: Any idea which approaches to non-liniear svm are easily parallelizable?
> 
> Is the kernel projection referring to online/incremental incomplete 
> Cholesky decomposition? Sorry I haven't used SVM for a long time and 
> didn't keep up with SotA.
> 
> If that's true, I haven't find an out-of-the-box implementation, but 
> this should be easy.
> 
> Yours Peng
> 
> On 10/22/2014 01:32 PM, Dmitriy Lyubimov wrote:
> > Andrew, thanks a bunch for the pointers!
> >
> >
> >
> > On Wed, Oct 22, 2014 at 10:14 AM, Andrew Palumbo <ap...@outlook.com> wrote:
> >
> >> If you do want to stick with SVM-
> >>
> >> This is a question that I keep coming back myself to and unfortunately
> >> have forgotten more (and lost more literature) than I’ve retained.
> >> I believe that the most easily parallelizable sections of libSVM for small
> >> datasets are (for C-SVC(R), RBF and polynomial kernels):
> >>

> >>      2.    The Hyper-Paramater grid search for C,  \gamma (I believe this
> >> is now included in LibSVM- I havent looked at it in a while)
> >>      3.    For multi-class SVC:  the concurrent computation of each SVM for
> >> each one-against-one class vote.
> >>
> >> I’m unfamiliar with any easily parallizable method for QP itself.
> >> Unfortunately for (2), (3) this involves broadcasting the entire dataset
> >> out to each node of a cluster (or working in a shared memory environment)
> >> so may not be practical depending on the size of your data set.  I’ve only
> >> ever implemented (2) for relatively small datasets using MPI and and a with
> >> pure java socket implementation.
> >>
> >> Other approaches (further from simple LibSVM), which are more applicable
> >> to large datasets (I’m less familiar with these):
> >>
> >>      4.    Divide and conquer the QP/SMO problem and solve (As I’ve said,
> >> I’m unfamiliar with this and  I don’t know of any standard)
> >>
> >>      5.    Break the training set into subsets and solve.
> >>
> >> For (5) there are several approaches, two that I know of are ensemble
> >> approaches and those that accumulate Support Vectors from each partition
> >> and heuristically keep/reject them until the model converges.  As well I’ve
> >> read some just read some research on implementing this in map a MapReduce
> >> style[2].
> >>
> >> I came across this paper [1] last night which you may find interesting as
> >> well which is an interesting comparison of some SVM parallelization
> >> strategies, particularly it discusses (1) for a shared memory environment
> >> and for offloading work to GPUs (using OpenMP  and CUDA). It also cites
> >> several other nice papers discussing SVM parallelization strategies
> >> especially for (5).  Also then goes on to discuss more purely linear
> >> algebra approach to optimizing SVMs (sec. 5)
> >>
> >> Also regarding (5) you may be interested in [2] (something I’ve only
> >> looked over briefly).
> >>
> >> [1] http://arxiv.org/pdf/1404.1066v1.pdf
> >> [2] http://arxiv.org/pdf/1301.0082.pdf
> >>
> >>
> >>> From: ted.dunning@gmail.com
> >>> Date: Tue, 21 Oct 2014 17:32:22 -0700
> >>> Subject: Re: Any idea which approaches to non-liniear svm are easily
> >> parallelizable?
> >>> To: dev@mahout.apache.org
> >>>
> >>> Last I heard, the best methods pre-project and do linear SVM.
> >>>
> >>> Beyond that, I would guess that deep learning techniques would subsume
> >>> non-linear SVM pretty easily.  The best parallel implementation I know
> >> for
> >>> that is in H2O.
> >>>
> >>>
> >>>
> >>> On Tue, Oct 21, 2014 at 4:12 PM, Dmitriy Lyubimov <dl...@gmail.com>
> >> wrote:
> >>>> in particular, from libSVM --
> >>>> http://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf ?
> >>>>
> >>>> thanks.
> >>>> -d
> >>>>
>

Re: Any idea which approaches to non-liniear svm are easily parallelizable?

Posted by peng <pc...@uowmail.edu.au>.

Is the kernel projection referring to online/incremental incomplete 
Cholesky decomposition? Sorry I haven't used SVM for a long time and 
didn't keep up with SotA.

If that's true, I haven't find an out-of-the-box implementation, but 
this should be easy.

Yours Peng

On 10/22/2014 01:32 PM, Dmitriy Lyubimov wrote:
> Andrew, thanks a bunch for the pointers!
>
>
>
> On Wed, Oct 22, 2014 at 10:14 AM, Andrew Palumbo <ap...@outlook.com> wrote:
>
>> If you do want to stick with SVM-
>>
>> This is a question that I keep coming back myself to and unfortunately
>> have forgotten more (and lost more literature) than I’ve retained.
>> I believe that the most easily parallelizable sections of libSVM for small
>> datasets are (for C-SVC(R), RBF and polynomial kernels):
>>
>>      1.    The Kernel projections
>>      2.    The Hyper-Paramater grid search for C,  \gamma (I believe this
>> is now included in LibSVM- I havent looked at it in a while)
>>      3.    For multi-class SVC:  the concurrent computation of each SVM for
>> each one-against-one class vote.
>>
>> I’m unfamiliar with any easily parallizable method for QP itself.
>> Unfortunately for (2), (3) this involves broadcasting the entire dataset
>> out to each node of a cluster (or working in a shared memory environment)
>> so may not be practical depending on the size of your data set.  I’ve only
>> ever implemented (2) for relatively small datasets using MPI and and a with
>> pure java socket implementation.
>>
>> Other approaches (further from simple LibSVM), which are more applicable
>> to large datasets (I’m less familiar with these):
>>
>>      4.    Divide and conquer the QP/SMO problem and solve (As I’ve said,
>> I’m unfamiliar with this and  I don’t know of any standard)
>>
>>      5.    Break the training set into subsets and solve.
>>
>> For (5) there are several approaches, two that I know of are ensemble
>> approaches and those that accumulate Support Vectors from each partition
>> and heuristically keep/reject them until the model converges.  As well I’ve
>> read some just read some research on implementing this in map a MapReduce
>> style[2].
>>
>> I came across this paper [1] last night which you may find interesting as
>> well which is an interesting comparison of some SVM parallelization
>> strategies, particularly it discusses (1) for a shared memory environment
>> and for offloading work to GPUs (using OpenMP  and CUDA). It also cites
>> several other nice papers discussing SVM parallelization strategies
>> especially for (5).  Also then goes on to discuss more purely linear
>> algebra approach to optimizing SVMs (sec. 5)
>>
>> Also regarding (5) you may be interested in [2] (something I’ve only
>> looked over briefly).
>>
>> [1] http://arxiv.org/pdf/1404.1066v1.pdf
>> [2] http://arxiv.org/pdf/1301.0082.pdf
>>
>>
>>> From: ted.dunning@gmail.com
>>> Date: Tue, 21 Oct 2014 17:32:22 -0700
>>> Subject: Re: Any idea which approaches to non-liniear svm are easily
>> parallelizable?
>>> To: dev@mahout.apache.org
>>>
>>> Last I heard, the best methods pre-project and do linear SVM.
>>>
>>> Beyond that, I would guess that deep learning techniques would subsume
>>> non-linear SVM pretty easily.  The best parallel implementation I know
>> for
>>> that is in H2O.
>>>
>>>
>>>
>>> On Tue, Oct 21, 2014 at 4:12 PM, Dmitriy Lyubimov <dl...@gmail.com>
>> wrote:
>>>> in particular, from libSVM --
>>>> http://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf ?
>>>>
>>>> thanks.
>>>> -d
>>>>

Re: Any idea which approaches to non-liniear svm are easily parallelizable?

Posted by Dmitriy Lyubimov <dl...@gmail.com>.

Andrew, thanks a bunch for the pointers!



On Wed, Oct 22, 2014 at 10:14 AM, Andrew Palumbo <ap...@outlook.com> wrote:

> If you do want to stick with SVM-
>
> This is a question that I keep coming back myself to and unfortunately
> have forgotten more (and lost more literature) than I’ve retained.
> I believe that the most easily parallelizable sections of libSVM for small
> datasets are (for C-SVC(R), RBF and polynomial kernels):
>
>     1.    The Kernel projections
>     2.    The Hyper-Paramater grid search for C,  \gamma (I believe this
> is now included in LibSVM- I havent looked at it in a while)
>     3.    For multi-class SVC:  the concurrent computation of each SVM for
> each one-against-one class vote.
>
> I’m unfamiliar with any easily parallizable method for QP itself.
> Unfortunately for (2), (3) this involves broadcasting the entire dataset
> out to each node of a cluster (or working in a shared memory environment)
> so may not be practical depending on the size of your data set.  I’ve only
> ever implemented (2) for relatively small datasets using MPI and and a with
> pure java socket implementation.
>
> Other approaches (further from simple LibSVM), which are more applicable
> to large datasets (I’m less familiar with these):
>
>     4.    Divide and conquer the QP/SMO problem and solve (As I’ve said,
> I’m unfamiliar with this and  I don’t know of any standard)
>
>     5.    Break the training set into subsets and solve.
>
> For (5) there are several approaches, two that I know of are ensemble
> approaches and those that accumulate Support Vectors from each partition
> and heuristically keep/reject them until the model converges.  As well I’ve
> read some just read some research on implementing this in map a MapReduce
> style[2].
>
> I came across this paper [1] last night which you may find interesting as
> well which is an interesting comparison of some SVM parallelization
> strategies, particularly it discusses (1) for a shared memory environment
> and for offloading work to GPUs (using OpenMP  and CUDA). It also cites
> several other nice papers discussing SVM parallelization strategies
> especially for (5).  Also then goes on to discuss more purely linear
> algebra approach to optimizing SVMs (sec. 5)
>
> Also regarding (5) you may be interested in [2] (something I’ve only
> looked over briefly).
>
> [1] http://arxiv.org/pdf/1404.1066v1.pdf
> [2] http://arxiv.org/pdf/1301.0082.pdf
>
>
> > From: ted.dunning@gmail.com
> > Date: Tue, 21 Oct 2014 17:32:22 -0700
> > Subject: Re: Any idea which approaches to non-liniear svm are easily
> parallelizable?
> > To: dev@mahout.apache.org
> >
> > Last I heard, the best methods pre-project and do linear SVM.
> >
> > Beyond that, I would guess that deep learning techniques would subsume
> > non-linear SVM pretty easily.  The best parallel implementation I know
> for
> > that is in H2O.
> >
> >
> >
> > On Tue, Oct 21, 2014 at 4:12 PM, Dmitriy Lyubimov <dl...@gmail.com>
> wrote:
> >
> > > in particular, from libSVM --
> > > http://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf ?
> > >
> > > thanks.
> > > -d
> > >
>
>

RE: Any idea which approaches to non-liniear svm are easily parallelizable?

Posted by Andrew Palumbo <ap...@outlook.com>.

If you do want to stick with SVM-

This is a question that I keep coming back myself to and unfortunately have forgotten more (and lost more literature) than I’ve retained.
I believe that the most easily parallelizable sections of libSVM for small datasets are (for C-SVC(R), RBF and polynomial kernels):

    1.    The Kernel projections 
    2.    The Hyper-Paramater grid search for C,  \gamma (I believe this is now included in LibSVM- I havent looked at it in a while)
    3.    For multi-class SVC:  the concurrent computation of each SVM for each one-against-one class vote.

I’m unfamiliar with any easily parallizable method for QP itself.  Unfortunately for (2), (3) this involves broadcasting the entire dataset out to each node of a cluster (or working in a shared memory environment) so may not be practical depending on the size of your data set.  I’ve only ever implemented (2) for relatively small datasets using MPI and and a with pure java socket implementation.  

Other approaches (further from simple LibSVM), which are more applicable to large datasets (I’m less familiar with these):

    4.    Divide and conquer the QP/SMO problem and solve (As I’ve said, I’m unfamiliar with this and  I don’t know of any standard) 

    5.    Break the training set into subsets and solve.

For (5) there are several approaches, two that I know of are ensemble approaches and those that accumulate Support Vectors from each partition and heuristically keep/reject them until the model converges.  As well I’ve read some just read some research on implementing this in map a MapReduce style[2].

I came across this paper [1] last night which you may find interesting as well which is an interesting comparison of some SVM parallelization strategies, particularly it discusses (1) for a shared memory environment and for offloading work to GPUs (using OpenMP  and CUDA). It also cites several other nice papers discussing SVM parallelization strategies especially for (5).  Also then goes on to discuss more purely linear algebra approach to optimizing SVMs (sec. 5)

Also regarding (5) you may be interested in [2] (something I’ve only looked over briefly).

[1] http://arxiv.org/pdf/1404.1066v1.pdf
[2] http://arxiv.org/pdf/1301.0082.pdf

> From: ted.dunning@gmail.com
> Date: Tue, 21 Oct 2014 17:32:22 -0700
> Subject: Re: Any idea which approaches to non-liniear svm are easily parallelizable?
> To: dev@mahout.apache.org
> 
> Last I heard, the best methods pre-project and do linear SVM.
> 
> Beyond that, I would guess that deep learning techniques would subsume
> non-linear SVM pretty easily.  The best parallel implementation I know for
> that is in H2O.
> 
> 
> 
> On Tue, Oct 21, 2014 at 4:12 PM, Dmitriy Lyubimov <dl...@gmail.com> wrote:
> 
> > in particular, from libSVM --
> > http://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf ?
> >
> > thanks.
> > -d
> >

Re: Any idea which approaches to non-liniear svm are easily parallelizable?

Posted by Ted Dunning <te...@gmail.com>.

Last I heard, the best methods pre-project and do linear SVM.

Beyond that, I would guess that deep learning techniques would subsume
non-linear SVM pretty easily.  The best parallel implementation I know for
that is in H2O.

On Tue, Oct 21, 2014 at 4:12 PM, Dmitriy Lyubimov <dl...@gmail.com> wrote:

> in particular, from libSVM --
> http://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf ?
>
> thanks.
> -d
>