You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@madlib.apache.org by "Frank McQuillan (JIRA)" <ji...@apache.org> on 2016/03/09 01:17:40 UTC

[jira] [Updated] (MADLIB-604) SVM Regression Performance : Several data sets handling is much slower than libsvm

     [ https://issues.apache.org/jira/browse/MADLIB-604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Frank McQuillan updated MADLIB-604:
-----------------------------------
    Attachment: svm-regression-benchmarking.jpg

new SVM regression algo benchmark

> SVM Regression Performance : Several data sets handling is much slower than libsvm
> ----------------------------------------------------------------------------------
>
>                 Key: MADLIB-604
>                 URL: https://issues.apache.org/jira/browse/MADLIB-604
>             Project: Apache MADlib
>          Issue Type: Bug
>            Reporter: Jiali Yao
>            Assignee: Rahul Iyer
>             Fix For: v1.9
>
>         Attachments: svm-regression-benchmarking.jpg
>
>
> For several data sets, MADlib is slower than libsvm
> 1. Time differnce
> {code}
> Kernel is dot
> Data Sets 	MADlib(Para=true)	MADlib(Para=false)	libsvm	MADlib/libsvm
> cadata	        874.15	277.35	2	138.68
> etfidf	        501.61	1844.26	32	15.68
> kernel is Polymial
> cadata	932.13	8979.85	2761	0.34
> etfidf	2269.23	3175.87	33	68.76
> space	139.12	238.26	1	139.12
> kernel is Gaussian
> cadata	900.83	9130.2	1	900.83
> cpusmall	390.57	196.13	1	196.13
> 2. Test case example:
> SELECT madlib.svm_regression
>                         ( 'madlibtestdata.svm_cadata'::text     --input_table
>                         , 'madlibtestresult.reg_model_table'::text    --model_table
>                         , 'false'::boolean       --parallel
>                         , 'madlibtestdata.svm_polynomial'::text    --kernel_func
>                         , 'false'::boolean        --verbose
>                         , '0.1'::float8            --eta
>                         , '0.005'::float8             --nu
>                         , '0.05'::float8        --slambda
>                    ) AS q;
> {code}
> 3. Data sets
> 4. Parameter seting
> MADlib parameter: default value
> R parameter:
> {code}
> svm-train  -s 4 -t 0 -c $cost -n 0.005 
> eunite2001 0.0595
> E2006 0.0012 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)