You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@madlib.apache.org by mktal <gi...@git.apache.org> on 2016/02/23 07:46:30 UTC

[GitHub] incubator-madlib pull request: SVM: Add Polynomial Kernel

GitHub user mktal opened a pull request:

    https://github.com/apache/incubator-madlib/pull/22

    SVM: Add Polynomial Kernel

    JIRA: MADLIB-938
    
    The features are transformed using Kar and Karnick's appoach, which applies to any positive definite
    kernel where the kernel function admits a Maclaurin expansion. This commit implements the special case
    when the kernel function is polynomial.
    
    @orhankislal Please review and comment

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/mktal/incubator-madlib feature/svm_poly

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-madlib/pull/22.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #22
    
----
commit b1d6c642fc37a353adefd9bb728a20e68ea16621
Author: Xiaocheng Tang <xi...@gmail.com>
Date:   2015-12-22T20:36:23Z

    SVM: Add Polynomial Kernel
    
    JIRA: MADLIB-938
    
    The features are transformed using Kar and Karnick's appoach, which applies to any positive definite
    kernel where the kernel function admits a Maclaurin expansion. This commit implements the special case
    when the kernel function is polynomial.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-madlib pull request: SVM: Add Polynomial Kernel

Posted by mktal <gi...@git.apache.org>.
Github user mktal commented on the pull request:

    https://github.com/apache/incubator-madlib/pull/22#issuecomment-189446520
  
    @iyerr3 Please review and merge


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-madlib pull request: SVM: Add Polynomial Kernel

Posted by iyerr3 <gi...@git.apache.org>.
Github user iyerr3 commented on the pull request:

    https://github.com/apache/incubator-madlib/pull/22#issuecomment-190332127
  
    I merged this (commit: 3e576c3) but forgot to close the PR in the commit message. 
    
    @mktal Please close this. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-madlib pull request: SVM: Add Polynomial Kernel

Posted by iyerr3 <gi...@git.apache.org>.
Github user iyerr3 commented on a diff in the pull request:

    https://github.com/apache/incubator-madlib/pull/22#discussion_r54290851
  
    --- Diff: methods/array_ops/src/pg_gp/array_ops.c ---
    @@ -702,8 +702,14 @@ array_sub(PG_FUNCTION_ARGS){
     PG_FUNCTION_INFO_V1(array_mult);
     Datum
     array_mult(PG_FUNCTION_ARGS){
    -    if (PG_ARGISNULL(0)) { PG_RETURN_NULL(); }
    -    if (PG_ARGISNULL(1)) { PG_RETURN_NULL(); }
    +    if (PG_ARGISNULL(0) && PG_ARGISNULL(1)) { PG_RETURN_NULL(); }
    --- End diff --
    
    Agree with @decibel . Since array_mult/prod is not being used in SVM, we can remove the change from here and have a separate discussion for it.  


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-madlib pull request: SVM: Add Polynomial Kernel

Posted by mktal <gi...@git.apache.org>.
Github user mktal commented on a diff in the pull request:

    https://github.com/apache/incubator-madlib/pull/22#discussion_r54285600
  
    --- Diff: methods/array_ops/src/pg_gp/array_ops.c ---
    @@ -702,8 +702,14 @@ array_sub(PG_FUNCTION_ARGS){
     PG_FUNCTION_INFO_V1(array_mult);
     Datum
     array_mult(PG_FUNCTION_ARGS){
    -    if (PG_ARGISNULL(0)) { PG_RETURN_NULL(); }
    -    if (PG_ARGISNULL(1)) { PG_RETURN_NULL(); }
    +    if (PG_ARGISNULL(0) && PG_ARGISNULL(1)) { PG_RETURN_NULL(); }
    --- End diff --
    
    Generally I try to avoid treating NULL as `zero` in multiplication because it produces zero even though only one entry of your data is NULL. Having said that, in numpy `np.nan*2 = np.nan`. But  `np.nan + 2` also results in np.nan.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-madlib pull request: SVM: Add Polynomial Kernel

Posted by decibel <gi...@git.apache.org>.
Github user decibel commented on a diff in the pull request:

    https://github.com/apache/incubator-madlib/pull/22#discussion_r54286578
  
    --- Diff: methods/array_ops/src/pg_gp/array_ops.c ---
    @@ -702,8 +702,14 @@ array_sub(PG_FUNCTION_ARGS){
     PG_FUNCTION_INFO_V1(array_mult);
     Datum
     array_mult(PG_FUNCTION_ARGS){
    -    if (PG_ARGISNULL(0)) { PG_RETURN_NULL(); }
    -    if (PG_ARGISNULL(1)) { PG_RETURN_NULL(); }
    +    if (PG_ARGISNULL(0) && PG_ARGISNULL(1)) { PG_RETURN_NULL(); }
    --- End diff --
    
    This is very much against the database view of how NULLs work. Almost anything you do with NULL should result in NULL.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-madlib pull request: SVM: Add Polynomial Kernel

Posted by mktal <gi...@git.apache.org>.
Github user mktal closed the pull request at:

    https://github.com/apache/incubator-madlib/pull/22


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-madlib pull request: SVM: Add Polynomial Kernel

Posted by iyerr3 <gi...@git.apache.org>.
Github user iyerr3 commented on a diff in the pull request:

    https://github.com/apache/incubator-madlib/pull/22#discussion_r54282752
  
    --- Diff: methods/array_ops/src/pg_gp/array_ops.c ---
    @@ -702,8 +702,14 @@ array_sub(PG_FUNCTION_ARGS){
     PG_FUNCTION_INFO_V1(array_mult);
     Datum
     array_mult(PG_FUNCTION_ARGS){
    -    if (PG_ARGISNULL(0)) { PG_RETURN_NULL(); }
    -    if (PG_ARGISNULL(1)) { PG_RETURN_NULL(); }
    +    if (PG_ARGISNULL(0) && PG_ARGISNULL(1)) { PG_RETURN_NULL(); }
    --- End diff --
    
    Not clear why this change is needed - this is saying that NULL * X = X, which indicates that NULL is treated as identity. The database does not make that assumption. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-madlib pull request: SVM: Add Polynomial Kernel

Posted by mktal <gi...@git.apache.org>.
Github user mktal commented on a diff in the pull request:

    https://github.com/apache/incubator-madlib/pull/22#discussion_r54284271
  
    --- Diff: methods/array_ops/src/pg_gp/array_ops.c ---
    @@ -702,8 +702,14 @@ array_sub(PG_FUNCTION_ARGS){
     PG_FUNCTION_INFO_V1(array_mult);
     Datum
     array_mult(PG_FUNCTION_ARGS){
    -    if (PG_ARGISNULL(0)) { PG_RETURN_NULL(); }
    -    if (PG_ARGISNULL(1)) { PG_RETURN_NULL(); }
    +    if (PG_ARGISNULL(0) && PG_ARGISNULL(1)) { PG_RETURN_NULL(); }
    --- End diff --
    
    This change is made so that array_mult can be used as SFUNC in UDA, e.g., madlib.prod() to make sure that the null state is initialized with the first tuple in the table. This is consistent with array_add that NULL + X = X


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-madlib pull request: SVM: Add Polynomial Kernel

Posted by mktal <gi...@git.apache.org>.
Github user mktal commented on a diff in the pull request:

    https://github.com/apache/incubator-madlib/pull/22#discussion_r54284483
  
    --- Diff: methods/array_ops/src/pg_gp/array_ops.c ---
    @@ -702,8 +702,14 @@ array_sub(PG_FUNCTION_ARGS){
     PG_FUNCTION_INFO_V1(array_mult);
     Datum
     array_mult(PG_FUNCTION_ARGS){
    -    if (PG_ARGISNULL(0)) { PG_RETURN_NULL(); }
    -    if (PG_ARGISNULL(1)) { PG_RETURN_NULL(); }
    +    if (PG_ARGISNULL(0) && PG_ARGISNULL(1)) { PG_RETURN_NULL(); }
    --- End diff --
    
    Well, I agree that NULL being treated as identity is a little weird. In that sense it is not consistent with array_add


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-madlib pull request: SVM: Add Polynomial Kernel

Posted by mktal <gi...@git.apache.org>.
Github user mktal commented on the pull request:

    https://github.com/apache/incubator-madlib/pull/22#issuecomment-189421035
  
    Well, I agree that NULL being treated as identity is a little weird. In that sense it is not consistent with array_add


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-madlib pull request: SVM: Add Polynomial Kernel

Posted by mktal <gi...@git.apache.org>.
Github user mktal commented on the pull request:

    https://github.com/apache/incubator-madlib/pull/22#issuecomment-189025801
  
    This should fix it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-madlib pull request: SVM: Add Polynomial Kernel

Posted by mktal <gi...@git.apache.org>.
Github user mktal commented on the pull request:

    https://github.com/apache/incubator-madlib/pull/22#issuecomment-189412184
  
    @iyerr3 Please review and merge


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---