You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@madlib.apache.org by "Frank McQuillan (JIRA)" <ji...@apache.org> on 2016/04/13 22:42:25 UTC

[jira] [Created] (MADLIB-990) SVM - novelty detection using 1-class SVM

Frank McQuillan created MADLIB-990:
--------------------------------------

             Summary: SVM - novelty detection using 1-class SVM
                 Key: MADLIB-990
                 URL: https://issues.apache.org/jira/browse/MADLIB-990
             Project: Apache MADlib
          Issue Type: New Feature
            Reporter: Frank McQuillan


Story

As a data scientist, I want to use  a one-class SVM so that I can decide whether a new observation belongs to the same distribution as existing observations (an inlier), or should be considered as different (an outlier). 

Acceptance

1) One-class SVM implemented with all supported kernel types (linear, gaussian, polynomial).
2) Output a T/F for not-novel/novel.

Note

a) Similar e1071 R package [1] with
type=one-classification (for novelty detection)

b) There is an important distinction between novelty detection (this story) and outlier detection for cleaning training data.  From reference [2]:

* novelty detection:  the training data is not polluted by outliers, and we are interested in detecting anomalies in new observations. <- this story
* outlier detection:  the training data contains outliers, and we need to fit the central mode of the training data, ignoring the deviant observations. <- we are *not* trying to solve this unsupervised learning problem in this story.

References

[1] e1071 R package
https://cran.r-project.org/web/packages/e1071/index.html

[2] Difference between novelty and outlier detection
http://scikit-learn.org/stable/modules/outlier_detection.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)