You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@madlib.apache.org by "Frank McQuillan (JIRA)" <ji...@apache.org> on 2017/08/30 23:08:00 UTC

[jira] [Created] (MADLIB-1156) Improve and promote sampling algorithms to top level modules

Frank McQuillan created MADLIB-1156:
---------------------------------------

             Summary: Improve and promote sampling algorithms to top level modules
                 Key: MADLIB-1156
                 URL: https://issues.apache.org/jira/browse/MADLIB-1156
             Project: Apache MADlib
          Issue Type: Improvement
          Components: Module: Sampling
            Reporter: Frank McQuillan
             Fix For: v2.0


Story

As a MADlib user, I want to sample a data table using the different techniques and distributions described in references [1] and [2], so that I can do model building using the sampled data sets.  Also, I want to ensure that these algorithms are properly documented and tested.  

Candidate for 2.0 since may involve interface changes

Acceptance

1) Are these algorithms ready to promote from software quality perspective?
2) Define and implement any interface changes required
3) Define and implement any IC and Tinc tests required.
4) Write documentation and provide examples.

References

[1] Existing MADlib sample function
http://madlib.incubator.apache.org/docs/latest/group__grp__sample.html

[2] Other MADlib sample functions
http://madlib.incubator.apache.org/docs/latest/sample_8sql__in.html



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)