You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Gianmarco De Francisci Morales (JIRA)" <ji...@apache.org> on 2011/03/22 17:53:05 UTC
[jira] [Commented] (PIG-1926) Sample/Limit should take scalar
[ https://issues.apache.org/jira/browse/PIG-1926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13009714#comment-13009714 ]
Gianmarco De Francisci Morales commented on PIG-1926:
-----------------------------------------------------
This looks quite useful.
If Ciemiewicz is already working on Reservoir sampling, I think I will apply for this one.
I assume there will probably be some integration point between the two projects.
> Sample/Limit should take scalar
> -------------------------------
>
> Key: PIG-1926
> URL: https://issues.apache.org/jira/browse/PIG-1926
> Project: Pig
> Issue Type: Improvement
> Reporter: Daniel Dai
> Labels: gsoc2011
>
> Currently, Limit, Sample only takes a constant. It would be better we can use a scalar in the place of constant. Eg:
> {code}
> a = load 'a.txt';
> b = group a by all;
> c = foreach b generate COUNT(*) as sum;
> d = order a by $0;
> e = limit d c.sum/100;
> {code}
> This is a candidate project for Google summer of code 2011. More information about the program can be found at http://wiki.apache.org/pig/GSoc2011
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira