You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Gianmarco De Francisci Morales (JIRA)" <ji...@apache.org> on 2011/03/22 17:53:05 UTC

[jira] [Commented] (PIG-1926) Sample/Limit should take scalar

    [ https://issues.apache.org/jira/browse/PIG-1926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13009714#comment-13009714 ] 

Gianmarco De Francisci Morales commented on PIG-1926:
-----------------------------------------------------

This looks quite useful.
If Ciemiewicz is already working on Reservoir sampling, I think I will apply for this one.
I assume there will probably be some integration point between the two projects.

> Sample/Limit should take scalar
> -------------------------------
>
>                 Key: PIG-1926
>                 URL: https://issues.apache.org/jira/browse/PIG-1926
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Daniel Dai
>              Labels: gsoc2011
>
> Currently, Limit, Sample only takes a constant. It would be better we can use a scalar in the place of constant. Eg:
> {code}
> a = load 'a.txt';
> b = group a by all;
> c = foreach b generate COUNT(*) as sum;
> d = order a by $0;
> e = limit d c.sum/100;
> {code}
> This is a candidate project for Google summer of code 2011. More information about the program can be found at http://wiki.apache.org/pig/GSoc2011

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira