You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Arun C Murthy (JIRA)" <ji...@apache.org> on 2008/04/04 10:51:24 UTC

[jira] Resolved: (PIG-185) Using cached data does not give me the expected result

     [ https://issues.apache.org/jira/browse/PIG-185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy resolved PIG-185.
-------------------------------

    Resolution: Invalid

Xu, cache doesn't work like ship().

You need to pass cache(<filename>#<linkname>) and then you can use ./linkname in ur streaming command.

> Using cached data does not give me the expected result
> ------------------------------------------------------
>
>                 Key: PIG-185
>                 URL: https://issues.apache.org/jira/browse/PIG-185
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Xu Zhang
>            Assignee: Arun C Murthy
>
> I was trying to run the following Pig script with the latest Pig stuff. Since essentially I was streaming 2 identical sets of data, I was expecting the final result which is the count of the name field to contain all even numbers. However, lots of odd number showed up in the actual result.
> {code}
> define X `perl -ne 'chomp $_; print "$_\n"' - ./user/pig/tests/data/singlefile/studenttab10k` cache('/user/pig/tests/data/singlefile/studenttab10k');
> A = load '/user/pig/tests/data/singlefile/studenttab10k';
> B = stream A through X as (name, age, gpa);
> C = group B by name;
> D = foreach C generate COUNT(B.$0);
> store D into 'results_22';
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.