You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Richard Ding (JIRA)" <ji...@apache.org> on 2010/08/02 21:14:16 UTC
[jira] Commented: (PIG-1434) Allow casting relations to scalars
[ https://issues.apache.org/jira/browse/PIG-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12894657#action_12894657 ]
Richard Ding commented on PIG-1434:
-----------------------------------
It looks really good. Additional corner cases need to be considered:
* missing field in scalar file
* empty scalar file
* empty input directory
Some minor issues:
* In PigServer. FileLocalizer.getTemporaryPath should be changed to use non-deprecated method.
* In ScalarFinder, method isScalarPresent is not used.
* In MRCompiler, variable scalarPhyFinder should be local so that ScalarPhyFinder can be simplified.
Also add a test case for using scalar in multi-query would be good.
> Allow casting relations to scalars
> ----------------------------------
>
> Key: PIG-1434
> URL: https://issues.apache.org/jira/browse/PIG-1434
> Project: Pig
> Issue Type: Improvement
> Reporter: Olga Natkovich
> Assignee: Aniket Mokashi
> Fix For: 0.8.0
>
> Attachments: scalarImpl.patch, ScalarImpl1.patch, ScalarImpl5.patch, ScalarImplFinale.patch
>
>
> This jira is to implement a simplified version of the functionality described in https://issues.apache.org/jira/browse/PIG-801.
> The proposal is to allow casting relations to scalar types in foreach.
> Example:
> A = load 'data' as (x, y, z);
> B = group A all;
> C = foreach B generate COUNT(A);
> .....
> X = ....
> Y = foreach X generate $1/(long) C;
> Couple of additional comments:
> (1) You can only cast relations including a single value or an error will be reported
> (2) Name resolution is needed since relation X might have field named C in which case that field takes precedence.
> (3) Y will look for C closest to it.
> Implementation thoughts:
> The idea is to store C into a file and then convert it into scalar via a UDF. I believe we already have a UDF that Ben Reed contributed for this purpose. Most of the work would be to update the logical plan to
> (1) Store C
> (2) convert the cast to the UDF
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.