You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Olga Natkovich (JIRA)" <ji...@apache.org> on 2010/08/17 21:24:27 UTC

[jira] Commented: (PIG-1420) Make CONCAT act on all fields of a tuple, instead of just the first two fields of a tuple

    [ https://issues.apache.org/jira/browse/PIG-1420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12899537#action_12899537 ] 

Olga Natkovich commented on PIG-1420:
-------------------------------------

I could not figure out how to re-open this issue. However, the code does not work in pig script. The main reason is that the code that selects which function to use does not deal yet with non-fixed number of arguments. 

grunt> A = load 'studentab10k' as (name: chararray, age: chararray, gpa: chararray);
grunt> B = foreach A generate CONCAT(name, age, gpa);
grunt> C = limit B 10;
grunt> dump C;
2010-08-17 12:17:41,635 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1045: Could not infer the matching function for org.apache.pig.builtin.CONCAT as multiple or none of them fit. Please use an explicit cast.
Details at logfile: /homes/olgan/pig_1282072550328.log
grunt>


> Make CONCAT act on all fields of a tuple, instead of just the first two fields of a tuple
> -----------------------------------------------------------------------------------------
>
>                 Key: PIG-1420
>                 URL: https://issues.apache.org/jira/browse/PIG-1420
>             Project: Pig
>          Issue Type: Improvement
>          Components: impl
>    Affects Versions: 0.8.0
>            Reporter: Russell Jurney
>            Assignee: Russell Jurney
>             Fix For: 0.8.0
>
>         Attachments: addconcat2.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> org.apache.pig.builtin.CONCAT (which acts on DataByteArray's internally) and org.apache.pig.builtin.StringConcat (which acts on Strings internally), both act on the first two fields of a tuple.  This results in ugly nested CONCAT calls like:
> CONCAT(CONCAT(A, ' '), B)
> The more desirable form is:
> CONCAT(A, ' ', B)
> This change will be backwards compatible, provided that no one was relying on the fact that CONCAT ignores fields after the first two in a tuple.  This seems a reasonable assumption to make, or at least a small break in compatibility for a sizable improvement.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.