You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Julien Le Dem (JIRA)" <ji...@apache.org> on 2012/12/06 23:42:35 UTC

[jira] [Created] (PIG-3082) outputSchema of a UDF allows two usages when describing a Tuple schema

Julien Le Dem created PIG-3082:
----------------------------------

             Summary: outputSchema of a UDF allows two usages when describing a Tuple schema
                 Key: PIG-3082
                 URL: https://issues.apache.org/jira/browse/PIG-3082
             Project: Pig
          Issue Type: Bug
            Reporter: Julien Le Dem


When defining an evalfunc that returns a Tuple there are two ways you can implement outputSchema().
- The right way: return a schema that contains one Field that contains the type and schema of the return type of the UDF
- The unreliable way: return a schema that contains more than one field and it will be understood as a tuple schema even though there is no type (which is in Field class) to specify that. This is particularly deceitful when the output schema is derived from the input schema and the outputted Tuple sometimes contain only one field. In such cases Pig understands the output schema as a tuple only if there is more than one field. And sometimes it works, sometimes it does not.

We should at least issue a warning (backward compatibility) if not plain throw an exception when the output schema contains more than one Field.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira