You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Olga Natkovich (JIRA)" <ji...@apache.org> on 2009/02/19 20:22:02 UTC

[jira] Resolved: (PIG-683) Semantics of TOKENIZE are not clear

     [ https://issues.apache.org/jira/browse/PIG-683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Olga Natkovich resolved PIG-683.
--------------------------------

    Resolution: Invalid

We can't change semantics of an existing functions. If users need a different interface, they can create another function that suits their needs

> Semantics of TOKENIZE are not clear
> -----------------------------------
>
>                 Key: PIG-683
>                 URL: https://issues.apache.org/jira/browse/PIG-683
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: types_branch
>            Reporter: Santhosh Srinivasan
>             Fix For: types_branch
>
>
> The semantics of TOKENIZE are not clear. In its current form, TOKENIZE takes as input a string and returns a bag. The bag contains 1 tuple per token. The tuple in turn contains a single token. A better approach would be to return a tuple (instead of a bag) that contains as many elements as there are tokens.
> On a secondary note, the outputSchema method in TOKENIZE is broken. It should return a bag with a tuple that contains a string and not just a string.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.