You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Olga Natkovich (JIRA)" <ji...@apache.org> on 2010/07/10 02:25:50 UTC

[jira] Updated: (PIG-768) Schema of a relation reported by DESCRIBE and allowed operations on the relation are not compatible

     [ https://issues.apache.org/jira/browse/PIG-768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Olga Natkovich updated PIG-768:
-------------------------------

    Fix Version/s: 0.9.0

> Schema of a relation reported by DESCRIBE and allowed operations on the relation are not compatible
> ---------------------------------------------------------------------------------------------------
>
>                 Key: PIG-768
>                 URL: https://issues.apache.org/jira/browse/PIG-768
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.2.0
>            Reporter: George Mavromatis
>             Fix For: 0.9.0
>
>
> The DESCIBE command in the following script  prints:
> {s: bytearray, pg: bytearray, wm: bytearray}
> However, the script later treats the s field of urlMap as a map instead of a bytearray, as shown in s#'Url'.
> Pig does not complain about this contradiction and at execution time, the s field is treated as hash, although it was reported as byterray at parse time.
> Pig should either not report s as a byterray or exit with a parsing error.
> Note that all above operations happen before the query executes at the cluster.
> register WebDataProcessing.jar; 
> register opencrawl.jar; 
> urlMap = LOAD '$input' USING opencrawl.pigudf.WebDataLoader() AS (s, pg, wm);
> DESCRIBE urlMap;
> -- in fact the loader in the WebDataProcessing.jar populates s and pg as s:map[], pg:bag{t1:(contents:bytearray)}
> -- and defines that in determineSchema() but pig describe ignores it!
> urlMap2 = LIMIT urlMap 20;
> urlList2 = FOREACH urlMap2 GENERATE s#'Url', pg;
> DESCRIBE urlList2;
> STORE urlList2 INTO 'output2' USING BinStorage();

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.