You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Olga Natkovich (JIRA)" <ji...@apache.org> on 2011/05/09 20:32:03 UTC

[jira] [Resolved] (PIG-2045) Pig treating map values as String causing ClassCastException in CONCAT

     [ https://issues.apache.org/jira/browse/PIG-2045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Olga Natkovich resolved PIG-2045.
---------------------------------

    Resolution: Invalid

This is expexcted behavior and map keys need a cast. Otherwise, there is a mismatch bewtween function selected - one that handles bytearray and actual data producing strings

> Pig treating map values as String  causing ClassCastException in CONCAT
> -----------------------------------------------------------------------
>
>                 Key: PIG-2045
>                 URL: https://issues.apache.org/jira/browse/PIG-2045
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.8.0, 0.9.0
>            Reporter: Vivek Padmanabhan
>
> I have the below script ;
> {code}
> register mymapudf.jar;
> a = load '4523893_1' as (f1);
> a1 = foreach a generate org.vivek.udfs.mToMapUDF(f1);
> a2 = foreach a1 generate mapout#'k1' as str1,mapout#'k3' as str2;
> b = load '4523893_2' as (f1,f2);
> c = join a2 by CONCAT(str1,str2) , b by CONCAT(f1,f2);
> dump c;
> {code}
> The udf looks like below;
> {code}
> public class mToMapUDF  extends EvalFunc<Map> {
> 	@Override
> 	public Map<String, Object> exec(Tuple arg0) throws IOException {
> 		Map <String,Object> myMapTResult =  new HashMap<String, Object>();
> 		myMapTResult.put("k1", "SomeString");
> 		myMapTResult.put("k3", "SomeOtherString");
> 		return myMapTResult;
> 	}
> 	@Override
> 	public Schema outputSchema(Schema input) {
> 		// TODO Auto-generated method stub
> 		return new Schema(new Schema.FieldSchema("mapout",DataType.MAP));
> 	}
> }
> {code}
> The script fails with exception ;
>  java.lang.ClassCastException: java.lang.String cannot be cast to org.apache.pig.data.DataByteArray
> 	at org.apache.pig.builtin.CONCAT.exec(CONCAT.java:51)
> The values of the map output, ie str1 and str2, is autmomatically treated as String by Pig and this causes the ClassCast exception when it is used in subsequent udfs.
> Since there are no explicit casting done nor any types defined, Pig should treat the values as the default bytearray. This issue is also observed in 0.9
> The workaround in this case is to cast explicitly to chararray all keys involved in join.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira