You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Olga Natkovich (JIRA)" <ji...@apache.org> on 2011/05/09 20:32:03 UTC
[jira] [Resolved] (PIG-2045) Pig treating map values as String
causing ClassCastException in CONCAT
[ https://issues.apache.org/jira/browse/PIG-2045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Olga Natkovich resolved PIG-2045.
---------------------------------
Resolution: Invalid
This is expexcted behavior and map keys need a cast. Otherwise, there is a mismatch bewtween function selected - one that handles bytearray and actual data producing strings
> Pig treating map values as String causing ClassCastException in CONCAT
> -----------------------------------------------------------------------
>
> Key: PIG-2045
> URL: https://issues.apache.org/jira/browse/PIG-2045
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.8.0, 0.9.0
> Reporter: Vivek Padmanabhan
>
> I have the below script ;
> {code}
> register mymapudf.jar;
> a = load '4523893_1' as (f1);
> a1 = foreach a generate org.vivek.udfs.mToMapUDF(f1);
> a2 = foreach a1 generate mapout#'k1' as str1,mapout#'k3' as str2;
> b = load '4523893_2' as (f1,f2);
> c = join a2 by CONCAT(str1,str2) , b by CONCAT(f1,f2);
> dump c;
> {code}
> The udf looks like below;
> {code}
> public class mToMapUDF extends EvalFunc<Map> {
> @Override
> public Map<String, Object> exec(Tuple arg0) throws IOException {
> Map <String,Object> myMapTResult = new HashMap<String, Object>();
> myMapTResult.put("k1", "SomeString");
> myMapTResult.put("k3", "SomeOtherString");
> return myMapTResult;
> }
> @Override
> public Schema outputSchema(Schema input) {
> // TODO Auto-generated method stub
> return new Schema(new Schema.FieldSchema("mapout",DataType.MAP));
> }
> }
> {code}
> The script fails with exception ;
> java.lang.ClassCastException: java.lang.String cannot be cast to org.apache.pig.data.DataByteArray
> at org.apache.pig.builtin.CONCAT.exec(CONCAT.java:51)
> The values of the map output, ie str1 and str2, is autmomatically treated as String by Pig and this causes the ClassCast exception when it is used in subsequent udfs.
> Since there are no explicit casting done nor any types defined, Pig should treat the values as the default bytearray. This issue is also observed in 0.9
> The workaround in this case is to cast explicitly to chararray all keys involved in join.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira