You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Ross Levin <ro...@simulmedia.com> on 2013/11/11 20:31:33 UTC

Developing a GenericUDAF

Hello,

I'm writing a generic UDAF function that closely resembles SUM() with the
main difference being that it accepts an array datatype parameter and
returns an array datatype.

I've already done this for a GenericUDF successfully. I believe I am having
difficulty coding the proper ObjectInspectors for my parameter & return
objects since I am getting .ClassCastException exceptions for Long ->
LongArray.  I am using a hybrid of the GenericUDAFSum.java sample and the
GenericUDAFCollect sample from the Programming Hive book.

My parameter is a fixed length array of longs and the return is the same
length array of longs.  As with the SUM function, I do not need to keep the
individual row values that I collect, I can iterate the array, SUM it to
the container and move on to the next row.  With this in mind, I think I
can disregard having an internalMergeOI.

Any input is appreciated.

Thanks,
Ross


Here is the exception:
-----
Diagnostic Messages for this Task:
java.lang.RuntimeException: Hive Runtime Error while closing operators
        at
org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:232)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:441)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:377)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1132)
        at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException:
java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot be
cast to [Ljava.lang.Object;
        at
org.apache.hadoop.hive.ql.exec.GroupByOperator.closeOp(GroupByOperator.java:1137)
        at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:588)
        at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:597)
        at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:597)
        at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:597)
        at
org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:199)
        ... 8 more
Caused by: java.lang.ClassCastException: org.apache.hadoop.io.LongWritable
cannot be cast to [Ljava.lang.Object;
        at
org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector.getListLength(StandardListObjectInspector.java:83)
        at
org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serialize(LazyBinarySerDe.java:418)
        at
org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serialize(LazyBinarySerDe.java:438)
        at
org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serializeStruct(LazyBinarySerDe.java:257)
        at
org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serialize(LazyBinarySerDe.java:204)
        at
org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:245)
        at
org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:502)
        at
org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:832)
        at
org.apache.hadoop.hive.ql.exec.GroupByOperator.forward(GroupByOperator.java:1066)
        at
org.apache.hadoop.hive.ql.exec.GroupByOperator.closeOp(GroupByOperator.java:1118)
        ... 13 more

Here is the pertinent code:

       @Override
        public ObjectInspector init(Mode m, ObjectInspector[] parameters)
throws HiveException
        {
            super.init(m, parameters);
            if (m == Mode.PARTIAL1)
            {
                System.out.println("1 - init() mode: " + m + "
parameter[0]=" + parameters[0].toString());
                inputOI = (StandardListObjectInspector) parameters[0];
                return
ObjectInspectorFactory.getStandardListObjectInspector(inputOI);
            }
            else
            {
                System.out.println("2 - init() mode: " + m + "
parameter[0]=" + parameters[0].toString());
                JavaLongObjectInspector doi;
                doi =
PrimitiveObjectInspectorFactory.javaLongObjectInspector;

                // Set up the list object inspector for the output, and
return it
                ListObjectInspector loi;
                loi =
ObjectInspectorFactory.getStandardListObjectInspector(doi);
                return loi;


 //               inputOI = (StandardListObjectInspector)
ObjectInspectorUtils.getStandardObjectInspector(parameters[0]);
 //               return (StandardListObjectInspector)
ObjectInspectorFactory.getStandardListObjectInspector(inputOI);
            }
        }

        static class BitmapAggregationBuffer implements AggregationBuffer {
            ArrayList<Object> container;
        }

        @Override
        public Object terminate(AggregationBuffer agg) throws HiveException
        {
            BitmapAggregationBuffer myAgg = (BitmapAggregationBuffer) agg;
            return myAgg.container;
        }


-- 

Ross Levin
Principal Software Engineer
*Simulmedia* | *People Ads Want*
(m) 609.760.5027
670 Broadway, 2nd Floor, New York, NY 10012


*Check out our new data and tools for TV* *Advertising at
**OpenAccess*<http://www.simulmedia.com/OpenAccess/>
*.*

Re: Developing a GenericUDAF

Posted by Ross Levin <ro...@simulmedia.com>.
Thanks Navis, that got me past this exception!

Ross


On Mon, Nov 11, 2013 at 6:03 PM, Navis류승우 <na...@nexr.com> wrote:

> in handling PARTIAL1,
>
> inputOI = (StandardListObjectInspector) parameters[0];
> return ObjectInspectorFactory.getStandardListObjectInspector(inputOI);
>
> 1.
> inputOI is not guaranteed to be a StandardListObjectInspector.
> Use ListObjectInspector instead.
>
> 2.
> ObjectInspectorFactory.getStandardListObjectInspector(inputOI)
>
> this is list of list. What you meant to be
>
>
> ObjectInspectorFactory.getStandardListObjectInspector(PrimitiveObjectInspectorFactory.javaLongObjectInspector)
>
>
>
> 2013/11/12 Ross Levin <ro...@simulmedia.com>
>
>> Hello,
>>
>> I'm writing a generic UDAF function that closely resembles SUM() with the
>> main difference being that it accepts an array datatype parameter and
>> returns an array datatype.
>>
>> I've already done this for a GenericUDF successfully. I believe I am
>> having difficulty coding the proper ObjectInspectors for my parameter &
>> return objects since I am getting .ClassCastException exceptions for Long
>> -> LongArray.  I am using a hybrid of the GenericUDAFSum.java sample and
>> the GenericUDAFCollect sample from the Programming Hive book.
>>
>> My parameter is a fixed length array of longs and the return is the same
>> length array of longs.  As with the SUM function, I do not need to keep the
>> individual row values that I collect, I can iterate the array, SUM it to
>> the container and move on to the next row.  With this in mind, I think I
>> can disregard having an internalMergeOI.
>>
>> Any input is appreciated.
>>
>> Thanks,
>> Ross
>>
>>
>> Here is the exception:
>> -----
>> Diagnostic Messages for this Task:
>> java.lang.RuntimeException: Hive Runtime Error while closing operators
>>         at
>> org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:232)
>>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
>>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:441)
>>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:377)
>>         at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>>         at java.security.AccessController.doPrivileged(Native Method)
>>         at javax.security.auth.Subject.doAs(Subject.java:415)
>>         at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1132)
>>         at org.apache.hadoop.mapred.Child.main(Child.java:249)
>> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException:
>> java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot be
>> cast to [Ljava.lang.Object;
>>         at
>> org.apache.hadoop.hive.ql.exec.GroupByOperator.closeOp(GroupByOperator.java:1137)
>>         at
>> org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:588)
>>         at
>> org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:597)
>>         at
>> org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:597)
>>         at
>> org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:597)
>>         at
>> org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:199)
>>         ... 8 more
>> Caused by: java.lang.ClassCastException:
>> org.apache.hadoop.io.LongWritable cannot be cast to [Ljava.lang.Object;
>>         at
>> org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector.getListLength(StandardListObjectInspector.java:83)
>>         at
>> org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serialize(LazyBinarySerDe.java:418)
>>         at
>> org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serialize(LazyBinarySerDe.java:438)
>>         at
>> org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serializeStruct(LazyBinarySerDe.java:257)
>>         at
>> org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serialize(LazyBinarySerDe.java:204)
>>         at
>> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:245)
>>         at
>> org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:502)
>>         at
>> org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:832)
>>         at
>> org.apache.hadoop.hive.ql.exec.GroupByOperator.forward(GroupByOperator.java:1066)
>>         at
>> org.apache.hadoop.hive.ql.exec.GroupByOperator.closeOp(GroupByOperator.java:1118)
>>         ... 13 more
>>
>> Here is the pertinent code:
>>
>>        @Override
>>         public ObjectInspector init(Mode m, ObjectInspector[] parameters)
>> throws HiveException
>>         {
>>             super.init(m, parameters);
>>             if (m == Mode.PARTIAL1)
>>             {
>>                 System.out.println("1 - init() mode: " + m + "
>> parameter[0]=" + parameters[0].toString());
>>                 inputOI = (StandardListObjectInspector) parameters[0];
>>                 return
>> ObjectInspectorFactory.getStandardListObjectInspector(inputOI);
>>             }
>>             else
>>             {
>>                 System.out.println("2 - init() mode: " + m + "
>> parameter[0]=" + parameters[0].toString());
>>                 JavaLongObjectInspector doi;
>>                 doi =
>> PrimitiveObjectInspectorFactory.javaLongObjectInspector;
>>
>>                 // Set up the list object inspector for the output, and
>> return it
>>                 ListObjectInspector loi;
>>                 loi =
>> ObjectInspectorFactory.getStandardListObjectInspector(doi);
>>                 return loi;
>>
>>
>>  //               inputOI = (StandardListObjectInspector)
>> ObjectInspectorUtils.getStandardObjectInspector(parameters[0]);
>>  //               return (StandardListObjectInspector)
>> ObjectInspectorFactory.getStandardListObjectInspector(inputOI);
>>             }
>>         }
>>
>>         static class BitmapAggregationBuffer implements AggregationBuffer
>> {
>>             ArrayList<Object> container;
>>         }
>>
>>         @Override
>>         public Object terminate(AggregationBuffer agg) throws
>> HiveException
>>         {
>>             BitmapAggregationBuffer myAgg = (BitmapAggregationBuffer) agg;
>>             return myAgg.container;
>>         }
>>
>>
>> --
>>
>> Ross Levin
>> Principal Software Engineer
>> *Simulmedia* | *People Ads Want*
>> (m) 609.760.5027
>> 670 Broadway, 2nd Floor, New York, NY 10012
>>
>>
>> *Check out our new data and tools for TV* *Advertising at **OpenAccess*<http://www.simulmedia.com/OpenAccess/>
>> *.*
>>
>
>


-- 

Ross Levin
Principal Software Engineer
*Simulmedia* | *People Ads Want*
(m) 609.760.5027
670 Broadway, 2nd Floor, New York, NY 10012


*Check out our new data and tools for TV* *Advertising at
**OpenAccess*<http://www.simulmedia.com/OpenAccess/>
*.*

Re: Developing a GenericUDAF

Posted by Navis류승우 <na...@nexr.com>.
in handling PARTIAL1,

inputOI = (StandardListObjectInspector) parameters[0];
return ObjectInspectorFactory.getStandardListObjectInspector(inputOI);

1.
inputOI is not guaranteed to be a StandardListObjectInspector.
Use ListObjectInspector instead.

2.
ObjectInspectorFactory.getStandardListObjectInspector(inputOI)

this is list of list. What you meant to be

ObjectInspectorFactory.getStandardListObjectInspector(PrimitiveObjectInspectorFactory.javaLongObjectInspector)



2013/11/12 Ross Levin <ro...@simulmedia.com>

> Hello,
>
> I'm writing a generic UDAF function that closely resembles SUM() with the
> main difference being that it accepts an array datatype parameter and
> returns an array datatype.
>
> I've already done this for a GenericUDF successfully. I believe I am
> having difficulty coding the proper ObjectInspectors for my parameter &
> return objects since I am getting .ClassCastException exceptions for Long
> -> LongArray.  I am using a hybrid of the GenericUDAFSum.java sample and
> the GenericUDAFCollect sample from the Programming Hive book.
>
> My parameter is a fixed length array of longs and the return is the same
> length array of longs.  As with the SUM function, I do not need to keep the
> individual row values that I collect, I can iterate the array, SUM it to
> the container and move on to the next row.  With this in mind, I think I
> can disregard having an internalMergeOI.
>
> Any input is appreciated.
>
> Thanks,
> Ross
>
>
> Here is the exception:
> -----
> Diagnostic Messages for this Task:
> java.lang.RuntimeException: Hive Runtime Error while closing operators
>         at
> org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:232)
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:441)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:377)
>         at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1132)
>         at org.apache.hadoop.mapred.Child.main(Child.java:249)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException:
> java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot be
> cast to [Ljava.lang.Object;
>         at
> org.apache.hadoop.hive.ql.exec.GroupByOperator.closeOp(GroupByOperator.java:1137)
>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:588)
>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:597)
>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:597)
>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:597)
>         at
> org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:199)
>         ... 8 more
> Caused by: java.lang.ClassCastException: org.apache.hadoop.io.LongWritable
> cannot be cast to [Ljava.lang.Object;
>         at
> org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector.getListLength(StandardListObjectInspector.java:83)
>         at
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serialize(LazyBinarySerDe.java:418)
>         at
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serialize(LazyBinarySerDe.java:438)
>         at
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serializeStruct(LazyBinarySerDe.java:257)
>         at
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serialize(LazyBinarySerDe.java:204)
>         at
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:245)
>         at
> org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:502)
>         at
> org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:832)
>         at
> org.apache.hadoop.hive.ql.exec.GroupByOperator.forward(GroupByOperator.java:1066)
>         at
> org.apache.hadoop.hive.ql.exec.GroupByOperator.closeOp(GroupByOperator.java:1118)
>         ... 13 more
>
> Here is the pertinent code:
>
>        @Override
>         public ObjectInspector init(Mode m, ObjectInspector[] parameters)
> throws HiveException
>         {
>             super.init(m, parameters);
>             if (m == Mode.PARTIAL1)
>             {
>                 System.out.println("1 - init() mode: " + m + "
> parameter[0]=" + parameters[0].toString());
>                 inputOI = (StandardListObjectInspector) parameters[0];
>                 return
> ObjectInspectorFactory.getStandardListObjectInspector(inputOI);
>             }
>             else
>             {
>                 System.out.println("2 - init() mode: " + m + "
> parameter[0]=" + parameters[0].toString());
>                 JavaLongObjectInspector doi;
>                 doi =
> PrimitiveObjectInspectorFactory.javaLongObjectInspector;
>
>                 // Set up the list object inspector for the output, and
> return it
>                 ListObjectInspector loi;
>                 loi =
> ObjectInspectorFactory.getStandardListObjectInspector(doi);
>                 return loi;
>
>
>  //               inputOI = (StandardListObjectInspector)
> ObjectInspectorUtils.getStandardObjectInspector(parameters[0]);
>  //               return (StandardListObjectInspector)
> ObjectInspectorFactory.getStandardListObjectInspector(inputOI);
>             }
>         }
>
>         static class BitmapAggregationBuffer implements AggregationBuffer {
>             ArrayList<Object> container;
>         }
>
>         @Override
>         public Object terminate(AggregationBuffer agg) throws HiveException
>         {
>             BitmapAggregationBuffer myAgg = (BitmapAggregationBuffer) agg;
>             return myAgg.container;
>         }
>
>
> --
>
> Ross Levin
> Principal Software Engineer
> *Simulmedia* | *People Ads Want*
> (m) 609.760.5027
> 670 Broadway, 2nd Floor, New York, NY 10012
>
>
> *Check out our new data and tools for TV* *Advertising at **OpenAccess*<http://www.simulmedia.com/OpenAccess/>
> *.*
>