You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hivemall.apache.org by takuti <gi...@git.apache.org> on 2017/08/07 07:32:05 UTC

[GitHub] incubator-hivemall pull request #108: [HIVEMALL-138] `to_ordered_map` UDAF w...

GitHub user takuti opened a pull request:

    https://github.com/apache/incubator-hivemall/pull/108

    [HIVEMALL-138] `to_ordered_map` UDAF with size limit

    ## What changes were proposed in this pull request?
    
    Implement `to_bounded_ordered_map` UDAF. The UDAF is an extended version of `to_ordered_map` which has limit of map size.
    
    `to_bounded_ordered_map` UDAF can be used as an alternative of `each_top_k` UDTF. The main difference is that the former actively utilizes mapper-side aggregation.
    
    ## What type of PR is it?
    
    Feature
    
    ## What is the Jira issue?
    
    https://issues.apache.org/jira/browse/HIVEMALL-138
    
    ## How was this patch tested?
    
    Manual test on local and EMR
    
    ## How to use this feature?
    
    ```
    to_bounded_ordered_map(key, value, size [, const boolean reverseOrder=false])  
    ```
    
    ```sql
    with t as (
        select 10 as key, 'apple' as value
        union all
        select 3 as key, 'banana' as value
        union all
        select 4 as key, 'candy' as value
    )
    select
        to_bounded_ordered_map(key, value, 1),
        to_bounded_ordered_map(key, value, 2),
        to_bounded_ordered_map(key, value, 3),
        to_bounded_ordered_map(key, value, 100),
        to_bounded_ordered_map(key, value, 1, true),
        to_bounded_ordered_map(key, value, 2, true),
        to_bounded_ordered_map(key, value, 3, true),
        to_bounded_ordered_map(key, value, 100, true)
    from t
    ;
    ```
    
    > {3:"banana"}    {3:"banana",4:"candy"}  {3:"banana",4:"candy",10:"apple"}       {3:"banana",4:"candy",10:"apple"}       {10:"apple"}    {10:"apple",4:"candy"}  {10:"apple",4:"candy",3:"banana"}    {10:"apple",4:"candy",3:"banana"}
    
    ## Checklist
    
    - [x] Did you apply source code formatter, i.e., `mvn formatter:format`, for your commit?


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/takuti/incubator-hivemall topk-ordered-map

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-hivemall/pull/108.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #108
    
----
commit 78403e8a3cb99b6bccdf2500ad5551d413345222
Author: Takuya Kitazawa <k....@gmail.com>
Date:   2017-08-07T05:26:13Z

    Fix typo

commit 46a23a2129ea74244e8a42b6aa5d9da9d5cf8ba1
Author: Takuya Kitazawa <k....@gmail.com>
Date:   2017-08-07T07:14:42Z

    Implement `to_bounded_ordered_map` UDAF

commit 3c029f9bd71adb70db8dfc48f6452362dacc164c
Author: Takuya Kitazawa <k....@gmail.com>
Date:   2017-08-07T07:24:15Z

    Throw an exception for invalid map size

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hivemall issue #108: [WIP][HIVEMALL-138] `to_ordered_map` UDAF wit...

Posted by myui <gi...@git.apache.org>.
Github user myui commented on the issue:

    https://github.com/apache/incubator-hivemall/pull/108
  
    @takuti `Do you have any other ideas?`
    
    No corresponding data structure. Then, not to use Data structure name such as `aggr_top_k(cmpkey, value)::values`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hivemall issue #108: [HIVEMALL-138] `to_ordered_map` UDAF with siz...

Posted by takuti <gi...@git.apache.org>.
Github user takuti commented on the issue:

    https://github.com/apache/incubator-hivemall/pull/108
  
    @myui This is PoC implementation. Could you give me feedback, esp. in terms of interface? I will update documentation after interface has been fixed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hivemall pull request #108: [HIVEMALL-138] `to_ordered_map` & `to_...

Posted by myui <gi...@git.apache.org>.
Github user myui commented on a diff in the pull request:

    https://github.com/apache/incubator-hivemall/pull/108#discussion_r138026120
  
    --- Diff: core/src/main/java/hivemall/tools/list/UDAFToOrderedList.java ---
    @@ -0,0 +1,535 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *   http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing,
    + * software distributed under the License is distributed on an
    + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
    + * KIND, either express or implied.  See the License for the
    + * specific language governing permissions and limitations
    + * under the License.
    + */
    +package hivemall.tools.list;
    +
    +import hivemall.utils.collections.BoundedPriorityQueue;
    +import hivemall.utils.hadoop.HiveUtils;
    +import hivemall.utils.lang.CommandLineUtils;
    +
    +import org.apache.commons.cli.CommandLine;
    +import org.apache.commons.cli.HelpFormatter;
    +import org.apache.commons.cli.Options;
    +import org.apache.hadoop.hive.ql.exec.Description;
    +import org.apache.hadoop.hive.ql.exec.UDFArgumentException;
    +import org.apache.hadoop.hive.ql.exec.UDFArgumentTypeException;
    +import org.apache.hadoop.hive.ql.metadata.HiveException;
    +import org.apache.hadoop.hive.ql.parse.SemanticException;
    +import org.apache.hadoop.hive.ql.udf.generic.AbstractGenericUDAFResolver;
    +import org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator;
    +import org.apache.hadoop.hive.ql.udf.generic.GenericUDAFParameterInfo;
    +import org.apache.hadoop.hive.serde2.objectinspector.*;
    +import org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorFactory;
    +import org.apache.hadoop.hive.serde2.typeinfo.TypeInfo;
    +import org.apache.hadoop.io.BooleanWritable;
    +import org.apache.hadoop.io.IntWritable;
    +
    +import javax.annotation.Nonnegative;
    +import javax.annotation.Nonnull;
    +import java.io.PrintWriter;
    +import java.io.StringWriter;
    +import java.util.*;
    +
    +/**
    + * Return list of values sorted by value itself or specific key.
    + */
    +@Description(
    +        name = "to_ordered_list",
    +        value = "_FUNC_(value [, key, const string options]) - Return list of values sorted by value itself or specific key")
    +public class UDAFToOrderedList extends AbstractGenericUDAFResolver {
    +
    +    @Override
    +    public GenericUDAFEvaluator getEvaluator(GenericUDAFParameterInfo info)
    +            throws SemanticException {
    +        @SuppressWarnings("deprecation")
    +        TypeInfo[] typeInfo = info.getParameters();
    +        ObjectInspector[] argOIs = info.getParameterObjectInspectors();
    +        if ((typeInfo.length == 1) || (typeInfo.length == 2 && HiveUtils.isConstString(argOIs[1]))) {
    +            // sort values by value itself w/o key
    +            if (typeInfo[0].getCategory() != ObjectInspector.Category.PRIMITIVE) {
    +                throw new UDFArgumentTypeException(0,
    +                    "Only primitive type arguments are accepted for value but "
    +                            + typeInfo[0].getTypeName() + " was passed as the first parameter.");
    +            }
    +        } else if ((typeInfo.length == 2)
    +                || (typeInfo.length == 3 && HiveUtils.isConstString(argOIs[2]))) {
    +            // sort values by key
    +            if (typeInfo[1].getCategory() != ObjectInspector.Category.PRIMITIVE) {
    +                throw new UDFArgumentTypeException(1,
    +                    "Only primitive type arguments are accepted for key but "
    +                            + typeInfo[1].getTypeName() + " was passed as the second parameter.");
    +            }
    +        } else {
    +            throw new UDFArgumentTypeException(typeInfo.length - 1,
    +                "Number of arguments must be in [1, 3] including constant string for options: "
    +                        + typeInfo.length);
    +        }
    +        return new UDAFToOrderedListEvaluator();
    +    }
    +
    +    public static class UDAFToOrderedListEvaluator extends GenericUDAFEvaluator {
    +
    +        private ObjectInspector valueOI;
    +        private PrimitiveObjectInspector keyOI;
    +
    +        private ListObjectInspector valueListOI;
    +        private ListObjectInspector keyListOI;
    +
    +        private StructObjectInspector internalMergeOI;
    +
    +        private StructField valueListField;
    +        private StructField keyListField;
    +        private StructField sizeField;
    +        private StructField reverseOrderField;
    +
    +        @Nonnegative
    +        private int size;
    +        private boolean reverseOrder;
    +        private boolean sortByKey;
    +
    +        protected Options getOptions() {
    +            Options opts = new Options();
    +            opts.addOption("k", true, "To top-k (positive) or tail-k (negative) ordered queue");
    +            opts.addOption("reverse", "reverse_order", false,
    +                "Sort values by key in a reverse (e.g., descending) order [default: false]");
    +            return opts;
    +        }
    +
    +        @Nonnull
    +        protected final CommandLine parseOptions(String optionValue) throws UDFArgumentException {
    +            String[] args = optionValue.split("\\s+");
    +            Options opts = getOptions();
    +            opts.addOption("help", false, "Show function help");
    +            CommandLine cl = CommandLineUtils.parseOptions(args, opts);
    +
    +            if (cl.hasOption("help")) {
    +                Description funcDesc = getClass().getAnnotation(Description.class);
    +                final String cmdLineSyntax;
    +                if (funcDesc == null) {
    +                    cmdLineSyntax = getClass().getSimpleName();
    +                } else {
    +                    String funcName = funcDesc.name();
    +                    cmdLineSyntax = funcName == null ? getClass().getSimpleName()
    +                            : funcDesc.value().replace("_FUNC_", funcDesc.name());
    +                }
    +                StringWriter sw = new StringWriter();
    +                sw.write('\n');
    +                PrintWriter pw = new PrintWriter(sw);
    +                HelpFormatter formatter = new HelpFormatter();
    +                formatter.printHelp(pw, HelpFormatter.DEFAULT_WIDTH, cmdLineSyntax, null, opts,
    +                    HelpFormatter.DEFAULT_LEFT_PAD, HelpFormatter.DEFAULT_DESC_PAD, null, true);
    +                pw.flush();
    +                String helpMsg = sw.toString();
    +                throw new UDFArgumentException(helpMsg);
    +            }
    +
    +            return cl;
    +        }
    +
    +        protected CommandLine processOptions(ObjectInspector[] argOIs) throws UDFArgumentException {
    +            CommandLine cl = null;
    +
    +            int optionIndex = 1;
    +            if (sortByKey) {
    +                optionIndex = 2;
    +            }
    +
    +            int k = 0;
    +            boolean reverseOrder = false;
    +
    +            if (argOIs.length >= optionIndex + 1) {
    +                String rawArgs = HiveUtils.getConstString(argOIs[optionIndex]);
    +                cl = parseOptions(rawArgs);
    +
    +                reverseOrder = cl.hasOption("reverse_order");
    +
    +                if (cl.hasOption("k")) {
    +                    k = Integer.parseInt(cl.getOptionValue("k"));
    +                    if (k == 0) {
    +                        throw new UDFArgumentException("`k` must be nonzero: " + k);
    +                    }
    +                }
    +            }
    +
    +            this.size = Math.abs(k);
    +
    +            if ((k > 0 && reverseOrder) || (k < 0 && !reverseOrder) || (k == 0 && !reverseOrder)) {
    +                // reverse top-k, natural tail-k = ascending = natural order output = reverse order priority queue
    +                this.reverseOrder = true;
    +            } else { // (k > 0 && !reverseOrder) || (k < 0 && reverseOrder) || (k == 0 && reverseOrder)
    +                // natural top-k or reverse tail-k = descending = reverse order output = natural order priority queue
    +                this.reverseOrder = false;
    --- End diff --
    
    Why `k == 0 && reverseOrder` => `reverseOrder = false` ??


---

[GitHub] incubator-hivemall issue #108: [WIP][HIVEMALL-138] `to_ordered_map` UDAF wit...

Posted by takuti <gi...@git.apache.org>.
Github user takuti commented on the issue:

    https://github.com/apache/incubator-hivemall/pull/108
  
    I tested `each_top_k`, `to_ordered_map` and `to_ordered_list` on the same MovieLens 1M data. As we expected, `to_ordered_map` collects duplicated keys, and the number of ratings is 3 while we launched top-10 aggregation.
    
    ```sql
    with topk as (
        select
            each_top_k(
                10, userid, rating,
                userid, movieid
            ) as (rank, rating, userid, movieid)
        from (
            select
                userid, movieid, rating
            from ratings
            cluster by userid
        ) t
    )
    select 
        count(1), collect_list(array(movieid, rating))
    from 
        topk 
    where 
        userid = 1
    ;
    ```
    
    > 10      [[527.0,5.0],[3105.0,5.0],[1270.0,5.0],[48.0,5.0],[1035.0,5.0],[1193.0,5.0],[1287.0,5.0],[2355.0,5.0],[595.0,5.0],[2804.0,5.0]]
    
    ```sql
    with topk as (
        select 
            userid, 
            to_ordered_map(rating, movieid, 10) as movies
        from
            ratings
        group by 
            userid
    )
    select 
        count(1), collect_list(array(movieid, rating))
    from 
        topk
    lateral view explode(movies) t as rating, movieid
    where 
        userid = 1
    ;
    ```
    
    > 3       [[2028,5],[1246,4],[745,3]]
    
    ```sql
    with topk as (
        select 
            userid, 
            to_ordered_list(movieid, rating, '-k 10') as movies
        from
            ratings
        group by 
            userid
    )
    select 
        size(movies), movies
    from 
        topk
    where 
        userid = 1
    ;
    ```
    
    > 10      [595,1035,3105,2355,1287,2804,1193,2028,1029,1270]


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hivemall issue #108: [HIVEMALL-138] `to_ordered_map` UDAF with siz...

Posted by coveralls <gi...@git.apache.org>.
Github user coveralls commented on the issue:

    https://github.com/apache/incubator-hivemall/pull/108
  
    
    [![Coverage Status](https://coveralls.io/builds/12719161/badge)](https://coveralls.io/builds/12719161)
    
    Coverage increased (+0.3%) to 41.179% when pulling **3c029f9bd71adb70db8dfc48f6452362dacc164c on takuti:topk-ordered-map** into **7205de1e959f0d9b96ac756e415d8a8ada7e92af on apache:master**.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hivemall issue #108: [WIP][HIVEMALL-138] `to_ordered_map` UDAF wit...

Posted by myui <gi...@git.apache.org>.
Github user myui commented on the issue:

    https://github.com/apache/incubator-hivemall/pull/108
  
    `to_ordered_queue(NUMBER key, ANY value, const string options)`
    
    options:
    `-k` -- bounded queue
    `-reverse`  -- to reverse order queue


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hivemall issue #108: [HIVEMALL-138] `to_ordered_map` & `to_ordered...

Posted by myui <gi...@git.apache.org>.
Github user myui commented on the issue:

    https://github.com/apache/incubator-hivemall/pull/108
  
    Failed also for to_ordered_map in Hive v2.3.0.
    
    `IllegalArgumentException Size requested for unknown type: java.util.Map`


---

[GitHub] incubator-hivemall issue #108: [HIVEMALL-138] `to_ordered_map` & `to_ordered...

Posted by myui <gi...@git.apache.org>.
Github user myui commented on the issue:

    https://github.com/apache/incubator-hivemall/pull/108
  
    lots of collisions in `to_ordered_map`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hivemall issue #108: [WIP][HIVEMALL-138] `to_ordered_map` UDAF wit...

Posted by takuti <gi...@git.apache.org>.
Github user takuti commented on the issue:

    https://github.com/apache/incubator-hivemall/pull/108
  
    Updated `to_ordered_map` as you suggested.
    
    From here, I will implement `top_k_queue` to work around duplicated keys.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hivemall pull request #108: [HIVEMALL-138] `to_ordered_map` & `to_...

Posted by myui <gi...@git.apache.org>.
Github user myui commented on a diff in the pull request:

    https://github.com/apache/incubator-hivemall/pull/108#discussion_r138586028
  
    --- Diff: core/src/main/java/hivemall/tools/map/UDAFToOrderedMap.java ---
    @@ -92,4 +122,172 @@ public void reset(@SuppressWarnings("deprecation") AggregationBuffer agg)
     
         }
     
    +    public static class TopKOrderedMapEvaluator extends GenericUDAFEvaluator {
    +
    +        protected PrimitiveObjectInspector inputKeyOI;
    +        protected ObjectInspector inputValueOI;
    +        protected StandardMapObjectInspector partialMapOI;
    +        protected PrimitiveObjectInspector sizeOI;
    +
    +        protected StructObjectInspector internalMergeOI;
    +
    +        protected StructField partialMapField;
    +        protected StructField sizeField;
    +
    +        @Override
    +        public ObjectInspector init(Mode mode, ObjectInspector[] argOIs) throws HiveException {
    +            super.init(mode, argOIs);
    +
    +            // initialize input
    +            if (mode == Mode.PARTIAL1 || mode == Mode.COMPLETE) {// from original data
    +                this.inputKeyOI = HiveUtils.asPrimitiveObjectInspector(argOIs[0]);
    +                this.inputValueOI = argOIs[1];
    +                this.sizeOI = HiveUtils.asIntegerOI(argOIs[2]);
    --- End diff --
    
    parameter might be boolean for `argOIs[2]`


---

[GitHub] incubator-hivemall pull request #108: [HIVEMALL-138] `to_ordered_map` & `to_...

Posted by myui <gi...@git.apache.org>.
Github user myui commented on a diff in the pull request:

    https://github.com/apache/incubator-hivemall/pull/108#discussion_r138024379
  
    --- Diff: core/src/main/java/hivemall/tools/map/UDAFToOrderedMap.java ---
    @@ -54,19 +68,35 @@ public GenericUDAFEvaluator getEvaluator(GenericUDAFParameterInfo info)
                     "Only primitive type arguments are accepted for the key but "
                             + typeInfo[0].getTypeName() + " was passed as parameter 1.");
             }
    +
             boolean reverseOrder = false;
    +        int size = 0;
             if (typeInfo.length == 3) {
    -            if (HiveUtils.isBooleanTypeInfo(typeInfo[2]) == false) {
    -                throw new UDFArgumentTypeException(2, "The three argument must be boolean type: "
    -                        + typeInfo[2].getTypeName());
    -            }
                 ObjectInspector[] argOIs = info.getParameterObjectInspectors();
    -            reverseOrder = HiveUtils.getConstBoolean(argOIs[2]);
    +            if (HiveUtils.isBooleanTypeInfo(typeInfo[2])) {
    +                reverseOrder = HiveUtils.getConstBoolean(argOIs[2]);
    +            } else if (HiveUtils.isIntegerTypeInfo(typeInfo[2])) {
    +                size = HiveUtils.getConstInt(argOIs[2]);
    +                if (size == 0) {
    +                    throw new UDFArgumentException("Map size must be nonzero: " + size);
    +                }
    +                reverseOrder = (size > 0); // positive size => top-k
    +            } else {
    +                throw new UDFArgumentTypeException(2,
    +                    "The third argument must be boolean or integer type: "
    +                            + typeInfo[2].getTypeName());
    +            }
             }
     
    -        if (reverseOrder) {
    +        if (reverseOrder) { // descending
    --- End diff --
    
    Better to implement `BoundedSortedMap` to avoid duplicate codes and memory in-efficient top-k operation.


---

[GitHub] incubator-hivemall issue #108: [HIVEMALL-138] `to_ordered_map` UDAF with siz...

Posted by myui <gi...@git.apache.org>.
Github user myui commented on the issue:

    https://github.com/apache/incubator-hivemall/pull/108
  
    How about `to_ordered_map` accepts `int k` for the third argument. Then, use `BoundedOrderedMapEvaluator`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hivemall issue #108: [HIVEMALL-138] `to_ordered_map` & `to_ordered...

Posted by takuti <gi...@git.apache.org>.
Github user takuti commented on the issue:

    https://github.com/apache/incubator-hivemall/pull/108
  
    what's bad...?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hivemall issue #108: [HIVEMALL-138] `to_ordered_map` & `to_ordered...

Posted by myui <gi...@git.apache.org>.
Github user myui commented on the issue:

    https://github.com/apache/incubator-hivemall/pull/108
  
    that's bad...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hivemall issue #108: [WIP][HIVEMALL-138] `to_ordered_map` UDAF wit...

Posted by coveralls <gi...@git.apache.org>.
Github user coveralls commented on the issue:

    https://github.com/apache/incubator-hivemall/pull/108
  
    
    [![Coverage Status](https://coveralls.io/builds/12760076/badge)](https://coveralls.io/builds/12760076)
    
    Coverage increased (+0.2%) to 41.003% when pulling **c448099c4f35870dd75cf0a5166acf9e6fdf4d17 on takuti:topk-ordered-map** into **7205de1e959f0d9b96ac756e415d8a8ada7e92af on apache:master**.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hivemall pull request #108: [HIVEMALL-138] `to_ordered_map` & `to_...

Posted by myui <gi...@git.apache.org>.
Github user myui commented on a diff in the pull request:

    https://github.com/apache/incubator-hivemall/pull/108#discussion_r138024907
  
    --- Diff: core/src/main/java/hivemall/tools/list/UDAFToOrderedList.java ---
    @@ -0,0 +1,535 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *   http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing,
    + * software distributed under the License is distributed on an
    + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
    + * KIND, either express or implied.  See the License for the
    + * specific language governing permissions and limitations
    + * under the License.
    + */
    +package hivemall.tools.list;
    +
    +import hivemall.utils.collections.BoundedPriorityQueue;
    +import hivemall.utils.hadoop.HiveUtils;
    +import hivemall.utils.lang.CommandLineUtils;
    +
    +import org.apache.commons.cli.CommandLine;
    +import org.apache.commons.cli.HelpFormatter;
    +import org.apache.commons.cli.Options;
    +import org.apache.hadoop.hive.ql.exec.Description;
    +import org.apache.hadoop.hive.ql.exec.UDFArgumentException;
    +import org.apache.hadoop.hive.ql.exec.UDFArgumentTypeException;
    +import org.apache.hadoop.hive.ql.metadata.HiveException;
    +import org.apache.hadoop.hive.ql.parse.SemanticException;
    +import org.apache.hadoop.hive.ql.udf.generic.AbstractGenericUDAFResolver;
    +import org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator;
    +import org.apache.hadoop.hive.ql.udf.generic.GenericUDAFParameterInfo;
    +import org.apache.hadoop.hive.serde2.objectinspector.*;
    +import org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorFactory;
    +import org.apache.hadoop.hive.serde2.typeinfo.TypeInfo;
    +import org.apache.hadoop.io.BooleanWritable;
    +import org.apache.hadoop.io.IntWritable;
    +
    +import javax.annotation.Nonnegative;
    +import javax.annotation.Nonnull;
    +import java.io.PrintWriter;
    +import java.io.StringWriter;
    +import java.util.*;
    +
    +/**
    + * Return list of values sorted by value itself or specific key.
    + */
    +@Description(
    +        name = "to_ordered_list",
    +        value = "_FUNC_(value [, key, const string options]) - Return list of values sorted by value itself or specific key")
    +public class UDAFToOrderedList extends AbstractGenericUDAFResolver {
    +
    +    @Override
    +    public GenericUDAFEvaluator getEvaluator(GenericUDAFParameterInfo info)
    +            throws SemanticException {
    +        @SuppressWarnings("deprecation")
    +        TypeInfo[] typeInfo = info.getParameters();
    +        ObjectInspector[] argOIs = info.getParameterObjectInspectors();
    +        if ((typeInfo.length == 1) || (typeInfo.length == 2 && HiveUtils.isConstString(argOIs[1]))) {
    +            // sort values by value itself w/o key
    +            if (typeInfo[0].getCategory() != ObjectInspector.Category.PRIMITIVE) {
    +                throw new UDFArgumentTypeException(0,
    +                    "Only primitive type arguments are accepted for value but "
    +                            + typeInfo[0].getTypeName() + " was passed as the first parameter.");
    +            }
    +        } else if ((typeInfo.length == 2)
    +                || (typeInfo.length == 3 && HiveUtils.isConstString(argOIs[2]))) {
    +            // sort values by key
    +            if (typeInfo[1].getCategory() != ObjectInspector.Category.PRIMITIVE) {
    +                throw new UDFArgumentTypeException(1,
    +                    "Only primitive type arguments are accepted for key but "
    +                            + typeInfo[1].getTypeName() + " was passed as the second parameter.");
    +            }
    +        } else {
    +            throw new UDFArgumentTypeException(typeInfo.length - 1,
    +                "Number of arguments must be in [1, 3] including constant string for options: "
    +                        + typeInfo.length);
    +        }
    +        return new UDAFToOrderedListEvaluator();
    +    }
    +
    +    public static class UDAFToOrderedListEvaluator extends GenericUDAFEvaluator {
    +
    +        private ObjectInspector valueOI;
    +        private PrimitiveObjectInspector keyOI;
    +
    +        private ListObjectInspector valueListOI;
    +        private ListObjectInspector keyListOI;
    +
    +        private StructObjectInspector internalMergeOI;
    +
    +        private StructField valueListField;
    +        private StructField keyListField;
    +        private StructField sizeField;
    +        private StructField reverseOrderField;
    +
    +        @Nonnegative
    +        private int size;
    +        private boolean reverseOrder;
    +        private boolean sortByKey;
    +
    +        protected Options getOptions() {
    +            Options opts = new Options();
    +            opts.addOption("k", true, "To top-k (positive) or tail-k (negative) ordered queue");
    +            opts.addOption("reverse", "reverse_order", false,
    +                "Sort values by key in a reverse (e.g., descending) order [default: false]");
    +            return opts;
    +        }
    +
    +        @Nonnull
    +        protected final CommandLine parseOptions(String optionValue) throws UDFArgumentException {
    +            String[] args = optionValue.split("\\s+");
    +            Options opts = getOptions();
    +            opts.addOption("help", false, "Show function help");
    +            CommandLine cl = CommandLineUtils.parseOptions(args, opts);
    +
    +            if (cl.hasOption("help")) {
    +                Description funcDesc = getClass().getAnnotation(Description.class);
    +                final String cmdLineSyntax;
    +                if (funcDesc == null) {
    +                    cmdLineSyntax = getClass().getSimpleName();
    +                } else {
    +                    String funcName = funcDesc.name();
    +                    cmdLineSyntax = funcName == null ? getClass().getSimpleName()
    +                            : funcDesc.value().replace("_FUNC_", funcDesc.name());
    +                }
    +                StringWriter sw = new StringWriter();
    +                sw.write('\n');
    +                PrintWriter pw = new PrintWriter(sw);
    +                HelpFormatter formatter = new HelpFormatter();
    +                formatter.printHelp(pw, HelpFormatter.DEFAULT_WIDTH, cmdLineSyntax, null, opts,
    +                    HelpFormatter.DEFAULT_LEFT_PAD, HelpFormatter.DEFAULT_DESC_PAD, null, true);
    +                pw.flush();
    +                String helpMsg = sw.toString();
    +                throw new UDFArgumentException(helpMsg);
    +            }
    +
    +            return cl;
    +        }
    +
    +        protected CommandLine processOptions(ObjectInspector[] argOIs) throws UDFArgumentException {
    +            CommandLine cl = null;
    +
    +            int optionIndex = 1;
    +            if (sortByKey) {
    +                optionIndex = 2;
    +            }
    +
    +            int k = 0;
    +            boolean reverseOrder = false;
    +
    +            if (argOIs.length >= optionIndex + 1) {
    +                String rawArgs = HiveUtils.getConstString(argOIs[optionIndex]);
    +                cl = parseOptions(rawArgs);
    +
    +                reverseOrder = cl.hasOption("reverse_order");
    +
    +                if (cl.hasOption("k")) {
    +                    k = Integer.parseInt(cl.getOptionValue("k"));
    +                    if (k == 0) {
    +                        throw new UDFArgumentException("`k` must be nonzero: " + k);
    +                    }
    +                }
    +            }
    +
    +            this.size = Math.abs(k);
    +
    +            if ((k > 0 && reverseOrder) || (k < 0 && !reverseOrder) || (k == 0 && !reverseOrder)) {
    --- End diff --
    
    too complex condition. Better to simplified.
    
    ```java
                        k = Integer.parseInt(cl.getOptionValue("k"));
                        if (k == 0) {
                            throw new UDFArgumentException("`k` must be non-zero value: " + k);
                        } else if (k > 0) {
                            if(reverseOrder) {
                                reverseOrder = false;
                            } else {// top-k (descending)
                                reverseOrder = true;
                            }
                        } else {// k < 0
                            if (reverseOrder) {
                                reverseOrder = true;
                            } else {// top-k (descending)
                                reverseOrder = false;
                            }
                        }
    ```


---

[GitHub] incubator-hivemall issue #108: [WIP][HIVEMALL-138] `to_ordered_map` UDAF wit...

Posted by takuti <gi...@git.apache.org>.
Github user takuti commented on the issue:

    https://github.com/apache/incubator-hivemall/pull/108
  
    @myui Ah, okay. But I imagine that "ordered list" indicates a little bit different function like `to_ordered_list(ANY value)`... Do you have any other ideas?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hivemall pull request #108: [HIVEMALL-138] `to_ordered_map` UDAF w...

Posted by myui <gi...@git.apache.org>.
Github user myui commented on a diff in the pull request:

    https://github.com/apache/incubator-hivemall/pull/108#discussion_r131601075
  
    --- Diff: resources/ddl/define-udfs.td.hql ---
    @@ -174,6 +174,7 @@ create temporary function dimsum_mapper as 'hivemall.knn.similarity.DIMSUMMapper
     create temporary function train_classifier as 'hivemall.classifier.GeneralClassifierUDTF';
     create temporary function train_regressor as 'hivemall.regression.GeneralRegressorUDTF';
     create temporary function tree_export as 'hivemall.smile.tools.TreeExportUDF';
    +create temporary function to_bounded_ordered_map as 'hivemall.tools.map.UDAFToBoundedOrderedMap';
    --- End diff --
    
    bounded ... ordered .. is annoying.  How about `to_top_k_map` ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hivemall issue #108: [HIVEMALL-138] `to_ordered_map` & `to_ordered...

Posted by myui <gi...@git.apache.org>.
Github user myui commented on the issue:

    https://github.com/apache/incubator-hivemall/pull/108
  
    Merged. Thanks!


---

[GitHub] incubator-hivemall pull request #108: [HIVEMALL-138] `to_ordered_map` & `to_...

Posted by myui <gi...@git.apache.org>.
Github user myui commented on a diff in the pull request:

    https://github.com/apache/incubator-hivemall/pull/108#discussion_r138585976
  
    --- Diff: core/src/main/java/hivemall/tools/map/UDAFToOrderedMap.java ---
    @@ -92,4 +122,172 @@ public void reset(@SuppressWarnings("deprecation") AggregationBuffer agg)
     
         }
     
    +    public static class TopKOrderedMapEvaluator extends GenericUDAFEvaluator {
    +
    +        protected PrimitiveObjectInspector inputKeyOI;
    +        protected ObjectInspector inputValueOI;
    +        protected StandardMapObjectInspector partialMapOI;
    +        protected PrimitiveObjectInspector sizeOI;
    +
    +        protected StructObjectInspector internalMergeOI;
    +
    +        protected StructField partialMapField;
    +        protected StructField sizeField;
    +
    +        @Override
    +        public ObjectInspector init(Mode mode, ObjectInspector[] argOIs) throws HiveException {
    +            super.init(mode, argOIs);
    +
    +            // initialize input
    +            if (mode == Mode.PARTIAL1 || mode == Mode.COMPLETE) {// from original data
    +                this.inputKeyOI = HiveUtils.asPrimitiveObjectInspector(argOIs[0]);
    +                this.inputValueOI = argOIs[1];
    +                this.sizeOI = HiveUtils.asIntegerOI(argOIs[2]);
    +            } else {// from partial aggregation
    +                StructObjectInspector soi = (StructObjectInspector) argOIs[0];
    +                this.internalMergeOI = soi;
    +
    +                this.partialMapField = soi.getStructFieldRef("partialMap");
    +                // re-extract input key/value OIs
    +                StandardMapObjectInspector partialMapOI = (StandardMapObjectInspector) partialMapField.getFieldObjectInspector();
    +                this.inputKeyOI = HiveUtils.asPrimitiveObjectInspector(partialMapOI.getMapKeyObjectInspector());
    +                this.inputValueOI = partialMapOI.getMapValueObjectInspector();
    +
    +                this.partialMapOI = ObjectInspectorFactory.getStandardMapObjectInspector(
    +                    ObjectInspectorUtils.getStandardObjectInspector(inputKeyOI),
    +                    ObjectInspectorUtils.getStandardObjectInspector(inputValueOI));
    +
    +                this.sizeField = soi.getStructFieldRef("size");
    +                this.sizeOI = (PrimitiveObjectInspector) sizeField.getFieldObjectInspector();
    +            }
    +
    +            // initialize output
    +            final ObjectInspector outputOI;
    +            if (mode == Mode.PARTIAL1 || mode == Mode.PARTIAL2) {// terminatePartial
    +                outputOI = internalMergeOI(inputKeyOI, inputValueOI);
    +            } else {// terminate
    +                outputOI = ObjectInspectorFactory.getStandardMapObjectInspector(
    +                    ObjectInspectorUtils.getStandardObjectInspector(inputKeyOI),
    +                    ObjectInspectorUtils.getStandardObjectInspector(inputValueOI));
    +            }
    +            return outputOI;
    +        }
    +
    +        private static StructObjectInspector internalMergeOI(
    +                @Nonnull PrimitiveObjectInspector keyOI, @Nonnull ObjectInspector valueOI) {
    +            ArrayList<String> fieldNames = new ArrayList<String>();
    +            ArrayList<ObjectInspector> fieldOIs = new ArrayList<ObjectInspector>();
    +
    +            fieldNames.add("partialMap");
    +            fieldOIs.add(ObjectInspectorFactory.getStandardMapObjectInspector(
    +                ObjectInspectorUtils.getStandardObjectInspector(keyOI),
    +                ObjectInspectorUtils.getStandardObjectInspector(valueOI)));
    +
    +            fieldNames.add("size");
    +            fieldOIs.add(PrimitiveObjectInspectorFactory.writableIntObjectInspector);
    +
    +            return ObjectInspectorFactory.getStandardStructObjectInspector(fieldNames, fieldOIs);
    +        }
    +
    +        static class MapAggregationBuffer extends AbstractAggregationBuffer {
    +            Map<Object, Object> container;
    +            int size;
    +
    +            MapAggregationBuffer() {
    +                super();
    +            }
    +        }
    +
    +        @Override
    +        public void reset(@SuppressWarnings("deprecation") AggregationBuffer agg)
    +                throws HiveException {
    +            MapAggregationBuffer myagg = (MapAggregationBuffer) agg;
    +            myagg.container = new TreeMap<Object, Object>(Collections.reverseOrder());
    +            myagg.size = Integer.MAX_VALUE;
    +        }
    +
    +        @Override
    +        public MapAggregationBuffer getNewAggregationBuffer() throws HiveException {
    +            MapAggregationBuffer myagg = new MapAggregationBuffer();
    +            reset(myagg);
    +            return myagg;
    +        }
    +
    +        @Override
    +        public void iterate(@SuppressWarnings("deprecation") AggregationBuffer agg,
    +                Object[] parameters) throws HiveException {
    +            assert (parameters.length == 3);
    +
    +            if (parameters[0] == null) {
    +                return;
    +            }
    +
    +            Object key = ObjectInspectorUtils.copyToStandardObject(parameters[0], inputKeyOI);
    +            Object value = ObjectInspectorUtils.copyToStandardObject(parameters[1], inputValueOI);
    +            int size = Math.abs(HiveUtils.getInt(parameters[2], sizeOI)); // size could be negative for tail-k
    --- End diff --
    
    parameter might be `boolean` but not considered.


---

[GitHub] incubator-hivemall issue #108: [HIVEMALL-138] `to_ordered_map` & `to_ordered...

Posted by takuti <gi...@git.apache.org>.
Github user takuti commented on the issue:

    https://github.com/apache/incubator-hivemall/pull/108
  
    Oh, that's expected behavior. I just showed how each top k, map and list are different. Using map is obviously a wrong choice in the MoviLens case.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hivemall issue #108: [HIVEMALL-138] `to_ordered_map` UDAF with siz...

Posted by myui <gi...@git.apache.org>.
Github user myui commented on the issue:

    https://github.com/apache/incubator-hivemall/pull/108
  
    @takuti Yes. That's my intention.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hivemall issue #108: [HIVEMALL-138] `to_ordered_map` UDAF with siz...

Posted by takuti <gi...@git.apache.org>.
Github user takuti commented on the issue:

    https://github.com/apache/incubator-hivemall/pull/108
  
    Do you mean something like:
    
    - 3rd argument is boolean => current `to_ordered_map`
    - 3rd argument is integer
      - positive => top-k ordered map (i.e., reverse ordered bounded map)
      - negative => tail-k ordered map (i.e., natural ordered bounded map)
    
    ?
    
    Okay, it sounds reasonable for me.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hivemall issue #108: [HIVEMALL-138] `to_ordered_map` UDAF with siz...

Posted by coveralls <gi...@git.apache.org>.
Github user coveralls commented on the issue:

    https://github.com/apache/incubator-hivemall/pull/108
  
    
    [![Coverage Status](https://coveralls.io/builds/12719960/badge)](https://coveralls.io/builds/12719960)
    
    Coverage increased (+0.06%) to 40.888% when pulling **d68e5e9493cb78c4321169c15d7d04fedbbca691 on takuti:topk-ordered-map** into **7205de1e959f0d9b96ac756e415d8a8ada7e92af on apache:master**.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hivemall issue #108: [WIP][HIVEMALL-138] `to_ordered_map` UDAF wit...

Posted by takuti <gi...@git.apache.org>.
Github user takuti commented on the issue:

    https://github.com/apache/incubator-hivemall/pull/108
  
    `to_ordered_list(PRIMITIVE key, ANY value)` is also acceptable for me, btw ;)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hivemall pull request #108: [HIVEMALL-138] `to_ordered_map` UDAF w...

Posted by myui <gi...@git.apache.org>.
Github user myui commented on a diff in the pull request:

    https://github.com/apache/incubator-hivemall/pull/108#discussion_r131596381
  
    --- Diff: core/src/main/java/hivemall/tools/map/UDAFToBoundedOrderedMap.java ---
    @@ -0,0 +1,259 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *   http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing,
    + * software distributed under the License is distributed on an
    + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
    + * KIND, either express or implied.  See the License for the
    + * specific language governing permissions and limitations
    + * under the License.
    + */
    +package hivemall.tools.map;
    +
    +import hivemall.utils.hadoop.HiveUtils;
    +
    +import org.apache.hadoop.hive.ql.exec.Description;
    +import org.apache.hadoop.hive.ql.exec.UDFArgumentException;
    +import org.apache.hadoop.hive.ql.exec.UDFArgumentTypeException;
    +import org.apache.hadoop.hive.ql.metadata.HiveException;
    +import org.apache.hadoop.hive.ql.parse.SemanticException;
    +import org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator;
    +import org.apache.hadoop.hive.ql.udf.generic.GenericUDAFParameterInfo;
    +import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector;
    +import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory;
    +import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils;
    +import org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector;
    +import org.apache.hadoop.hive.serde2.objectinspector.StandardMapObjectInspector;
    +import org.apache.hadoop.hive.serde2.objectinspector.StructField;
    +import org.apache.hadoop.hive.serde2.objectinspector.StructObjectInspector;
    +import org.apache.hadoop.hive.serde2.typeinfo.TypeInfo;
    +import org.apache.hadoop.io.IntWritable;
    +
    +import java.util.ArrayList;
    +import java.util.Collections;
    +import java.util.Map;
    +import java.util.SortedMap;
    +import java.util.TreeMap;
    +
    +/**
    + * Convert two aggregated columns into a fixed-size sorted map.
    + */
    +@Description(name = "to_bounded_ordered_map",
    +        value = "_FUNC_(key, value, size [, const boolean reverseOrder=false]) "
    +                + "- Convert two aggregated columns into a fixed-size sorted map")
    +public class UDAFToBoundedOrderedMap extends UDAFToMap {
    +
    +    @Override
    +    public GenericUDAFEvaluator getEvaluator(GenericUDAFParameterInfo info)
    +            throws SemanticException {
    +        @SuppressWarnings("deprecation")
    +        TypeInfo[] typeInfo = info.getParameters();
    +        if (typeInfo.length != 3 && typeInfo.length != 4) {
    +            throw new UDFArgumentTypeException(typeInfo.length - 1,
    +                "Expecting three or four arguments: " + typeInfo.length);
    +        }
    +
    +        if (typeInfo[0].getCategory() != ObjectInspector.Category.PRIMITIVE) {
    +            throw new UDFArgumentTypeException(0,
    +                "Only primitive type arguments are accepted for the key but "
    +                        + typeInfo[0].getTypeName() + " was passed as parameter 1.");
    +        }
    +
    +        if (!HiveUtils.isIntegerTypeInfo(typeInfo[2])) {
    +            throw new UDFArgumentTypeException(2, "The third argument must be integer type: "
    +                    + typeInfo[2].getTypeName());
    +        }
    +
    +        boolean reverseOrder = false;
    +        if (typeInfo.length == 4) {
    +            if (!HiveUtils.isBooleanTypeInfo(typeInfo[3])) {
    +                throw new UDFArgumentTypeException(3, "The fourth argument must be boolean type: "
    +                        + typeInfo[3].getTypeName());
    +            }
    +            ObjectInspector[] argOIs = info.getParameterObjectInspectors();
    +            reverseOrder = HiveUtils.getConstBoolean(argOIs[3]);
    +        }
    +
    +        if (reverseOrder) {
    +            return new BoundedReverseOrderedMapEvaluator();
    +        } else {
    +            return new BoundedOrderedMapEvaluator();
    +        }
    +    }
    +
    +    public static class BoundedOrderedMapEvaluator extends GenericUDAFEvaluator {
    +
    +        protected PrimitiveObjectInspector inputKeyOI;
    +        protected ObjectInspector inputValueOI;
    +        protected StandardMapObjectInspector partialMapOI;
    +        protected PrimitiveObjectInspector sizeOI;
    +
    +        protected StructObjectInspector internalMergeOI;
    +
    +        protected StructField partialMapField;
    +        protected StructField sizeField;
    +
    +        @Override
    +        public ObjectInspector init(Mode mode, ObjectInspector[] argOIs) throws HiveException {
    +            assert (argOIs.length == 3) : argOIs.length;
    --- End diff --
    
    argOIs.length might be 4


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hivemall pull request #108: [HIVEMALL-138] `to_ordered_map` & `to_...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/incubator-hivemall/pull/108


---

[GitHub] incubator-hivemall issue #108: [WIP][HIVEMALL-138] `to_ordered_map` UDAF wit...

Posted by myui <gi...@git.apache.org>.
Github user myui commented on the issue:

    https://github.com/apache/incubator-hivemall/pull/108
  
    @takuti sorry, I'm considering to rename it to `to_ordered_list` . 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---