You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by "Sabbidi, Prashanth" <Pr...@VerizonWireless.com.INVALID> on 2015/12/10 21:02:48 UTC

DataFu HyperLogLogPlusPlus not using combiner though it is Algebraic

Hi,

I am trying to use HyperLogLogPlusPlus UDF from apache datafu library (version 1.3). As per the UDF source code, it impleted Algebraic interface. However, the pig script is not using combiner in this case.
Can someone please advice why this is so.


rawData   = load '$app_analytics_application' using org.apache.hive.hcatalog.pig.HCatLoader() as  (
     mdn:chararray
                                                               ,app_version:chararray
                                                                ,app_name:chararray
                                                       );

A = GROUP rawData BY (app_name, app_version);
B = FOREACH A GENERATE group.app_name, HyperLogLogPlusPlus(rawData.mdn);

EXPLAIN B;




#--------------------------------------------------
# Map Reduce Plan
#--------------------------------------------------
MapReduce node scope-15
Map Plan
A: Local Rearrange[tuple]{tuple}(false) - scope-3
|   |
|   Project[chararray][33] - scope-4
|   |
|   Project[chararray][2] - scope-5
|
|---rawData: Load(mobile_diag_dev_tbls.app_analytics_application:org.apache.hive.hcatalog.pig.HCatLoader) - scope-0--------
Reduce Plan
B: Store(fakefile:org.apache.pig.builtin.PigStorage) - scope-14
|
|---B: New For Each(false,false)[bag] - scope-13
    |   |
    |   Project[chararray][0] - scope-7
    |   |
    |   |---Project[tuple][0] - scope-6
    |   |
    |   POUserFunc(datafu.pig.stats.HyperLogLogPlusPlus)[long] - scope-11
    |   |
    |   |---Project[bag][0] - scope-10
    |       |
    |       |---Project[bag][1] - scope-9
    |
    |---A: Package(Packager)[tuple]{tuple} - scope-2--------
Global sort: false
----------------