You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Aihua Xu <ax...@cloudera.com> on 2015/01/05 15:51:01 UTC

Exception with the window function

Hi folks,

I’m working on the HIVE-9228 (https://issues.apache.org/jira/browse/HIVE-9228) with the following window query throwing ArrayOutOfBoundExcption. 

select st_fips_cd, zip_cd_5, hh_surr_key
from
(
select st_fips_cd, zip_cd_5, hh_surr_key,
count( case when advtg_len_rsdnc_cd = '1' then 1 end ) over (partition by st_fips_cd, zip_cd_5) as CNT_ADVTG_LEN_RSDNC_CD_1,
row_number() over (partition by st_fips_cd, zip_cd_5 order by hh_surr_key asc) as analytic_row_number3
from hh_agg
where analytic_row_number2 = 1
) t;


Here is the explain extend output for the query.  At the File Sink Operator of stage 1 below, seems like it should only output 4 columns while the temp table in fact output one additional column (the value of CNT_ADVTG_LEN_RSDNC_CD_1). I’m investigating toward such mismatch, but anyone can confirm and provide additional info that will be helpful.

STAGE DEPENDENCIES:
  Stage-1 is a root stage
  Stage-2 depends on stages: Stage-1
  Stage-0 is a root stage

STAGE PLANS:
  Stage: Stage-1
    Map Reduce
      Map Operator Tree:
          TableScan
            alias: hh_agg
            Statistics: Num rows: 33208 Data size: 10361206 Basic stats: COMPLETE Column stats: NONE
            GatherStats: false
            Filter Operator
              isSamplingPred: false
              predicate: (analytic_row_number2 = 1) (type: boolean)
              Statistics: Num rows: 16604 Data size: 5180603 Basic stats: COMPLETE Column stats: NONE
              Reduce Output Operator
                key expressions: st_fips_cd (type: string), zip_cd_5 (type: string), st_fips_cd (type: string), zip_cd_5 (type: string)
                sort order: ++++
                Map-reduce partition columns: st_fips_cd (type: string), zip_cd_5 (type: string)
                Statistics: Num rows: 16604 Data size: 5180603 Basic stats: COMPLETE Column stats: NONE
                tag: -1
                value expressions: st_fips_cd (type: string), zip_cd_5 (type: string), hh_surr_key (type: bigint), advtg_len_rsdnc_cd (type: string)
      Path -> Alias:
        file:/Users/axu/Documents/localDB/23982_debug [t:hh_agg]
      Path -> Partition:
        file:/Users/axu/Documents/localDB/23982_debug 
          Partition
            base file name: 23982_debug
            input format: org.apache.hadoop.mapred.TextInputFormat
            output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
            properties:
              COLUMN_STATS_ACCURATE true
              EXTERNAL TRUE
              bucket_count -1
              columns st_fips_cd,zip_cd_5,hh_surr_key,nbr_hh_in_zip,nbr_nr_adults_in_hh,hh_pop,advtg_len_rsdnc_cd,advtg_home_ownr_cd,dsf_season_cd,advtg_hh_edu_cd,advtg_hh_occupn_cd,advtg_child_presnc_cd,advtg_hh_age_cd,zip_avg_age,zip_mdn_age,mail_rspns_buy_cd,cnt_gend_cd_1,cnt_gend_cd_2,cnt_gend_cd_3,cnt_gend_cd_unk,cnt_advtg_marital_stat_cd_1,cnt_advtg_marital_stat_cd_2,cnt_advtg_marital_stat_cd_unk,cnt_nbr_tradeline_0,cnt_nbr_tradeline_1,cnt_nbr_tradeline_2,cnt_nbr_tradeline_3,cnt_nbr_tradeline_4,cnt_nbr_tradeline_5,cnt_nbr_tradeline_6,cnt_nbr_tradeline_7,cnt_nbr_tradeline_8,cnt_nbr_tradeline_9,cnt_nbr_tradeline_unk,advtg_dwell_type_cd,prprty_mkt_val_cd,zip_avg_prprty_mkt_val,zip_mdn_prprty_mkt_val,zip_avg_home_eqty_amt,zip_mdn_home_eqty_amt,trgt_inc_cd,zip_avg_trgt_inc_narrow_band,zip_mdn_trgt_inc_narrow_band,zip_avg_inc_prodc_asset_cd,zip_mdn_inc_prodc_asset_cd,zip_avg_net_wrth_cd,zip_mdn_net_wrth_cd,rylty_trgt_mktg_val_scr_cd,analytic_row_number2
              columns.comments 
              columns.types string:string:bigint:bigint:bigint:tinyint:string:string:string:string:string:string:string:double:double:int:bigint:bigint:bigint:bigint:bigint:bigint:bigint:bigint:bigint:bigint:bigint:bigint:bigint:bigint:bigint:bigint:bigint:bigint:string:string:double:double:double:double:string:double:double:double:double:double:double:string:int
              field.delim ,
              file.inputformat org.apache.hadoop.mapred.TextInputFormat
              file.outputformat org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
              location file:/Users/axu/Documents/localDB/23982_debug
              name default.hh_agg
              numFiles 0
              numRows 0
              rawDataSize 0
              serialization.ddl struct hh_agg { string st_fips_cd, string zip_cd_5, i64 hh_surr_key, i64 nbr_hh_in_zip, i64 nbr_nr_adults_in_hh, byte hh_pop, string advtg_len_rsdnc_cd, string advtg_home_ownr_cd, string dsf_season_cd, string advtg_hh_edu_cd, string advtg_hh_occupn_cd, string advtg_child_presnc_cd, string advtg_hh_age_cd, double zip_avg_age, double zip_mdn_age, i32 mail_rspns_buy_cd, i64 cnt_gend_cd_1, i64 cnt_gend_cd_2, i64 cnt_gend_cd_3, i64 cnt_gend_cd_unk, i64 cnt_advtg_marital_stat_cd_1, i64 cnt_advtg_marital_stat_cd_2, i64 cnt_advtg_marital_stat_cd_unk, i64 cnt_nbr_tradeline_0, i64 cnt_nbr_tradeline_1, i64 cnt_nbr_tradeline_2, i64 cnt_nbr_tradeline_3, i64 cnt_nbr_tradeline_4, i64 cnt_nbr_tradeline_5, i64 cnt_nbr_tradeline_6, i64 cnt_nbr_tradeline_7, i64 cnt_nbr_tradeline_8, i64 cnt_nbr_tradeline_9, i64 cnt_nbr_tradeline_unk, string advtg_dwell_type_cd, string prprty_mkt_val_cd, double zip_avg_prprty_mkt_val, double zip_mdn_prprty_mkt_val, double zip_avg_home_eqty_amt, double zip_mdn_home_eqty_amt, string trgt_inc_cd, double zip_avg_trgt_inc_narrow_band, double zip_mdn_trgt_inc_narrow_band, double zip_avg_inc_prodc_asset_cd, double zip_mdn_inc_prodc_asset_cd, double zip_avg_net_wrth_cd, double zip_mdn_net_wrth_cd, string rylty_trgt_mktg_val_scr_cd, i32 analytic_row_number2}
              serialization.format ,
              serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
              totalSize 0
              transient_lastDdlTime 1419367415
            serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
          
              input format: org.apache.hadoop.mapred.TextInputFormat
              output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
              properties:
                COLUMN_STATS_ACCURATE true
                EXTERNAL TRUE
                bucket_count -1
                columns st_fips_cd,zip_cd_5,hh_surr_key,nbr_hh_in_zip,nbr_nr_adults_in_hh,hh_pop,advtg_len_rsdnc_cd,advtg_home_ownr_cd,dsf_season_cd,advtg_hh_edu_cd,advtg_hh_occupn_cd,advtg_child_presnc_cd,advtg_hh_age_cd,zip_avg_age,zip_mdn_age,mail_rspns_buy_cd,cnt_gend_cd_1,cnt_gend_cd_2,cnt_gend_cd_3,cnt_gend_cd_unk,cnt_advtg_marital_stat_cd_1,cnt_advtg_marital_stat_cd_2,cnt_advtg_marital_stat_cd_unk,cnt_nbr_tradeline_0,cnt_nbr_tradeline_1,cnt_nbr_tradeline_2,cnt_nbr_tradeline_3,cnt_nbr_tradeline_4,cnt_nbr_tradeline_5,cnt_nbr_tradeline_6,cnt_nbr_tradeline_7,cnt_nbr_tradeline_8,cnt_nbr_tradeline_9,cnt_nbr_tradeline_unk,advtg_dwell_type_cd,prprty_mkt_val_cd,zip_avg_prprty_mkt_val,zip_mdn_prprty_mkt_val,zip_avg_home_eqty_amt,zip_mdn_home_eqty_amt,trgt_inc_cd,zip_avg_trgt_inc_narrow_band,zip_mdn_trgt_inc_narrow_band,zip_avg_inc_prodc_asset_cd,zip_mdn_inc_prodc_asset_cd,zip_avg_net_wrth_cd,zip_mdn_net_wrth_cd,rylty_trgt_mktg_val_scr_cd,analytic_row_number2
                columns.comments 
                columns.types string:string:bigint:bigint:bigint:tinyint:string:string:string:string:string:string:string:double:double:int:bigint:bigint:bigint:bigint:bigint:bigint:bigint:bigint:bigint:bigint:bigint:bigint:bigint:bigint:bigint:bigint:bigint:bigint:string:string:double:double:double:double:string:double:double:double:double:double:double:string:int
                field.delim ,
                file.inputformat org.apache.hadoop.mapred.TextInputFormat
                file.outputformat org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
                location file:/Users/axu/Documents/localDB/23982_debug
                name default.hh_agg
                numFiles 0
                numRows 0
                rawDataSize 0
                serialization.ddl struct hh_agg { string st_fips_cd, string zip_cd_5, i64 hh_surr_key, i64 nbr_hh_in_zip, i64 nbr_nr_adults_in_hh, byte hh_pop, string advtg_len_rsdnc_cd, string advtg_home_ownr_cd, string dsf_season_cd, string advtg_hh_edu_cd, string advtg_hh_occupn_cd, string advtg_child_presnc_cd, string advtg_hh_age_cd, double zip_avg_age, double zip_mdn_age, i32 mail_rspns_buy_cd, i64 cnt_gend_cd_1, i64 cnt_gend_cd_2, i64 cnt_gend_cd_3, i64 cnt_gend_cd_unk, i64 cnt_advtg_marital_stat_cd_1, i64 cnt_advtg_marital_stat_cd_2, i64 cnt_advtg_marital_stat_cd_unk, i64 cnt_nbr_tradeline_0, i64 cnt_nbr_tradeline_1, i64 cnt_nbr_tradeline_2, i64 cnt_nbr_tradeline_3, i64 cnt_nbr_tradeline_4, i64 cnt_nbr_tradeline_5, i64 cnt_nbr_tradeline_6, i64 cnt_nbr_tradeline_7, i64 cnt_nbr_tradeline_8, i64 cnt_nbr_tradeline_9, i64 cnt_nbr_tradeline_unk, string advtg_dwell_type_cd, string prprty_mkt_val_cd, double zip_avg_prprty_mkt_val, double zip_mdn_prprty_mkt_val, double zip_avg_home_eqty_amt, double zip_mdn_home_eqty_amt, string trgt_inc_cd, double zip_avg_trgt_inc_narrow_band, double zip_mdn_trgt_inc_narrow_band, double zip_avg_inc_prodc_asset_cd, double zip_mdn_inc_prodc_asset_cd, double zip_avg_net_wrth_cd, double zip_mdn_net_wrth_cd, string rylty_trgt_mktg_val_scr_cd, i32 analytic_row_number2}
                serialization.format ,
                serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
                totalSize 0
                transient_lastDdlTime 1419367415
              serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
              name: default.hh_agg
            name: default.hh_agg
      Truncated Path -> Alias:
        file:/Users/axu/Documents/localDB/23982_debug [t:hh_agg]
      Needs Tagging: false
      Reduce Operator Tree:
        Extract
          Statistics: Num rows: 16604 Data size: 5180603 Basic stats: COMPLETE Column stats: NONE
          PTF Operator
            Statistics: Num rows: 16604 Data size: 5180603 Basic stats: COMPLETE Column stats: NONE
            File Output Operator
              compressed: false
              GlobalTableId: 0
              directory: file:/var/folders/00/_7st2p5x5bg1hvpxlgm455180000gp/T/axu/hive_2015-01-01_13-25-01_841_2717056365737730087-1/-mr-10003
              NumFilesPerFileSink: 1
              table:
                  input format: org.apache.hadoop.mapred.SequenceFileInputFormat
                  output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
                  properties:
                    columns _col0,_col1,_col2,_col6
                    columns.types string,string,bigint,string
                    escape.delim \
                    serialization.lib org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe
                  serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe
              TotalFiles: 1
              GatherStats: false
              MultiFileSpray: false

  Stage: Stage-2
    Map Reduce
      Map Operator Tree:
          TableScan
            GatherStats: false
            Reduce Output Operator
              key expressions: _col0 (type: string), _col1 (type: string), _col2 (type: bigint)
              sort order: +++
              Map-reduce partition columns: _col0 (type: string), _col1 (type: string)
              Statistics: Num rows: 16604 Data size: 5180603 Basic stats: COMPLETE Column stats: NONE
              tag: -1
              value expressions: _col0 (type: string), _col1 (type: string), _col2 (type: bigint)
      Path -> Alias:
        file:/var/folders/00/_7st2p5x5bg1hvpxlgm455180000gp/T/axu/hive_2015-01-01_13-25-01_841_2717056365737730087-1/-mr-10003 [file:/var/folders/00/_7st2p5x5bg1hvpxlgm455180000gp/T/axu/hive_2015-01-01_13-25-01_841_2717056365737730087-1/-mr-10003]
      Path -> Partition:
        file:/var/folders/00/_7st2p5x5bg1hvpxlgm455180000gp/T/axu/hive_2015-01-01_13-25-01_841_2717056365737730087-1/-mr-10003 
          Partition
            base file name: -mr-10003
            input format: org.apache.hadoop.mapred.SequenceFileInputFormat
            output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
            properties:
              columns _col0,_col1,_col2,_col6
              columns.types string,string,bigint,string
              escape.delim \
              serialization.lib org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe
            serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe
          
              input format: org.apache.hadoop.mapred.SequenceFileInputFormat
              output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
              properties:
                columns _col0,_col1,_col2,_col6
                columns.types string,string,bigint,string
                escape.delim \
                serialization.lib org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe
              serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe
      Truncated Path -> Alias:
        file:/var/folders/00/_7st2p5x5bg1hvpxlgm455180000gp/T/axu/hive_2015-01-01_13-25-01_841_2717056365737730087-1/-mr-10003 [file:/var/folders/00/_7st2p5x5bg1hvpxlgm455180000gp/T/axu/hive_2015-01-01_13-25-01_841_2717056365737730087-1/-mr-10003]
      Needs Tagging: false
      Reduce Operator Tree:
        Extract
          Statistics: Num rows: 16604 Data size: 5180603 Basic stats: COMPLETE Column stats: NONE
          PTF Operator
            Statistics: Num rows: 16604 Data size: 5180603 Basic stats: COMPLETE Column stats: NONE
            Select Operator
              expressions: _col1 (type: string), _col2 (type: string), _col3 (type: bigint)
              outputColumnNames: _col0, _col1, _col2
              Statistics: Num rows: 16604 Data size: 5180603 Basic stats: COMPLETE Column stats: NONE
              File Output Operator
                compressed: false
                GlobalTableId: 0
                directory: file:/var/folders/00/_7st2p5x5bg1hvpxlgm455180000gp/T/axu/hive_2015-01-01_13-25-01_841_2717056365737730087-1/-mr-10000/.hive-staging_hive_2015-01-01_13-25-01_841_2717056365737730087-1/-ext-10001
                NumFilesPerFileSink: 1
                Statistics: Num rows: 16604 Data size: 5180603 Basic stats: COMPLETE Column stats: NONE
                Stats Publishing Key Prefix: file:/var/folders/00/_7st2p5x5bg1hvpxlgm455180000gp/T/axu/hive_2015-01-01_13-25-01_841_2717056365737730087-1/-mr-10000/.hive-staging_hive_2015-01-01_13-25-01_841_2717056365737730087-1/-ext-10001/
                table:
                    input format: org.apache.hadoop.mapred.TextInputFormat
                    output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
                    properties:
                      columns _col0,_col1,_col2
                      columns.types string:string:bigint
                      escape.delim \
                      hive.serialization.extend.nesting.levels true
                      serialization.format 1
                      serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
                    serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
                TotalFiles: 1
                GatherStats: false
                MultiFileSpray: false

  Stage: Stage-0
    Fetch Operator
      limit: -1

Fwd: Exception with the window function

Posted by Aihua Xu <ax...@cloudera.com>.
Any one has insights on the following issue? Where would be the best place to make the changes? I have tried several approaches but it breaks some other queries. 

Basically, the inner query produces some additional columns (from window function) not consumed by the outer query. The query plan says we won’t generate those columns since they are not used (which seems correct), while during the execution, we DO output the additional column to the temp file. That causes the issue when we read the temp file in later.

Best regards,
Aihua Xu
Software Engineer, CCE
axu@cloudera.com

Begin forwarded message:

> From: Aihua Xu <ax...@cloudera.com>
> Subject: Exception with the window function
> Date: January 5, 2015 at 6:51:01 AM PST
> To: user@hive.apache.org
> 
> Hi folks,
> 
> I’m working on the HIVE-9228 (https://issues.apache.org/jira/browse/HIVE-9228) with the following window query throwing ArrayOutOfBoundExcption. 
> 
> select st_fips_cd, zip_cd_5, hh_surr_key
> from
> (
> select st_fips_cd, zip_cd_5, hh_surr_key,
> count( case when advtg_len_rsdnc_cd = '1' then 1 end ) over (partition by st_fips_cd, zip_cd_5) as CNT_ADVTG_LEN_RSDNC_CD_1,
> row_number() over (partition by st_fips_cd, zip_cd_5 order by hh_surr_key asc) as analytic_row_number3
> from hh_agg
> where analytic_row_number2 = 1
> ) t;
> 
> 
> Here is the explain extend output for the query.  At the File Sink Operator of stage 1 below, seems like it should only output 4 columns while the temp table in fact output one additional column (the value of CNT_ADVTG_LEN_RSDNC_CD_1). I’m investigating toward such mismatch, but anyone can confirm and provide additional info that will be helpful.
> 
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-2 depends on stages: Stage-1
>   Stage-0 is a root stage
> 
> STAGE PLANS:
>   Stage: Stage-1
>     Map Reduce
>       Map Operator Tree:
>           TableScan
>             alias: hh_agg
>             Statistics: Num rows: 33208 Data size: 10361206 Basic stats: COMPLETE Column stats: NONE
>             GatherStats: false
>             Filter Operator
>               isSamplingPred: false
>               predicate: (analytic_row_number2 = 1) (type: boolean)
>               Statistics: Num rows: 16604 Data size: 5180603 Basic stats: COMPLETE Column stats: NONE
>               Reduce Output Operator
>                 key expressions: st_fips_cd (type: string), zip_cd_5 (type: string), st_fips_cd (type: string), zip_cd_5 (type: string)
>                 sort order: ++++
>                 Map-reduce partition columns: st_fips_cd (type: string), zip_cd_5 (type: string)
>                 Statistics: Num rows: 16604 Data size: 5180603 Basic stats: COMPLETE Column stats: NONE
>                 tag: -1
>                 value expressions: st_fips_cd (type: string), zip_cd_5 (type: string), hh_surr_key (type: bigint), advtg_len_rsdnc_cd (type: string)
>       Path -> Alias:
>         file:/Users/axu/Documents/localDB/23982_debug [t:hh_agg]
>       Path -> Partition:
>         file:/Users/axu/Documents/localDB/23982_debug 
>           Partition
>             base file name: 23982_debug
>             input format: org.apache.hadoop.mapred.TextInputFormat
>             output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
>             properties:
>               COLUMN_STATS_ACCURATE true
>               EXTERNAL TRUE
>               bucket_count -1
>               columns st_fips_cd,zip_cd_5,hh_surr_key,nbr_hh_in_zip,nbr_nr_adults_in_hh,hh_pop,advtg_len_rsdnc_cd,advtg_home_ownr_cd,dsf_season_cd,advtg_hh_edu_cd,advtg_hh_occupn_cd,advtg_child_presnc_cd,advtg_hh_age_cd,zip_avg_age,zip_mdn_age,mail_rspns_buy_cd,cnt_gend_cd_1,cnt_gend_cd_2,cnt_gend_cd_3,cnt_gend_cd_unk,cnt_advtg_marital_stat_cd_1,cnt_advtg_marital_stat_cd_2,cnt_advtg_marital_stat_cd_unk,cnt_nbr_tradeline_0,cnt_nbr_tradeline_1,cnt_nbr_tradeline_2,cnt_nbr_tradeline_3,cnt_nbr_tradeline_4,cnt_nbr_tradeline_5,cnt_nbr_tradeline_6,cnt_nbr_tradeline_7,cnt_nbr_tradeline_8,cnt_nbr_tradeline_9,cnt_nbr_tradeline_unk,advtg_dwell_type_cd,prprty_mkt_val_cd,zip_avg_prprty_mkt_val,zip_mdn_prprty_mkt_val,zip_avg_home_eqty_amt,zip_mdn_home_eqty_amt,trgt_inc_cd,zip_avg_trgt_inc_narrow_band,zip_mdn_trgt_inc_narrow_band,zip_avg_inc_prodc_asset_cd,zip_mdn_inc_prodc_asset_cd,zip_avg_net_wrth_cd,zip_mdn_net_wrth_cd,rylty_trgt_mktg_val_scr_cd,analytic_row_number2
>               columns.comments 
>               columns.types string:string:bigint:bigint:bigint:tinyint:string:string:string:string:string:string:string:double:double:int:bigint:bigint:bigint:bigint:bigint:bigint:bigint:bigint:bigint:bigint:bigint:bigint:bigint:bigint:bigint:bigint:bigint:bigint:string:string:double:double:double:double:string:double:double:double:double:double:double:string:int
>               field.delim ,
>               file.inputformat org.apache.hadoop.mapred.TextInputFormat
>               file.outputformat org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
>               location file:/Users/axu/Documents/localDB/23982_debug
>               name default.hh_agg
>               numFiles 0
>               numRows 0
>               rawDataSize 0
>               serialization.ddl struct hh_agg { string st_fips_cd, string zip_cd_5, i64 hh_surr_key, i64 nbr_hh_in_zip, i64 nbr_nr_adults_in_hh, byte hh_pop, string advtg_len_rsdnc_cd, string advtg_home_ownr_cd, string dsf_season_cd, string advtg_hh_edu_cd, string advtg_hh_occupn_cd, string advtg_child_presnc_cd, string advtg_hh_age_cd, double zip_avg_age, double zip_mdn_age, i32 mail_rspns_buy_cd, i64 cnt_gend_cd_1, i64 cnt_gend_cd_2, i64 cnt_gend_cd_3, i64 cnt_gend_cd_unk, i64 cnt_advtg_marital_stat_cd_1, i64 cnt_advtg_marital_stat_cd_2, i64 cnt_advtg_marital_stat_cd_unk, i64 cnt_nbr_tradeline_0, i64 cnt_nbr_tradeline_1, i64 cnt_nbr_tradeline_2, i64 cnt_nbr_tradeline_3, i64 cnt_nbr_tradeline_4, i64 cnt_nbr_tradeline_5, i64 cnt_nbr_tradeline_6, i64 cnt_nbr_tradeline_7, i64 cnt_nbr_tradeline_8, i64 cnt_nbr_tradeline_9, i64 cnt_nbr_tradeline_unk, string advtg_dwell_type_cd, string prprty_mkt_val_cd, double zip_avg_prprty_mkt_val, double zip_mdn_prprty_mkt_val, double zip_avg_home_eqty_amt, double zip_mdn_home_eqty_amt, string trgt_inc_cd, double zip_avg_trgt_inc_narrow_band, double zip_mdn_trgt_inc_narrow_band, double zip_avg_inc_prodc_asset_cd, double zip_mdn_inc_prodc_asset_cd, double zip_avg_net_wrth_cd, double zip_mdn_net_wrth_cd, string rylty_trgt_mktg_val_scr_cd, i32 analytic_row_number2}
>               serialization.format ,
>               serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
>               totalSize 0
>               transient_lastDdlTime 1419367415
>             serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
>           
>               input format: org.apache.hadoop.mapred.TextInputFormat
>               output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
>               properties:
>                 COLUMN_STATS_ACCURATE true
>                 EXTERNAL TRUE
>                 bucket_count -1
>                 columns st_fips_cd,zip_cd_5,hh_surr_key,nbr_hh_in_zip,nbr_nr_adults_in_hh,hh_pop,advtg_len_rsdnc_cd,advtg_home_ownr_cd,dsf_season_cd,advtg_hh_edu_cd,advtg_hh_occupn_cd,advtg_child_presnc_cd,advtg_hh_age_cd,zip_avg_age,zip_mdn_age,mail_rspns_buy_cd,cnt_gend_cd_1,cnt_gend_cd_2,cnt_gend_cd_3,cnt_gend_cd_unk,cnt_advtg_marital_stat_cd_1,cnt_advtg_marital_stat_cd_2,cnt_advtg_marital_stat_cd_unk,cnt_nbr_tradeline_0,cnt_nbr_tradeline_1,cnt_nbr_tradeline_2,cnt_nbr_tradeline_3,cnt_nbr_tradeline_4,cnt_nbr_tradeline_5,cnt_nbr_tradeline_6,cnt_nbr_tradeline_7,cnt_nbr_tradeline_8,cnt_nbr_tradeline_9,cnt_nbr_tradeline_unk,advtg_dwell_type_cd,prprty_mkt_val_cd,zip_avg_prprty_mkt_val,zip_mdn_prprty_mkt_val,zip_avg_home_eqty_amt,zip_mdn_home_eqty_amt,trgt_inc_cd,zip_avg_trgt_inc_narrow_band,zip_mdn_trgt_inc_narrow_band,zip_avg_inc_prodc_asset_cd,zip_mdn_inc_prodc_asset_cd,zip_avg_net_wrth_cd,zip_mdn_net_wrth_cd,rylty_trgt_mktg_val_scr_cd,analytic_row_number2
>                 columns.comments 
>                 columns.types string:string:bigint:bigint:bigint:tinyint:string:string:string:string:string:string:string:double:double:int:bigint:bigint:bigint:bigint:bigint:bigint:bigint:bigint:bigint:bigint:bigint:bigint:bigint:bigint:bigint:bigint:bigint:bigint:string:string:double:double:double:double:string:double:double:double:double:double:double:string:int
>                 field.delim ,
>                 file.inputformat org.apache.hadoop.mapred.TextInputFormat
>                 file.outputformat org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
>                 location file:/Users/axu/Documents/localDB/23982_debug
>                 name default.hh_agg
>                 numFiles 0
>                 numRows 0
>                 rawDataSize 0
>                 serialization.ddl struct hh_agg { string st_fips_cd, string zip_cd_5, i64 hh_surr_key, i64 nbr_hh_in_zip, i64 nbr_nr_adults_in_hh, byte hh_pop, string advtg_len_rsdnc_cd, string advtg_home_ownr_cd, string dsf_season_cd, string advtg_hh_edu_cd, string advtg_hh_occupn_cd, string advtg_child_presnc_cd, string advtg_hh_age_cd, double zip_avg_age, double zip_mdn_age, i32 mail_rspns_buy_cd, i64 cnt_gend_cd_1, i64 cnt_gend_cd_2, i64 cnt_gend_cd_3, i64 cnt_gend_cd_unk, i64 cnt_advtg_marital_stat_cd_1, i64 cnt_advtg_marital_stat_cd_2, i64 cnt_advtg_marital_stat_cd_unk, i64 cnt_nbr_tradeline_0, i64 cnt_nbr_tradeline_1, i64 cnt_nbr_tradeline_2, i64 cnt_nbr_tradeline_3, i64 cnt_nbr_tradeline_4, i64 cnt_nbr_tradeline_5, i64 cnt_nbr_tradeline_6, i64 cnt_nbr_tradeline_7, i64 cnt_nbr_tradeline_8, i64 cnt_nbr_tradeline_9, i64 cnt_nbr_tradeline_unk, string advtg_dwell_type_cd, string prprty_mkt_val_cd, double zip_avg_prprty_mkt_val, double zip_mdn_prprty_mkt_val, double zip_avg_home_eqty_amt, double zip_mdn_home_eqty_amt, string trgt_inc_cd, double zip_avg_trgt_inc_narrow_band, double zip_mdn_trgt_inc_narrow_band, double zip_avg_inc_prodc_asset_cd, double zip_mdn_inc_prodc_asset_cd, double zip_avg_net_wrth_cd, double zip_mdn_net_wrth_cd, string rylty_trgt_mktg_val_scr_cd, i32 analytic_row_number2}
>                 serialization.format ,
>                 serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
>                 totalSize 0
>                 transient_lastDdlTime 1419367415
>               serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
>               name: default.hh_agg
>             name: default.hh_agg
>       Truncated Path -> Alias:
>         file:/Users/axu/Documents/localDB/23982_debug [t:hh_agg]
>       Needs Tagging: false
>       Reduce Operator Tree:
>         Extract
>           Statistics: Num rows: 16604 Data size: 5180603 Basic stats: COMPLETE Column stats: NONE
>           PTF Operator
>             Statistics: Num rows: 16604 Data size: 5180603 Basic stats: COMPLETE Column stats: NONE
>             File Output Operator
>               compressed: false
>               GlobalTableId: 0
>               directory: file:/var/folders/00/_7st2p5x5bg1hvpxlgm455180000gp/T/axu/hive_2015-01-01_13-25-01_841_2717056365737730087-1/-mr-10003
>               NumFilesPerFileSink: 1
>               table:
>                   input format: org.apache.hadoop.mapred.SequenceFileInputFormat
>                   output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
>                   properties:
>                     columns _col0,_col1,_col2,_col6
>                     columns.types string,string,bigint,string
>                     escape.delim \
>                     serialization.lib org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe
>                   serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe
>               TotalFiles: 1
>               GatherStats: false
>               MultiFileSpray: false
> 
>   Stage: Stage-2
>     Map Reduce
>       Map Operator Tree:
>           TableScan
>             GatherStats: false
>             Reduce Output Operator
>               key expressions: _col0 (type: string), _col1 (type: string), _col2 (type: bigint)
>               sort order: +++
>               Map-reduce partition columns: _col0 (type: string), _col1 (type: string)
>               Statistics: Num rows: 16604 Data size: 5180603 Basic stats: COMPLETE Column stats: NONE
>               tag: -1
>               value expressions: _col0 (type: string), _col1 (type: string), _col2 (type: bigint)
>       Path -> Alias:
>         file:/var/folders/00/_7st2p5x5bg1hvpxlgm455180000gp/T/axu/hive_2015-01-01_13-25-01_841_2717056365737730087-1/-mr-10003 [file:/var/folders/00/_7st2p5x5bg1hvpxlgm455180000gp/T/axu/hive_2015-01-01_13-25-01_841_2717056365737730087-1/-mr-10003]
>       Path -> Partition:
>         file:/var/folders/00/_7st2p5x5bg1hvpxlgm455180000gp/T/axu/hive_2015-01-01_13-25-01_841_2717056365737730087-1/-mr-10003 
>           Partition
>             base file name: -mr-10003
>             input format: org.apache.hadoop.mapred.SequenceFileInputFormat
>             output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
>             properties:
>               columns _col0,_col1,_col2,_col6
>               columns.types string,string,bigint,string
>               escape.delim \
>               serialization.lib org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe
>             serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe
>           
>               input format: org.apache.hadoop.mapred.SequenceFileInputFormat
>               output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
>               properties:
>                 columns _col0,_col1,_col2,_col6
>                 columns.types string,string,bigint,string
>                 escape.delim \
>                 serialization.lib org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe
>               serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe
>       Truncated Path -> Alias:
>         file:/var/folders/00/_7st2p5x5bg1hvpxlgm455180000gp/T/axu/hive_2015-01-01_13-25-01_841_2717056365737730087-1/-mr-10003 [file:/var/folders/00/_7st2p5x5bg1hvpxlgm455180000gp/T/axu/hive_2015-01-01_13-25-01_841_2717056365737730087-1/-mr-10003]
>       Needs Tagging: false
>       Reduce Operator Tree:
>         Extract
>           Statistics: Num rows: 16604 Data size: 5180603 Basic stats: COMPLETE Column stats: NONE
>           PTF Operator
>             Statistics: Num rows: 16604 Data size: 5180603 Basic stats: COMPLETE Column stats: NONE
>             Select Operator
>               expressions: _col1 (type: string), _col2 (type: string), _col3 (type: bigint)
>               outputColumnNames: _col0, _col1, _col2
>               Statistics: Num rows: 16604 Data size: 5180603 Basic stats: COMPLETE Column stats: NONE
>               File Output Operator
>                 compressed: false
>                 GlobalTableId: 0
>                 directory: file:/var/folders/00/_7st2p5x5bg1hvpxlgm455180000gp/T/axu/hive_2015-01-01_13-25-01_841_2717056365737730087-1/-mr-10000/.hive-staging_hive_2015-01-01_13-25-01_841_2717056365737730087-1/-ext-10001
>                 NumFilesPerFileSink: 1
>                 Statistics: Num rows: 16604 Data size: 5180603 Basic stats: COMPLETE Column stats: NONE
>                 Stats Publishing Key Prefix: file:/var/folders/00/_7st2p5x5bg1hvpxlgm455180000gp/T/axu/hive_2015-01-01_13-25-01_841_2717056365737730087-1/-mr-10000/.hive-staging_hive_2015-01-01_13-25-01_841_2717056365737730087-1/-ext-10001/
>                 table:
>                     input format: org.apache.hadoop.mapred.TextInputFormat
>                     output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
>                     properties:
>                       columns _col0,_col1,_col2
>                       columns.types string:string:bigint
>                       escape.delim \
>                       hive.serialization.extend.nesting.levels true
>                       serialization.format 1
>                       serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
>                     serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
>                 TotalFiles: 1
>                 GatherStats: false
>                 MultiFileSpray: false
> 
>   Stage: Stage-0
>     Fetch Operator
>       limit: -1