You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "r7raul1984@163.com" <r7...@163.com> on 2015/04/02 03:32:46 UTC

hive 0.14 on some platform return some not NULL value as NULL

I use   hive 1.1.0 cli  on computer A (linux)   the result is 
87FQEZT1UEDXJHJQPFFX7G7ET8S2DVPM        2357378283356   91501191044440048       7326356         NULL
87FQEZT1UEDXJHJQPFFX7G7ET8S2DVPM        2357378283356   121501191035580028      7326356         NULL
UBDTK8D9XUZ9GRZU8NZNXDEG73D4PCZG        2362223711289   161501191549050061      14837289      NULL
Y49EY895ACABHS95DRQEE8DVFEB8JSE1        2360853052224   111501191426280023      115883224       NULL
But I use hive0.14 cli in my test enviroment the result is correct.

I use  hive 0.10 on computer B (linux) the result is 
87FQEZT1UEDXJHJQPFFX7G7ET8S2DVPM        2357378283356   91501191044440048       7326356        2015-01-19 10:44:44
87FQEZT1UEDXJHJQPFFX7G7ET8S2DVPM        2357378283356   121501191035580028      7326356        2015-01-19 10:35:58
UBDTK8D9XUZ9GRZU8NZNXDEG73D4PCZG        2362223711289   161501191549050061      14837289     2015-01-19 15:49:05
Y49EY895ACABHS95DRQEE8DVFEB8JSE1        2360853052224   111501191426280023      115883224       2015-01-19 14:26:28

Why ? 
I attach my log. Also in my log I found 2015-04-01 09:55:38,409 WARN [main] org.apache.hadoop.hive.serde2.lazy.LazyStruct: Extra bytes detected at the end of the row! Ignoring similar problems.



r7raul1984@163.com

Re: Re: hive 0.14 on some platform return some not NULL value as NULL

Posted by "r7raul1984@163.com" <r7...@163.com>.
Sorry ,I check my production jdk is java version "1.7.0_45"   not   java version "1.6.0_35" 



r7raul1984@163.com
 
From: r7raul1984@163.com
Date: 2015-04-02 17:01
To: dev
Subject: Re: Re: hive 0.14 on some platform return some not NULL value as NULL
I download full data from hdfs.  Then load data into my table. In my test enviroment. Everything is ok.
My production is  hadoop 2.3.0-cdh 5.0.2   REDHAT 5.8   java version "1.6.0_35" 





r7raul1984@163.com
 
From: r7raul1984@163.com
Date: 2015-04-02 16:57
To: dev
Subject: Re: Re: hive 0.14 on some platform return some not NULL value as NULL
In my test enviroment . I use hive 0.14 ,hive 1.1.0  ,the result is ok.
But in production enviroment  ,the result is not correct.



r7raul1984@163.com
 
From: Thejas Nair
Date: 2015-04-02 16:41
To: r7raul1984@163.com
CC: dev
Subject: Re: Re: hive 0.14 on some platform return some not NULL value as NULL
I am unable to reproduce this issue using the sample data -
 
For this query, using 1.1.0, i get the following result-
87FQEZT1UEDXJHJQPFFX7G7ET8S2DVPM        2357378283356
91501191044440048       7326356 2015-01-19 10:44:442015-01-19
 
On Thu, Apr 2, 2015 at 12:36 AM, r7raul1984@163.com <r7...@163.com> wrote:
>
> DDL is
> CREATE TABLE dw.fct_traffic_navpage_path_detl(
> date_id string,
> chanl_id bigint,
> sessn_id string,
> gu_id string,
> prov_id string,
> city_id string,
> landing_page_type_id string,
> landing_track_time string,
> landing_url string,
> nav_refer_tracker_id string,
> nav_refer_page_type_id string,
> nav_refer_page_value string,
> nav_refer_link_position string,
> nav_tracker_id string,
> nav_page_categ_id string,
> nav_page_type_id string,
> nav_page_value string,
> nav_srce_type string,
> internal_keyword string,
> internal_result_sum string,
> pltfm_id int,
> app_vers string,
> nav_link_position string,
> nav_button_position string,
> nav_track_time string,
> nav_next_tracker_id string,
> sessn_last_time string,
> sessn_pv int,
> detl_tracker_id string,
> detl_page_type_id string,
> detl_page_value string,
> detl_pm_id bigint,
> detl_link_position string,
> detl_position_track_id string,
> cart_tracker_id string,
> cart_page_type_id string,
> cart_page_value string,
> cart_link_postion string,
> cart_button_position string,
> cart_position_track_id string,
> cart_prod_id bigint,
> ordr_tracker_id string,
> ordr_page_type_id string,
> ordr_code string,
> updt_time string,
> cart_pm_id bigint,
> brand_code string,
> categ_type int,
> os string,
> end_user_id string,
> add_cart_flag string,
> navgation_page_flag int,
> nav_page_url string,
> detl_button_position string,
> manul_flag int,
> manul_track_date string,
> nav_refer_tpa string,
> nav_refer_tpa_id string,
> nav_refer_tpc string,
> nav_refer_tpi string,
> nav_refer_tcs string,
> nav_refer_tcsa string,
> nav_refer_tcdt string,
> nav_refer_tcd string,
> nav_refer_tci string,
> nav_refer_postn_type string,
> nav_tpa_id string,
> nav_tpa string,
> nav_tpc string,
> nav_tpi string,
> nav_tcs string,
> nav_tcsa string,
> nav_tcdt string,
> nav_tcd string,
> nav_tci string,
> nav_postn_type string,
> detl_tpa_id string,
> detl_tpa string,
> detl_tpc string,
> detl_tpi string,
> detl_tcs string,
> detl_tcsa string,
> detl_tcdt string,
> detl_tcd string,
> detl_tci string,
> detl_postn_type string,
> cart_tpa_id string,
> cart_tpa string,
> cart_tpc string,
> cart_tpi string,
> cart_tcs string,
> cart_tcsa string,
> cart_tcdt string,
> cart_tcd string,
> cart_tci string,
> cart_postn_type string,
> sessn_chanl_id bigint,
> gu_sec_flg bigint,
> detl_refer_page_type_id string,
> detl_refer_page_value string,
> detl_event_id string,
> nav_refer_intrn_reslt_sum string,
> nav_intrn_reslt_sum string,
> nav_refer_intrn_kw string,
> nav_intrn_kw string,
> detl_track_time string,
> cart_track_time string)
> PARTITIONED BY (
> ds string)
> ROW FORMAT SERDE
> 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
> STORED AS INPUTFORMAT
> 'org.apache.hadoop.mapred.TextInputFormat'
> OUTPUTFORMAT
> 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
> LOCATION
> '/user/hive/dw/fct_traffic_navpage_path_detl'
> TBLPROPERTIES (
> 'numPartitions'='265',
> 'numFiles'='26677',
> 'last_modified_by'='bi_etl',
> 'last_modified_time'='1423633028',
> 'transient_lastDdlTime'='1427870517',
> 'numRows'='0',
> 'totalSize'='8268127466928',
> 'rawDataSize'='0')
>
> My query is :
>
> SELECT a1.sessn_id,
>
>        a1.ordr_code,
>
>        a1.cart_tracker_id,
>
>        a1.end_user_id,
>
>        a1.cart_track_time
>
> FROM   dw.fct_traffic_navpage_path_detl a1
>
> WHERE  a1.ds = '2015-01-19'
>
> AND    a1.cart_tracker_id > 0
>
> AND    (a1.cart_button_position IS NULL OR length(a1.cart_button_position) =
> 0)
>
> AND    a1.sessn_id IN ('Y49EY895ACABHS95DRQEE8DVFEB8JSE1',
>
>                        'UBDTK8D9XUZ9GRZU8NZNXDEG73D4PCZG',
>
>                        '87FQEZT1UEDXJHJQPFFX7G7ET8S2DVPM')
>
>
>
> I attach my sample data.
>
>
> ________________________________
> r7raul1984@163.com
>
>
> From: Thejas Nair
> Date: 2015-04-02 15:28
> To: dev
> Subject: Re: hive 0.14 on some platform return some not NULL value as NULL
> Can you give more details
> - the query you are running
> - schema of the table
> - serialization format of the table, sample records if possible.
>
>
> On Wed, Apr 1, 2015 at 6:32 PM, r7raul1984@163.com <r7...@163.com>
> wrote:
>>
>> I use   hive 1.1.0 cli  on computer A (linux)   the result is
>>
>> 87FQEZT1UEDXJHJQPFFX7G7ET8S2DVPM        2357378283356   91501191044440048
>> 7326356         NULL
>>
>> 87FQEZT1UEDXJHJQPFFX7G7ET8S2DVPM        2357378283356   121501191035580028
>> 7326356         NULL
>>
>> UBDTK8D9XUZ9GRZU8NZNXDEG73D4PCZG        2362223711289   161501191549050061
>> 14837289      NULL
>>
>> Y49EY895ACABHS95DRQEE8DVFEB8JSE1        2360853052224   111501191426280023
>> 115883224       NULL
>>
>> But I use hive0.14 cli in my test enviroment the result is correct.
>>
>>
>> I use  hive 0.10 on computer B (linux) the result is
>>
>> 87FQEZT1UEDXJHJQPFFX7G7ET8S2DVPM        2357378283356   91501191044440048
>> 7326356        2015-01-19 10:44:44
>>
>> 87FQEZT1UEDXJHJQPFFX7G7ET8S2DVPM        2357378283356   121501191035580028
>> 7326356        2015-01-19 10:35:58
>>
>> UBDTK8D9XUZ9GRZU8NZNXDEG73D4PCZG        2362223711289   161501191549050061
>> 14837289     2015-01-19 15:49:05
>>
>> Y49EY895ACABHS95DRQEE8DVFEB8JSE1        2360853052224   111501191426280023
>> 115883224       2015-01-19 14:26:28
>>
>>
>> Why ?
>>
>> I attach my log. Also in my log I found 2015-04-01 09:55:38,409 WARN
>> [main]
>> org.apache.hadoop.hive.serde2.lazy.LazyStruct: Extra bytes detected at the
>> end of the row! Ignoring similar problems.
>>
>>
>> ________________________________
>> r7raul1984@163.com

Re: Re: hive 0.14 on some platform return some not NULL value as NULL

Posted by "r7raul1984@163.com" <r7...@163.com>.
I download full data from hdfs.  Then load data into my table. In my test enviroment. Everything is ok.
My production is  hadoop 2.3.0-cdh 5.0.2   REDHAT 5.8   java version "1.6.0_35" 





r7raul1984@163.com
 
From: r7raul1984@163.com
Date: 2015-04-02 16:57
To: dev
Subject: Re: Re: hive 0.14 on some platform return some not NULL value as NULL
In my test enviroment . I use hive 0.14 ,hive 1.1.0  ,the result is ok.
But in production enviroment  ,the result is not correct.



r7raul1984@163.com
 
From: Thejas Nair
Date: 2015-04-02 16:41
To: r7raul1984@163.com
CC: dev
Subject: Re: Re: hive 0.14 on some platform return some not NULL value as NULL
I am unable to reproduce this issue using the sample data -
 
For this query, using 1.1.0, i get the following result-
87FQEZT1UEDXJHJQPFFX7G7ET8S2DVPM        2357378283356
91501191044440048       7326356 2015-01-19 10:44:442015-01-19
 
On Thu, Apr 2, 2015 at 12:36 AM, r7raul1984@163.com <r7...@163.com> wrote:
>
> DDL is
> CREATE TABLE dw.fct_traffic_navpage_path_detl(
> date_id string,
> chanl_id bigint,
> sessn_id string,
> gu_id string,
> prov_id string,
> city_id string,
> landing_page_type_id string,
> landing_track_time string,
> landing_url string,
> nav_refer_tracker_id string,
> nav_refer_page_type_id string,
> nav_refer_page_value string,
> nav_refer_link_position string,
> nav_tracker_id string,
> nav_page_categ_id string,
> nav_page_type_id string,
> nav_page_value string,
> nav_srce_type string,
> internal_keyword string,
> internal_result_sum string,
> pltfm_id int,
> app_vers string,
> nav_link_position string,
> nav_button_position string,
> nav_track_time string,
> nav_next_tracker_id string,
> sessn_last_time string,
> sessn_pv int,
> detl_tracker_id string,
> detl_page_type_id string,
> detl_page_value string,
> detl_pm_id bigint,
> detl_link_position string,
> detl_position_track_id string,
> cart_tracker_id string,
> cart_page_type_id string,
> cart_page_value string,
> cart_link_postion string,
> cart_button_position string,
> cart_position_track_id string,
> cart_prod_id bigint,
> ordr_tracker_id string,
> ordr_page_type_id string,
> ordr_code string,
> updt_time string,
> cart_pm_id bigint,
> brand_code string,
> categ_type int,
> os string,
> end_user_id string,
> add_cart_flag string,
> navgation_page_flag int,
> nav_page_url string,
> detl_button_position string,
> manul_flag int,
> manul_track_date string,
> nav_refer_tpa string,
> nav_refer_tpa_id string,
> nav_refer_tpc string,
> nav_refer_tpi string,
> nav_refer_tcs string,
> nav_refer_tcsa string,
> nav_refer_tcdt string,
> nav_refer_tcd string,
> nav_refer_tci string,
> nav_refer_postn_type string,
> nav_tpa_id string,
> nav_tpa string,
> nav_tpc string,
> nav_tpi string,
> nav_tcs string,
> nav_tcsa string,
> nav_tcdt string,
> nav_tcd string,
> nav_tci string,
> nav_postn_type string,
> detl_tpa_id string,
> detl_tpa string,
> detl_tpc string,
> detl_tpi string,
> detl_tcs string,
> detl_tcsa string,
> detl_tcdt string,
> detl_tcd string,
> detl_tci string,
> detl_postn_type string,
> cart_tpa_id string,
> cart_tpa string,
> cart_tpc string,
> cart_tpi string,
> cart_tcs string,
> cart_tcsa string,
> cart_tcdt string,
> cart_tcd string,
> cart_tci string,
> cart_postn_type string,
> sessn_chanl_id bigint,
> gu_sec_flg bigint,
> detl_refer_page_type_id string,
> detl_refer_page_value string,
> detl_event_id string,
> nav_refer_intrn_reslt_sum string,
> nav_intrn_reslt_sum string,
> nav_refer_intrn_kw string,
> nav_intrn_kw string,
> detl_track_time string,
> cart_track_time string)
> PARTITIONED BY (
> ds string)
> ROW FORMAT SERDE
> 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
> STORED AS INPUTFORMAT
> 'org.apache.hadoop.mapred.TextInputFormat'
> OUTPUTFORMAT
> 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
> LOCATION
> '/user/hive/dw/fct_traffic_navpage_path_detl'
> TBLPROPERTIES (
> 'numPartitions'='265',
> 'numFiles'='26677',
> 'last_modified_by'='bi_etl',
> 'last_modified_time'='1423633028',
> 'transient_lastDdlTime'='1427870517',
> 'numRows'='0',
> 'totalSize'='8268127466928',
> 'rawDataSize'='0')
>
> My query is :
>
> SELECT a1.sessn_id,
>
>        a1.ordr_code,
>
>        a1.cart_tracker_id,
>
>        a1.end_user_id,
>
>        a1.cart_track_time
>
> FROM   dw.fct_traffic_navpage_path_detl a1
>
> WHERE  a1.ds = '2015-01-19'
>
> AND    a1.cart_tracker_id > 0
>
> AND    (a1.cart_button_position IS NULL OR length(a1.cart_button_position) =
> 0)
>
> AND    a1.sessn_id IN ('Y49EY895ACABHS95DRQEE8DVFEB8JSE1',
>
>                        'UBDTK8D9XUZ9GRZU8NZNXDEG73D4PCZG',
>
>                        '87FQEZT1UEDXJHJQPFFX7G7ET8S2DVPM')
>
>
>
> I attach my sample data.
>
>
> ________________________________
> r7raul1984@163.com
>
>
> From: Thejas Nair
> Date: 2015-04-02 15:28
> To: dev
> Subject: Re: hive 0.14 on some platform return some not NULL value as NULL
> Can you give more details
> - the query you are running
> - schema of the table
> - serialization format of the table, sample records if possible.
>
>
> On Wed, Apr 1, 2015 at 6:32 PM, r7raul1984@163.com <r7...@163.com>
> wrote:
>>
>> I use   hive 1.1.0 cli  on computer A (linux)   the result is
>>
>> 87FQEZT1UEDXJHJQPFFX7G7ET8S2DVPM        2357378283356   91501191044440048
>> 7326356         NULL
>>
>> 87FQEZT1UEDXJHJQPFFX7G7ET8S2DVPM        2357378283356   121501191035580028
>> 7326356         NULL
>>
>> UBDTK8D9XUZ9GRZU8NZNXDEG73D4PCZG        2362223711289   161501191549050061
>> 14837289      NULL
>>
>> Y49EY895ACABHS95DRQEE8DVFEB8JSE1        2360853052224   111501191426280023
>> 115883224       NULL
>>
>> But I use hive0.14 cli in my test enviroment the result is correct.
>>
>>
>> I use  hive 0.10 on computer B (linux) the result is
>>
>> 87FQEZT1UEDXJHJQPFFX7G7ET8S2DVPM        2357378283356   91501191044440048
>> 7326356        2015-01-19 10:44:44
>>
>> 87FQEZT1UEDXJHJQPFFX7G7ET8S2DVPM        2357378283356   121501191035580028
>> 7326356        2015-01-19 10:35:58
>>
>> UBDTK8D9XUZ9GRZU8NZNXDEG73D4PCZG        2362223711289   161501191549050061
>> 14837289     2015-01-19 15:49:05
>>
>> Y49EY895ACABHS95DRQEE8DVFEB8JSE1        2360853052224   111501191426280023
>> 115883224       2015-01-19 14:26:28
>>
>>
>> Why ?
>>
>> I attach my log. Also in my log I found 2015-04-01 09:55:38,409 WARN
>> [main]
>> org.apache.hadoop.hive.serde2.lazy.LazyStruct: Extra bytes detected at the
>> end of the row! Ignoring similar problems.
>>
>>
>> ________________________________
>> r7raul1984@163.com

Re: Re: hive 0.14 on some platform return some not NULL value as NULL

Posted by "r7raul1984@163.com" <r7...@163.com>.
In my test enviroment . I use hive 0.14 ,hive 1.1.0  ,the result is ok.
But in production enviroment  ,the result is not correct.



r7raul1984@163.com
 
From: Thejas Nair
Date: 2015-04-02 16:41
To: r7raul1984@163.com
CC: dev
Subject: Re: Re: hive 0.14 on some platform return some not NULL value as NULL
I am unable to reproduce this issue using the sample data -
 
For this query, using 1.1.0, i get the following result-
87FQEZT1UEDXJHJQPFFX7G7ET8S2DVPM        2357378283356
91501191044440048       7326356 2015-01-19 10:44:442015-01-19
 
On Thu, Apr 2, 2015 at 12:36 AM, r7raul1984@163.com <r7...@163.com> wrote:
>
> DDL is
> CREATE TABLE dw.fct_traffic_navpage_path_detl(
> date_id string,
> chanl_id bigint,
> sessn_id string,
> gu_id string,
> prov_id string,
> city_id string,
> landing_page_type_id string,
> landing_track_time string,
> landing_url string,
> nav_refer_tracker_id string,
> nav_refer_page_type_id string,
> nav_refer_page_value string,
> nav_refer_link_position string,
> nav_tracker_id string,
> nav_page_categ_id string,
> nav_page_type_id string,
> nav_page_value string,
> nav_srce_type string,
> internal_keyword string,
> internal_result_sum string,
> pltfm_id int,
> app_vers string,
> nav_link_position string,
> nav_button_position string,
> nav_track_time string,
> nav_next_tracker_id string,
> sessn_last_time string,
> sessn_pv int,
> detl_tracker_id string,
> detl_page_type_id string,
> detl_page_value string,
> detl_pm_id bigint,
> detl_link_position string,
> detl_position_track_id string,
> cart_tracker_id string,
> cart_page_type_id string,
> cart_page_value string,
> cart_link_postion string,
> cart_button_position string,
> cart_position_track_id string,
> cart_prod_id bigint,
> ordr_tracker_id string,
> ordr_page_type_id string,
> ordr_code string,
> updt_time string,
> cart_pm_id bigint,
> brand_code string,
> categ_type int,
> os string,
> end_user_id string,
> add_cart_flag string,
> navgation_page_flag int,
> nav_page_url string,
> detl_button_position string,
> manul_flag int,
> manul_track_date string,
> nav_refer_tpa string,
> nav_refer_tpa_id string,
> nav_refer_tpc string,
> nav_refer_tpi string,
> nav_refer_tcs string,
> nav_refer_tcsa string,
> nav_refer_tcdt string,
> nav_refer_tcd string,
> nav_refer_tci string,
> nav_refer_postn_type string,
> nav_tpa_id string,
> nav_tpa string,
> nav_tpc string,
> nav_tpi string,
> nav_tcs string,
> nav_tcsa string,
> nav_tcdt string,
> nav_tcd string,
> nav_tci string,
> nav_postn_type string,
> detl_tpa_id string,
> detl_tpa string,
> detl_tpc string,
> detl_tpi string,
> detl_tcs string,
> detl_tcsa string,
> detl_tcdt string,
> detl_tcd string,
> detl_tci string,
> detl_postn_type string,
> cart_tpa_id string,
> cart_tpa string,
> cart_tpc string,
> cart_tpi string,
> cart_tcs string,
> cart_tcsa string,
> cart_tcdt string,
> cart_tcd string,
> cart_tci string,
> cart_postn_type string,
> sessn_chanl_id bigint,
> gu_sec_flg bigint,
> detl_refer_page_type_id string,
> detl_refer_page_value string,
> detl_event_id string,
> nav_refer_intrn_reslt_sum string,
> nav_intrn_reslt_sum string,
> nav_refer_intrn_kw string,
> nav_intrn_kw string,
> detl_track_time string,
> cart_track_time string)
> PARTITIONED BY (
> ds string)
> ROW FORMAT SERDE
> 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
> STORED AS INPUTFORMAT
> 'org.apache.hadoop.mapred.TextInputFormat'
> OUTPUTFORMAT
> 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
> LOCATION
> '/user/hive/dw/fct_traffic_navpage_path_detl'
> TBLPROPERTIES (
> 'numPartitions'='265',
> 'numFiles'='26677',
> 'last_modified_by'='bi_etl',
> 'last_modified_time'='1423633028',
> 'transient_lastDdlTime'='1427870517',
> 'numRows'='0',
> 'totalSize'='8268127466928',
> 'rawDataSize'='0')
>
> My query is :
>
> SELECT a1.sessn_id,
>
>        a1.ordr_code,
>
>        a1.cart_tracker_id,
>
>        a1.end_user_id,
>
>        a1.cart_track_time
>
> FROM   dw.fct_traffic_navpage_path_detl a1
>
> WHERE  a1.ds = '2015-01-19'
>
> AND    a1.cart_tracker_id > 0
>
> AND    (a1.cart_button_position IS NULL OR length(a1.cart_button_position) =
> 0)
>
> AND    a1.sessn_id IN ('Y49EY895ACABHS95DRQEE8DVFEB8JSE1',
>
>                        'UBDTK8D9XUZ9GRZU8NZNXDEG73D4PCZG',
>
>                        '87FQEZT1UEDXJHJQPFFX7G7ET8S2DVPM')
>
>
>
> I attach my sample data.
>
>
> ________________________________
> r7raul1984@163.com
>
>
> From: Thejas Nair
> Date: 2015-04-02 15:28
> To: dev
> Subject: Re: hive 0.14 on some platform return some not NULL value as NULL
> Can you give more details
> - the query you are running
> - schema of the table
> - serialization format of the table, sample records if possible.
>
>
> On Wed, Apr 1, 2015 at 6:32 PM, r7raul1984@163.com <r7...@163.com>
> wrote:
>>
>> I use   hive 1.1.0 cli  on computer A (linux)   the result is
>>
>> 87FQEZT1UEDXJHJQPFFX7G7ET8S2DVPM        2357378283356   91501191044440048
>> 7326356         NULL
>>
>> 87FQEZT1UEDXJHJQPFFX7G7ET8S2DVPM        2357378283356   121501191035580028
>> 7326356         NULL
>>
>> UBDTK8D9XUZ9GRZU8NZNXDEG73D4PCZG        2362223711289   161501191549050061
>> 14837289      NULL
>>
>> Y49EY895ACABHS95DRQEE8DVFEB8JSE1        2360853052224   111501191426280023
>> 115883224       NULL
>>
>> But I use hive0.14 cli in my test enviroment the result is correct.
>>
>>
>> I use  hive 0.10 on computer B (linux) the result is
>>
>> 87FQEZT1UEDXJHJQPFFX7G7ET8S2DVPM        2357378283356   91501191044440048
>> 7326356        2015-01-19 10:44:44
>>
>> 87FQEZT1UEDXJHJQPFFX7G7ET8S2DVPM        2357378283356   121501191035580028
>> 7326356        2015-01-19 10:35:58
>>
>> UBDTK8D9XUZ9GRZU8NZNXDEG73D4PCZG        2362223711289   161501191549050061
>> 14837289     2015-01-19 15:49:05
>>
>> Y49EY895ACABHS95DRQEE8DVFEB8JSE1        2360853052224   111501191426280023
>> 115883224       2015-01-19 14:26:28
>>
>>
>> Why ?
>>
>> I attach my log. Also in my log I found 2015-04-01 09:55:38,409 WARN
>> [main]
>> org.apache.hadoop.hive.serde2.lazy.LazyStruct: Extra bytes detected at the
>> end of the row! Ignoring similar problems.
>>
>>
>> ________________________________
>> r7raul1984@163.com

Re: Re: hive 0.14 on some platform return some not NULL value as NULL

Posted by Thejas Nair <th...@gmail.com>.
I am unable to reproduce this issue using the sample data -

For this query, using 1.1.0, i get the following result-
87FQEZT1UEDXJHJQPFFX7G7ET8S2DVPM        2357378283356
91501191044440048       7326356 2015-01-19 10:44:442015-01-19

On Thu, Apr 2, 2015 at 12:36 AM, r7raul1984@163.com <r7...@163.com> wrote:
>
> DDL is
> CREATE TABLE dw.fct_traffic_navpage_path_detl(
> date_id string,
> chanl_id bigint,
> sessn_id string,
> gu_id string,
> prov_id string,
> city_id string,
> landing_page_type_id string,
> landing_track_time string,
> landing_url string,
> nav_refer_tracker_id string,
> nav_refer_page_type_id string,
> nav_refer_page_value string,
> nav_refer_link_position string,
> nav_tracker_id string,
> nav_page_categ_id string,
> nav_page_type_id string,
> nav_page_value string,
> nav_srce_type string,
> internal_keyword string,
> internal_result_sum string,
> pltfm_id int,
> app_vers string,
> nav_link_position string,
> nav_button_position string,
> nav_track_time string,
> nav_next_tracker_id string,
> sessn_last_time string,
> sessn_pv int,
> detl_tracker_id string,
> detl_page_type_id string,
> detl_page_value string,
> detl_pm_id bigint,
> detl_link_position string,
> detl_position_track_id string,
> cart_tracker_id string,
> cart_page_type_id string,
> cart_page_value string,
> cart_link_postion string,
> cart_button_position string,
> cart_position_track_id string,
> cart_prod_id bigint,
> ordr_tracker_id string,
> ordr_page_type_id string,
> ordr_code string,
> updt_time string,
> cart_pm_id bigint,
> brand_code string,
> categ_type int,
> os string,
> end_user_id string,
> add_cart_flag string,
> navgation_page_flag int,
> nav_page_url string,
> detl_button_position string,
> manul_flag int,
> manul_track_date string,
> nav_refer_tpa string,
> nav_refer_tpa_id string,
> nav_refer_tpc string,
> nav_refer_tpi string,
> nav_refer_tcs string,
> nav_refer_tcsa string,
> nav_refer_tcdt string,
> nav_refer_tcd string,
> nav_refer_tci string,
> nav_refer_postn_type string,
> nav_tpa_id string,
> nav_tpa string,
> nav_tpc string,
> nav_tpi string,
> nav_tcs string,
> nav_tcsa string,
> nav_tcdt string,
> nav_tcd string,
> nav_tci string,
> nav_postn_type string,
> detl_tpa_id string,
> detl_tpa string,
> detl_tpc string,
> detl_tpi string,
> detl_tcs string,
> detl_tcsa string,
> detl_tcdt string,
> detl_tcd string,
> detl_tci string,
> detl_postn_type string,
> cart_tpa_id string,
> cart_tpa string,
> cart_tpc string,
> cart_tpi string,
> cart_tcs string,
> cart_tcsa string,
> cart_tcdt string,
> cart_tcd string,
> cart_tci string,
> cart_postn_type string,
> sessn_chanl_id bigint,
> gu_sec_flg bigint,
> detl_refer_page_type_id string,
> detl_refer_page_value string,
> detl_event_id string,
> nav_refer_intrn_reslt_sum string,
> nav_intrn_reslt_sum string,
> nav_refer_intrn_kw string,
> nav_intrn_kw string,
> detl_track_time string,
> cart_track_time string)
> PARTITIONED BY (
> ds string)
> ROW FORMAT SERDE
> 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
> STORED AS INPUTFORMAT
> 'org.apache.hadoop.mapred.TextInputFormat'
> OUTPUTFORMAT
> 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
> LOCATION
> '/user/hive/dw/fct_traffic_navpage_path_detl'
> TBLPROPERTIES (
> 'numPartitions'='265',
> 'numFiles'='26677',
> 'last_modified_by'='bi_etl',
> 'last_modified_time'='1423633028',
> 'transient_lastDdlTime'='1427870517',
> 'numRows'='0',
> 'totalSize'='8268127466928',
> 'rawDataSize'='0')
>
> My query is :
>
> SELECT a1.sessn_id,
>
>        a1.ordr_code,
>
>        a1.cart_tracker_id,
>
>        a1.end_user_id,
>
>        a1.cart_track_time
>
> FROM   dw.fct_traffic_navpage_path_detl a1
>
> WHERE  a1.ds = '2015-01-19'
>
> AND    a1.cart_tracker_id > 0
>
> AND    (a1.cart_button_position IS NULL OR length(a1.cart_button_position) =
> 0)
>
> AND    a1.sessn_id IN ('Y49EY895ACABHS95DRQEE8DVFEB8JSE1',
>
>                        'UBDTK8D9XUZ9GRZU8NZNXDEG73D4PCZG',
>
>                        '87FQEZT1UEDXJHJQPFFX7G7ET8S2DVPM')
>
>
>
> I attach my sample data.
>
>
> ________________________________
> r7raul1984@163.com
>
>
> From: Thejas Nair
> Date: 2015-04-02 15:28
> To: dev
> Subject: Re: hive 0.14 on some platform return some not NULL value as NULL
> Can you give more details
> - the query you are running
> - schema of the table
> - serialization format of the table, sample records if possible.
>
>
> On Wed, Apr 1, 2015 at 6:32 PM, r7raul1984@163.com <r7...@163.com>
> wrote:
>>
>> I use   hive 1.1.0 cli  on computer A (linux)   the result is
>>
>> 87FQEZT1UEDXJHJQPFFX7G7ET8S2DVPM        2357378283356   91501191044440048
>> 7326356         NULL
>>
>> 87FQEZT1UEDXJHJQPFFX7G7ET8S2DVPM        2357378283356   121501191035580028
>> 7326356         NULL
>>
>> UBDTK8D9XUZ9GRZU8NZNXDEG73D4PCZG        2362223711289   161501191549050061
>> 14837289      NULL
>>
>> Y49EY895ACABHS95DRQEE8DVFEB8JSE1        2360853052224   111501191426280023
>> 115883224       NULL
>>
>> But I use hive0.14 cli in my test enviroment the result is correct.
>>
>>
>> I use  hive 0.10 on computer B (linux) the result is
>>
>> 87FQEZT1UEDXJHJQPFFX7G7ET8S2DVPM        2357378283356   91501191044440048
>> 7326356        2015-01-19 10:44:44
>>
>> 87FQEZT1UEDXJHJQPFFX7G7ET8S2DVPM        2357378283356   121501191035580028
>> 7326356        2015-01-19 10:35:58
>>
>> UBDTK8D9XUZ9GRZU8NZNXDEG73D4PCZG        2362223711289   161501191549050061
>> 14837289     2015-01-19 15:49:05
>>
>> Y49EY895ACABHS95DRQEE8DVFEB8JSE1        2360853052224   111501191426280023
>> 115883224       2015-01-19 14:26:28
>>
>>
>> Why ?
>>
>> I attach my log. Also in my log I found 2015-04-01 09:55:38,409 WARN
>> [main]
>> org.apache.hadoop.hive.serde2.lazy.LazyStruct: Extra bytes detected at the
>> end of the row! Ignoring similar problems.
>>
>>
>> ________________________________
>> r7raul1984@163.com

Re: Re: hive 0.14 on some platform return some not NULL value as NULL

Posted by "r7raul1984@163.com" <r7...@163.com>.
I use hive 0.14 to use hive 0.10 metastroe server .The problem fixed. Now hive 0.14 return correct result.



r7raul1984@163.com
 
From: r7raul1984@163.com
Date: 2015-04-07 10:34
To: dev
CC: thejas.nair
Subject: Re: Re: hive 0.14 on some platform return some not NULL value as NULL
 
 
I found difference form log:
In hive 0.14 
DEBUG lazy.LazySimpleSerDe: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe initialized with: columnNames=[date_id, chanl_id, sessn_id, gu_id, prov_id, city_id, landing_page_type_id, landing_track_time, landing_url, nav_refer_tracker_id, nav_refer_page_type_id, nav_refer_page_value, nav_refer_link_position, nav_tracker_id, nav_page_categ_id, nav_page_type_id, nav_page_value, nav_srce_type, internal_keyword, internal_result_sum, pltfm_id, app_vers, nav_link_position, nav_button_position, nav_track_time, nav_next_tracker_id, sessn_last_time, sessn_pv, detl_tracker_id, detl_page_type_id, detl_page_value, detl_pm_id, detl_link_position, detl_position_track_id, cart_tracker_id, cart_page_type_id, cart_page_value, cart_link_postion, cart_button_position, cart_position_track_id, cart_prod_id, ordr_tracker_id, ordr_page_type_id, ordr_code, updt_time, cart_pm_id, brand_code, categ_type, os, end_user_id, add_cart_flag, navgation_page_flag, nav_page_url, detl_button_position, manul_flag, manul_track_date, nav_refer_tpa, nav_refer_tpa_id, nav_refer_tpc, nav_refer_tpi, nav_refer_tcs, nav_refer_tcsa, nav_refer_tcdt, nav_refer_tcd, nav_refer_tci, nav_refer_postn_type, nav_tpa_id, nav_tpa, nav_tpc, nav_tpi, nav_tcs, nav_tcsa, nav_tcdt, nav_tcd, nav_tci, nav_postn_type, detl_tpa_id, detl_tpa, detl_tpc, detl_tpi, detl_tcs, detl_tcsa, detl_tcdt, detl_tcd, detl_tci, detl_postn_type, cart_tpa_id, cart_tpa, cart_tpc, cart_tpi, cart_tcs, cart_tcsa, cart_tcdt, cart_tcd, cart_tci, cart_postn_type] columnTypes=[string, bigint, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, int, string, string, string, string, string, string, int, string, string, string, bigint, string, string, string, string, string, string, string, string, bigint, string, string, string, string, bigint, string, int, string, string, string, int, string, string, int, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string] separator=[[B@e50bca4] nullstring=\N lastColumnTakesRest=false 
 
In hive 0.10 
DEBUG lazy.LazySimpleSerDe: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe initialized with: columnNames=[date_id, chanl_id, sessn_id, gu_id, prov_id, city_id, landing_page_type_id, landing_track_time, landing_url, nav_refer_tracker_id, nav_refer_page_type_id, nav_refer_page_value, nav_refer_link_position, nav_tracker_id, nav_page_categ_id, nav_page_type_id, nav_page_value, nav_srce_type, internal_keyword, internal_result_sum, pltfm_id, app_vers, nav_link_position, nav_button_position, nav_track_time, nav_next_tracker_id, sessn_last_time, sessn_pv, detl_tracker_id, detl_page_type_id, detl_page_value, detl_pm_id, detl_link_position, detl_position_track_id, cart_tracker_id, cart_page_type_id, cart_page_value, cart_link_postion, cart_button_position, cart_position_track_id, cart_prod_id, ordr_tracker_id, ordr_page_type_id, ordr_code, updt_time, cart_pm_id, brand_code, categ_type, os, end_user_id, add_cart_flag, navgation_page_flag, nav_page_url, detl_button_position, manul_flag, manul_track_date, nav_refer_tpa, nav_refer_tpa_id, nav_refer_tpc, nav_refer_tpi, nav_refer_tcs, nav_refer_tcsa, nav_refer_tcdt, nav_refer_tcd, nav_refer_tci, nav_refer_postn_type, nav_tpa_id, nav_tpa, nav_tpc, nav_tpi, nav_tcs, nav_tcsa, nav_tcdt, nav_tcd, nav_tci, nav_postn_type, detl_tpa_id, detl_tpa, detl_tpc, detl_tpi, detl_tcs, detl_tcsa, detl_tcdt, detl_tcd, detl_tci, detl_postn_type, cart_tpa_id, cart_tpa, cart_tpc, cart_tpi, cart_tcs, cart_tcsa, cart_tcdt, cart_tcd, cart_tci, cart_postn_type, sessn_chanl_id, gu_sec_flg, detl_refer_page_type_id, detl_refer_page_value, detl_event_id, nav_refer_intrn_reslt_sum, nav_intrn_reslt_sum, nav_refer_intrn_kw, nav_intrn_kw, detl_track_time, cart_track_time] columnTypes=[string, bigint, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, int, string, string, string, string, string, string, int, string, string, string, bigint, string, string, string, string, string, string, string, string, bigint, string, string, string, string, bigint, string, int, string, string, string, int, string, string, int, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, bigint, bigint, string, string, string, string, string, string, string, string, string] separator=[[B@116265c3] nullstring=\N lastColumnTakesRest=false 
 
You see hive 0.14 lost some column info. Why?   BTW, My meta database schema is hive 0.10 not update to hive 0.14.
 
 
 
r7raul1984@163.com
From: r7raul1984@163.com
Date: 2015-04-02 15:36
To: dev
CC: thejas.nair
Subject: Re: Re: hive 0.14 on some platform return some not NULL value as NULL
 
DDL is
CREATE TABLE dw.fct_traffic_navpage_path_detl( 
date_id string, 
chanl_id bigint, 
sessn_id string, 
gu_id string, 
prov_id string, 
city_id string, 
landing_page_type_id string, 
landing_track_time string, 
landing_url string, 
nav_refer_tracker_id string, 
nav_refer_page_type_id string, 
nav_refer_page_value string, 
nav_refer_link_position string, 
nav_tracker_id string, 
nav_page_categ_id string, 
nav_page_type_id string, 
nav_page_value string, 
nav_srce_type string, 
internal_keyword string, 
internal_result_sum string, 
pltfm_id int, 
app_vers string, 
nav_link_position string, 
nav_button_position string, 
nav_track_time string, 
nav_next_tracker_id string, 
sessn_last_time string, 
sessn_pv int, 
detl_tracker_id string, 
detl_page_type_id string, 
detl_page_value string, 
detl_pm_id bigint, 
detl_link_position string, 
detl_position_track_id string, 
cart_tracker_id string, 
cart_page_type_id string, 
cart_page_value string, 
cart_link_postion string, 
cart_button_position string, 
cart_position_track_id string, 
cart_prod_id bigint, 
ordr_tracker_id string, 
ordr_page_type_id string, 
ordr_code string, 
updt_time string, 
cart_pm_id bigint, 
brand_code string, 
categ_type int, 
os string, 
end_user_id string, 
add_cart_flag string, 
navgation_page_flag int, 
nav_page_url string, 
detl_button_position string, 
manul_flag int, 
manul_track_date string, 
nav_refer_tpa string, 
nav_refer_tpa_id string, 
nav_refer_tpc string, 
nav_refer_tpi string, 
nav_refer_tcs string, 
nav_refer_tcsa string, 
nav_refer_tcdt string, 
nav_refer_tcd string, 
nav_refer_tci string, 
nav_refer_postn_type string, 
nav_tpa_id string, 
nav_tpa string, 
nav_tpc string, 
nav_tpi string, 
nav_tcs string, 
nav_tcsa string, 
nav_tcdt string, 
nav_tcd string, 
nav_tci string, 
nav_postn_type string, 
detl_tpa_id string, 
detl_tpa string, 
detl_tpc string, 
detl_tpi string, 
detl_tcs string, 
detl_tcsa string, 
detl_tcdt string, 
detl_tcd string, 
detl_tci string, 
detl_postn_type string, 
cart_tpa_id string, 
cart_tpa string, 
cart_tpc string, 
cart_tpi string, 
cart_tcs string, 
cart_tcsa string, 
cart_tcdt string, 
cart_tcd string, 
cart_tci string, 
cart_postn_type string, 
sessn_chanl_id bigint, 
gu_sec_flg bigint, 
detl_refer_page_type_id string, 
detl_refer_page_value string, 
detl_event_id string, 
nav_refer_intrn_reslt_sum string, 
nav_intrn_reslt_sum string, 
nav_refer_intrn_kw string, 
nav_intrn_kw string, 
detl_track_time string, 
cart_track_time string) 
PARTITIONED BY ( 
ds string) 
ROW FORMAT SERDE 
'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' 
STORED AS INPUTFORMAT 
'org.apache.hadoop.mapred.TextInputFormat' 
OUTPUTFORMAT 
'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' 
LOCATION 
'/user/hive/dw/fct_traffic_navpage_path_detl' 
TBLPROPERTIES ( 
'numPartitions'='265', 
'numFiles'='26677', 
'last_modified_by'='bi_etl', 
'last_modified_time'='1423633028', 
'transient_lastDdlTime'='1427870517', 
'numRows'='0', 
'totalSize'='8268127466928', 
'rawDataSize'='0') 
 
My query is :
SELECT a1.sessn_id,
       a1.ordr_code,
       a1.cart_tracker_id,
       a1.end_user_id,
       a1.cart_track_time
FROM   dw.fct_traffic_navpage_path_detl a1
WHERE  a1.ds = '2015-01-19'
AND    a1.cart_tracker_id > 0
AND    (a1.cart_button_position IS NULL OR length(a1.cart_button_position) = 0)
AND    a1.sessn_id IN ('Y49EY895ACABHS95DRQEE8DVFEB8JSE1',
                       'UBDTK8D9XUZ9GRZU8NZNXDEG73D4PCZG',
                       '87FQEZT1UEDXJHJQPFFX7G7ET8S2DVPM')
 
 
I attach my sample data.
 
 
 
r7raul1984@163.com
From: Thejas Nair
Date: 2015-04-02 15:28
To: dev
Subject: Re: hive 0.14 on some platform return some not NULL value as NULL
Can you give more details
- the query you are running
- schema of the table
- serialization format of the table, sample records if possible.
On Wed, Apr 1, 2015 at 6:32 PM, r7raul1984@163.com <r7...@163.com> wrote:
>
> I use   hive 1.1.0 cli  on computer A (linux)   the result is
>
> 87FQEZT1UEDXJHJQPFFX7G7ET8S2DVPM        2357378283356   91501191044440048
> 7326356         NULL
>
> 87FQEZT1UEDXJHJQPFFX7G7ET8S2DVPM        2357378283356   121501191035580028
> 7326356         NULL
>
> UBDTK8D9XUZ9GRZU8NZNXDEG73D4PCZG        2362223711289   161501191549050061
> 14837289      NULL
>
> Y49EY895ACABHS95DRQEE8DVFEB8JSE1        2360853052224   111501191426280023
> 115883224       NULL
>
> But I use hive0.14 cli in my test enviroment the result is correct.
>
>
> I use  hive 0.10 on computer B (linux) the result is
>
> 87FQEZT1UEDXJHJQPFFX7G7ET8S2DVPM        2357378283356   91501191044440048
> 7326356        2015-01-19 10:44:44
>
> 87FQEZT1UEDXJHJQPFFX7G7ET8S2DVPM        2357378283356   121501191035580028
> 7326356        2015-01-19 10:35:58
>
> UBDTK8D9XUZ9GRZU8NZNXDEG73D4PCZG        2362223711289   161501191549050061
> 14837289     2015-01-19 15:49:05
>
> Y49EY895ACABHS95DRQEE8DVFEB8JSE1        2360853052224   111501191426280023
> 115883224       2015-01-19 14:26:28
>
>
> Why ?
>
> I attach my log. Also in my log I found 2015-04-01 09:55:38,409 WARN [main]
> org.apache.hadoop.hive.serde2.lazy.LazyStruct: Extra bytes detected at the
> end of the row! Ignoring similar problems.
>
>
> ________________________________
> r7raul1984@163.com

Re: Re: hive 0.14 on some platform return some not NULL value as NULL

Posted by "r7raul1984@163.com" <r7...@163.com>.

I found difference form log:
In hive 0.14 
DEBUG lazy.LazySimpleSerDe: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe initialized with: columnNames=[date_id, chanl_id, sessn_id, gu_id, prov_id, city_id, landing_page_type_id, landing_track_time, landing_url, nav_refer_tracker_id, nav_refer_page_type_id, nav_refer_page_value, nav_refer_link_position, nav_tracker_id, nav_page_categ_id, nav_page_type_id, nav_page_value, nav_srce_type, internal_keyword, internal_result_sum, pltfm_id, app_vers, nav_link_position, nav_button_position, nav_track_time, nav_next_tracker_id, sessn_last_time, sessn_pv, detl_tracker_id, detl_page_type_id, detl_page_value, detl_pm_id, detl_link_position, detl_position_track_id, cart_tracker_id, cart_page_type_id, cart_page_value, cart_link_postion, cart_button_position, cart_position_track_id, cart_prod_id, ordr_tracker_id, ordr_page_type_id, ordr_code, updt_time, cart_pm_id, brand_code, categ_type, os, end_user_id, add_cart_flag, navgation_page_flag, nav_page_url, detl_button_position, manul_flag, manul_track_date, nav_refer_tpa, nav_refer_tpa_id, nav_refer_tpc, nav_refer_tpi, nav_refer_tcs, nav_refer_tcsa, nav_refer_tcdt, nav_refer_tcd, nav_refer_tci, nav_refer_postn_type, nav_tpa_id, nav_tpa, nav_tpc, nav_tpi, nav_tcs, nav_tcsa, nav_tcdt, nav_tcd, nav_tci, nav_postn_type, detl_tpa_id, detl_tpa, detl_tpc, detl_tpi, detl_tcs, detl_tcsa, detl_tcdt, detl_tcd, detl_tci, detl_postn_type, cart_tpa_id, cart_tpa, cart_tpc, cart_tpi, cart_tcs, cart_tcsa, cart_tcdt, cart_tcd, cart_tci, cart_postn_type] columnTypes=[string, bigint, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, int, string, string, string, string, string, string, int, string, string, string, bigint, string, string, string, string, string, string, string, string, bigint, string, string, string, string, bigint, string, int, string, string, string, int, string, string, int, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string] separator=[[B@e50bca4] nullstring=\N lastColumnTakesRest=false 

In hive 0.10 
DEBUG lazy.LazySimpleSerDe: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe initialized with: columnNames=[date_id, chanl_id, sessn_id, gu_id, prov_id, city_id, landing_page_type_id, landing_track_time, landing_url, nav_refer_tracker_id, nav_refer_page_type_id, nav_refer_page_value, nav_refer_link_position, nav_tracker_id, nav_page_categ_id, nav_page_type_id, nav_page_value, nav_srce_type, internal_keyword, internal_result_sum, pltfm_id, app_vers, nav_link_position, nav_button_position, nav_track_time, nav_next_tracker_id, sessn_last_time, sessn_pv, detl_tracker_id, detl_page_type_id, detl_page_value, detl_pm_id, detl_link_position, detl_position_track_id, cart_tracker_id, cart_page_type_id, cart_page_value, cart_link_postion, cart_button_position, cart_position_track_id, cart_prod_id, ordr_tracker_id, ordr_page_type_id, ordr_code, updt_time, cart_pm_id, brand_code, categ_type, os, end_user_id, add_cart_flag, navgation_page_flag, nav_page_url, detl_button_position, manul_flag, manul_track_date, nav_refer_tpa, nav_refer_tpa_id, nav_refer_tpc, nav_refer_tpi, nav_refer_tcs, nav_refer_tcsa, nav_refer_tcdt, nav_refer_tcd, nav_refer_tci, nav_refer_postn_type, nav_tpa_id, nav_tpa, nav_tpc, nav_tpi, nav_tcs, nav_tcsa, nav_tcdt, nav_tcd, nav_tci, nav_postn_type, detl_tpa_id, detl_tpa, detl_tpc, detl_tpi, detl_tcs, detl_tcsa, detl_tcdt, detl_tcd, detl_tci, detl_postn_type, cart_tpa_id, cart_tpa, cart_tpc, cart_tpi, cart_tcs, cart_tcsa, cart_tcdt, cart_tcd, cart_tci, cart_postn_type, sessn_chanl_id, gu_sec_flg, detl_refer_page_type_id, detl_refer_page_value, detl_event_id, nav_refer_intrn_reslt_sum, nav_intrn_reslt_sum, nav_refer_intrn_kw, nav_intrn_kw, detl_track_time, cart_track_time] columnTypes=[string, bigint, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, int, string, string, string, string, string, string, int, string, string, string, bigint, string, string, string, string, string, string, string, string, bigint, string, string, string, string, bigint, string, int, string, string, string, int, string, string, int, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, bigint, bigint, string, string, string, string, string, string, string, string, string] separator=[[B@116265c3] nullstring=\N lastColumnTakesRest=false 

You see hive 0.14 lost some column info. Why?   BTW, My meta database schema is hive 0.10 not update to hive 0.14.



r7raul1984@163.com
 
From: r7raul1984@163.com
Date: 2015-04-02 15:36
To: dev
CC: thejas.nair
Subject: Re: Re: hive 0.14 on some platform return some not NULL value as NULL

DDL is
CREATE TABLE dw.fct_traffic_navpage_path_detl( 
date_id string, 
chanl_id bigint, 
sessn_id string, 
gu_id string, 
prov_id string, 
city_id string, 
landing_page_type_id string, 
landing_track_time string, 
landing_url string, 
nav_refer_tracker_id string, 
nav_refer_page_type_id string, 
nav_refer_page_value string, 
nav_refer_link_position string, 
nav_tracker_id string, 
nav_page_categ_id string, 
nav_page_type_id string, 
nav_page_value string, 
nav_srce_type string, 
internal_keyword string, 
internal_result_sum string, 
pltfm_id int, 
app_vers string, 
nav_link_position string, 
nav_button_position string, 
nav_track_time string, 
nav_next_tracker_id string, 
sessn_last_time string, 
sessn_pv int, 
detl_tracker_id string, 
detl_page_type_id string, 
detl_page_value string, 
detl_pm_id bigint, 
detl_link_position string, 
detl_position_track_id string, 
cart_tracker_id string, 
cart_page_type_id string, 
cart_page_value string, 
cart_link_postion string, 
cart_button_position string, 
cart_position_track_id string, 
cart_prod_id bigint, 
ordr_tracker_id string, 
ordr_page_type_id string, 
ordr_code string, 
updt_time string, 
cart_pm_id bigint, 
brand_code string, 
categ_type int, 
os string, 
end_user_id string, 
add_cart_flag string, 
navgation_page_flag int, 
nav_page_url string, 
detl_button_position string, 
manul_flag int, 
manul_track_date string, 
nav_refer_tpa string, 
nav_refer_tpa_id string, 
nav_refer_tpc string, 
nav_refer_tpi string, 
nav_refer_tcs string, 
nav_refer_tcsa string, 
nav_refer_tcdt string, 
nav_refer_tcd string, 
nav_refer_tci string, 
nav_refer_postn_type string, 
nav_tpa_id string, 
nav_tpa string, 
nav_tpc string, 
nav_tpi string, 
nav_tcs string, 
nav_tcsa string, 
nav_tcdt string, 
nav_tcd string, 
nav_tci string, 
nav_postn_type string, 
detl_tpa_id string, 
detl_tpa string, 
detl_tpc string, 
detl_tpi string, 
detl_tcs string, 
detl_tcsa string, 
detl_tcdt string, 
detl_tcd string, 
detl_tci string, 
detl_postn_type string, 
cart_tpa_id string, 
cart_tpa string, 
cart_tpc string, 
cart_tpi string, 
cart_tcs string, 
cart_tcsa string, 
cart_tcdt string, 
cart_tcd string, 
cart_tci string, 
cart_postn_type string, 
sessn_chanl_id bigint, 
gu_sec_flg bigint, 
detl_refer_page_type_id string, 
detl_refer_page_value string, 
detl_event_id string, 
nav_refer_intrn_reslt_sum string, 
nav_intrn_reslt_sum string, 
nav_refer_intrn_kw string, 
nav_intrn_kw string, 
detl_track_time string, 
cart_track_time string) 
PARTITIONED BY ( 
ds string) 
ROW FORMAT SERDE 
'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' 
STORED AS INPUTFORMAT 
'org.apache.hadoop.mapred.TextInputFormat' 
OUTPUTFORMAT 
'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' 
LOCATION 
'/user/hive/dw/fct_traffic_navpage_path_detl' 
TBLPROPERTIES ( 
'numPartitions'='265', 
'numFiles'='26677', 
'last_modified_by'='bi_etl', 
'last_modified_time'='1423633028', 
'transient_lastDdlTime'='1427870517', 
'numRows'='0', 
'totalSize'='8268127466928', 
'rawDataSize'='0') 

My query is :
SELECT a1.sessn_id,
       a1.ordr_code,
       a1.cart_tracker_id,
       a1.end_user_id,
       a1.cart_track_time
FROM   dw.fct_traffic_navpage_path_detl a1
WHERE  a1.ds = '2015-01-19'
AND    a1.cart_tracker_id > 0
AND    (a1.cart_button_position IS NULL OR length(a1.cart_button_position) = 0)
AND    a1.sessn_id IN ('Y49EY895ACABHS95DRQEE8DVFEB8JSE1',
                       'UBDTK8D9XUZ9GRZU8NZNXDEG73D4PCZG',
                       '87FQEZT1UEDXJHJQPFFX7G7ET8S2DVPM')


I attach my sample data.



r7raul1984@163.com
 
From: Thejas Nair
Date: 2015-04-02 15:28
To: dev
Subject: Re: hive 0.14 on some platform return some not NULL value as NULL
Can you give more details
- the query you are running
- schema of the table
- serialization format of the table, sample records if possible.
 
 
On Wed, Apr 1, 2015 at 6:32 PM, r7raul1984@163.com <r7...@163.com> wrote:
>
> I use   hive 1.1.0 cli  on computer A (linux)   the result is
>
> 87FQEZT1UEDXJHJQPFFX7G7ET8S2DVPM        2357378283356   91501191044440048
> 7326356         NULL
>
> 87FQEZT1UEDXJHJQPFFX7G7ET8S2DVPM        2357378283356   121501191035580028
> 7326356         NULL
>
> UBDTK8D9XUZ9GRZU8NZNXDEG73D4PCZG        2362223711289   161501191549050061
> 14837289      NULL
>
> Y49EY895ACABHS95DRQEE8DVFEB8JSE1        2360853052224   111501191426280023
> 115883224       NULL
>
> But I use hive0.14 cli in my test enviroment the result is correct.
>
>
> I use  hive 0.10 on computer B (linux) the result is
>
> 87FQEZT1UEDXJHJQPFFX7G7ET8S2DVPM        2357378283356   91501191044440048
> 7326356        2015-01-19 10:44:44
>
> 87FQEZT1UEDXJHJQPFFX7G7ET8S2DVPM        2357378283356   121501191035580028
> 7326356        2015-01-19 10:35:58
>
> UBDTK8D9XUZ9GRZU8NZNXDEG73D4PCZG        2362223711289   161501191549050061
> 14837289     2015-01-19 15:49:05
>
> Y49EY895ACABHS95DRQEE8DVFEB8JSE1        2360853052224   111501191426280023
> 115883224       2015-01-19 14:26:28
>
>
> Why ?
>
> I attach my log. Also in my log I found 2015-04-01 09:55:38,409 WARN [main]
> org.apache.hadoop.hive.serde2.lazy.LazyStruct: Extra bytes detected at the
> end of the row! Ignoring similar problems.
>
>
> ________________________________
> r7raul1984@163.com

Re: Re: hive 0.14 on some platform return some not NULL value as NULL

Posted by "r7raul1984@163.com" <r7...@163.com>.
DDL is
CREATE TABLE dw.fct_traffic_navpage_path_detl( 
date_id string, 
chanl_id bigint, 
sessn_id string, 
gu_id string, 
prov_id string, 
city_id string, 
landing_page_type_id string, 
landing_track_time string, 
landing_url string, 
nav_refer_tracker_id string, 
nav_refer_page_type_id string, 
nav_refer_page_value string, 
nav_refer_link_position string, 
nav_tracker_id string, 
nav_page_categ_id string, 
nav_page_type_id string, 
nav_page_value string, 
nav_srce_type string, 
internal_keyword string, 
internal_result_sum string, 
pltfm_id int, 
app_vers string, 
nav_link_position string, 
nav_button_position string, 
nav_track_time string, 
nav_next_tracker_id string, 
sessn_last_time string, 
sessn_pv int, 
detl_tracker_id string, 
detl_page_type_id string, 
detl_page_value string, 
detl_pm_id bigint, 
detl_link_position string, 
detl_position_track_id string, 
cart_tracker_id string, 
cart_page_type_id string, 
cart_page_value string, 
cart_link_postion string, 
cart_button_position string, 
cart_position_track_id string, 
cart_prod_id bigint, 
ordr_tracker_id string, 
ordr_page_type_id string, 
ordr_code string, 
updt_time string, 
cart_pm_id bigint, 
brand_code string, 
categ_type int, 
os string, 
end_user_id string, 
add_cart_flag string, 
navgation_page_flag int, 
nav_page_url string, 
detl_button_position string, 
manul_flag int, 
manul_track_date string, 
nav_refer_tpa string, 
nav_refer_tpa_id string, 
nav_refer_tpc string, 
nav_refer_tpi string, 
nav_refer_tcs string, 
nav_refer_tcsa string, 
nav_refer_tcdt string, 
nav_refer_tcd string, 
nav_refer_tci string, 
nav_refer_postn_type string, 
nav_tpa_id string, 
nav_tpa string, 
nav_tpc string, 
nav_tpi string, 
nav_tcs string, 
nav_tcsa string, 
nav_tcdt string, 
nav_tcd string, 
nav_tci string, 
nav_postn_type string, 
detl_tpa_id string, 
detl_tpa string, 
detl_tpc string, 
detl_tpi string, 
detl_tcs string, 
detl_tcsa string, 
detl_tcdt string, 
detl_tcd string, 
detl_tci string, 
detl_postn_type string, 
cart_tpa_id string, 
cart_tpa string, 
cart_tpc string, 
cart_tpi string, 
cart_tcs string, 
cart_tcsa string, 
cart_tcdt string, 
cart_tcd string, 
cart_tci string, 
cart_postn_type string, 
sessn_chanl_id bigint, 
gu_sec_flg bigint, 
detl_refer_page_type_id string, 
detl_refer_page_value string, 
detl_event_id string, 
nav_refer_intrn_reslt_sum string, 
nav_intrn_reslt_sum string, 
nav_refer_intrn_kw string, 
nav_intrn_kw string, 
detl_track_time string, 
cart_track_time string) 
PARTITIONED BY ( 
ds string) 
ROW FORMAT SERDE 
'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' 
STORED AS INPUTFORMAT 
'org.apache.hadoop.mapred.TextInputFormat' 
OUTPUTFORMAT 
'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' 
LOCATION 
'/user/hive/dw/fct_traffic_navpage_path_detl' 
TBLPROPERTIES ( 
'numPartitions'='265', 
'numFiles'='26677', 
'last_modified_by'='bi_etl', 
'last_modified_time'='1423633028', 
'transient_lastDdlTime'='1427870517', 
'numRows'='0', 
'totalSize'='8268127466928', 
'rawDataSize'='0') 

My query is :
SELECT a1.sessn_id,
       a1.ordr_code,
       a1.cart_tracker_id,
       a1.end_user_id,
       a1.cart_track_time
FROM   dw.fct_traffic_navpage_path_detl a1
WHERE  a1.ds = '2015-01-19'
AND    a1.cart_tracker_id > 0
AND    (a1.cart_button_position IS NULL OR length(a1.cart_button_position) = 0)
AND    a1.sessn_id IN ('Y49EY895ACABHS95DRQEE8DVFEB8JSE1',
                       'UBDTK8D9XUZ9GRZU8NZNXDEG73D4PCZG',
                       '87FQEZT1UEDXJHJQPFFX7G7ET8S2DVPM')


I attach my sample data.



r7raul1984@163.com
 
From: Thejas Nair
Date: 2015-04-02 15:28
To: dev
Subject: Re: hive 0.14 on some platform return some not NULL value as NULL
Can you give more details
- the query you are running
- schema of the table
- serialization format of the table, sample records if possible.
 
 
On Wed, Apr 1, 2015 at 6:32 PM, r7raul1984@163.com <r7...@163.com> wrote:
>
> I use   hive 1.1.0 cli  on computer A (linux)   the result is
>
> 87FQEZT1UEDXJHJQPFFX7G7ET8S2DVPM        2357378283356   91501191044440048
> 7326356         NULL
>
> 87FQEZT1UEDXJHJQPFFX7G7ET8S2DVPM        2357378283356   121501191035580028
> 7326356         NULL
>
> UBDTK8D9XUZ9GRZU8NZNXDEG73D4PCZG        2362223711289   161501191549050061
> 14837289      NULL
>
> Y49EY895ACABHS95DRQEE8DVFEB8JSE1        2360853052224   111501191426280023
> 115883224       NULL
>
> But I use hive0.14 cli in my test enviroment the result is correct.
>
>
> I use  hive 0.10 on computer B (linux) the result is
>
> 87FQEZT1UEDXJHJQPFFX7G7ET8S2DVPM        2357378283356   91501191044440048
> 7326356        2015-01-19 10:44:44
>
> 87FQEZT1UEDXJHJQPFFX7G7ET8S2DVPM        2357378283356   121501191035580028
> 7326356        2015-01-19 10:35:58
>
> UBDTK8D9XUZ9GRZU8NZNXDEG73D4PCZG        2362223711289   161501191549050061
> 14837289     2015-01-19 15:49:05
>
> Y49EY895ACABHS95DRQEE8DVFEB8JSE1        2360853052224   111501191426280023
> 115883224       2015-01-19 14:26:28
>
>
> Why ?
>
> I attach my log. Also in my log I found 2015-04-01 09:55:38,409 WARN [main]
> org.apache.hadoop.hive.serde2.lazy.LazyStruct: Extra bytes detected at the
> end of the row! Ignoring similar problems.
>
>
> ________________________________
> r7raul1984@163.com

Re: hive 0.14 on some platform return some not NULL value as NULL

Posted by Thejas Nair <th...@gmail.com>.
Can you give more details
- the query you are running
- schema of the table
- serialization format of the table, sample records if possible.


On Wed, Apr 1, 2015 at 6:32 PM, r7raul1984@163.com <r7...@163.com> wrote:
>
> I use   hive 1.1.0 cli  on computer A (linux)   the result is
>
> 87FQEZT1UEDXJHJQPFFX7G7ET8S2DVPM        2357378283356   91501191044440048
> 7326356         NULL
>
> 87FQEZT1UEDXJHJQPFFX7G7ET8S2DVPM        2357378283356   121501191035580028
> 7326356         NULL
>
> UBDTK8D9XUZ9GRZU8NZNXDEG73D4PCZG        2362223711289   161501191549050061
> 14837289      NULL
>
> Y49EY895ACABHS95DRQEE8DVFEB8JSE1        2360853052224   111501191426280023
> 115883224       NULL
>
> But I use hive0.14 cli in my test enviroment the result is correct.
>
>
> I use  hive 0.10 on computer B (linux) the result is
>
> 87FQEZT1UEDXJHJQPFFX7G7ET8S2DVPM        2357378283356   91501191044440048
> 7326356        2015-01-19 10:44:44
>
> 87FQEZT1UEDXJHJQPFFX7G7ET8S2DVPM        2357378283356   121501191035580028
> 7326356        2015-01-19 10:35:58
>
> UBDTK8D9XUZ9GRZU8NZNXDEG73D4PCZG        2362223711289   161501191549050061
> 14837289     2015-01-19 15:49:05
>
> Y49EY895ACABHS95DRQEE8DVFEB8JSE1        2360853052224   111501191426280023
> 115883224       2015-01-19 14:26:28
>
>
> Why ?
>
> I attach my log. Also in my log I found 2015-04-01 09:55:38,409 WARN [main]
> org.apache.hadoop.hive.serde2.lazy.LazyStruct: Extra bytes detected at the
> end of the row! Ignoring similar problems.
>
>
> ________________________________
> r7raul1984@163.com