You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Guy le Mar (JIRA)" <ji...@apache.org> on 2010/12/23 05:03:01 UTC
[jira] Created: (HIVE-1863) Boolean columns in Hive tables
containing NULL are treated as FALSE by the Hive JDBC driver.
Boolean columns in Hive tables containing NULL are treated as FALSE by the Hive JDBC driver.
--------------------------------------------------------------------------------------------
Key: HIVE-1863
URL: https://issues.apache.org/jira/browse/HIVE-1863
Project: Hive
Issue Type: Bug
Components: Drivers
Affects Versions: 0.5.0
Reporter: Guy le Mar
(1) Using the Hive CLI, create a table using...
create table dt4_boolean
(
dt4_id int,
dt4_testbool boolean,
dt4_string string
)
row format delimited
fields terminated by ','
lines terminated by '\n';
(2) Create a file containing the following text...
1,true,Value is True
2,null,Data says null and must be null
3,,No value that means null
4,NoIdea,Data says NoIdea that's gonna be null
5,false,Value is FALSE
(3) Load the data in the file into the Hive table...
load data local inpath '<DATA FILE PATH>' overwrite into table dt4_boolean;
(4) Check the table works as expected using the Hive CLI...
hive> select * from dt4_boolean;
OK
1 true Value is True
2 NULL Data says null and must be null
3 NULL No value that means null
4 NULL Data says NoIdea that's gonna be null
5 false Value is FALSE
Time taken: 0.049 seconds
(5) Using the Hive JDBC driver, execute the same Hive query (select * from dt4_boolean)
(5.1) The "row_str" values obtained by the Hive JDBC driver for deserialization are correct...
1 true Value is True
2 NULL Data says null and must be null
3 NULL No value that means null
4 NULL Data says NoIdea that's gonna be null
5 false Value is FALSE
(5.2) However, when these "row_str" are deserialized by the DynamicSerDe to a java.lang.Object, the NULL boolean values are converted to FALSE - instead of being null.
As a consequence, the application making use of the Hive JDBC driver produces this (incorrect) output...
SQL> select dt4_id, dt4_testbool from dt4_boolean;
DT4_ID DT4_TESTBOOL
---------- ------------
1 true
2 false
3 false
4 false
5 false
...instead of producing this (correct) output...
SQL> select dt4_id, dt4_testbool from dt4_boolean;
DT4_ID DT4_TESTBOOL
---------- ------------
1 true
2 NULL
3 NULL
4 NULL
5 false
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1863) Boolean columns in Hive tables
containing NULL are treated as FALSE by the Hive JDBC driver.
Posted by "Carl Steinbach (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-1863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Carl Steinbach updated HIVE-1863:
---------------------------------
Component/s: (was: Drivers)
JDBC
Fix Version/s: 0.7.0
> Boolean columns in Hive tables containing NULL are treated as FALSE by the Hive JDBC driver.
> --------------------------------------------------------------------------------------------
>
> Key: HIVE-1863
> URL: https://issues.apache.org/jira/browse/HIVE-1863
> Project: Hive
> Issue Type: Bug
> Components: JDBC
> Affects Versions: 0.5.0
> Reporter: Guy le Mar
> Assignee: Edward Capriolo
> Fix For: 0.7.0
>
>
> (1) Using the Hive CLI, create a table using...
> create table dt4_boolean
> (
> dt4_id int,
> dt4_testbool boolean,
> dt4_string string
> )
> row format delimited
> fields terminated by ','
> lines terminated by '\n';
> (2) Create a file containing the following text...
> 1,true,Value is True
> 2,null,Data says null and must be null
> 3,,No value that means null
> 4,NoIdea,Data says NoIdea that's gonna be null
> 5,false,Value is FALSE
> (3) Load the data in the file into the Hive table...
> load data local inpath '<DATA FILE PATH>' overwrite into table dt4_boolean;
> (4) Check the table works as expected using the Hive CLI...
> hive> select * from dt4_boolean;
> OK
> 1 true Value is True
> 2 NULL Data says null and must be null
> 3 NULL No value that means null
> 4 NULL Data says NoIdea that's gonna be null
> 5 false Value is FALSE
> Time taken: 0.049 seconds
> (5) Using the Hive JDBC driver, execute the same Hive query (select * from dt4_boolean)
> (5.1) The "row_str" values obtained by the Hive JDBC driver for deserialization are correct...
> 1 true Value is True
> 2 NULL Data says null and must be null
> 3 NULL No value that means null
> 4 NULL Data says NoIdea that's gonna be null
> 5 false Value is FALSE
> (5.2) However, when these "row_str" are deserialized by the DynamicSerDe to a java.lang.Object, the NULL boolean values are converted to FALSE - instead of being null.
> As a consequence, the application making use of the Hive JDBC driver produces this (incorrect) output...
> SQL> select dt4_id, dt4_testbool from dt4_boolean;
> DT4_ID DT4_TESTBOOL
> ---------- ------------
> 1 true
> 2 false
> 3 false
> 4 false
> 5 false
> ...instead of producing this (correct) output...
> SQL> select dt4_id, dt4_testbool from dt4_boolean;
> DT4_ID DT4_TESTBOOL
> ---------- ------------
> 1 true
> 2 NULL
> 3 NULL
> 4 NULL
> 5 false
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Assigned: (HIVE-1863) Boolean columns in Hive tables
containing NULL are treated as FALSE by the Hive JDBC driver.
Posted by "Edward Capriolo (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-1863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Edward Capriolo reassigned HIVE-1863:
-------------------------------------
Assignee: Edward Capriolo
> Boolean columns in Hive tables containing NULL are treated as FALSE by the Hive JDBC driver.
> --------------------------------------------------------------------------------------------
>
> Key: HIVE-1863
> URL: https://issues.apache.org/jira/browse/HIVE-1863
> Project: Hive
> Issue Type: Bug
> Components: Drivers
> Affects Versions: 0.5.0
> Reporter: Guy le Mar
> Assignee: Edward Capriolo
>
> (1) Using the Hive CLI, create a table using...
> create table dt4_boolean
> (
> dt4_id int,
> dt4_testbool boolean,
> dt4_string string
> )
> row format delimited
> fields terminated by ','
> lines terminated by '\n';
> (2) Create a file containing the following text...
> 1,true,Value is True
> 2,null,Data says null and must be null
> 3,,No value that means null
> 4,NoIdea,Data says NoIdea that's gonna be null
> 5,false,Value is FALSE
> (3) Load the data in the file into the Hive table...
> load data local inpath '<DATA FILE PATH>' overwrite into table dt4_boolean;
> (4) Check the table works as expected using the Hive CLI...
> hive> select * from dt4_boolean;
> OK
> 1 true Value is True
> 2 NULL Data says null and must be null
> 3 NULL No value that means null
> 4 NULL Data says NoIdea that's gonna be null
> 5 false Value is FALSE
> Time taken: 0.049 seconds
> (5) Using the Hive JDBC driver, execute the same Hive query (select * from dt4_boolean)
> (5.1) The "row_str" values obtained by the Hive JDBC driver for deserialization are correct...
> 1 true Value is True
> 2 NULL Data says null and must be null
> 3 NULL No value that means null
> 4 NULL Data says NoIdea that's gonna be null
> 5 false Value is FALSE
> (5.2) However, when these "row_str" are deserialized by the DynamicSerDe to a java.lang.Object, the NULL boolean values are converted to FALSE - instead of being null.
> As a consequence, the application making use of the Hive JDBC driver produces this (incorrect) output...
> SQL> select dt4_id, dt4_testbool from dt4_boolean;
> DT4_ID DT4_TESTBOOL
> ---------- ------------
> 1 true
> 2 false
> 3 false
> 4 false
> 5 false
> ...instead of producing this (correct) output...
> SQL> select dt4_id, dt4_testbool from dt4_boolean;
> DT4_ID DT4_TESTBOOL
> ---------- ------------
> 1 true
> 2 NULL
> 3 NULL
> 4 NULL
> 5 false
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Resolved: (HIVE-1863) Boolean columns in Hive tables
containing NULL are treated as FALSE by the Hive JDBC driver.
Posted by "Carl Steinbach (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-1863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Carl Steinbach resolved HIVE-1863.
----------------------------------
Resolution: Duplicate
> Boolean columns in Hive tables containing NULL are treated as FALSE by the Hive JDBC driver.
> --------------------------------------------------------------------------------------------
>
> Key: HIVE-1863
> URL: https://issues.apache.org/jira/browse/HIVE-1863
> Project: Hive
> Issue Type: Bug
> Components: JDBC
> Affects Versions: 0.5.0
> Reporter: Guy le Mar
> Assignee: Edward Capriolo
> Fix For: 0.7.0
>
>
> (1) Using the Hive CLI, create a table using...
> create table dt4_boolean
> (
> dt4_id int,
> dt4_testbool boolean,
> dt4_string string
> )
> row format delimited
> fields terminated by ','
> lines terminated by '\n';
> (2) Create a file containing the following text...
> 1,true,Value is True
> 2,null,Data says null and must be null
> 3,,No value that means null
> 4,NoIdea,Data says NoIdea that's gonna be null
> 5,false,Value is FALSE
> (3) Load the data in the file into the Hive table...
> load data local inpath '<DATA FILE PATH>' overwrite into table dt4_boolean;
> (4) Check the table works as expected using the Hive CLI...
> hive> select * from dt4_boolean;
> OK
> 1 true Value is True
> 2 NULL Data says null and must be null
> 3 NULL No value that means null
> 4 NULL Data says NoIdea that's gonna be null
> 5 false Value is FALSE
> Time taken: 0.049 seconds
> (5) Using the Hive JDBC driver, execute the same Hive query (select * from dt4_boolean)
> (5.1) The "row_str" values obtained by the Hive JDBC driver for deserialization are correct...
> 1 true Value is True
> 2 NULL Data says null and must be null
> 3 NULL No value that means null
> 4 NULL Data says NoIdea that's gonna be null
> 5 false Value is FALSE
> (5.2) However, when these "row_str" are deserialized by the DynamicSerDe to a java.lang.Object, the NULL boolean values are converted to FALSE - instead of being null.
> As a consequence, the application making use of the Hive JDBC driver produces this (incorrect) output...
> SQL> select dt4_id, dt4_testbool from dt4_boolean;
> DT4_ID DT4_TESTBOOL
> ---------- ------------
> 1 true
> 2 false
> 3 false
> 4 false
> 5 false
> ...instead of producing this (correct) output...
> SQL> select dt4_id, dt4_testbool from dt4_boolean;
> DT4_ID DT4_TESTBOOL
> ---------- ------------
> 1 true
> 2 NULL
> 3 NULL
> 4 NULL
> 5 false
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1863) Boolean columns in Hive tables
containing NULL are treated as FALSE by the Hive JDBC driver.
Posted by "Edward Capriolo (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-1863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12974701#action_12974701 ]
Edward Capriolo commented on HIVE-1863:
---------------------------------------
Right. Something gets "lost in translation" here because Hive Service is using the ArrayList<String> that comes from driver.getResults() which is designed for sending output to the final console not another serde. We should be able to re-write string and place the nulls back in carefully, however a better solution might exists.
> Boolean columns in Hive tables containing NULL are treated as FALSE by the Hive JDBC driver.
> --------------------------------------------------------------------------------------------
>
> Key: HIVE-1863
> URL: https://issues.apache.org/jira/browse/HIVE-1863
> Project: Hive
> Issue Type: Bug
> Components: Drivers
> Affects Versions: 0.5.0
> Reporter: Guy le Mar
> Assignee: Edward Capriolo
>
> (1) Using the Hive CLI, create a table using...
> create table dt4_boolean
> (
> dt4_id int,
> dt4_testbool boolean,
> dt4_string string
> )
> row format delimited
> fields terminated by ','
> lines terminated by '\n';
> (2) Create a file containing the following text...
> 1,true,Value is True
> 2,null,Data says null and must be null
> 3,,No value that means null
> 4,NoIdea,Data says NoIdea that's gonna be null
> 5,false,Value is FALSE
> (3) Load the data in the file into the Hive table...
> load data local inpath '<DATA FILE PATH>' overwrite into table dt4_boolean;
> (4) Check the table works as expected using the Hive CLI...
> hive> select * from dt4_boolean;
> OK
> 1 true Value is True
> 2 NULL Data says null and must be null
> 3 NULL No value that means null
> 4 NULL Data says NoIdea that's gonna be null
> 5 false Value is FALSE
> Time taken: 0.049 seconds
> (5) Using the Hive JDBC driver, execute the same Hive query (select * from dt4_boolean)
> (5.1) The "row_str" values obtained by the Hive JDBC driver for deserialization are correct...
> 1 true Value is True
> 2 NULL Data says null and must be null
> 3 NULL No value that means null
> 4 NULL Data says NoIdea that's gonna be null
> 5 false Value is FALSE
> (5.2) However, when these "row_str" are deserialized by the DynamicSerDe to a java.lang.Object, the NULL boolean values are converted to FALSE - instead of being null.
> As a consequence, the application making use of the Hive JDBC driver produces this (incorrect) output...
> SQL> select dt4_id, dt4_testbool from dt4_boolean;
> DT4_ID DT4_TESTBOOL
> ---------- ------------
> 1 true
> 2 false
> 3 false
> 4 false
> 5 false
> ...instead of producing this (correct) output...
> SQL> select dt4_id, dt4_testbool from dt4_boolean;
> DT4_ID DT4_TESTBOOL
> ---------- ------------
> 1 true
> 2 NULL
> 3 NULL
> 4 NULL
> 5 false
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1863) Boolean columns in Hive tables
containing NULL are treated as FALSE by the Hive JDBC driver.
Posted by "Steven Wong (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-1863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12974728#action_12974728 ]
Steven Wong commented on HIVE-1863:
-----------------------------------
This may have been fixed by HIVE-1378 in trunk (0.7). Please give that a try.
> Boolean columns in Hive tables containing NULL are treated as FALSE by the Hive JDBC driver.
> --------------------------------------------------------------------------------------------
>
> Key: HIVE-1863
> URL: https://issues.apache.org/jira/browse/HIVE-1863
> Project: Hive
> Issue Type: Bug
> Components: Drivers
> Affects Versions: 0.5.0
> Reporter: Guy le Mar
> Assignee: Edward Capriolo
>
> (1) Using the Hive CLI, create a table using...
> create table dt4_boolean
> (
> dt4_id int,
> dt4_testbool boolean,
> dt4_string string
> )
> row format delimited
> fields terminated by ','
> lines terminated by '\n';
> (2) Create a file containing the following text...
> 1,true,Value is True
> 2,null,Data says null and must be null
> 3,,No value that means null
> 4,NoIdea,Data says NoIdea that's gonna be null
> 5,false,Value is FALSE
> (3) Load the data in the file into the Hive table...
> load data local inpath '<DATA FILE PATH>' overwrite into table dt4_boolean;
> (4) Check the table works as expected using the Hive CLI...
> hive> select * from dt4_boolean;
> OK
> 1 true Value is True
> 2 NULL Data says null and must be null
> 3 NULL No value that means null
> 4 NULL Data says NoIdea that's gonna be null
> 5 false Value is FALSE
> Time taken: 0.049 seconds
> (5) Using the Hive JDBC driver, execute the same Hive query (select * from dt4_boolean)
> (5.1) The "row_str" values obtained by the Hive JDBC driver for deserialization are correct...
> 1 true Value is True
> 2 NULL Data says null and must be null
> 3 NULL No value that means null
> 4 NULL Data says NoIdea that's gonna be null
> 5 false Value is FALSE
> (5.2) However, when these "row_str" are deserialized by the DynamicSerDe to a java.lang.Object, the NULL boolean values are converted to FALSE - instead of being null.
> As a consequence, the application making use of the Hive JDBC driver produces this (incorrect) output...
> SQL> select dt4_id, dt4_testbool from dt4_boolean;
> DT4_ID DT4_TESTBOOL
> ---------- ------------
> 1 true
> 2 false
> 3 false
> 4 false
> 5 false
> ...instead of producing this (correct) output...
> SQL> select dt4_id, dt4_testbool from dt4_boolean;
> DT4_ID DT4_TESTBOOL
> ---------- ------------
> 1 true
> 2 NULL
> 3 NULL
> 4 NULL
> 5 false
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.