You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2021/06/01 17:16:00 UTC

[jira] [Work logged] (HIVE-25150) Tab characters are not removed before decimal conversion similar to space character which is fixed as part of HIVE-24378

     [ https://issues.apache.org/jira/browse/HIVE-25150?focusedWorklogId=604635&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-604635 ]

ASF GitHub Bot logged work on HIVE-25150:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 01/Jun/21 17:15
            Start Date: 01/Jun/21 17:15
    Worklog Time Spent: 10m 
      Work Description: tarak271 commented on a change in pull request #2308:
URL: https://github.com/apache/hive/pull/2308#discussion_r643326657



##########
File path: storage-api/src/java/org/apache/hadoop/hive/common/type/FastHiveDecimalImpl.java
##########
@@ -273,7 +269,8 @@ public static boolean fastSetFromBytes(byte[] bytes, int offset, int length, boo
     int index = offset;
 
     if (trimBlanks) {
-      while (bytes[index] == BYTE_BLANK) {
+      //Character.isWhitespace handles both space and tab character

Review comment:
       @maheshk114 
   Added a new function to validate more characters supported by Mysql, postgres like HORIZONTAL_TABULATION, VERTICAL_TABULATION, FORM_FEED & SPACE_SEPARATOR




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 604635)
    Time Spent: 50m  (was: 40m)

> Tab characters are not removed before decimal conversion similar to space character which is fixed as part of HIVE-24378
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-25150
>                 URL: https://issues.apache.org/jira/browse/HIVE-25150
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive
>    Affects Versions: 4.0.0
>            Reporter: Taraka Rama Rao Lethavadla
>            Assignee: Taraka Rama Rao Lethavadla
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 50m
>  Remaining Estimate: 0h
>
> Test case: 
>  column values with space and tab character 
> {noformat}
> bash-4.2$ cat data/files/test_dec_space.csv
> 1,0
> 2, 1
> 3,	2{noformat}
> {noformat}
> create external table test_dec_space (id int, value decimal) ROW FORMAT DELIMITED
>  FIELDS TERMINATED BY ',' location '/tmp/test_dec_space';
> {noformat}
> output of select * from test_dec_space would be
> {noformat}
> 1	0
> 2	1
> 3	NULL{noformat}
> The behaviour in MySQL when there is tab & space characters in decimal values
> {noformat}
> bash-4.2$ cat /tmp/insert.csv 
> "1","aa",11.88
> "2","bb", 99.88
> "4","dd",	209.88{noformat}
>  
> {noformat}
> MariaDB [test]> load data local infile '/tmp/insert.csv' into table t2 fields terminated by ',' ENCLOSED BY '"' LINES TERMINATED BY '\n';
>  Query OK, 3 rows affected, 3 warnings (0.00 sec) 
>  Records: 3 Deleted: 0 Skipped: 0 Warnings: 3
> MariaDB [test]> select * from t2;
> +------+------+-------+
> | id   | name | score |
> +------+------+-------+
> | 1    | aa   |    12 |
> | 2    | bb   |   100 |
> | 4    | dd   |   210 |
> +------+------+-------+
>  3 rows in set (0.00 sec)
> {noformat}
> So in hive also we can make it work by skipping tab character



--
This message was sent by Atlassian Jira
(v8.3.4#803005)