You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Quanlong Huang (Jira)" <ji...@apache.org> on 2020/01/09 08:49:00 UTC

[jira] [Commented] (HIVE-22711) yearValue of UDF mask() should not start from 1900

    [ https://issues.apache.org/jira/browse/HIVE-22711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17011559#comment-17011559 ] 

Quanlong Huang commented on HIVE-22711:
---------------------------------------

cc [~madhan]

> yearValue of UDF mask() should not start from 1900
> --------------------------------------------------
>
>                 Key: HIVE-22711
>                 URL: https://issues.apache.org/jira/browse/HIVE-22711
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Quanlong Huang
>            Priority: Major
>
> Here's the description of the UDF mask():
> {code:java}
> masks the given value  
> Examples:                  
>    mask(ccn)               
>    mask(ccn, 'X', 'x', '0')
>    mask(ccn, 'x', 'x', 'x')
>  Arguments:                
>    mask(value, upperChar, lowerChar, digitChar, otherChar, numberChar, dayValue, monthValue, yearValue) 
>      value      - value to mask. Supported types: TINYINT, SMALLINT, INT, BIGINT, STRING, VARCHAR, CHAR, DATE 
>      upperChar  - character to replace upper-case characters with. Specify -1 to retain original character. Default value: 'X' 
>      lowerChar  - character to replace lower-case characters with. Specify -1 to retain original character. Default value: 'x' 
>      digitChar  - character to replace digit characters with. Specify -1 to retain original character. Default value: 'n' 
>      otherChar  - character to replace all other characters with. Specify -1 to retain original character. Default value: -1 
>      numberChar - character to replace digits in a number with. Valid values: 0-9. Default value: '1' 
>      dayValue   - value to replace day field in a date with.  Specify -1 to retain original value. Valid values: 1-31. Default value: 1 
>      monthValue - value to replace month field in a date with. Specify -1 to retain original value. Valid values: 0-11. Default value: 0 
>      yearValue  - value to replace year field in a date with. Specify -1 to retain original value. Default value: 0 {code}
> Although it says 'yearValue' is the value to replace year field in a DATE with, it actually counts start at 1900. E.g. yearValue = 0 means masking the year field to 1900, yearValue=2000 means masking the year field to 3900, yearValue=-2 means masking the year field to 1988.
> Here are some query examples:
> {code:sql}
> beeline> select mask(cast('2019-02-03' as date), -1, -1, -1, -1, -1, -1, -1, 0);
> 1900-02-03
> beeline> select mask(cast('2019-02-03' as date), -1, -1, -1, -1, -1, -1, -1, 2000);
> 3900-02-03
> beeline> select mask(cast('2019-02-03' as date), -1, -1, -1, -1, -1, -1, -1, -2);
> 1898-02-03
> beeline> select mask(cast('2019-02-03' as date), -1, -1, -1, -1, -1, -1, -1, -100);
> 1800-02-03
> {code}
> The drawback of this behavior is that we can't mask year field to be 1899, since -1 already means retaining the original value.
> It'd be better to change the behavior to be intuitive that simply masking year filed to yearValue. And only accept yearValue from 0 to 9999. Still use -1 to retain original value. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)