You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Quanlong Huang (Code Review)" <ge...@cloudera.org> on 2021/09/04 11:46:13 UTC

[Impala-ASF-CR] IMPALA-9662,IMPALA-2019(part-3): Support UTF-8 mode in mask functions

Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/17780 )

Change subject: IMPALA-9662,IMPALA-2019(part-3): Support UTF-8 mode in mask functions
......................................................................


Patch Set 4:

> Patch Set 4:
> 
> (6 comments)
> 
> Looks great. My only concern is whether we just want to mask ascii characters, or Unicode characters. For the former a simpler algorithm exists for the method MastSubStrUtf8(). 
> 
> We may need to decide the scope of the change (ascii characters only, or Unicode characters in general) first.

Sorry that I may misunderstanding this.. For masking only ascii characters, we already support it in non-UTF8 mode. However, Hive always masks unicode characters (as described in IMPALA-9662). So this patch aims to add support for masking unicode characters.

Or do you mean whether we should support masking characters using unicode characters?


-- 
To view, visit http://gerrit.cloudera.org:8080/17780
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1276eccc94c9528507349b155a51e76f338367d5
Gerrit-Change-Number: 17780
Gerrit-PatchSet: 4
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Comment-Date: Sat, 04 Sep 2021 11:46:13 +0000
Gerrit-HasComments: No