You are viewing a plain text version of this content. The canonical link for it is here.
Posted to gitbox@hive.apache.org by GitBox <gi...@apache.org> on 2021/01/04 11:23:55 UTC

[GitHub] [hive] kgyrtkirk commented on a change in pull request #1810: [WIP] HIVE-24565: Implement standard trim function

kgyrtkirk commented on a change in pull request #1810:
URL: https://github.com/apache/hive/pull/1810#discussion_r551257267



##########
File path: ql/src/test/queries/clientpositive/udf_trim.q
##########
@@ -1,2 +1,20 @@
 DESCRIBE FUNCTION trim;
 DESCRIBE FUNCTION EXTENDED trim;
+
+SELECT '"' || trim('   tech   ') || '"';
+
+SELECT '"' || TRIM(' '  FROM  '   tech   ') || '"';
+
+SELECT '"' || TRIM(LEADING '0' FROM '000123') || '"';
+
+SELECT '"' || TRIM(TRAILING '1' FROM 'Tech1') || '"';
+
+SELECT '"' || TRIM(BOTH '1' FROM '123Tech111') || '"';
+
+SELECT '"' || ltrim('   tech   ') || '"', '"' || rtrim('   tech   ') || '"';
+
+SELECT '"' || lTRIM('0'  FROM  '000123') || '"', '"' || rTRIM('0'  FROM  '000123') || '"';
+
+SELECT trim('000123', '0');

Review comment:
       could you also add some`null` cases as well `trim(null,'x')` and `trim('x',null)`
   
   I know they will probably work okay; but its better to cover them

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFBaseTrim.java
##########
@@ -68,11 +82,24 @@ public Object evaluate(DeferredObject[] arguments) throws HiveException {
     if (valObject == null) {
       return null;
     }
-    String val = ((Text) converter.convert(valObject)).toString();
+    String val = stringToTrimConverter.convert(valObject).toString();
     if (val == null) {
       return null;
     }
-    result.set(performOp(val.toString()));
+
+    String trimChars = " ";

Review comment:
       there is also some vectorized implementations (see `StringRTrim` for example)
   
   the functionality is enhanced a bit here - those other implementations should be updated as well (and possibly also covered with tests)
   

##########
File path: parser/src/java/org/apache/hadoop/hive/ql/parse/HiveLexerParent.g
##########
@@ -373,6 +373,8 @@ KW_COST: 'COST';
 KW_JOINCOST: 'JOINCOST';
 KW_WITHIN: 'WITHIN';
 KW_PKFK_JOIN: 'PKFK_JOIN';
+KW_LEADING: 'LEADING';

Review comment:
       do we need to fully reserve these keywords - if not they could be added to: IdentifiersParser.g/nonReserved 

##########
File path: parser/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g
##########
@@ -464,6 +473,7 @@ atomExpression
     | whenExpression
     | (subQueryExpression)=> (subQueryExpression)
         -> ^(TOK_SUBQUERY_EXPR TOK_SUBQUERY_OP subQueryExpression)
+    | (functionName LPAREN (leading=KW_LEADING | trailing=KW_TRAILING | KW_BOTH)? (trim_characters=selectExpression)? KW_FROM (str=selectExpression) RPAREN) => trimFunction

Review comment:
       this is a syntetic predicate expression; narrowing it down furthere will help for sure ; but since `trimFunction` also matches the `functionName LPAREN` prefix I think it should be placed in the `function` rule




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org