You are viewing a plain text version of this content. The canonical link for it is here.
Posted to gitbox@hive.apache.org by GitBox <gi...@apache.org> on 2020/04/23 11:57:41 UTC

[GitHub] [hive] kgyrtkirk commented on a change in pull request #992: HIVE-23269: Unsafe comparing bigints and (var)chars

kgyrtkirk commented on a change in pull request #992:
URL: https://github.com/apache/hive/pull/992#discussion_r413748151



##########
File path: ql/src/test/org/apache/hadoop/hive/ql/parse/type/TestTypeCheckProcFactory.java
##########
@@ -140,4 +147,37 @@ public void testWithNonZeroFraction() throws Exception {
     }
   }
 
+  @Test
+  public void testValidateUDFOnTypeCheck() throws Exception {

Review comment:
       * this is a "@Parameterized" testclass; please don't add testcases which are not use the parameterized nature... move these to a new testclass
   * it might make sense to split the testcases into separate methods

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/parse/type/TypeCheckProcFactory.java
##########
@@ -789,12 +791,25 @@ protected void validateUDF(ASTNode expr, boolean isFunction, TypeCheckCtx ctx, F
 
         LogHelper console = new LogHelper(LOG);
 
+        Set<PrimitiveObjectInspector.PrimitiveCategory> unsafeConventionTyps = Sets.newHashSet(
+            PrimitiveObjectInspector.PrimitiveCategory.STRING,
+            PrimitiveObjectInspector.PrimitiveCategory.VARCHAR,
+            PrimitiveObjectInspector.PrimitiveCategory.CHAR);
         // For now, if a bigint is going to be cast to a double throw an error or warning
-        if ((oiTypeInfo0.equals(TypeInfoFactory.stringTypeInfo) && oiTypeInfo1.equals(TypeInfoFactory.longTypeInfo)) ||
-            (oiTypeInfo0.equals(TypeInfoFactory.longTypeInfo) && oiTypeInfo1.equals(TypeInfoFactory.stringTypeInfo))) {
+        if ((oiTypeInfo0 instanceof PrimitiveTypeInfo &&

Review comment:
       you could move the oiTypeInfo0 conditions into a method (along with the Set) and then reuse the method 2 lines down
   might make this more readable

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/parse/type/TypeCheckProcFactory.java
##########
@@ -789,12 +791,25 @@ protected void validateUDF(ASTNode expr, boolean isFunction, TypeCheckCtx ctx, F
 
         LogHelper console = new LogHelper(LOG);
 
+        Set<PrimitiveObjectInspector.PrimitiveCategory> unsafeConventionTyps = Sets.newHashSet(
+            PrimitiveObjectInspector.PrimitiveCategory.STRING,
+            PrimitiveObjectInspector.PrimitiveCategory.VARCHAR,
+            PrimitiveObjectInspector.PrimitiveCategory.CHAR);
         // For now, if a bigint is going to be cast to a double throw an error or warning
-        if ((oiTypeInfo0.equals(TypeInfoFactory.stringTypeInfo) && oiTypeInfo1.equals(TypeInfoFactory.longTypeInfo)) ||
-            (oiTypeInfo0.equals(TypeInfoFactory.longTypeInfo) && oiTypeInfo1.equals(TypeInfoFactory.stringTypeInfo))) {
+        if ((oiTypeInfo0 instanceof PrimitiveTypeInfo &&
+            unsafeConventionTyps.contains(((PrimitiveTypeInfo)oiTypeInfo0).getPrimitiveCategory()) &&
+            oiTypeInfo1.equals(TypeInfoFactory.longTypeInfo)) || (oiTypeInfo1 instanceof PrimitiveTypeInfo &&
+            unsafeConventionTyps.contains(((PrimitiveTypeInfo)oiTypeInfo1).getPrimitiveCategory()) &&
+            oiTypeInfo0.equals(TypeInfoFactory.longTypeInfo))) {
           String error = StrictChecks.checkTypeSafety(conf);
-          if (error != null) throw new UDFArgumentException(error);
-          console.printError("WARNING: Comparing a bigint and a string may result in a loss of precision.");
+          if (error != null) {
+            throw new UDFArgumentException(error);
+          }
+          String type = oiTypeInfo0.getTypeName();
+          if (oiTypeInfo0.equals(TypeInfoFactory.longTypeInfo)) {
+            type = oiTypeInfo1.getTypeName();
+          }
+          console.printError("WARNING: Comparing a bigint and a " + type + " may result in a loss of precision.");

Review comment:
       I don't think the `type` variable is needed - you could just get both types when you are generating the warning messages.

##########
File path: ql/src/test/results/clientpositive/llap/unsafe_compare.q.out
##########
@@ -0,0 +1,40 @@
+PREHOOK: query: CREATE TABLE test_a (appid1 varchar(256),  appid2 char(20))
+PREHOOK: type: CREATETABLE
+PREHOOK: Output: database:default
+PREHOOK: Output: default@test_a
+POSTHOOK: query: CREATE TABLE test_a (appid1 varchar(256),  appid2 char(20))
+POSTHOOK: type: CREATETABLE
+POSTHOOK: Output: database:default
+POSTHOOK: Output: default@test_a
+PREHOOK: query: INSERT INTO  test_a VALUES ('2882303761517473127', '2882303761517473127'), ('2882303761517473276','2882303761517473276')
+PREHOOK: type: QUERY
+PREHOOK: Input: _dummy_database@_dummy_table
+PREHOOK: Output: default@test_a
+POSTHOOK: query: INSERT INTO  test_a VALUES ('2882303761517473127', '2882303761517473127'), ('2882303761517473276','2882303761517473276')
+POSTHOOK: type: QUERY
+POSTHOOK: Input: _dummy_database@_dummy_table
+POSTHOOK: Output: default@test_a
+POSTHOOK: Lineage: test_a.appid1 SCRIPT []
+POSTHOOK: Lineage: test_a.appid2 SCRIPT []
+WARNING: Comparing a bigint and a varchar(256) may result in a loss of precision.
+PREHOOK: query: SELECT appid1 FROM test_a WHERE appid1 = 2882303761517473127
+PREHOOK: type: QUERY
+PREHOOK: Input: default@test_a
+#### A masked pattern was here ####
+POSTHOOK: query: SELECT appid1 FROM test_a WHERE appid1 = 2882303761517473127
+POSTHOOK: type: QUERY
+POSTHOOK: Input: default@test_a
+#### A masked pattern was here ####
+2882303761517473127
+2882303761517473276

Review comment:
       I think these both '2882303761517473127' and '2882303761517473276' should fit into char(20) / varchar(20)
   and because of that this result leaves me a bit puzzled; is this row expected with the `appid1 = 2882303761517473127` condition?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org