You are viewing a plain text version of this content. The canonical link for it is here.
Posted to gitbox@hive.apache.org by GitBox <gi...@apache.org> on 2022/09/27 23:01:29 UTC

[GitHub] [hive] jfsii opened a new pull request, #3628: [WIP] HIVE-26320: Deserialize Parquet VARCHAR and CHAR types appropriately

jfsii opened a new pull request, #3628:
URL: https://github.com/apache/hive/pull/3628

   
   <!--
   Thanks for sending a pull request!  Here are some tips for you:
     1. If this is your first time, please read our contributor guidelines: https://cwiki.apache.org/confluence/display/Hive/HowToContribute
     2. Ensure that you have created an issue on the Hive project JIRA: https://issues.apache.org/jira/projects/HIVE/summary
     3. Ensure you have added or run the appropriate tests for your PR: 
     4. If the PR is unfinished, add '[WIP]' in your PR title, e.g., '[WIP]HIVE-XXXXX:  Your PR title ...'.
     5. Be sure to keep the PR description updated to reflect all changes.
     6. Please write your PR title to summarize what this PR proposes.
     7. If possible, provide a concise example to reproduce the issue for a faster review.
   
   -->
   
   ### What changes were proposed in this pull request? 
   <!--
   Please clarify what changes you are proposing. The purpose of this section is to outline the changes and how this PR fixes the issue. 
   If possible, please consider writing useful notes for better and faster reviews in your PR. See the examples below.
     1. If you refactor some codes with changing classes, showing the class hierarchy will help reviewers.
     2. If you fix some SQL features, you can provide some references of other DBMSes.
     3. If there is design documentation, please add the link.
     4. If there is a discussion in the mailing list, please add the link.
   -->
   
   
   ### Why are the changes needed?
   <!--
   Please clarify why the changes are needed. For instance,
     1. If you propose a new API, clarify the use case for a new API.
     2. If you fix a bug, you can clarify why it is a bug.
   -->
   
   
   ### Does this PR introduce _any_ user-facing change?
   <!--
   Note that it means *any* user-facing change including all aspects such as the documentation fix.
   If yes, please clarify the previous behavior and the change this PR proposes - provide the console output, description, screenshot and/or a reproducable example to show the behavior difference if possible.
   If possible, please also clarify if this is a user-facing change compared to the released Hive versions or within the unreleased branches such as master.
   If no, write 'No'.
   -->
   
   
   ### How was this patch tested?
   <!--
   If tests were added, say they were added here. Please make sure to add some test cases that check the changes thoroughly including negative and positive cases if possible.
   If it was tested in a way different from regular unit tests, please clarify how you tested step by step, ideally copy and paste-able, so that other reviewers can test and check, and descendants can verify in the future.
   If tests were not added, please describe why they were not added and/or why it was difficult to add.
   -->
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] amansinha100 commented on a diff in pull request #3628: HIVE-26320: Deserialize Parquet VARCHAR and CHAR types appropriately

Posted by GitBox <gi...@apache.org>.
amansinha100 commented on code in PR #3628:
URL: https://github.com/apache/hive/pull/3628#discussion_r984120820


##########
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetHiveSerDe.java:
##########
@@ -91,11 +93,61 @@ public class ParquetHiveSerDe extends AbstractSerDe implements SchemaInference {
 
   private ObjectInspector objInspector;
   private ParquetHiveRecord parquetRow;
+  private ObjectInspectorConverters.Converter converter;
 
   public ParquetHiveSerDe() {
     parquetRow = new ParquetHiveRecord();
   }
 
+  // Recursively check if CHAR or VARCHAR types are used
+  private boolean needsConversion(TypeInfo type) {

Review Comment:
   I like the revised approach which does not do the extra allocation. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] sonarcloud[bot] commented on pull request #3628: HIVE-26320: Deserialize Parquet VARCHAR and CHAR types appropriately

Posted by GitBox <gi...@apache.org>.
sonarcloud[bot] commented on PR #3628:
URL: https://github.com/apache/hive/pull/3628#issuecomment-1262747235

   Kudos, SonarCloud Quality Gate passed!&nbsp; &nbsp; [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive&pullRequest=3628)
   
   [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=BUG)  
   [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=VULNERABILITY)  
   [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3628&resolved=false&types=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3628&resolved=false&types=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3628&resolved=false&types=SECURITY_HOTSPOT)  
   [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=CODE_SMELL) [6 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=CODE_SMELL)
   
   [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3628&metric=coverage&view=list) No Coverage information  
   [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3628&metric=duplicated_lines_density&view=list) No Duplication information
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] kasakrisz commented on a diff in pull request #3628: HIVE-26320: Deserialize Parquet VARCHAR and CHAR types appropriately

Posted by GitBox <gi...@apache.org>.
kasakrisz commented on code in PR #3628:
URL: https://github.com/apache/hive/pull/3628#discussion_r982958033


##########
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetHiveSerDe.java:
##########
@@ -91,11 +93,61 @@ public class ParquetHiveSerDe extends AbstractSerDe implements SchemaInference {
 
   private ObjectInspector objInspector;
   private ParquetHiveRecord parquetRow;
+  private ObjectInspectorConverters.Converter converter;
 
   public ParquetHiveSerDe() {
     parquetRow = new ParquetHiveRecord();
   }
 
+  // Recursively check if CHAR or VARCHAR types are used
+  private boolean needsConversion(TypeInfo type) {
+    if (type.getTypeName().toLowerCase().startsWith(serdeConstants.CHAR_TYPE_NAME) ||
+        type.getTypeName().toLowerCase().startsWith(serdeConstants.VARCHAR_TYPE_NAME)) {
+      return true;
+    }
+
+    if (type.getCategory().equals(Category.STRUCT)) {
+      StructTypeInfo sti = (StructTypeInfo) type;
+      for (TypeInfo t : sti.getAllStructFieldTypeInfos()) {
+        if (needsConversion(t)) {
+          return true;
+        }
+      }
+    }
+
+    if (type.getCategory().equals(Category.MAP)) {
+      TypeInfo keyTypeInfo = ((MapTypeInfo) type).getMapKeyTypeInfo();
+      if (needsConversion(keyTypeInfo)) {
+        return true;
+      }
+      TypeInfo valueTypeInfo = ((MapTypeInfo) type).getMapKeyTypeInfo();

Review Comment:
   Both calls are `getMapKeyTypeInfo`. Should one of these be `getMapValueTypeInfo`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] sonarcloud[bot] commented on pull request #3628: HIVE-26320: Deserialize Parquet VARCHAR and CHAR types appropriately

Posted by GitBox <gi...@apache.org>.
sonarcloud[bot] commented on PR #3628:
URL: https://github.com/apache/hive/pull/3628#issuecomment-1262427770

   Kudos, SonarCloud Quality Gate passed!&nbsp; &nbsp; [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive&pullRequest=3628)
   
   [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=BUG)  
   [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=VULNERABILITY)  
   [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3628&resolved=false&types=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3628&resolved=false&types=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3628&resolved=false&types=SECURITY_HOTSPOT)  
   [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=CODE_SMELL) [6 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=CODE_SMELL)
   
   [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3628&metric=coverage&view=list) No Coverage information  
   [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3628&metric=duplicated_lines_density&view=list) No Duplication information
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] jfsii commented on a diff in pull request #3628: HIVE-26320: Deserialize Parquet VARCHAR and CHAR types appropriately

Posted by GitBox <gi...@apache.org>.
jfsii commented on code in PR #3628:
URL: https://github.com/apache/hive/pull/3628#discussion_r984714917


##########
ql/src/test/queries/clientpositive/pointlookup.q:
##########
@@ -127,3 +127,39 @@ or inOutputOpt.key = null;
 drop table orOutput;
 drop table inOutput;
 drop table inOutputOpt;
+
+-- test case(s) for HIVE-26320 for ORC
+SET hive.optimize.point.lookup=true;
+SET hive.optimize.point.lookup.min=2;
+CREATE EXTERNAL TABLE hive2623_orc(kob varchar(2), enhanced_type_code int) STORED AS ORC;
+INSERT INTO hive2623_orc VALUES('BB',18),('BC',18),('AB',18);
+SELECT CASE WHEN ((kob='BB' AND enhanced_type_code='18') OR (kob='BC' AND enhanced_type_code='18')) THEN 1 ELSE 0 END AS logic_check
+FROM hive2623_orc;
+DROP TABLE hive2623_orc;
+
+CREATE EXTERNAL TABLE hive2623_char_orc(kob char(2), enhanced_type_code int) STORED AS ORC;
+INSERT INTO hive2623_char_orc VALUES('B',18),('BC',18),('AB',18);
+SELECT CASE WHEN ((kob='B ' AND enhanced_type_code='18') OR (kob='BC' AND enhanced_type_code='18')) THEN 1 ELSE 0 END AS logic_check
+FROM hive2623_char_orc;
+DROP TABLE hive2623_char_orc;
+
+-- test case(s) for HIVE-26320 for PARQUET
+SET hive.optimize.point.lookup=true;
+SET hive.optimize.point.lookup.min=2;
+
+CREATE EXTERNAL TABLE hive2623_parq(kob varchar(2), enhanced_type_code int) STORED AS PARQUET;
+INSERT INTO hive2623_parq VALUES('BB',18),('BC',18),('AB',18);
+SELECT CASE WHEN ((kob='BB' AND enhanced_type_code='18') OR (kob='BC' AND enhanced_type_code='18')) THEN 1 ELSE 0 END AS logic_check
+FROM hive2623_parq;
+DROP TABLE hive2623_parq;
+
+CREATE EXTERNAL TABLE hive2623_char_parq(kob char(2), enhanced_type_code int) STORED AS PARQUET;
+INSERT INTO hive2623_char_parq VALUES('B ',18),('BC',18),('AB',18);
+SELECT CASE WHEN ((kob='B ' AND enhanced_type_code='18') OR (kob='BC' AND enhanced_type_code='18')) THEN 1 ELSE 0 END AS logic_check
+FROM hive2623_char_parq;
+DROP TABLE hive2623_char_parq;
+
+CREATE EXTERNAL TABLE hive2623_int_parq(kob int, enhanced_type_code int) STORED AS PARQUET;
+INSERT INTO hive2623_int_parq VALUES(2,18),(23,18),(12,18);
+SELECT CASE WHEN ((kob=2 AND enhanced_type_code=18) OR (kob=23  AND enhanced_type_code=18)) THEN 1 ELSE 0 END AS logic_check FROM hive2623_int_parq;
+DROP TABLE hive2623_int_parq;

Review Comment:
   Added the suggested case in udf_in.q



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] sonarcloud[bot] commented on pull request #3628: [WIP] HIVE-26320: Deserialize Parquet VARCHAR and CHAR types appropriately

Posted by GitBox <gi...@apache.org>.
sonarcloud[bot] commented on PR #3628:
URL: https://github.com/apache/hive/pull/3628#issuecomment-1260187363

   Kudos, SonarCloud Quality Gate passed!&nbsp; &nbsp; [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive&pullRequest=3628)
   
   [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=BUG)  
   [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=VULNERABILITY)  
   [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3628&resolved=false&types=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3628&resolved=false&types=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3628&resolved=false&types=SECURITY_HOTSPOT)  
   [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=CODE_SMELL) [10 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=CODE_SMELL)
   
   [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3628&metric=coverage&view=list) No Coverage information  
   [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3628&metric=duplicated_lines_density&view=list) No Duplication information
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] jfsii commented on a diff in pull request #3628: HIVE-26320: Deserialize Parquet VARCHAR and CHAR types appropriately

Posted by GitBox <gi...@apache.org>.
jfsii commented on code in PR #3628:
URL: https://github.com/apache/hive/pull/3628#discussion_r984714606


##########
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ETypeConverter.java:
##########
@@ -481,12 +485,36 @@ protected BytesWritable convert(Binary binary) {
   },
   ESTRING_CONVERTER(String.class) {
     @Override
-    PrimitiveConverter getConverter(final PrimitiveType type, final int index, final ConverterParent parent, TypeInfo hiveTypeInfo) {
+    PrimitiveConverter getConverter(final PrimitiveType type, final int index, final ConverterParent parent,
+        TypeInfo hiveTypeInfo) {
+      // If we have type information, we should return properly typed strings. However, there are a variety
+      // of code paths that do not provide the typeInfo in those cases we default to Text. This idiom is also
+      // followed by for example the BigDecimal converter in which if there is no type information,
+      // it defaults to the widest representation
+      if (hiveTypeInfo != null) {
+        String typeName = hiveTypeInfo.getTypeName().toLowerCase();
+        if (typeName.startsWith(serdeConstants.CHAR_TYPE_NAME)) {
+          return new BinaryConverter<HiveCharWritable>(type, parent, index) {
+            @Override
+              protected HiveCharWritable convert(Binary binary) {
+                return new HiveCharWritable(binary.getBytes(), ((CharTypeInfo) hiveTypeInfo).getLength());
+              }
+          };
+        } else if (typeName.startsWith(serdeConstants.VARCHAR_TYPE_NAME)) {
+          return new BinaryConverter<HiveVarcharWritable>(type, parent, index) {
+            @Override
+              protected HiveVarcharWritable convert(Binary binary) {
+                return new HiveVarcharWritable(binary.getBytes(), ((VarcharTypeInfo) hiveTypeInfo).getLength());
+              }
+          };
+        }
+      }

Review Comment:
   Yeah, I did follow the file on its convention. However, I changed it to your suggestion because it looks cleaner and might as well encourage this still in case someone copies me



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] sonarcloud[bot] commented on pull request #3628: HIVE-26320: Deserialize Parquet VARCHAR and CHAR types appropriately

Posted by GitBox <gi...@apache.org>.
sonarcloud[bot] commented on PR #3628:
URL: https://github.com/apache/hive/pull/3628#issuecomment-1264413029

   Kudos, SonarCloud Quality Gate passed!&nbsp; &nbsp; [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive&pullRequest=3628)
   
   [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=BUG)  
   [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=VULNERABILITY)  
   [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3628&resolved=false&types=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3628&resolved=false&types=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3628&resolved=false&types=SECURITY_HOTSPOT)  
   [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=CODE_SMELL) [7 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=CODE_SMELL)
   
   [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3628&metric=coverage&view=list) No Coverage information  
   [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3628&metric=duplicated_lines_density&view=list) No Duplication information
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] sonarcloud[bot] commented on pull request #3628: HIVE-26320: Deserialize Parquet VARCHAR and CHAR types appropriately

Posted by GitBox <gi...@apache.org>.
sonarcloud[bot] commented on PR #3628:
URL: https://github.com/apache/hive/pull/3628#issuecomment-1261417739

   Kudos, SonarCloud Quality Gate passed!&nbsp; &nbsp; [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive&pullRequest=3628)
   
   [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=BUG)  
   [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=VULNERABILITY)  
   [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3628&resolved=false&types=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3628&resolved=false&types=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3628&resolved=false&types=SECURITY_HOTSPOT)  
   [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=CODE_SMELL) [6 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=CODE_SMELL)
   
   [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3628&metric=coverage&view=list) No Coverage information  
   [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3628&metric=duplicated_lines_density&view=list) No Duplication information
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] sonarcloud[bot] commented on pull request #3628: HIVE-26320: Deserialize Parquet VARCHAR and CHAR types appropriately

Posted by GitBox <gi...@apache.org>.
sonarcloud[bot] commented on PR #3628:
URL: https://github.com/apache/hive/pull/3628#issuecomment-1261640444

   Kudos, SonarCloud Quality Gate passed!&nbsp; &nbsp; [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive&pullRequest=3628)
   
   [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=BUG)  
   [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=VULNERABILITY)  
   [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3628&resolved=false&types=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3628&resolved=false&types=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3628&resolved=false&types=SECURITY_HOTSPOT)  
   [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=CODE_SMELL) [6 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=CODE_SMELL)
   
   [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3628&metric=coverage&view=list) No Coverage information  
   [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3628&metric=duplicated_lines_density&view=list) No Duplication information
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] zabetak commented on a diff in pull request #3628: HIVE-26320: Deserialize Parquet VARCHAR and CHAR types appropriately

Posted by GitBox <gi...@apache.org>.
zabetak commented on code in PR #3628:
URL: https://github.com/apache/hive/pull/3628#discussion_r984763146


##########
ql/src/test/org/apache/hadoop/hive/ql/io/parquet/convert/TestETypeConverter.java:
##########
@@ -212,23 +216,61 @@ public void testGetInt64NanosAdjustedToUTCTimestampConverter() throws Exception
   }
 
   @Test
-  public void testGetTextConverter() throws Exception {
+  public void testGetTextConverterForString() throws Exception {
     PrimitiveType primitiveType = Types.optional(PrimitiveTypeName.BINARY)
-        .as(LogicalTypeAnnotation.stringType()).named("value");
-    Writable writable = getWritableFromBinaryConverter(new VarcharTypeInfo(), primitiveType,
-        Binary.fromString("this_is_a_value"));
-    Text textWritable = (Text) writable;
-    assertEquals("this_is_a_value", textWritable.toString());
+            .as(LogicalTypeAnnotation.stringType()).named("value");
+    String value = "this_is_a_value";
+    Text textWritable = (Text) getWritableFromBinaryConverter(stringTypeInfo, primitiveType,
+            Binary.fromString(value));
+    assertEquals(value, textWritable.toString());
   }
 
   @Test
-  public void testGetTextConverterNoHiveTypeInfo() throws Exception {
+  public void testGetTextConverterForCharPadsValueWithSpacesTillLen() {
+    PrimitiveType primitiveType = Types.optional(PrimitiveTypeName.BINARY)
+            .as(LogicalTypeAnnotation.stringType()).named("value");
+    String value = "this_is_a_value";
+    HiveCharWritable textWritable = (HiveCharWritable) getWritableFromBinaryConverter(
+            new CharTypeInfo(value.length() + 2), primitiveType, Binary.fromString(value));
+    assertEquals(value + "  ", textWritable.toString());
+  }
+
+  public void testGetTextConverterForCharTruncatesValueExceedingLen() {

Review Comment:
   Probably the `@Test` annotations needs to be added here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] sonarcloud[bot] commented on pull request #3628: HIVE-26320: Deserialize Parquet VARCHAR and CHAR types appropriately

Posted by GitBox <gi...@apache.org>.
sonarcloud[bot] commented on PR #3628:
URL: https://github.com/apache/hive/pull/3628#issuecomment-1264368217

   Kudos, SonarCloud Quality Gate passed!&nbsp; &nbsp; [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive&pullRequest=3628)
   
   [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=BUG)  
   [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=VULNERABILITY)  
   [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3628&resolved=false&types=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3628&resolved=false&types=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3628&resolved=false&types=SECURITY_HOTSPOT)  
   [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=CODE_SMELL) [7 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=CODE_SMELL)
   
   [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3628&metric=coverage&view=list) No Coverage information  
   [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3628&metric=duplicated_lines_density&view=list) No Duplication information
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] sonarcloud[bot] commented on pull request #3628: HIVE-26320: Deserialize Parquet VARCHAR and CHAR types appropriately

Posted by GitBox <gi...@apache.org>.
sonarcloud[bot] commented on PR #3628:
URL: https://github.com/apache/hive/pull/3628#issuecomment-1262981564

   Kudos, SonarCloud Quality Gate passed!&nbsp; &nbsp; [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive&pullRequest=3628)
   
   [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=BUG)  
   [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=VULNERABILITY)  
   [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3628&resolved=false&types=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3628&resolved=false&types=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3628&resolved=false&types=SECURITY_HOTSPOT)  
   [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=CODE_SMELL) [6 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=CODE_SMELL)
   
   [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3628&metric=coverage&view=list) No Coverage information  
   [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3628&metric=duplicated_lines_density&view=list) No Duplication information
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] sonarcloud[bot] commented on pull request #3628: HIVE-26320: Deserialize Parquet VARCHAR and CHAR types appropriately

Posted by GitBox <gi...@apache.org>.
sonarcloud[bot] commented on PR #3628:
URL: https://github.com/apache/hive/pull/3628#issuecomment-1264024651

   Kudos, SonarCloud Quality Gate passed!&nbsp; &nbsp; [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive&pullRequest=3628)
   
   [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=BUG)  
   [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=VULNERABILITY)  
   [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3628&resolved=false&types=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3628&resolved=false&types=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3628&resolved=false&types=SECURITY_HOTSPOT)  
   [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=CODE_SMELL) [7 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=CODE_SMELL)
   
   [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3628&metric=coverage&view=list) No Coverage information  
   [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3628&metric=duplicated_lines_density&view=list) No Duplication information
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] sonarcloud[bot] commented on pull request #3628: HIVE-26320: Deserialize Parquet VARCHAR and CHAR types appropriately

Posted by GitBox <gi...@apache.org>.
sonarcloud[bot] commented on PR #3628:
URL: https://github.com/apache/hive/pull/3628#issuecomment-1264233942

   Kudos, SonarCloud Quality Gate passed!&nbsp; &nbsp; [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive&pullRequest=3628)
   
   [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=BUG)  
   [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=VULNERABILITY)  
   [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3628&resolved=false&types=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3628&resolved=false&types=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3628&resolved=false&types=SECURITY_HOTSPOT)  
   [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=CODE_SMELL) [7 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=CODE_SMELL)
   
   [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3628&metric=coverage&view=list) No Coverage information  
   [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3628&metric=duplicated_lines_density&view=list) No Duplication information
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] zabetak commented on a diff in pull request #3628: HIVE-26320: Deserialize Parquet VARCHAR and CHAR types appropriately

Posted by GitBox <gi...@apache.org>.
zabetak commented on code in PR #3628:
URL: https://github.com/apache/hive/pull/3628#discussion_r984418472


##########
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ETypeConverter.java:
##########
@@ -481,12 +485,36 @@ protected BytesWritable convert(Binary binary) {
   },
   ESTRING_CONVERTER(String.class) {
     @Override
-    PrimitiveConverter getConverter(final PrimitiveType type, final int index, final ConverterParent parent, TypeInfo hiveTypeInfo) {
+    PrimitiveConverter getConverter(final PrimitiveType type, final int index, final ConverterParent parent,
+        TypeInfo hiveTypeInfo) {
+      // If we have type information, we should return properly typed strings. However, there are a variety
+      // of code paths that do not provide the typeInfo in those cases we default to Text. This idiom is also
+      // followed by for example the BigDecimal converter in which if there is no type information,
+      // it defaults to the widest representation
+      if (hiveTypeInfo != null) {
+        String typeName = hiveTypeInfo.getTypeName().toLowerCase();
+        if (typeName.startsWith(serdeConstants.CHAR_TYPE_NAME)) {
+          return new BinaryConverter<HiveCharWritable>(type, parent, index) {
+            @Override
+              protected HiveCharWritable convert(Binary binary) {
+                return new HiveCharWritable(binary.getBytes(), ((CharTypeInfo) hiveTypeInfo).getLength());
+              }
+          };
+        } else if (typeName.startsWith(serdeConstants.VARCHAR_TYPE_NAME)) {
+          return new BinaryConverter<HiveVarcharWritable>(type, parent, index) {
+            @Override
+              protected HiveVarcharWritable convert(Binary binary) {
+                return new HiveVarcharWritable(binary.getBytes(), ((VarcharTypeInfo) hiveTypeInfo).getLength());
+              }
+          };
+        }
+      }

Review Comment:
   I think the most common way to do something based on a primitive type is to use its category. It seems robust and more efficient.
   ```java
   if (hiveTypeInfo instanceof PrimitiveTypeInfo) {
           PrimitiveTypeInfo t = (PrimitiveTypeInfo) hiveTypeInfo;
           switch (t.getPrimitiveCategory()) {
           case VARCHAR:
             break;
           case CHAR:
             break;
           }
         }
   ```
   I see that in this class they rely much on the name and serdeConstants so I am also OK with the current code as it is.



##########
ql/src/test/org/apache/hadoop/hive/ql/io/parquet/convert/TestETypeConverter.java:
##########
@@ -212,23 +216,59 @@ public void testGetInt64NanosAdjustedToUTCTimestampConverter() throws Exception
   }
 
   @Test
-  public void testGetTextConverter() throws Exception {
+  public void testGetTextConverterForString() throws Exception {
     PrimitiveType primitiveType = Types.optional(PrimitiveTypeName.BINARY)
-        .as(LogicalTypeAnnotation.stringType()).named("value");
-    Writable writable = getWritableFromBinaryConverter(new VarcharTypeInfo(), primitiveType,
-        Binary.fromString("this_is_a_value"));
-    Text textWritable = (Text) writable;
-    assertEquals("this_is_a_value", textWritable.toString());
+            .as(LogicalTypeAnnotation.stringType()).named("value");
+    String value = "this_is_a_value";
+    Text textWritable = (Text) getWritableFromBinaryConverter(stringTypeInfo, primitiveType,
+            Binary.fromString(value));
+    // we should get what we put in
+    assertEquals(value, textWritable.toString());
+  }
+
+  @Test
+  public void testGetTextConverterForChar() throws Exception {
+    PrimitiveType primitiveType = Types.optional(PrimitiveTypeName.BINARY)
+            .as(LogicalTypeAnnotation.stringType()).named("value");
+    String value = "this_is_a_value";
+
+    HiveCharWritable textWritable = (HiveCharWritable) getWritableFromBinaryConverter(
+            new CharTypeInfo(value.length() + 2), primitiveType, Binary.fromString(value));
+    // check that it enforces the length (it should be padded)
+    assertEquals(value + "  ", textWritable.toString());
+
+    textWritable = (HiveCharWritable) getWritableFromBinaryConverter(
+            new CharTypeInfo(6), primitiveType, Binary.fromString(value));
+    // check that it enforces the length (it should truncate)
+    assertEquals(value.substring(0, 6), textWritable.toString());

Review Comment:
   nit. The test could be easily split in two and a proper name could avoid the need for extra comments
   * testGetTextConverterForCharPadsValueWithSpacesTillLen
   * testGetTextConverterForCharTruncatesValueExceedingLen



##########
ql/src/test/queries/clientpositive/pointlookup.q:
##########
@@ -127,3 +127,39 @@ or inOutputOpt.key = null;
 drop table orOutput;
 drop table inOutput;
 drop table inOutputOpt;
+
+-- test case(s) for HIVE-26320 for ORC
+SET hive.optimize.point.lookup=true;
+SET hive.optimize.point.lookup.min=2;
+CREATE EXTERNAL TABLE hive2623_orc(kob varchar(2), enhanced_type_code int) STORED AS ORC;
+INSERT INTO hive2623_orc VALUES('BB',18),('BC',18),('AB',18);
+SELECT CASE WHEN ((kob='BB' AND enhanced_type_code='18') OR (kob='BC' AND enhanced_type_code='18')) THEN 1 ELSE 0 END AS logic_check
+FROM hive2623_orc;
+DROP TABLE hive2623_orc;
+
+CREATE EXTERNAL TABLE hive2623_char_orc(kob char(2), enhanced_type_code int) STORED AS ORC;
+INSERT INTO hive2623_char_orc VALUES('B',18),('BC',18),('AB',18);
+SELECT CASE WHEN ((kob='B ' AND enhanced_type_code='18') OR (kob='BC' AND enhanced_type_code='18')) THEN 1 ELSE 0 END AS logic_check
+FROM hive2623_char_orc;
+DROP TABLE hive2623_char_orc;
+
+-- test case(s) for HIVE-26320 for PARQUET
+SET hive.optimize.point.lookup=true;
+SET hive.optimize.point.lookup.min=2;

Review Comment:
   This is already set above no need to set it twice.



##########
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ETypeConverter.java:
##########
@@ -481,12 +485,36 @@ protected BytesWritable convert(Binary binary) {
   },
   ESTRING_CONVERTER(String.class) {
     @Override
-    PrimitiveConverter getConverter(final PrimitiveType type, final int index, final ConverterParent parent, TypeInfo hiveTypeInfo) {
+    PrimitiveConverter getConverter(final PrimitiveType type, final int index, final ConverterParent parent,
+        TypeInfo hiveTypeInfo) {
+      // If we have type information, we should return properly typed strings. However, there are a variety
+      // of code paths that do not provide the typeInfo in those cases we default to Text. This idiom is also
+      // followed by for example the BigDecimal converter in which if there is no type information,
+      // it defaults to the widest representation
+      if (hiveTypeInfo != null) {
+        String typeName = hiveTypeInfo.getTypeName().toLowerCase();
+        if (typeName.startsWith(serdeConstants.CHAR_TYPE_NAME)) {
+          return new BinaryConverter<HiveCharWritable>(type, parent, index) {
+            @Override
+              protected HiveCharWritable convert(Binary binary) {
+                return new HiveCharWritable(binary.getBytes(), ((CharTypeInfo) hiveTypeInfo).getLength());
+              }
+          };
+        } else if (typeName.startsWith(serdeConstants.VARCHAR_TYPE_NAME)) {
+          return new BinaryConverter<HiveVarcharWritable>(type, parent, index) {
+            @Override
+              protected HiveVarcharWritable convert(Binary binary) {
+                return new HiveVarcharWritable(binary.getBytes(), ((VarcharTypeInfo) hiveTypeInfo).getLength());
+              }
+          };
+        }
+      }
+      // STRING type
       return new BinaryConverter<Text>(type, parent, index) {
         @Override
-        protected Text convert(Binary binary) {
-          return new Text(binary.getBytes());
-        }
+          protected Text convert(Binary binary) {
+            return new Text(binary.getBytes());
+          }

Review Comment:
   nit: broken indentation



##########
ql/src/test/queries/clientpositive/pointlookup.q:
##########
@@ -127,3 +127,39 @@ or inOutputOpt.key = null;
 drop table orOutput;
 drop table inOutput;
 drop table inOutputOpt;
+
+-- test case(s) for HIVE-26320 for ORC
+SET hive.optimize.point.lookup=true;
+SET hive.optimize.point.lookup.min=2;
+CREATE EXTERNAL TABLE hive2623_orc(kob varchar(2), enhanced_type_code int) STORED AS ORC;
+INSERT INTO hive2623_orc VALUES('BB',18),('BC',18),('AB',18);
+SELECT CASE WHEN ((kob='BB' AND enhanced_type_code='18') OR (kob='BC' AND enhanced_type_code='18')) THEN 1 ELSE 0 END AS logic_check
+FROM hive2623_orc;
+DROP TABLE hive2623_orc;
+
+CREATE EXTERNAL TABLE hive2623_char_orc(kob char(2), enhanced_type_code int) STORED AS ORC;
+INSERT INTO hive2623_char_orc VALUES('B',18),('BC',18),('AB',18);
+SELECT CASE WHEN ((kob='B ' AND enhanced_type_code='18') OR (kob='BC' AND enhanced_type_code='18')) THEN 1 ELSE 0 END AS logic_check
+FROM hive2623_char_orc;
+DROP TABLE hive2623_char_orc;
+
+-- test case(s) for HIVE-26320 for PARQUET
+SET hive.optimize.point.lookup=true;
+SET hive.optimize.point.lookup.min=2;
+
+CREATE EXTERNAL TABLE hive2623_parq(kob varchar(2), enhanced_type_code int) STORED AS PARQUET;
+INSERT INTO hive2623_parq VALUES('BB',18),('BC',18),('AB',18);
+SELECT CASE WHEN ((kob='BB' AND enhanced_type_code='18') OR (kob='BC' AND enhanced_type_code='18')) THEN 1 ELSE 0 END AS logic_check
+FROM hive2623_parq;
+DROP TABLE hive2623_parq;
+
+CREATE EXTERNAL TABLE hive2623_char_parq(kob char(2), enhanced_type_code int) STORED AS PARQUET;
+INSERT INTO hive2623_char_parq VALUES('B ',18),('BC',18),('AB',18);
+SELECT CASE WHEN ((kob='B ' AND enhanced_type_code='18') OR (kob='BC' AND enhanced_type_code='18')) THEN 1 ELSE 0 END AS logic_check
+FROM hive2623_char_parq;
+DROP TABLE hive2623_char_parq;
+
+CREATE EXTERNAL TABLE hive2623_int_parq(kob int, enhanced_type_code int) STORED AS PARQUET;
+INSERT INTO hive2623_int_parq VALUES(2,18),(23,18),(12,18);
+SELECT CASE WHEN ((kob=2 AND enhanced_type_code=18) OR (kob=23  AND enhanced_type_code=18)) THEN 1 ELSE 0 END AS logic_check FROM hive2623_int_parq;
+DROP TABLE hive2623_int_parq;

Review Comment:
   The bug was not really in pointlookup optimization but in the IN function combined with structs and Parquet format. We can certainly leave a test here with `CASE WHEN` since it was reported like this but it may be a good idea to add explicitly a test case using the IN function in udf_in.q  (or elsewhere).
   
   Something like:
   ```sql
   SELECT (kob, code) IN (('BB', 18), ('BC', 18)) FROM parq;
   ```



##########
ql/src/test/queries/clientpositive/pointlookup.q:
##########
@@ -127,3 +127,39 @@ or inOutputOpt.key = null;
 drop table orOutput;
 drop table inOutput;
 drop table inOutputOpt;
+
+-- test case(s) for HIVE-26320 for ORC
+SET hive.optimize.point.lookup=true;
+SET hive.optimize.point.lookup.min=2;
+CREATE EXTERNAL TABLE hive2623_orc(kob varchar(2), enhanced_type_code int) STORED AS ORC;
+INSERT INTO hive2623_orc VALUES('BB',18),('BC',18),('AB',18);
+SELECT CASE WHEN ((kob='BB' AND enhanced_type_code='18') OR (kob='BC' AND enhanced_type_code='18')) THEN 1 ELSE 0 END AS logic_check
+FROM hive2623_orc;

Review Comment:
   `hive2623` vs `hive26230` intentional or typo? same goes for all table names.



##########
serde/src/java/org/apache/hadoop/hive/serde2/io/HiveBaseCharWritable.java:
##########
@@ -26,10 +26,19 @@
 import org.apache.hive.common.util.HiveStringUtils;
 
 public abstract class HiveBaseCharWritable {
-  protected Text value = new Text();
+  protected Text value;
   protected int charLength = -1;
 
   public HiveBaseCharWritable() {
+    value = new Text();
+  }
+
+  public HiveBaseCharWritable(Text text) {
+    value = text;
+  }

Review Comment:
   Is this constructor used by anyone?



##########
ql/src/test/queries/clientpositive/pointlookup.q:
##########
@@ -127,3 +127,39 @@ or inOutputOpt.key = null;
 drop table orOutput;
 drop table inOutput;
 drop table inOutputOpt;
+
+-- test case(s) for HIVE-26320 for ORC
+SET hive.optimize.point.lookup=true;
+SET hive.optimize.point.lookup.min=2;
+CREATE EXTERNAL TABLE hive2623_orc(kob varchar(2), enhanced_type_code int) STORED AS ORC;
+INSERT INTO hive2623_orc VALUES('BB',18),('BC',18),('AB',18);
+SELECT CASE WHEN ((kob='BB' AND enhanced_type_code='18') OR (kob='BC' AND enhanced_type_code='18')) THEN 1 ELSE 0 END AS logic_check

Review Comment:
   I noticed that in this and some of the queries below we are using a character literal (`'18'`) instead of numeric literal (`18`). I am not sure what's the intent here but better avoid making the test more complicated than necessary.



##########
serde/src/java/org/apache/hadoop/hive/serde2/io/HiveBaseCharWritable.java:
##########
@@ -26,10 +26,19 @@
 import org.apache.hive.common.util.HiveStringUtils;
 
 public abstract class HiveBaseCharWritable {
-  protected Text value = new Text();
+  protected Text value;
   protected int charLength = -1;
 
   public HiveBaseCharWritable() {
+    value = new Text();
+  }
+
+  public HiveBaseCharWritable(Text text) {
+    value = text;
+  }
+
+  public HiveBaseCharWritable(byte[] bytes) {
+    value = new Text(bytes);

Review Comment:
   We could add an additional `set` method in sublcasses and no touch at all this class (fewer changes in total). Is there a specific reason for using the constructor approach?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] amansinha100 commented on a diff in pull request #3628: HIVE-26320: Deserialize Parquet VARCHAR and CHAR types appropriately

Posted by GitBox <gi...@apache.org>.
amansinha100 commented on code in PR #3628:
URL: https://github.com/apache/hive/pull/3628#discussion_r983060737


##########
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetHiveSerDe.java:
##########
@@ -91,11 +93,61 @@ public class ParquetHiveSerDe extends AbstractSerDe implements SchemaInference {
 
   private ObjectInspector objInspector;
   private ParquetHiveRecord parquetRow;
+  private ObjectInspectorConverters.Converter converter;
 
   public ParquetHiveSerDe() {
     parquetRow = new ParquetHiveRecord();
   }
 
+  // Recursively check if CHAR or VARCHAR types are used
+  private boolean needsConversion(TypeInfo type) {

Review Comment:
   Thinking about the number of times needsConversion() and the subsequent convert() would be called for a table scan that is reading say N parquet files each with m row groups.  Such operations on a per file or per row group would be ok but if done on per row basis, it would be a perf hit. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] jfsii commented on a diff in pull request #3628: HIVE-26320: Deserialize Parquet VARCHAR and CHAR types appropriately

Posted by GitBox <gi...@apache.org>.
jfsii commented on code in PR #3628:
URL: https://github.com/apache/hive/pull/3628#discussion_r983635069


##########
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetHiveSerDe.java:
##########
@@ -91,11 +93,61 @@ public class ParquetHiveSerDe extends AbstractSerDe implements SchemaInference {
 
   private ObjectInspector objInspector;
   private ParquetHiveRecord parquetRow;
+  private ObjectInspectorConverters.Converter converter;
 
   public ParquetHiveSerDe() {
     parquetRow = new ParquetHiveRecord();
   }
 
+  // Recursively check if CHAR or VARCHAR types are used
+  private boolean needsConversion(TypeInfo type) {
+    if (type.getTypeName().toLowerCase().startsWith(serdeConstants.CHAR_TYPE_NAME) ||
+        type.getTypeName().toLowerCase().startsWith(serdeConstants.VARCHAR_TYPE_NAME)) {

Review Comment:
   Thanks for the comment @mdayakar ! I've change my approach



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] zabetak closed pull request #3628: HIVE-26320: Deserialize Parquet VARCHAR and CHAR types appropriately

Posted by GitBox <gi...@apache.org>.
zabetak closed pull request #3628: HIVE-26320: Deserialize Parquet VARCHAR and CHAR types appropriately
URL: https://github.com/apache/hive/pull/3628


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] sonarcloud[bot] commented on pull request #3628: [WIP] HIVE-26320: Deserialize Parquet VARCHAR and CHAR types appropriately

Posted by GitBox <gi...@apache.org>.
sonarcloud[bot] commented on PR #3628:
URL: https://github.com/apache/hive/pull/3628#issuecomment-1261090118

   Kudos, SonarCloud Quality Gate passed!&nbsp; &nbsp; [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive&pullRequest=3628)
   
   [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=BUG)  
   [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=VULNERABILITY)  
   [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3628&resolved=false&types=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3628&resolved=false&types=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3628&resolved=false&types=SECURITY_HOTSPOT)  
   [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=CODE_SMELL) [6 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=CODE_SMELL)
   
   [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3628&metric=coverage&view=list) No Coverage information  
   [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3628&metric=duplicated_lines_density&view=list) No Duplication information
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] jfsii commented on a diff in pull request #3628: HIVE-26320: Deserialize Parquet VARCHAR and CHAR types appropriately

Posted by GitBox <gi...@apache.org>.
jfsii commented on code in PR #3628:
URL: https://github.com/apache/hive/pull/3628#discussion_r983634554


##########
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetHiveSerDe.java:
##########
@@ -91,11 +93,61 @@ public class ParquetHiveSerDe extends AbstractSerDe implements SchemaInference {
 
   private ObjectInspector objInspector;
   private ParquetHiveRecord parquetRow;
+  private ObjectInspectorConverters.Converter converter;
 
   public ParquetHiveSerDe() {
     parquetRow = new ParquetHiveRecord();
   }
 
+  // Recursively check if CHAR or VARCHAR types are used
+  private boolean needsConversion(TypeInfo type) {

Review Comment:
   Yes - I agree it could be a perf hit here. Though it wouldn't surprise me if the perf difference is small since these paths already do many allocations per row.
   
   BUT - I did change my approach with what I think is a much more appropriate solution that hopefully avoids the extra allocation.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] jfsii commented on a diff in pull request #3628: HIVE-26320: Deserialize Parquet VARCHAR and CHAR types appropriately

Posted by GitBox <gi...@apache.org>.
jfsii commented on code in PR #3628:
URL: https://github.com/apache/hive/pull/3628#discussion_r982970284


##########
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetHiveSerDe.java:
##########
@@ -91,11 +93,61 @@ public class ParquetHiveSerDe extends AbstractSerDe implements SchemaInference {
 
   private ObjectInspector objInspector;
   private ParquetHiveRecord parquetRow;
+  private ObjectInspectorConverters.Converter converter;
 
   public ParquetHiveSerDe() {
     parquetRow = new ParquetHiveRecord();
   }
 
+  // Recursively check if CHAR or VARCHAR types are used
+  private boolean needsConversion(TypeInfo type) {
+    if (type.getTypeName().toLowerCase().startsWith(serdeConstants.CHAR_TYPE_NAME) ||
+        type.getTypeName().toLowerCase().startsWith(serdeConstants.VARCHAR_TYPE_NAME)) {
+      return true;
+    }
+
+    if (type.getCategory().equals(Category.STRUCT)) {
+      StructTypeInfo sti = (StructTypeInfo) type;
+      for (TypeInfo t : sti.getAllStructFieldTypeInfos()) {
+        if (needsConversion(t)) {
+          return true;
+        }
+      }
+    }
+
+    if (type.getCategory().equals(Category.MAP)) {
+      TypeInfo keyTypeInfo = ((MapTypeInfo) type).getMapKeyTypeInfo();
+      if (needsConversion(keyTypeInfo)) {
+        return true;
+      }
+      TypeInfo valueTypeInfo = ((MapTypeInfo) type).getMapKeyTypeInfo();

Review Comment:
   That is correct. Thanks for catching this @kasakrisz 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] mdayakar commented on a diff in pull request #3628: HIVE-26320: Deserialize Parquet VARCHAR and CHAR types appropriately

Posted by GitBox <gi...@apache.org>.
mdayakar commented on code in PR #3628:
URL: https://github.com/apache/hive/pull/3628#discussion_r983086619


##########
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetHiveSerDe.java:
##########
@@ -91,11 +93,61 @@ public class ParquetHiveSerDe extends AbstractSerDe implements SchemaInference {
 
   private ObjectInspector objInspector;
   private ParquetHiveRecord parquetRow;
+  private ObjectInspectorConverters.Converter converter;
 
   public ParquetHiveSerDe() {
     parquetRow = new ParquetHiveRecord();
   }
 
+  // Recursively check if CHAR or VARCHAR types are used
+  private boolean needsConversion(TypeInfo type) {
+    if (type.getTypeName().toLowerCase().startsWith(serdeConstants.CHAR_TYPE_NAME) ||
+        type.getTypeName().toLowerCase().startsWith(serdeConstants.VARCHAR_TYPE_NAME)) {

Review Comment:
   `type.getTypeName().toLowerCase()` value can be stored to a local variable and use the local variable instead of calculating the value again.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] asolimando commented on a diff in pull request #3628: HIVE-26320: Deserialize Parquet VARCHAR and CHAR types appropriately

Posted by GitBox <gi...@apache.org>.
asolimando commented on code in PR #3628:
URL: https://github.com/apache/hive/pull/3628#discussion_r984400218


##########
ql/src/test/org/apache/hadoop/hive/ql/io/parquet/convert/TestETypeConverter.java:
##########
@@ -212,23 +216,59 @@ public void testGetInt64NanosAdjustedToUTCTimestampConverter() throws Exception
   }
 
   @Test
-  public void testGetTextConverter() throws Exception {
+  public void testGetTextConverterForString() throws Exception {
     PrimitiveType primitiveType = Types.optional(PrimitiveTypeName.BINARY)
-        .as(LogicalTypeAnnotation.stringType()).named("value");
-    Writable writable = getWritableFromBinaryConverter(new VarcharTypeInfo(), primitiveType,
-        Binary.fromString("this_is_a_value"));
-    Text textWritable = (Text) writable;
-    assertEquals("this_is_a_value", textWritable.toString());
+            .as(LogicalTypeAnnotation.stringType()).named("value");
+    String value = "this_is_a_value";
+    Text textWritable = (Text) getWritableFromBinaryConverter(stringTypeInfo, primitiveType,
+            Binary.fromString(value));
+    // we should get what we put in
+    assertEquals(value, textWritable.toString());
+  }
+
+  @Test
+  public void testGetTextConverterForChar() throws Exception {

Review Comment:
   Nit: same here, `throws Exception` is redundant
   
   ```suggestion
     public void testGetTextConverterForChar() {
   ```



##########
ql/src/test/org/apache/hadoop/hive/ql/io/parquet/convert/TestETypeConverter.java:
##########
@@ -212,23 +216,59 @@ public void testGetInt64NanosAdjustedToUTCTimestampConverter() throws Exception
   }
 
   @Test
-  public void testGetTextConverter() throws Exception {
+  public void testGetTextConverterForString() throws Exception {
     PrimitiveType primitiveType = Types.optional(PrimitiveTypeName.BINARY)
-        .as(LogicalTypeAnnotation.stringType()).named("value");
-    Writable writable = getWritableFromBinaryConverter(new VarcharTypeInfo(), primitiveType,
-        Binary.fromString("this_is_a_value"));
-    Text textWritable = (Text) writable;
-    assertEquals("this_is_a_value", textWritable.toString());
+            .as(LogicalTypeAnnotation.stringType()).named("value");
+    String value = "this_is_a_value";
+    Text textWritable = (Text) getWritableFromBinaryConverter(stringTypeInfo, primitiveType,
+            Binary.fromString(value));
+    // we should get what we put in
+    assertEquals(value, textWritable.toString());
+  }
+
+  @Test
+  public void testGetTextConverterForChar() throws Exception {
+    PrimitiveType primitiveType = Types.optional(PrimitiveTypeName.BINARY)
+            .as(LogicalTypeAnnotation.stringType()).named("value");
+    String value = "this_is_a_value";
+
+    HiveCharWritable textWritable = (HiveCharWritable) getWritableFromBinaryConverter(
+            new CharTypeInfo(value.length() + 2), primitiveType, Binary.fromString(value));
+    // check that it enforces the length (it should be padded)
+    assertEquals(value + "  ", textWritable.toString());
+
+    textWritable = (HiveCharWritable) getWritableFromBinaryConverter(
+            new CharTypeInfo(6), primitiveType, Binary.fromString(value));
+    // check that it enforces the length (it should truncate)
+    assertEquals(value.substring(0, 6), textWritable.toString());
+  }
+
+  @Test
+  public void testGetTextConverterForVarchar() throws Exception {

Review Comment:
   ```suggestion
     public void testGetTextConverterForVarchar() {
   ```



##########
ql/src/test/org/apache/hadoop/hive/ql/io/parquet/convert/TestETypeConverter.java:
##########
@@ -212,23 +216,59 @@ public void testGetInt64NanosAdjustedToUTCTimestampConverter() throws Exception
   }
 
   @Test
-  public void testGetTextConverter() throws Exception {
+  public void testGetTextConverterForString() throws Exception {

Review Comment:
   Nit: the "throws Exception" bit is redundant and can be safely removed
   
   ```suggestion
     public void testGetTextConverterForString() {
   ```



##########
ql/src/test/org/apache/hadoop/hive/ql/io/parquet/convert/TestETypeConverter.java:
##########
@@ -212,23 +216,59 @@ public void testGetInt64NanosAdjustedToUTCTimestampConverter() throws Exception
   }
 
   @Test
-  public void testGetTextConverter() throws Exception {
+  public void testGetTextConverterForString() throws Exception {
     PrimitiveType primitiveType = Types.optional(PrimitiveTypeName.BINARY)
-        .as(LogicalTypeAnnotation.stringType()).named("value");
-    Writable writable = getWritableFromBinaryConverter(new VarcharTypeInfo(), primitiveType,
-        Binary.fromString("this_is_a_value"));
-    Text textWritable = (Text) writable;
-    assertEquals("this_is_a_value", textWritable.toString());
+            .as(LogicalTypeAnnotation.stringType()).named("value");
+    String value = "this_is_a_value";
+    Text textWritable = (Text) getWritableFromBinaryConverter(stringTypeInfo, primitiveType,
+            Binary.fromString(value));
+    // we should get what we put in
+    assertEquals(value, textWritable.toString());
+  }
+
+  @Test
+  public void testGetTextConverterForChar() throws Exception {
+    PrimitiveType primitiveType = Types.optional(PrimitiveTypeName.BINARY)
+            .as(LogicalTypeAnnotation.stringType()).named("value");
+    String value = "this_is_a_value";
+
+    HiveCharWritable textWritable = (HiveCharWritable) getWritableFromBinaryConverter(
+            new CharTypeInfo(value.length() + 2), primitiveType, Binary.fromString(value));
+    // check that it enforces the length (it should be padded)
+    assertEquals(value + "  ", textWritable.toString());
+
+    textWritable = (HiveCharWritable) getWritableFromBinaryConverter(
+            new CharTypeInfo(6), primitiveType, Binary.fromString(value));
+    // check that it enforces the length (it should truncate)
+    assertEquals(value.substring(0, 6), textWritable.toString());
+  }
+
+  @Test
+  public void testGetTextConverterForVarchar() throws Exception {
+    PrimitiveType primitiveType = Types.optional(PrimitiveTypeName.BINARY)
+            .as(LogicalTypeAnnotation.stringType()).named("value");
+    String value = "this_is_a_value";
+
+    HiveVarcharWritable textWritable = (HiveVarcharWritable) getWritableFromBinaryConverter(
+            new VarcharTypeInfo(6), primitiveType, Binary.fromString(value));
+    // check that it enforces the length (it should truncate)
+    assertEquals(value.substring(0, 6), textWritable.toString());
+
+    textWritable = (HiveVarcharWritable) getWritableFromBinaryConverter(
+            new VarcharTypeInfo(34), primitiveType, Binary.fromString(value));
+    // check that it does not pad
+    assertEquals(value, textWritable.toString());
   }
 
   @Test
   public void testGetTextConverterNoHiveTypeInfo() throws Exception {
     PrimitiveType primitiveType = Types.optional(PrimitiveTypeName.BINARY)
         .as(LogicalTypeAnnotation.stringType()).named("value");
-    Writable writable =
-        getWritableFromBinaryConverter(null, primitiveType, Binary.fromString("this_is_a_value"));
-    Text textWritable = (Text) writable;
-    assertEquals("this_is_a_value", textWritable.toString());
+    String value = "this_is_a_value";

Review Comment:
   Nit: since you are modifying the method, you could remove the extra `throws Exception` here too :)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] sonarcloud[bot] commented on pull request #3628: HIVE-26320: Deserialize Parquet VARCHAR and CHAR types appropriately

Posted by GitBox <gi...@apache.org>.
sonarcloud[bot] commented on PR #3628:
URL: https://github.com/apache/hive/pull/3628#issuecomment-1264127867

   Kudos, SonarCloud Quality Gate passed!&nbsp; &nbsp; [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive&pullRequest=3628)
   
   [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=BUG)  
   [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=VULNERABILITY)  
   [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3628&resolved=false&types=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3628&resolved=false&types=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3628&resolved=false&types=SECURITY_HOTSPOT)  
   [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=CODE_SMELL) [7 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3628&resolved=false&types=CODE_SMELL)
   
   [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3628&metric=coverage&view=list) No Coverage information  
   [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3628&metric=duplicated_lines_density&view=list) No Duplication information
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] jfsii commented on a diff in pull request #3628: HIVE-26320: Deserialize Parquet VARCHAR and CHAR types appropriately

Posted by GitBox <gi...@apache.org>.
jfsii commented on code in PR #3628:
URL: https://github.com/apache/hive/pull/3628#discussion_r984716423


##########
serde/src/java/org/apache/hadoop/hive/serde2/io/HiveBaseCharWritable.java:
##########
@@ -26,10 +26,19 @@
 import org.apache.hive.common.util.HiveStringUtils;
 
 public abstract class HiveBaseCharWritable {
-  protected Text value = new Text();
+  protected Text value;
   protected int charLength = -1;
 
   public HiveBaseCharWritable() {
+    value = new Text();
+  }
+
+  public HiveBaseCharWritable(Text text) {
+    value = text;
+  }
+
+  public HiveBaseCharWritable(byte[] bytes) {
+    value = new Text(bytes);

Review Comment:
   Likely no specific reason, it just happened to end up that way. I suppose that I didn't notice the set method on the Text object and wanted to avoid an extra allocation.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org