You are viewing a plain text version of this content. The canonical link for it is here.
Posted to gitbox@hive.apache.org by GitBox <gi...@apache.org> on 2022/09/23 22:24:38 UTC

[GitHub] [hive] ayushtkn commented on a diff in pull request #3606: HIVE-26496: Populating bucketId in the constructor instead of parse m…

ayushtkn commented on code in PR #3606:
URL: https://github.com/apache/hive/pull/3606#discussion_r979098870


##########
ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcSplit.java:
##########
@@ -384,7 +384,6 @@ public void parse(Configuration conf, Path rootPath) throws IOException {
         OrcRawRecordMerger.TransactionMetaData.findWriteIDForSynthetcRowIDs(getPath(), rootPath, conf);
     writeId = tmd.syntheticWriteId;
     stmtId = tmd.statementId;
-    bucketId = AcidUtils.parseBucketId(getPath());

Review Comment:
   I am not very sure about this part of code. But just going through the code. the ``bucketId= AcidUtils.parseBucketId(getPath());``, the Path can change also if someone calls ``readFields``. It does this in the super class:
   ```
     public void readFields(DataInput in) throws IOException {
       file = new Path(Text.readString(in));
   ```
   the `file` is what returns from ``getPath()``, So, I am not sure if it is a good idea to just have that set in just the constructor.
   
   Moreover if any other change in Hadoop comes, which bothers the path in some other method post the construction invocation, again we will be prone to may be some silent errors.
   
   Second, there is a legacy protected constructor which doesn't have path and expect ``readFields`` to do the magic for it, if some thirdparty application/project is still using that by any means, that will get screwed, So, we have to remove that constructor also I feel.
   
   Would require someone more experienced to tell, how safe this is



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org