You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Eugene Koifman (JIRA)" <ji...@apache.org> on 2017/10/31 22:00:00 UTC
[jira] [Commented] (HIVE-16952)
AcidUtils.parseBaseOrDeltaBucketFilename() end clause
[ https://issues.apache.org/jira/browse/HIVE-16952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16227674#comment-16227674 ]
Eugene Koifman commented on HIVE-16952:
---------------------------------------
note that Load Data can simply move files with arbitrary names into the table namespace.
So non-acid to acid conversion (unbucketed) may see files with non-standard names
So the "-1" may be needed to send all such files a single logical bucket to number the rows correct for reading "original" files.
Could also hash the filename (that maps to -1) and mod N to send to different logical buckets so that 1st compaction doesn't have a lopsided split.
> AcidUtils.parseBaseOrDeltaBucketFilename() end clause
> -----------------------------------------------------
>
> Key: HIVE-16952
> URL: https://issues.apache.org/jira/browse/HIVE-16952
> Project: Hive
> Issue Type: Bug
> Components: Transactions
> Affects Versions: 1.0.0
> Reporter: Eugene Koifman
> Assignee: Eugene Koifman
> Priority: Minor
>
> The end of this method
> {noformat}
> } else {
> result.setOldStyle(true).bucket(-1).minimumTransactionId(0)
> .maximumTransactionId(0);
> }
> {noformat}
> should this throw instead? bucket == -1 can't be handled by anything in OrcRawRecordMerger or anywhere else
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)