You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@hive.apache.org by "Eugene Koifman (JIRA)" <ji...@apache.org> on 2017/10/31 22:00:00 UTC

[jira] [Commented] (HIVE-16952) AcidUtils.parseBaseOrDeltaBucketFilename() end clause

    [ https://issues.apache.org/jira/browse/HIVE-16952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16227674#comment-16227674 ] 

Eugene Koifman commented on HIVE-16952:
---------------------------------------

note that Load Data can simply move files with arbitrary names into the table namespace.
So non-acid to acid conversion (unbucketed) may see files with non-standard names
So the "-1" may be needed to send all such files a single logical bucket to number the rows correct for reading "original" files.

Could also hash the filename (that maps to -1) and mod N to send to different logical buckets so that 1st compaction doesn't have a lopsided split.

> AcidUtils.parseBaseOrDeltaBucketFilename() end clause
> -----------------------------------------------------
>
>                 Key: HIVE-16952
>                 URL: https://issues.apache.org/jira/browse/HIVE-16952
>             Project: Hive
>          Issue Type: Bug
>          Components: Transactions
>    Affects Versions: 1.0.0
>            Reporter: Eugene Koifman
>            Assignee: Eugene Koifman
>            Priority: Minor
>
> The end of this method
> {noformat}
>     } else {
>       result.setOldStyle(true).bucket(-1).minimumTransactionId(0)
>           .maximumTransactionId(0);
>     }
> {noformat}
> should this throw instead?  bucket == -1 can't be handled by anything in OrcRawRecordMerger or anywhere else



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)