You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Gopal V (JIRA)" <ji...@apache.org> on 2019/03/28 20:07:00 UTC
[jira] [Commented] (TEZ-4057) Fix Unsorted broadcast shuffle umasks
[ https://issues.apache.org/jira/browse/TEZ-4057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16804265#comment-16804265 ]
Gopal V commented on TEZ-4057:
------------------------------
LGTM - +1
> Fix Unsorted broadcast shuffle umasks
> -------------------------------------
>
> Key: TEZ-4057
> URL: https://issues.apache.org/jira/browse/TEZ-4057
> Project: Apache Tez
> Issue Type: Bug
> Affects Versions: 0.9.2
> Reporter: Gopal V
> Assignee: Eric Wohlstadter
> Priority: Major
> Attachments: TEZ-4057.1.patch
>
>
> {code}
> if (numPartitions == 1 && !pipelinedShuffle) {
> //special case, where in only one partition is available.
> finalOutPath = outputFileHandler.getOutputFileForWrite();
> finalIndexPath = outputFileHandler.getOutputIndexFileForWrite(indexFileSizeEstimate);
> skipBuffers = true;
> writer = new IFile.Writer(conf, rfs, finalOutPath, keyClass, valClass,
> codec, outputRecordsCounter, outputRecordBytesCounter);
> } else {
> skipBuffers = false;
> writer = null;
> }
> {code}
> The broadcast events don't update the file umasks, because they have 1 partition.
> {code}
> total 8.0K
> -rw------- 1 hive hadoop 15 Mar 27 20:30 file.out
> -rw-r----- 1 hive hadoop 32 Mar 27 20:30 file.out.index
> {code}
> ending up with readable index files and unreadable .out files.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)