You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Eric Wohlstadter (JIRA)" <ji...@apache.org> on 2019/03/28 18:49:00 UTC

[jira] [Assigned] (TEZ-4057) Fix Unsorted broadcast shuffle umasks

     [ https://issues.apache.org/jira/browse/TEZ-4057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Eric Wohlstadter reassigned TEZ-4057:
-------------------------------------

    Assignee: Eric Wohlstadter

> Fix Unsorted broadcast shuffle umasks
> -------------------------------------
>
>                 Key: TEZ-4057
>                 URL: https://issues.apache.org/jira/browse/TEZ-4057
>             Project: Apache Tez
>          Issue Type: Bug
>    Affects Versions: 0.9.2
>            Reporter: Gopal V
>            Assignee: Eric Wohlstadter
>            Priority: Major
>
> {code}
>     if (numPartitions == 1 && !pipelinedShuffle) {
>       //special case, where in only one partition is available.
>       finalOutPath = outputFileHandler.getOutputFileForWrite();
>       finalIndexPath = outputFileHandler.getOutputIndexFileForWrite(indexFileSizeEstimate);
>       skipBuffers = true;
>       writer = new IFile.Writer(conf, rfs, finalOutPath, keyClass, valClass,
>           codec, outputRecordsCounter, outputRecordBytesCounter);
>     } else {
>       skipBuffers = false;
>       writer = null;
>     }
> {code}
> The broadcast events don't update the file umasks, because they have 1 partition.
> {code}
> total 8.0K
> -rw------- 1 hive hadoop 15 Mar 27 20:30 file.out
> -rw-r----- 1 hive hadoop 32 Mar 27 20:30 file.out.index
> {code}
> ending up with readable index files and unreadable .out files.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)