You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Gopal V (JIRA)" <ji...@apache.org> on 2019/03/27 21:12:00 UTC

[jira] [Created] (TEZ-4057) Fix Unsorted broadcast shuffle umasks

Gopal V created TEZ-4057:
----------------------------

             Summary: Fix Unsorted broadcast shuffle umasks
                 Key: TEZ-4057
                 URL: https://issues.apache.org/jira/browse/TEZ-4057
             Project: Apache Tez
          Issue Type: Bug
    Affects Versions: 0.9.2
            Reporter: Gopal V


{code}

    if (numPartitions == 1 && !pipelinedShuffle) {
      //special case, where in only one partition is available.
      finalOutPath = outputFileHandler.getOutputFileForWrite();
      finalIndexPath = outputFileHandler.getOutputIndexFileForWrite(indexFileSizeEstimate);
      skipBuffers = true;
      writer = new IFile.Writer(conf, rfs, finalOutPath, keyClass, valClass,
          codec, outputRecordsCounter, outputRecordBytesCounter);
    } else {
      skipBuffers = false;
      writer = null;
    }
{code}

The broadcast events don't update the file umasks, because they have 1 partition.

{code}
total 8.0K
-rw------- 1 hive hadoop 15 Mar 27 20:30 file.out
-rw-r----- 1 hive hadoop 32 Mar 27 20:30 file.out.index
{code}

ending up with readable index files and unreadable .out files.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)