You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tez.apache.org by "Gopal V (JIRA)" <ji...@apache.org> on 2019/05/30 06:02:00 UTC

[jira] [Created] (TEZ-4073) Configuration: Reduce Vertex and DAG Payload Size

Gopal V created TEZ-4073:
----------------------------

             Summary: Configuration: Reduce Vertex and DAG Payload Size
                 Key: TEZ-4073
                 URL: https://issues.apache.org/jira/browse/TEZ-4073
             Project: Apache Tez
          Issue Type: Bug
            Reporter: Gopal V


As the total number of vertices go up, the Tez protobuf transport starts to show up as a potential scalability problem for the task submission and the AM

{code}
public TezTaskRunner2(Configuration tezConf, UserGroupInformation ugi, String[] localDirs,
 ...
    this.taskConf = new Configuration(tezConf);
    if (taskSpec.getTaskConf() != null) {
      Iterator<Entry<String, String>> iter = taskSpec.getTaskConf().iterator();
      while (iter.hasNext()) {
        Entry<String, String> entry = iter.next();
        taskConf.set(entry.getKey(), entry.getValue());
      }
    }
{code}

The TaskSpec getTaskConf() need not include any of the default configs, since the keys are placed into an existing task conf.

{code}
    // Security framework already loaded the tokens into current ugi
    DAGProtos.ConfigurationProto confProto =
        TezUtilsInternal.readUserSpecifiedTezConfiguration(System.getenv(Environment.PWD.name()));
    TezUtilsInternal.addUserSpecifiedTezConfiguration(defaultConf, confProto.getConfKeyValuesList());
    UserGroupInformation.setConfiguration(defaultConf);
    Credentials credentials = UserGroupInformation.getCurrentUser().getCredentials();
{code}

At the very least, the DAG and Vertex do not both need to have the same configs repeated in them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)