You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Robert Kanter (JIRA)" <ji...@apache.org> on 2017/05/02 00:53:04 UTC

[jira] [Commented] (YARN-6522) Make SLS JSON input file format simple and scalable

    [ https://issues.apache.org/jira/browse/YARN-6522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15991911#comment-15991911 ] 

Robert Kanter commented on YARN-6522:
-------------------------------------

Looks good overall.  A few comments:
# For {{int priority = 19;}}, let's use whatever constant this should refer to if possible.  If not, let's at least put a comment saying what {{19}} means.
# It looks like {{job.finish.ms}} is useful, at least for some metrics.  I'm fine if we make this optional.
# There's some issues with the {{nodes_per_rack}} and {{generateNodes}}:
## I think it would be good to do some input validation (e.g. what if user specifies a negative number of nodes?).  I'm sure there's other fields where we should do some validation (e.g. invalid character in hostname, etc).  Can you file a followup JIRA for this?
## The default value of {{nodesPerRack}} is {{numNodes}}.  And the rack names are generated using {{nodeNum % nodesPerRack}}.  So if {{numNodes}} is 100, you're going to end up with {{/rack0/node0, /rack1/node1, /rack2/node2, etc).  i.e. you get 100 racks, each with 1 node; instead of 1 rack with 100 nodes.
## Can you add a unit test for {{generateNodes}} to {{TestSLSUtils}}?
## The names {{num_nodes}} and {{nodes_per_rack}} are inconsistent with the other properties that all have {{.}} as the delimiter.  Should we make these {{num.nodes}} and {{nodes.per.rack}}?
## Instead of {{nodesPerRack}}, what if we did {{numRacks}}?  I think that's easier to reason about.  e.g. if you said 20 nodes and 3 racks, if would divide the 20 nodes (as evenly as possible) into the 3 racks.  As it is now, to do the equivalent, you'd have to {{nodes_per_rack}} to 6 or 7 (I'm not sure which).
# Can you add documentation?
# Why did you change
{code:java}
String oldAppId = jsonJob.get("job.id").toString();
{code}
to
{code:java}
String oldAppId = (String)jsonJob.get("job.id");
{code}

> Make SLS JSON input file format simple and scalable
> ---------------------------------------------------
>
>                 Key: YARN-6522
>                 URL: https://issues.apache.org/jira/browse/YARN-6522
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: scheduler-load-simulator
>    Affects Versions: 3.0.0-alpha2
>            Reporter: Yufei Gu
>            Assignee: Yufei Gu
>         Attachments: YARN-6522.001.patch
>
>
> SLS input format is verbose, and it doesn't scale out. We can improve it in these ways:
> # We need to configure tasks one by one if there are more than one task in a job, which means the job configuration usually includes lots of redundant items. To specify the number of task for task configuration will solve this issue.
> # Container host is useful for locality testing. It is obnoxious to specify container host for each task for tests unrelated to locality. We would like to make it optional.
> # For most tests, we don't care about job.id. Make it optional and generated automatically by default.
> # job.finish.ms doesn't make sense, just remove it.
> # container type and container priority should be optional as well. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org