You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tez.apache.org by Solal Pirelli <t-...@microsoft.com.INVALID> on 2017/09/25 23:40:37 UTC

Proposal: Simulator mode

Hi,

I opened a JIRA issue and was redirected to the mailing list, so here I am. :)

https://issues.apache.org/jira/browse/TEZ-3841 is early work on a new feature proposal: a Tez "simulator" in which vertices are not actually executed, but instead use a simplified "fake" processor (which by default does nothing) to let a developer see how Tez will handle certain workloads.

The goal is to be relatively close to an actual Tez run (including support for e.g. blacklisting nodes, to see what happens when simulating an operation with a high failure rate) , without requiring an actual Hadoop cluster; the whole thing runs inside a single JVM.

The JIRA issue describes the current implementation, and some possible questions a simulator could help answer.

What do you think about this proposal?
I'd appreciate any pointers regarding the implementation.


Cheers,

Solal Pirelli

RE: Proposal: Simulator mode

Posted by Solal Pirelli <t-...@microsoft.com.INVALID>.
Hi,

Gentle ping. :)

It seems my current implementation is buggy when sending events from the fake processor (e.g. to test how Tez handles load), is there documentation on Tez hearbeat requests/responses anywhere?
I don't know what the `preRoutedStartIndex` and `startIndex` values in `TezHeartbeatRequest` are for, nor whether I should be doing anything more than one heartbeat with a "progress is 100%" and a "task is finished" event.


Cheers,

Solal Pirelli

-----Original Message-----
From: Solal Pirelli [mailto:t-sopire@microsoft.com.INVALID] 
Sent: Monday, September 25, 2017 4:41 PM
To: dev@tez.apache.org
Subject: Proposal: Simulator mode

Hi,

I opened a JIRA issue and was redirected to the mailing list, so here I am. :)

https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FTEZ-3841&data=02%7C01%7Ct-sopire%40microsoft.com%7Cd5c5a48fa5a540e4533408d5046f0085%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636419797147565479&sdata=JMCTd8DzQSvWietTfE5fr2NZ7L8lRpdnLTN5YjZH1uc%3D&reserved=0 is early work on a new feature proposal: a Tez "simulator" in which vertices are not actually executed, but instead use a simplified "fake" processor (which by default does nothing) to let a developer see how Tez will handle certain workloads.

The goal is to be relatively close to an actual Tez run (including support for e.g. blacklisting nodes, to see what happens when simulating an operation with a high failure rate) , without requiring an actual Hadoop cluster; the whole thing runs inside a single JVM.

The JIRA issue describes the current implementation, and some possible questions a simulator could help answer.

What do you think about this proposal?
I'd appreciate any pointers regarding the implementation.


Cheers,

Solal Pirelli