You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "László Bodor (Jira)" <ji...@apache.org> on 2020/04/07 12:12:00 UTC

[jira] [Comment Edited] (TEZ-4105) Tez job-analyzer tool to support proto logging history

    [ https://issues.apache.org/jira/browse/TEZ-4105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17063277#comment-17063277 ] 

László Bodor edited comment on TEZ-4105 at 4/7/20, 12:11 PM:
-------------------------------------------------------------

{code}
wget -qO- "https://issues.apache.org/jira/secure/attachment/12998769/TEZ-4105.09.patch" | git apply -p0 -3

mvn clean install -DskipTests -Ptools

mvn dependency:copy-dependencies -Ptools -pl ./tez-plugins/tez-history-parser -pl ./tez-tools/analyzers/job-analyzer

java -cp "./tez-tools/analyzers/job-analyzer/target/*:./tez-tools/analyzers/job-analyzer/target/dependency/*" org.apache.tez.analyzer.plugins.AnalyzerDriver TaskAssignmentAnalyzer --dagId=dag_1583980529217_0000_18 --fromProtoHistory --eventFileName=/Users/lbodor/Downloads/ --outputDir /tmp --saveResults
{code}

output under /tmp
{code}
vertex,node,numTasks,load
Map 2,query-executor-0-9.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,2,100.00
Map 3,query-executor-0-8.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,832,240.29
Map 3,query-executor-1-5.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,173,49.96
Map 3,query-executor-0-6.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,402,116.10
Map 3,query-executor-1-7.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,132,38.12
Map 3,query-executor-1-9.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,147,42.45
Map 3,query-executor-0-4.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,28,8.09
Map 3,query-executor-1-3.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,184,53.14
Map 3,query-executor-0-0.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,810,233.94
Map 3,query-executor-1-1.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,151,43.61
Map 3,query-executor-1-6.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,118,34.08
Map 3,query-executor-0-2.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,427,123.32
Map 3,query-executor-0-7.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,25,7.22
Map 3,query-executor-1-4.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,170,49.10
Map 3,query-executor-0-9.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,579,167.22
Map 3,query-executor-1-2.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,168,48.52
Map 3,query-executor-0-5.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,1788,516.39
Map 3,query-executor-1-8.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,175,50.54
Map 3,query-executor-0-1.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,48,13.86
Map 3,query-executor-0-3.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,492,142.09
Map 3,query-executor-1-0.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,76,21.95
Map 1,query-executor-0-5.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,89,100.00
Reducer 4,query-executor-0-8.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,65,88.98
Reducer 4,query-executor-1-5.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,63,86.24
Reducer 4,query-executor-0-6.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,80,109.51
Reducer 4,query-executor-1-7.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,114,156.06
Reducer 4,query-executor-1-9.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,35,47.91
Reducer 4,query-executor-0-4.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,4,5.48
Reducer 4,query-executor-1-3.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,68,93.09
Reducer 4,query-executor-0-0.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,109,149.21
Reducer 4,query-executor-1-1.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,120,164.27
Reducer 4,query-executor-1-6.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,60,82.14
Reducer 4,query-executor-0-2.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,94,128.68
Reducer 4,query-executor-0-7.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,12,16.43
Reducer 4,query-executor-1-4.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,94,128.68
Reducer 4,query-executor-0-9.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,95,130.05
Reducer 4,query-executor-1-2.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,102,139.63
Reducer 4,query-executor-0-5.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,66,90.35
Reducer 4,query-executor-1-8.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,66,90.35
Reducer 4,query-executor-0-1.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,5,6.84
Reducer 4,query-executor-0-3.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,81,110.88
Reducer 4,query-executor-1-0.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,128,175.22
{code}

could you please take a look [~rajesh.balamohan], [~gopalv]?
the patch is about the following:

1. HistoryEventProtoJsonConversion: Convert HistoryEventProto into JSONObject for analyzers (I would not recommend reviewing in details, it's long...basically it was copied from HistoryEventJsonConversion and converted, you can check javadoc comment on the class)

2. ProtoHistoryParser extends SimpleHistoryParser: the new parser works almost exactly the same as simple history logging parser, as HistoryEventProtoJsonConversion was created in a way that it outputs the same JSON as HistoryEventJsonConversion

3. added some missing serialized values to HistoryEventProtoConverter, HistoryEventJsonConversion

4. extend parsers to accept multiple files (in my usecase, history events were split into 2 files), and implement it for ProtoHistoryParser + accept directory and find files by pattern inside (contains dagId)

patch can be tested by attached protobuf history files like above


was (Author: abstractdog):
{code}
wget -qO- "https://issues.apache.org/jira/secure/attachment/12997364/TEZ-4105.05.patch" | git apply -p0 -3

mvn clean install -DskipTests -Ptools

mvn dependency:copy-dependencies -Ptools -pl ./tez-plugins/tez-history-parser -pl ./tez-tools/analyzers/job-analyzer

java -cp "./tez-tools/analyzers/job-analyzer/target/*:./tez-tools/analyzers/job-analyzer/target/dependency/*" org.apache.tez.analyzer.plugins.AnalyzerDriver TaskAssignmentAnalyzer --dagId=dag_1583980529217_0000_18 --fromProtoHistory --eventFileName=/Users/lbodor/Downloads/ --outputDir /tmp --saveResults
{code}

output under /tmp
{code}
vertex,node,numTasks,load
Map 2,query-executor-0-9.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,2,100.00
Map 3,query-executor-0-8.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,832,240.29
Map 3,query-executor-1-5.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,173,49.96
Map 3,query-executor-0-6.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,402,116.10
Map 3,query-executor-1-7.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,132,38.12
Map 3,query-executor-1-9.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,147,42.45
Map 3,query-executor-0-4.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,28,8.09
Map 3,query-executor-1-3.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,184,53.14
Map 3,query-executor-0-0.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,810,233.94
Map 3,query-executor-1-1.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,151,43.61
Map 3,query-executor-1-6.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,118,34.08
Map 3,query-executor-0-2.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,427,123.32
Map 3,query-executor-0-7.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,25,7.22
Map 3,query-executor-1-4.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,170,49.10
Map 3,query-executor-0-9.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,579,167.22
Map 3,query-executor-1-2.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,168,48.52
Map 3,query-executor-0-5.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,1788,516.39
Map 3,query-executor-1-8.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,175,50.54
Map 3,query-executor-0-1.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,48,13.86
Map 3,query-executor-0-3.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,492,142.09
Map 3,query-executor-1-0.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,76,21.95
Map 1,query-executor-0-5.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,89,100.00
Reducer 4,query-executor-0-8.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,65,88.98
Reducer 4,query-executor-1-5.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,63,86.24
Reducer 4,query-executor-0-6.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,80,109.51
Reducer 4,query-executor-1-7.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,114,156.06
Reducer 4,query-executor-1-9.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,35,47.91
Reducer 4,query-executor-0-4.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,4,5.48
Reducer 4,query-executor-1-3.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,68,93.09
Reducer 4,query-executor-0-0.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,109,149.21
Reducer 4,query-executor-1-1.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,120,164.27
Reducer 4,query-executor-1-6.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,60,82.14
Reducer 4,query-executor-0-2.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,94,128.68
Reducer 4,query-executor-0-7.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,12,16.43
Reducer 4,query-executor-1-4.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,94,128.68
Reducer 4,query-executor-0-9.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,95,130.05
Reducer 4,query-executor-1-2.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,102,139.63
Reducer 4,query-executor-0-5.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,66,90.35
Reducer 4,query-executor-1-8.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,66,90.35
Reducer 4,query-executor-0-1.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,5,6.84
Reducer 4,query-executor-0-3.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,81,110.88
Reducer 4,query-executor-1-0.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,128,175.22
{code}

could you please take a look [~rajesh.balamohan], [~gopalv]?
the patch is about the following:

1. HistoryEventProtoJsonConversion: Convert HistoryEventProto into JSONObject for analyzers (I would not recommend reviewing in details, it's long...basically it was copied from HistoryEventJsonConversion and converted, you can check javadoc comment on the class)

2. ProtoHistoryParser extends SimpleHistoryParser: the new parser works almost exactly the same as simple history logging parser, as HistoryEventProtoJsonConversion was created in a way that it outputs the same JSON as HistoryEventJsonConversion

3. added some missing serialized values to HistoryEventProtoConverter, HistoryEventJsonConversion

4. extend parsers to accept multiple files (in my usecase, history events were split into 2 files), and implement it for ProtoHistoryParser + accept directory and find files by pattern inside (contains dagId)

patch can be tested by attached protobuf history files like above

> Tez job-analyzer tool to support proto logging history
> ------------------------------------------------------
>
>                 Key: TEZ-4105
>                 URL: https://issues.apache.org/jira/browse/TEZ-4105
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: László Bodor
>            Assignee: László Bodor
>            Priority: Major
>         Attachments: TEZ-4105.01.patch, TEZ-4105.02.patch, TEZ-4105.03.patch, TEZ-4105.04.patch, TEZ-4105.05.patch, TEZ-4105.06.patch, TEZ-4105.07.patch, TEZ-4105.08.patch, TEZ-4105.09.patch, dag_1583980529217_0000_18_1, dag_1583980529217_0000_18_1_1
>
>
> Currently analyzers in tez-tools can only work with output of ats (zipped json files) and simple history logging (plain text json file) files. It would be nice to have a parser that can create the needed info for analyzers from a dag protobuf file. In order to achieve this, we need at least a converter which can convert HistoryProtoEvent instances to a format which can be read by TezAnalyzerBase (+ a new parser is needed, similarly to SimpleHistoryParser)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)