You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Riza Suminto (Code Review)" <ge...@cloudera.org> on 2023/04/24 17:33:48 UTC

[Impala-ASF-CR] IMPALA-12090: Split runtime profiles made by single node perf run.py

Riza Suminto has uploaded this change for review. ( http://gerrit.cloudera.org:8080/19796


Change subject: IMPALA-12090: Split runtime profiles made by single_node_perf_run.py
......................................................................

IMPALA-12090: Split runtime profiles made by single_node_perf_run.py

single_node_perf_run.py produce a single text file containing all
runtime profiles from perf run from one git hash. This is handy, but the
resulting text file can be very long and makes it difficult to analyze
individual profile.

This patch add --split_profiles option into single_node_perf_run.py so
it will extract runtime profiles into individual file instead of single
long text file. Files in profile directory will look like this:

$ ls -1 perf_results/latest/2267d9d104cc3fb0740cba09acb369b4d7ae4f52_profiles/
2023-04-24T09:47:37.804348_TPCDS-Q14-1.txt
2023-04-24T09:47:46.551682_TPCDS-Q14-1.txt
2023-04-24T09:47:53.162474_TPCDS-Q14-1.txt
2023-04-24T09:47:58.734194_TPCDS-Q14-2.txt
2023-04-24T09:48:02.793736_TPCDS-Q14-2.txt
2023-04-24T09:48:06.580700_TPCDS-Q14-2.txt
2023-04-24T09:48:12.968678_TPCDS-Q23-1.txt
2023-04-24T09:48:18.060801_TPCDS-Q23-1.txt
2023-04-24T09:48:22.307676_TPCDS-Q23-1.txt
2023-04-24T09:48:26.602270_TPCDS-Q23-2.txt
2023-04-24T09:48:31.342496_TPCDS-Q23-2.txt
2023-04-24T09:48:35.832333_TPCDS-Q23-2.txt

Testing:
- Manually test run the script with selected queries from tpcds
  workload.

Change-Id: Ibc2d3cefd7ad61b76cbef74c734543ef9ca51795
---
M bin/single_node_perf_run.py
1 file changed, 43 insertions(+), 11 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/96/19796/1
-- 
To view, visit http://gerrit.cloudera.org:8080/19796
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ibc2d3cefd7ad61b76cbef74c734543ef9ca51795
Gerrit-Change-Number: 19796
Gerrit-PatchSet: 1
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>

[Impala-ASF-CR] IMPALA-12090: Split runtime profiles made by single node perf run.py

Posted by "Riza Suminto (Code Review)" <ge...@cloudera.org>.
Hello Zoltan Borok-Nagy, Joe McDonnell, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/19796

to look at the new patch set (#3).

Change subject: IMPALA-12090: Split runtime profiles made by single_node_perf_run.py
......................................................................

IMPALA-12090: Split runtime profiles made by single_node_perf_run.py

single_node_perf_run.py produce a single text file containing all
runtime profiles from perf run from one git hash. This is handy, but the
resulting text file can be very long and makes it difficult to analyze
individual profile.

This patch add --split_profiles option into single_node_perf_run.py so
it will extract runtime profiles into individual file instead of single
long text file. Files in profile directory will look like this:

$ ls -1 perf_results/latest/2267d9d104cc3fb0740cba09acb369b4d7ae4f52_profiles/
TPCDS-Q14-1_iter001.txt
TPCDS-Q14-1_iter002.txt
TPCDS-Q14-1_iter003.txt
TPCDS-Q14-2_iter001.txt
TPCDS-Q14-2_iter002.txt
TPCDS-Q14-2_iter003.txt
TPCDS-Q23-1_iter001.txt
TPCDS-Q23-1_iter002.txt
TPCDS-Q23-1_iter003.txt
TPCDS-Q23-2_iter001.txt
TPCDS-Q23-2_iter002.txt
TPCDS-Q23-2_iter003.txt

Testing:
- Manually test run the script with selected queries from tpcds
  workload.

Change-Id: Ibc2d3cefd7ad61b76cbef74c734543ef9ca51795
---
M bin/single_node_perf_run.py
1 file changed, 49 insertions(+), 11 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/96/19796/3
-- 
To view, visit http://gerrit.cloudera.org:8080/19796
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibc2d3cefd7ad61b76cbef74c734543ef9ca51795
Gerrit-Change-Number: 19796
Gerrit-PatchSet: 3
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>

[Impala-ASF-CR] IMPALA-12090: Split runtime profiles made by single node perf run.py

Posted by "Riza Suminto (Code Review)" <ge...@cloudera.org>.
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/19796 )

Change subject: IMPALA-12090: Split runtime profiles made by single_node_perf_run.py
......................................................................


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/19796/3/bin/single_node_perf_run.py
File bin/single_node_perf_run.py:

http://gerrit.cloudera.org:8080/#/c/19796/3/bin/single_node_perf_run.py@332
PS3, Line 332:   parser.add_option("--split_profiles", action="store_true", dest="split_profiles",
             :                     default=True, help=("If specified, query profiles will be generated
> I'm thinking that we might want the split profiles to be the default at som
I'm fine with splitting profiles as default.
ps4 adds --no_split_profiles as well.



-- 
To view, visit http://gerrit.cloudera.org:8080/19796
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibc2d3cefd7ad61b76cbef74c734543ef9ca51795
Gerrit-Change-Number: 19796
Gerrit-PatchSet: 4
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Comment-Date: Tue, 25 Apr 2023 15:48:12 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-12090: Split runtime profiles made by single node perf run.py

Posted by "Riza Suminto (Code Review)" <ge...@cloudera.org>.
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/19796 )

Change subject: IMPALA-12090: Split runtime profiles made by single_node_perf_run.py
......................................................................


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/19796/2/bin/single_node_perf_run.py
File bin/single_node_perf_run.py:

http://gerrit.cloudera.org:8080/#/c/19796/2/bin/single_node_perf_run.py@187
PS2, Line 187:             _out=os.path.join(IMPALA_HOME, "performance_result_profile_diff.txt"),
             :             _ok_code=[0, 1])
             :   else:
             :     generate_profile_file(file_a, hash_a, base_dir)
             :     generate_profile_file(fi
> This diff might have some issues once we split up the files.
Good point! Done.


http://gerrit.cloudera.org:8080/#/c/19796/2/bin/single_node_perf_run.py@243
PS2, Line 243: 
> Small nit: Let's pad it with zeros (i.e. 001, 002, ..., 010)
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/19796
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibc2d3cefd7ad61b76cbef74c734543ef9ca51795
Gerrit-Change-Number: 19796
Gerrit-PatchSet: 3
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Comment-Date: Mon, 24 Apr 2023 20:10:53 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-12090: Split runtime profiles made by single node perf run.py

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19796 )

Change subject: IMPALA-12090: Split runtime profiles made by single_node_perf_run.py
......................................................................


Patch Set 5:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/9262/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/19796
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibc2d3cefd7ad61b76cbef74c734543ef9ca51795
Gerrit-Change-Number: 19796
Gerrit-PatchSet: 5
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Comment-Date: Fri, 28 Apr 2023 04:37:55 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-12090: Split runtime profiles made by single node perf run.py

Posted by "Riza Suminto (Code Review)" <ge...@cloudera.org>.
Hello Zoltan Borok-Nagy, Joe McDonnell, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/19796

to look at the new patch set (#4).

Change subject: IMPALA-12090: Split runtime profiles made by single_node_perf_run.py
......................................................................

IMPALA-12090: Split runtime profiles made by single_node_perf_run.py

single_node_perf_run.py produce a single text file containing all
runtime profiles from perf run from one git hash. This is handy, but the
resulting text file can be very long and makes it difficult to analyze
individual profile.

This patch add --split_profiles and --no_split_profiles option into
single_node_perf_run.py. If --split_profiles is specified, it it will
extract runtime profiles into individual file instead of single long
text file. Specifying --no_split_profiles will retain the old behavior
of putting runtime profiles into a single-combined text file. Default to
split profiles if neither is specified. Files in profile directory will
look like this with --split_profiles:

$ ls -1 perf_results/latest/2267d9d104cc3fb0740cba09acb369b4d7ae4f52_profiles/
TPCDS-Q14-1_iter001.txt
TPCDS-Q14-1_iter002.txt
TPCDS-Q14-1_iter003.txt
TPCDS-Q14-2_iter001.txt
TPCDS-Q14-2_iter002.txt
TPCDS-Q14-2_iter003.txt
TPCDS-Q23-1_iter001.txt
TPCDS-Q23-1_iter002.txt
TPCDS-Q23-1_iter003.txt
TPCDS-Q23-2_iter001.txt
TPCDS-Q23-2_iter002.txt
TPCDS-Q23-2_iter003.txt

Testing:
- Manually test run the script with selected queries from tpcds
  workload with either --split_profiles or --no_split_profiles.

Change-Id: Ibc2d3cefd7ad61b76cbef74c734543ef9ca51795
---
M bin/single_node_perf_run.py
1 file changed, 53 insertions(+), 11 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/96/19796/4
-- 
To view, visit http://gerrit.cloudera.org:8080/19796
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibc2d3cefd7ad61b76cbef74c734543ef9ca51795
Gerrit-Change-Number: 19796
Gerrit-PatchSet: 4
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>

[Impala-ASF-CR] IMPALA-12090: Split runtime profiles made by single node perf run.py

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/19796 )

Change subject: IMPALA-12090: Split runtime profiles made by single_node_perf_run.py
......................................................................

IMPALA-12090: Split runtime profiles made by single_node_perf_run.py

single_node_perf_run.py produce a single text file containing all
runtime profiles from perf run from one git hash. This is handy, but the
resulting text file can be very long and makes it difficult to analyze
individual profile.

This patch add --split_profiles and --no_split_profiles option into
single_node_perf_run.py. If --split_profiles is specified, it it will
extract runtime profiles into individual file instead of single long
text file. Specifying --no_split_profiles will retain the old behavior
of putting runtime profiles into a single-combined text file. Default to
split profiles if neither is specified. Files in profile directory will
look like this with --split_profiles:

$ ls -1 perf_results/latest/2267d9d104cc3fb0740cba09acb369b4d7ae4f52_profiles/
TPCDS-Q14-1_iter001.txt
TPCDS-Q14-1_iter002.txt
TPCDS-Q14-1_iter003.txt
TPCDS-Q14-2_iter001.txt
TPCDS-Q14-2_iter002.txt
TPCDS-Q14-2_iter003.txt
TPCDS-Q23-1_iter001.txt
TPCDS-Q23-1_iter002.txt
TPCDS-Q23-1_iter003.txt
TPCDS-Q23-2_iter001.txt
TPCDS-Q23-2_iter002.txt
TPCDS-Q23-2_iter003.txt

Testing:
- Manually test run the script with selected queries from tpcds
  workload with either --split_profiles or --no_split_profiles.

Change-Id: Ibc2d3cefd7ad61b76cbef74c734543ef9ca51795
Reviewed-on: http://gerrit.cloudera.org:8080/19796
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
M bin/single_node_perf_run.py
1 file changed, 53 insertions(+), 11 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/19796
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ibc2d3cefd7ad61b76cbef74c734543ef9ca51795
Gerrit-Change-Number: 19796
Gerrit-PatchSet: 6
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>

[Impala-ASF-CR] IMPALA-12090: Split runtime profiles made by single node perf run.py

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19796 )

Change subject: IMPALA-12090: Split runtime profiles made by single_node_perf_run.py
......................................................................


Patch Set 1:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/12852/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/19796
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibc2d3cefd7ad61b76cbef74c734543ef9ca51795
Gerrit-Change-Number: 19796
Gerrit-PatchSet: 1
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Comment-Date: Mon, 24 Apr 2023 17:54:15 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-12090: Split runtime profiles made by single node perf run.py

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19796 )

Change subject: IMPALA-12090: Split runtime profiles made by single_node_perf_run.py
......................................................................


Patch Set 5: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/19796
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibc2d3cefd7ad61b76cbef74c734543ef9ca51795
Gerrit-Change-Number: 19796
Gerrit-PatchSet: 5
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Comment-Date: Fri, 28 Apr 2023 09:51:47 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-12090: Split runtime profiles made by single node perf run.py

Posted by "Riza Suminto (Code Review)" <ge...@cloudera.org>.
Hello Zoltan Borok-Nagy, Joe McDonnell, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/19796

to look at the new patch set (#2).

Change subject: IMPALA-12090: Split runtime profiles made by single_node_perf_run.py
......................................................................

IMPALA-12090: Split runtime profiles made by single_node_perf_run.py

single_node_perf_run.py produce a single text file containing all
runtime profiles from perf run from one git hash. This is handy, but the
resulting text file can be very long and makes it difficult to analyze
individual profile.

This patch add --split_profiles option into single_node_perf_run.py so
it will extract runtime profiles into individual file instead of single
long text file. Files in profile directory will look like this:

$ ls -1 perf_results/latest/2267d9d104cc3fb0740cba09acb369b4d7ae4f52_profiles/
TPCDS-Q14-1_iter1_e84f5c509c6b4a69:73bf2a8800000000.txt
TPCDS-Q14-1_iter2_c7428c658a448007:8bb5d67000000000.txt
TPCDS-Q14-1_iter3_7c496a8931e323f2:86bb3acf00000000.txt
TPCDS-Q14-2_iter1_b14f1f2b12c3cd6c:b218165f00000000.txt
TPCDS-Q14-2_iter2_9d494ebe3ee9b3e6:a8b4cbdc00000000.txt
TPCDS-Q14-2_iter3_0545de0d237f900b:e00341e400000000.txt
TPCDS-Q23-1_iter1_d2471e69d9c66a96:257453b200000000.txt
TPCDS-Q23-1_iter2_6e436b5bc3d02b27:f337342200000000.txt
TPCDS-Q23-1_iter3_af468e3223c663f1:3f67c12a00000000.txt
TPCDS-Q23-2_iter1_8744451bdfe508d4:20cd15e100000000.txt
TPCDS-Q23-2_iter2_994c718311701bfb:42e9d57b00000000.txt
TPCDS-Q23-2_iter3_3f41275e6f327017:c7fc46f800000000.txt

Testing:
- Manually test run the script with selected queries from tpcds
  workload.

Change-Id: Ibc2d3cefd7ad61b76cbef74c734543ef9ca51795
---
M bin/single_node_perf_run.py
1 file changed, 56 insertions(+), 11 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/96/19796/2
-- 
To view, visit http://gerrit.cloudera.org:8080/19796
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibc2d3cefd7ad61b76cbef74c734543ef9ca51795
Gerrit-Change-Number: 19796
Gerrit-PatchSet: 2
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>

[Impala-ASF-CR] IMPALA-12090: Split runtime profiles made by single node perf run.py

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19796 )

Change subject: IMPALA-12090: Split runtime profiles made by single_node_perf_run.py
......................................................................


Patch Set 5: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/19796
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibc2d3cefd7ad61b76cbef74c734543ef9ca51795
Gerrit-Change-Number: 19796
Gerrit-PatchSet: 5
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Comment-Date: Fri, 28 Apr 2023 04:37:55 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-12090: Split runtime profiles made by single node perf run.py

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19796 )

Change subject: IMPALA-12090: Split runtime profiles made by single_node_perf_run.py
......................................................................


Patch Set 4:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/12864/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/19796
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibc2d3cefd7ad61b76cbef74c734543ef9ca51795
Gerrit-Change-Number: 19796
Gerrit-PatchSet: 4
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Comment-Date: Tue, 25 Apr 2023 16:08:04 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-12090: Split runtime profiles made by single node perf run.py

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19796 )

Change subject: IMPALA-12090: Split runtime profiles made by single_node_perf_run.py
......................................................................


Patch Set 3:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/12856/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/19796
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibc2d3cefd7ad61b76cbef74c734543ef9ca51795
Gerrit-Change-Number: 19796
Gerrit-PatchSet: 3
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Comment-Date: Mon, 24 Apr 2023 20:30:38 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-12090: Split runtime profiles made by single node perf run.py

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19796 )

Change subject: IMPALA-12090: Split runtime profiles made by single_node_perf_run.py
......................................................................


Patch Set 2:

Build Failed 

https://jenkins.impala.io/job/gerrit-code-review-checks/12855/ : Initial code review checks failed. See linked job for details on the failure.


-- 
To view, visit http://gerrit.cloudera.org:8080/19796
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibc2d3cefd7ad61b76cbef74c734543ef9ca51795
Gerrit-Change-Number: 19796
Gerrit-PatchSet: 2
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Comment-Date: Mon, 24 Apr 2023 19:30:55 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-12090: Split runtime profiles made by single node perf run.py

Posted by "Joe McDonnell (Code Review)" <ge...@cloudera.org>.
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/19796 )

Change subject: IMPALA-12090: Split runtime profiles made by single_node_perf_run.py
......................................................................


Patch Set 1:

(3 comments)

This is super-useful. Thanks for looking into this

http://gerrit.cloudera.org:8080/#/c/19796/1/bin/single_node_perf_run.py
File bin/single_node_perf_run.py:

http://gerrit.cloudera.org:8080/#/c/19796/1/bin/single_node_perf_run.py@181
PS1, Line 181:   if (options.split_profiles):
Flip this condition? (See other note)


http://gerrit.cloudera.org:8080/#/c/19796/1/bin/single_node_perf_run.py@227
PS1, Line 227:       for iteration in data[key]:
             :         query_name = iteration["query"]["name"]
             :         start_time = iteration["start_time"]
             :         file_name = "{}_{}.txt".format(start_time, query_name)
There are several potential file name formats. We definitely need the query name to group them, and then to identify the specific query run we have options. Start time is unique and would work. Another option would use the iteration count and query id:

{query_name}_iter{iter}_{query_id}.txt

I think I like putting the query name first.


http://gerrit.cloudera.org:8080/#/c/19796/1/bin/single_node_perf_run.py@326
PS1, Line 326:   parser.add_option("--split_profiles", action="store_false",
             :                     help="If true, query profiles will be generated as separate files")
I'm not sure this works. I think split_profiles = True should turn on the new behavior, so the action should be "store_true" and then the condition up in compare() should be flipped?



-- 
To view, visit http://gerrit.cloudera.org:8080/19796
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibc2d3cefd7ad61b76cbef74c734543ef9ca51795
Gerrit-Change-Number: 19796
Gerrit-PatchSet: 1
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Comment-Date: Mon, 24 Apr 2023 18:03:59 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-12090: Split runtime profiles made by single node perf run.py

Posted by "Joe McDonnell (Code Review)" <ge...@cloudera.org>.
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/19796 )

Change subject: IMPALA-12090: Split runtime profiles made by single_node_perf_run.py
......................................................................


Patch Set 3: Code-Review+1

(1 comment)

This is looking good to me

http://gerrit.cloudera.org:8080/#/c/19796/3/bin/single_node_perf_run.py
File bin/single_node_perf_run.py:

http://gerrit.cloudera.org:8080/#/c/19796/3/bin/single_node_perf_run.py@332
PS3, Line 332:   parser.add_option("--split_profiles", action="store_true",
             :                     help="If true, query profiles will be generated as separate files")
I'm thinking that we might want the split profiles to be the default at some point. When we do, we could introduce a "--no_split_profiles" option that stores false into split_profiles. We can decide whether we should switch the default now or later.

This example over at the optparse documentation can help (https://docs.python.org/2.7/library/optparse.html#default-values )

Something like this might work:
parser.add_option("--split_profiles", dest="split_profiles", action="store_true", default=True, ...)
parser.add_option("--no_split_profiles, dest="split_profiles", action="store_false", ...)



-- 
To view, visit http://gerrit.cloudera.org:8080/19796
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibc2d3cefd7ad61b76cbef74c734543ef9ca51795
Gerrit-Change-Number: 19796
Gerrit-PatchSet: 3
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Comment-Date: Tue, 25 Apr 2023 00:21:20 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-12090: Split runtime profiles made by single node perf run.py

Posted by "Riza Suminto (Code Review)" <ge...@cloudera.org>.
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/19796 )

Change subject: IMPALA-12090: Split runtime profiles made by single_node_perf_run.py
......................................................................


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/19796/1/bin/single_node_perf_run.py
File bin/single_node_perf_run.py:

http://gerrit.cloudera.org:8080/#/c/19796/1/bin/single_node_perf_run.py@181
PS1, Line 181:   report_benchmark_results(fil
> Flip this condition? (See other note)
Done. Thanks for catching this!


http://gerrit.cloudera.org:8080/#/c/19796/1/bin/single_node_perf_run.py@227
PS1, Line 227:     data = json.loads(fid.read().decode("utf-8", "ignore"))
             :     iter_num = {}
             :     # For each query
             :     for key in data:
> There are several potential file name formats. We definitely need the query
Done


http://gerrit.cloudera.org:8080/#/c/19796/1/bin/single_node_perf_run.py@326
PS1, Line 326:   # Less commonly-used options:
             :   parser.add_option("--query_names",
> I'm not sure this works. I think split_profiles = True should turn on the n
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/19796
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibc2d3cefd7ad61b76cbef74c734543ef9ca51795
Gerrit-Change-Number: 19796
Gerrit-PatchSet: 2
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Comment-Date: Mon, 24 Apr 2023 19:10:56 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-12090: Split runtime profiles made by single node perf run.py

Posted by "Joe McDonnell (Code Review)" <ge...@cloudera.org>.
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/19796 )

Change subject: IMPALA-12090: Split runtime profiles made by single_node_perf_run.py
......................................................................


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/19796/2/bin/single_node_perf_run.py
File bin/single_node_perf_run.py:

http://gerrit.cloudera.org:8080/#/c/19796/2/bin/single_node_perf_run.py@187
PS2, Line 187:     sh.diff("-u",
             :             os.path.join(base_dir, hash_a + "_profiles"),
             :             os.path.join(base_dir, hash_b + "_profiles"),
             :             _out=os.path.join(IMPALA_HOME, "performance_result_profile_diff.txt"),
             :             _ok_code=[0, 1])
This diff might have some issues once we split up the files.

If I have two directories:
a/
 /one_filename.txt
b/
 /other_filename.txt

Then diff can't line up the file names and I get messages like this:
$ diff -u a b
Only in a: one_filename.txt
Only in b: other_filename.txt

To make this work, maybe we need to drop the query id from the file name. We'll have to check the output.


http://gerrit.cloudera.org:8080/#/c/19796/2/bin/single_node_perf_run.py@243
PS2, Line 243: {}
Small nit: Let's pad it with zeros (i.e. 001, 002, ..., 010)

To do that, we can use: '{:03d}'



-- 
To view, visit http://gerrit.cloudera.org:8080/19796
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibc2d3cefd7ad61b76cbef74c734543ef9ca51795
Gerrit-Change-Number: 19796
Gerrit-PatchSet: 2
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Comment-Date: Mon, 24 Apr 2023 19:54:11 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-12090: Split runtime profiles made by single node perf run.py

Posted by "Joe McDonnell (Code Review)" <ge...@cloudera.org>.
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/19796 )

Change subject: IMPALA-12090: Split runtime profiles made by single_node_perf_run.py
......................................................................


Patch Set 4: Code-Review+2

Ran with this locally, and it worked fine and the profile diff looked good to me.


-- 
To view, visit http://gerrit.cloudera.org:8080/19796
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibc2d3cefd7ad61b76cbef74c734543ef9ca51795
Gerrit-Change-Number: 19796
Gerrit-PatchSet: 4
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Comment-Date: Thu, 27 Apr 2023 22:21:47 +0000
Gerrit-HasComments: No