You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2020/03/27 14:54:30 UTC
[GitHub] [beam] echauchot edited a comment on issue #11055: [BEAM-9436]
Improve GBK in spark structured streaming runner
echauchot edited a comment on issue #11055: [BEAM-9436] Improve GBK in spark structured streaming runner
URL: https://github.com/apache/beam/pull/11055#issuecomment-605020089
> Yes, I agree that materialisation and out of memory should be addressed in different Jira/PR
> Could you post the Nexmark results before and after your fix to compare? Thanks
Sure, here are nexmark results. They are not very relevant because the results are quite the same before and after the change. Note that 0.2s difference is not a real difference because I usually get 0.2s difference between 2 consequent runs on the same code base.
This is because these nexmark results are not very relevant that I ran the load tests above. Indeed, nexmark does a lot more than just a GBK, so I used GroupByKeyLoadTest as a pure GBK test.
after the change:
```
Conf Runtime(sec) (Baseline) Events(/sec) (Baseline) Results (Baseline)
0000 1,6 61349,7 100000
0001 0,9 107758,6 92000
0002 0,5 201612,9 351
0003 *** not run ***
0004 1,8 11415,5 40
0005 1,6 60864,3 12
0006 0,8 25641,0 103
0007 2,2 91116,2 1
0008 0,9 219298,2 6000
0009 0,6 32894,7 298
0010 1,1 88028,2 1
0011 0,9 110375,3 1919
0012 0,6 160771,7 1919
0013 0,6 180505,4 92000
0014 1,0 99304,9 92000
============================================================================
```
```
before the change:
Conf Runtime(sec) (Baseline) Events(/sec) (Baseline) Results (Baseline)
0000 1,8 56243,0 100000
0001 0,8 120481,9 92000
0002 0,4 223713,6 351
0003 *** not run ***
0004 1,6 12232,4 40
0005 1,4 71275,8 12
0006 0,9 23148,1 103
0007 2,1 96711,8 1
0008 1,1 183486,2 6000
0009 0,6 34843,2 298
0010 1,2 85543,2 1
0011 0,9 114547,5 1919
0012 0,6 156006,2 1919
0013 0,5 203666,0 92000
0014 1,0 102145,0 92000
===========================================================================
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services