You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@systemds.apache.org by GitBox <gi...@apache.org> on 2020/07/14 09:39:35 UTC
[GitHub] [systemds] Baunsgaard commented on pull request #931: [SYSTEMDS-371-372][WIP] ColGroup Quantization
Baunsgaard commented on pull request #931:
URL: https://github.com/apache/systemds/pull/931#issuecomment-658081701
@mboehm7
As requested here are some comparison between before and now after, also with this i will finish committing to this branch, to enable reviews.
I have disabled two key features, that hopefully will improve performance once re-implemented, but i intend to slightly change the way they are done.
- Dictionary sharing (I intend to enable sharing across different col-group types, Since we now have a shared representation for this) I intend to move this step to before the construction of ColGroups, this will enable the storing of pointers to all the dictionaries in the CompressedMatrixBlock object to quicken value only computations, and the ColGroups will then be oblivious to their sharing of dictionaries.
- CoCoding. This is disabled currently since 1 it increase compression time, 2 it does not improve compression ratio on covType dataset.
Before (on master branch)
```code
DATA , RUN , TYPE , TIME ms , REP
covtype , MatrixVector mv , cla , 1.980 , 100
covtype , MatrixVector vm , cla , 3.310 , 100
covtype , scalar mult , cla , 3.900 , 100
covtype , scalar plus , cla , 13.180 , 100
covtype , unaryAggregate sum , cla , 1.992 , 500
covtype , unaryAggregate rowsum , cla , 23.740 , 500
covtype , unaryAggregate colsum , cla , 24.556 , 500
covtype , unaryAggregate colmax , cla , 0.122 , 500
covtype , unaryAggregate max , cla , nan , 0
covtype , unaryAggregate min , cla , 0.100 , 500
covtype , unaryAggregate rowmax , cla , 44.208 , 500
```
after:
```code
DATA , RUN , TYPE , TIME ms , REP
covtype , MatrixVector mv , cla , 1.916 , 1000
covtype , MatrixVector mv , lcla , 1.752 , 1000
covtype , MatrixVector vm , cla , 4.138 , 1000
covtype , MatrixVector vm , lcla , 3.764 , 1000
covtype , scalar mult , cla , 0.157 , 1000
covtype , scalar mult , lcla , 0.129 , 1000
covtype , scalar plus , cla , 0.249 , 1000
covtype , scalar plus , lcla , 0.212 , 1000
covtype , unaryAggregate sum , cla , 0.828 , 500
covtype , unaryAggregate sum , lcla , 2.790 , 500
covtype , unaryAggregate rowsum , cla , 12.075 , 3000
covtype , unaryAggregate rowsum , lcla , 33.120 , 3000
covtype , unaryAggregate colsum , cla , 0.834 , 500
covtype , unaryAggregate colsum , lcla , 2.886 , 500
covtype , unaryAggregate colmax , cla , 0.259 , 3000
covtype , unaryAggregate colmax , lcla , 0.039 , 3000
covtype , unaryAggregate max , cla , 0.142 , 500
covtype , unaryAggregate max , lcla , 0.064 , 500
covtype , unaryAggregate min , cla , 0.170 , 500
covtype , unaryAggregate min , lcla , 0.118 , 500
covtype , unaryAggregate rowmax , cla , 31.253 , 3000
covtype , unaryAggregate rowmax , lcla , 69.297 , 3000
```
Uncompressed Performance:
```code
covtype , MatrixVector mv , ula , 6.230 , 1000
covtype , MatrixVector vm , ula , 8.895 , 1000
covtype , scalar mult , ula , 34.050 , 300
covtype , scalar plus , ula , 63.683 , 300
covtype , unaryAggregate sum , ula , 7.146 , 500
covtype , unaryAggregate rowsum , ula , 10.895 , 3000
covtype , unaryAggregate colsum , ula , 8.268 , 500
covtype , unaryAggregate colmax , ula , 7.886 , 3000
covtype , unaryAggregate max , ula , 7.116 , 500
covtype , unaryAggregate min , ula , 7.508 , 500
covtype , unaryAggregate rowmax , ula , 8.403 , 3000
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org