You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@systemds.apache.org by GitBox <gi...@apache.org> on 2021/03/05 15:18:05 UTC

[GitHub] [systemds] Baunsgaard opened a new pull request #1196: [SYSTEMDS-2884] CLA Shared Mapping abstract

Baunsgaard opened a new pull request #1196:
URL: https://github.com/apache/systemds/pull/1196


   This commit adds an mapping abstract and some implementations of this to
   allow DDC to use an arbitrary number of bits per entry.
   the numbers currently supported are 1 8 16 and 32 While there is added
   a student task to make others.
   
   Furthermore this abstract is also used in SDC with similar benefits for
   better compression.
   
   Also fixed in this commit various bugs, and improved compression ratio
   to 23x on census from 15x. this gives a PCA execution time 1.4x faster
   including IO and compression for both census and classic MNIST.
   
   This commit also contain the beginning of a insertion sorter for
   efficient construction of SDC column groups, but currently only a naive
   implementation is added that works well for few unique values, but does
   not scale well to larger amounts.
   
   Normal:
   ![image](https://user-images.githubusercontent.com/9947148/110134970-489e4900-7dce-11eb-887b-3cee4d76c419.png)
   
   CLA:
   ![image](https://user-images.githubusercontent.com/9947148/110135016-53f17480-7dce-11eb-8321-da6ef8606530.png)
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [systemds] Baunsgaard merged pull request #1196: [SYSTEMDS-2884] CLA Shared Mapping abstract

Posted by GitBox <gi...@apache.org>.
Baunsgaard merged pull request #1196:
URL: https://github.com/apache/systemds/pull/1196


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org