You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@systemds.apache.org by GitBox <gi...@apache.org> on 2022/09/17 21:31:39 UTC

[GitHub] [systemds] Baunsgaard commented on pull request #1697: [SYSTEMDS-2699] CLA IO Compressed Matrix

Baunsgaard commented on PR #1697:
URL: https://github.com/apache/systemds/pull/1697#issuecomment-1250143469

   Since the compression format have a tendency to change a bit the files written will not be fully supported at all times across different versions. A suggestion to detect changes or incompatible version numbers is to write a identifier  to the files in the beginning, 
   
   - GitHash 
   - SystemDS version Number 
   
   Since GitHash is not available at all times we could use SystemDS version number as a fall back. I do not personally like either solution maybe someone else have some suggestions?
   
   Other design decisions:
   
   1. For distributed i intend to simply write each compressed block in different files like we already do.
   2. Parallel reading and writing could be made with many files, for instance i could split each each column group into a separate file instead of multiple blocks, perhaps someone have some experience or ideas?
   
   Help / Comments appreciated
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@systemds.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org