You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@systemds.apache.org by GitBox <gi...@apache.org> on 2022/09/17 21:31:39 UTC
[GitHub] [systemds] Baunsgaard commented on pull request #1697: [SYSTEMDS-2699] CLA IO Compressed Matrix
Baunsgaard commented on PR #1697:
URL: https://github.com/apache/systemds/pull/1697#issuecomment-1250143469
Since the compression format have a tendency to change a bit the files written will not be fully supported at all times across different versions. A suggestion to detect changes or incompatible version numbers is to write a identifier to the files in the beginning,
- GitHash
- SystemDS version Number
Since GitHash is not available at all times we could use SystemDS version number as a fall back. I do not personally like either solution maybe someone else have some suggestions?
Other design decisions:
1. For distributed i intend to simply write each compressed block in different files like we already do.
2. Parallel reading and writing could be made with many files, for instance i could split each each column group into a separate file instead of multiple blocks, perhaps someone have some experience or ideas?
Help / Comments appreciated
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@systemds.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org