You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tinkerpop.apache.org by "Matt Frantz (JIRA)" <ji...@apache.org> on 2015/04/03 17:55:52 UTC
[jira] [Commented] (TINKERPOP3-609) Reduce the memory footprint of
Gryo
[ https://issues.apache.org/jira/browse/TINKERPOP3-609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14394608#comment-14394608 ]
Matt Frantz commented on TINKERPOP3-609:
----------------------------------------
Does the 1-byte element label enum size imply a limit on the number of unique labels? Or would you use variable-width for that one?
If ID's are sequential, then variable-width encoding would provide even better compression. If they are hashes or UUID's, then fixed size would be better.
What about adding a header that designates the integer encoding? Presumably, the header is where you would define the element label enums.
> Reduce the memory footprint of Gryo
> -----------------------------------
>
> Key: TINKERPOP3-609
> URL: https://issues.apache.org/jira/browse/TINKERPOP3-609
> Project: TinkerPop 3
> Issue Type: Improvement
> Components: io
> Affects Versions: 3.0.0.GA
> Reporter: Marko A. Rodriguez
> Assignee: stephen mallette
> Priority: Critical
>
> A 1 million vertex graph with 1 edge each is a 150meg file. That is 150 bytes per vertex/edge.
> If the vertex id is a long that is 4 bytes.
> If the edge id is a long that is 4 bytes.
> The edge should only have ONE id for the otherV of 4 bytes.
> The edge label should be somehow "enum'd" and 1 byte.
> The vertex label should be somehow "enum'd" and 1 byte.
> Add 2-3 bytes for terminators.
> Thus, we should be able to get away with a 17 byte representation (assuming no variable width encodings) and thus, a 17 meg file. That is a near 10x file size reduction.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)