You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tinkerpop.apache.org by "Matt Frantz (JIRA)" <ji...@apache.org> on 2015/04/03 17:55:52 UTC

[jira] [Commented] (TINKERPOP3-609) Reduce the memory footprint of Gryo

    [ https://issues.apache.org/jira/browse/TINKERPOP3-609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14394608#comment-14394608 ] 

Matt Frantz commented on TINKERPOP3-609:
----------------------------------------

Does the 1-byte element label enum size imply a limit on the number of unique labels?  Or would you use variable-width for that one?

If ID's are sequential, then variable-width encoding would provide even better compression.  If they are hashes or UUID's, then fixed size would be better.

What about adding a header that designates the integer encoding?  Presumably, the header is where you would define the element label enums.

> Reduce the memory footprint of Gryo
> -----------------------------------
>
>                 Key: TINKERPOP3-609
>                 URL: https://issues.apache.org/jira/browse/TINKERPOP3-609
>             Project: TinkerPop 3
>          Issue Type: Improvement
>          Components: io
>    Affects Versions: 3.0.0.GA
>            Reporter: Marko A. Rodriguez
>            Assignee: stephen mallette
>            Priority: Critical
>
> A 1 million vertex graph with 1 edge each is a 150meg file. That is 150 bytes per vertex/edge.
> If the vertex id is a long that is 4 bytes.
> If the edge id is a long that is 4 bytes.
> The edge should only have ONE id for the otherV of 4 bytes.
> The edge label should be somehow "enum'd" and 1 byte.
> The vertex label should be somehow "enum'd" and 1 byte.
> Add 2-3 bytes for terminators.
> Thus, we should be able to get away with a 17 byte representation (assuming no variable width encodings) and thus, a 17 meg file. That is a near 10x file size reduction. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)