You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Mikhail Lipkovich (JIRA)" <ji...@apache.org> on 2017/09/18 11:07:02 UTC

[jira] [Commented] (IGNITE-6418) Binary: optionally write integer datatypes with varint encoding

    [ https://issues.apache.org/jira/browse/IGNITE-6418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16169881#comment-16169881 ] 

Mikhail Lipkovich commented on IGNITE-6418:
-------------------------------------------

Hi Vladimir,
seems like a doable task for a newbie. If there are no objections I would like to work on it.
What about reusing this Varint implementation?
https://github.com/apache/mahout/blob/master/hdfs/src/main/java/org/apache/mahout/math/Varint.java

Regarding to annotation - do you mean annotation of classes being marshalled themselves? We could create additional `BinaryWriteMode` for varints which will be identified in `BinaryUtils#mode(cls)` but the problems is that user will have no control over annotation of ignite's internal classes. Or my understanding of your suggestion is wrong?

> Binary: optionally write integer datatypes with varint encoding
> ---------------------------------------------------------------
>
>                 Key: IGNITE-6418
>                 URL: https://issues.apache.org/jira/browse/IGNITE-6418
>             Project: Ignite
>          Issue Type: Task
>          Components: binary
>    Affects Versions: 2.1
>            Reporter: Vladimir Ozerov
>              Labels: iep-2
>
> Currently all integer data types are written as is. {{Integer}} always takes 4 bytes, {{Long}} - 8 bytes, etc.
> There is well-known technique called "varint encoding" which can compress integer values [1]. When used, {{Integer}} can take 1-5 bytes, {{Long}} - 1-10 bytes. So when values are small enough we can save a lot of space. 
> But this technique is not unversal, as big encoded values might require more bytes comparing to plain form. Also it might cause slowdowns in SQL engine. So this approach cannot be applied globally. Instead, we should allow users to control whether they want to use this technique or not.
> One possible approach is to add some annotation and several new methods to {{BinaryWriter}} and {{BinaryReader}}, which will control whether varint is used or not.
> [1] https://developers.google.com/protocol-buffers/docs/encoding#varints



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)