You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Aleksandr Polovtcev (Jira)" <ji...@apache.org> on 2021/12/11 11:55:00 UTC

[jira] [Updated] (IGNITE-16105) Replace sorted index binary storage protocol

     [ https://issues.apache.org/jira/browse/IGNITE-16105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Aleksandr Polovtcev updated IGNITE-16105:
-----------------------------------------
    Description: 
Sorted Index Storage currently uses {{BinaryRow}} as way to convert column values into byte arrays. This approach is not optimal for the following reasons:

# Data is stored in RocksDB and we can't use its native lexicographic comparator, we rely on a custom Java-based comparator that needs to de-serialize all columns in order to compare them. This is bad performance-wise, because Java-based comparators are  slower and we need to extract all column values;
# Range scans can't use the prefix seek operation from RocksDB, because {{BinaryRow}} seralization is not stable: serialized prefix of column values will not be a prefix of the whole serialized row, because the format depends on columns being serialized;
# {[BinaryRow}} serialization is designed to store versioned row data and is overall badly suited to the Sorted Index purposes, its API usage looks awkward in this context.

We need to find a new serialization protocol that will (ideally) satisfy the following requirements:

# It should be comparable lexicographically;
# It should support null values;
# It should support variable length columns (though this requirement can probably be dropped);
# It should support both ascending and descending order for individual columns;
# It should support all data types that {{BinaryRow}} uses.

> Replace sorted index binary storage protocol
> --------------------------------------------
>
>                 Key: IGNITE-16105
>                 URL: https://issues.apache.org/jira/browse/IGNITE-16105
>             Project: Ignite
>          Issue Type: Task
>            Reporter: Aleksandr Polovtcev
>            Priority: Major
>              Labels: ignite-3
>
> Sorted Index Storage currently uses {{BinaryRow}} as way to convert column values into byte arrays. This approach is not optimal for the following reasons:
> # Data is stored in RocksDB and we can't use its native lexicographic comparator, we rely on a custom Java-based comparator that needs to de-serialize all columns in order to compare them. This is bad performance-wise, because Java-based comparators are  slower and we need to extract all column values;
> # Range scans can't use the prefix seek operation from RocksDB, because {{BinaryRow}} seralization is not stable: serialized prefix of column values will not be a prefix of the whole serialized row, because the format depends on columns being serialized;
> # {[BinaryRow}} serialization is designed to store versioned row data and is overall badly suited to the Sorted Index purposes, its API usage looks awkward in this context.
> We need to find a new serialization protocol that will (ideally) satisfy the following requirements:
> # It should be comparable lexicographically;
> # It should support null values;
> # It should support variable length columns (though this requirement can probably be dropped);
> # It should support both ascending and descending order for individual columns;
> # It should support all data types that {{BinaryRow}} uses.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)