You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Robert Stupp (JIRA)" <ji...@apache.org> on 2015/09/14 13:04:45 UTC
[jira] [Updated] (CASSANDRA-10314) Update index file format
[ https://issues.apache.org/jira/browse/CASSANDRA-10314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Robert Stupp updated CASSANDRA-10314:
-------------------------------------
Description:
As CASSANDRA-9738 may not make it into 3.0rc1, but having an off-heap key-cache is still a goal, we should change the index file format to meet off-heap requirements (so I've set fixver to 3.0rc1).
Off-heap (and mmap'd index files) need the offsets of the individual IndexInfo objects and the at least the offset field of IndexInfo structures.
The format I propose is as follows:
{noformat}
(long) position (vint since 3.0, 64bit before)
(int) serialized size of data that follows (vint since 3.0, 32bit before)
-- following for indexed entries only (so serialized size > 0)
(int) DeletionTime.localDeletionTime (32 bit int)
(long) DeletionTime.markedForDeletionAt (64 bit long)
(int) number of IndexInfo objects (vint since 3.0, 32bit before)
(*) serialized IndexInfo objects, see below
(*) offsets of serialized IndexInfo objects, since version "ma" (3.0)
Each IndexInfo object's offset is relative to the first IndexInfo object.
{noformat}
{noformat}
(*) IndexInfo.firstName (ClusteringPrefix serializer, either Clustering.serializer.serialize or Slice.Bound.serializer.serialize)
(*) IndexInfo.lastName (ClusteringPrefix serializer, either Clustering.serializer.serialize or Slice.Bound.serializer.serialize)
(long) IndexInfo.offset (vint encoded since 3.0, 64bit int before)
(long) IndexInfo.width (vint encoded since 3.0, 64bit int before)
(bool) IndexInfo.endOpenMarker != null (if 3.0)
(int) IndexInfo.endOpenMarker.localDeletionTime (if 3.0 && IndexInfo.endOpenMarker != null)
(long) IndexInfo.endOpenMarker.markedForDeletionAt (if 3.0 && IndexInfo.endOpenMarker != null)
{noformat}
Regarding the {{IndexInfo.offset}} and {{.width}} fields there are two options.
* Serialize both of them or
* Serialize only the offset field plus a _last byte offset_ to be able to recalculate the width of the last IndexInfo
The first option is probably the simpler one, the second saves a few bytes (those of the vint encoded width).
EDIT: update vint fields (as per CASSANDRA-10232)
was:
As CASSANDRA-9738 may not make it into 3.0rc1, but having an off-heap key-cache is still a goal, we should change the index file format to meet off-heap requirements (so I've set fixver to 3.0rc1).
Off-heap (and mmap'd index files) need the offsets of the individual IndexInfo objects and the at least the offset field of IndexInfo structures.
The format I propose is as follows:
{noformat}
(long) position (64 bit long)
(int) serialized size of data that follows (32 bit int)
-- following for indexed entries only (so serialized size > 0)
(int) DeletionTime.localDeletionTime (32 bit int)
(long) DeletionTime.markedForDeletionAt (64 bit long)
(int) number of IndexInfo objects (32 bit int)
(*) serialized IndexInfo objects, see below
(*) offsets of serialized IndexInfo objects, since version "ma" (3.0)
Each IndexInfo object's offset is relative to the first IndexInfo object.
{noformat}
{noformat}
(*) IndexInfo.firstName (ClusteringPrefix serializer, either Clustering.serializer.serialize or Slice.Bound.serializer.serialize)
(*) IndexInfo.lastName (ClusteringPrefix serializer, either Clustering.serializer.serialize or Slice.Bound.serializer.serialize)
(long) IndexInfo.offset (vint encoded since 3.0, 64bit int before)
(long) IndexInfo.width (vint encoded since 3.0, 64bit int before)
(bool) IndexInfo.endOpenMarker != null (if 3.0)
(int) IndexInfo.endOpenMarker.localDeletionTime (if 3.0 && IndexInfo.endOpenMarker != null)
(long) IndexInfo.endOpenMarker.markedForDeletionAt (if 3.0 && IndexInfo.endOpenMarker != null)
{noformat}
Regarding the {{IndexInfo.offset}} and {{.width}} fields there are two options.
* Serialize both of them or
* Serialize only the offset field plus a _last byte offset_ to be able to recalculate the width of the last IndexInfo
The first option is probably the simpler one, the second saves a few bytes (those of the vint encoded width).
> Update index file format
> ------------------------
>
> Key: CASSANDRA-10314
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10314
> Project: Cassandra
> Issue Type: Improvement
> Reporter: Robert Stupp
> Assignee: Robert Stupp
> Fix For: 3.0.0 rc1
>
>
> As CASSANDRA-9738 may not make it into 3.0rc1, but having an off-heap key-cache is still a goal, we should change the index file format to meet off-heap requirements (so I've set fixver to 3.0rc1).
> Off-heap (and mmap'd index files) need the offsets of the individual IndexInfo objects and the at least the offset field of IndexInfo structures.
> The format I propose is as follows:
> {noformat}
> (long) position (vint since 3.0, 64bit before)
> (int) serialized size of data that follows (vint since 3.0, 32bit before)
> -- following for indexed entries only (so serialized size > 0)
> (int) DeletionTime.localDeletionTime (32 bit int)
> (long) DeletionTime.markedForDeletionAt (64 bit long)
> (int) number of IndexInfo objects (vint since 3.0, 32bit before)
> (*) serialized IndexInfo objects, see below
> (*) offsets of serialized IndexInfo objects, since version "ma" (3.0)
> Each IndexInfo object's offset is relative to the first IndexInfo object.
> {noformat}
> {noformat}
> (*) IndexInfo.firstName (ClusteringPrefix serializer, either Clustering.serializer.serialize or Slice.Bound.serializer.serialize)
> (*) IndexInfo.lastName (ClusteringPrefix serializer, either Clustering.serializer.serialize or Slice.Bound.serializer.serialize)
> (long) IndexInfo.offset (vint encoded since 3.0, 64bit int before)
> (long) IndexInfo.width (vint encoded since 3.0, 64bit int before)
> (bool) IndexInfo.endOpenMarker != null (if 3.0)
> (int) IndexInfo.endOpenMarker.localDeletionTime (if 3.0 && IndexInfo.endOpenMarker != null)
> (long) IndexInfo.endOpenMarker.markedForDeletionAt (if 3.0 && IndexInfo.endOpenMarker != null)
> {noformat}
> Regarding the {{IndexInfo.offset}} and {{.width}} fields there are two options.
> * Serialize both of them or
> * Serialize only the offset field plus a _last byte offset_ to be able to recalculate the width of the last IndexInfo
> The first option is probably the simpler one, the second saves a few bytes (those of the vint encoded width).
> EDIT: update vint fields (as per CASSANDRA-10232)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)