You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Brice Dutheil <br...@gmail.com> on 2015/08/31 16:37:57 UTC

Re: SSTable structure

For information this changed in 2.2 (and will probably be removed in 3.0 or
shortly after ?)

2.2 →
https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/io/sstable/Descriptor.java#L52-L65
2.2 →
https://github.com/apache/cassandra/blob/cassandra-2.2/src/java/org/apache/cassandra/io/sstable/format/big/BigFormat.java#L131-L137

-- Brice

On Thu, Apr 2, 2015 at 11:11 AM, Serega Sheypak <se...@gmail.com>
wrote:

> Thank you, great to know that.
>
> 2015-04-01 23:14 GMT+02:00 Bharatendra Boddu <bh...@gmail.com>:
>
>> Hi Serega,
>>
>> Most of the content in the blog article is still relevant. After 1.2.5
>> (ic), there are only three new versions (ja, jb, ka) for SSTable format.
>> Following are the changes in these versions.
>>
>>         // ja (2.0.0): super columns are serialized as composites (note that there is no real format change,
>>         //               this is mostly a marker to know if we should expect super columns or not. We do need
>>         //               a major version bump however, because we should not allow streaming of super columns
>>         //               into this new format)
>>         //             tracks max local deletiontime in sstable metadata
>>         //             records bloom_filter_fp_chance in metadata component
>>         //             remove data size and column count from data file (CASSANDRA-4180)
>>         //             tracks max/min column values (according to comparator)
>>         // jb (2.0.1): switch from crc32 to adler32 for compression checksums
>>         //             checksum the compressed data
>>         // ka (2.1.0): new Statistics.db file format
>>         //             index summaries can be downsampled and the sampling level is persisted
>>         //             switch uncompressed checksums to adler32
>>         //             tracks presense of legacy (local and remote) counter shards
>>
>> - bharat
>>
>> On Wed, Apr 1, 2015 at 12:02 AM, Serega Sheypak <serega.sheypak@gmail.com
>> > wrote:
>>
>>> Hi bharat,
>>> you are talking about Cassandra 1.2.5 Does it fit Cassandra 2.1?
>>> Were there any significant changes to SSTable format and layout?
>>> Thank you, article is interesting.
>>>
>>> Hi jacob <ja...@me.com>,
>>> HBase does it for example.
>>> http://hbase.apache.org/book.html#_hfile_format_2
>>> It would be great to give general ideas. It could help to understand
>>> schema design problems. You start to understand better how Cassandra scans
>>> data how you can utilize its power.
>>>
>>> 2015-04-01 5:39 GMT+02:00 Bharatendra Boddu <bh...@gmail.com>:
>>>
>>>> Some time back I created a blog article about the SSTable storage
>>>> format with some code references.
>>>>
>>>> Cassandra: SSTable Storage Format
>>>> <http://distributeddatastore.blogspot.com/2013/08/cassandra-sstable-storage-format.html>
>>>>
>>>> - bharat
>>>>
>>>> On Mon, Mar 30, 2015 at 5:24 PM, Jacob Rhoden <ja...@me.com>
>>>> wrote:
>>>>
>>>>> Yes updating code and documentation can sometimes be annoying, you
>>>>> would only ever maintain both if it were important. It comes down or is
>>>>> having the format of the data files documented for everyone to understand
>>>>> an important thing?
>>>>>
>>>>> ______________________________
>>>>> Sent from iPhone
>>>>>
>>>>> On 31 Mar 2015, at 11:07 am, daemeon reiydelle <da...@gmail.com>
>>>>> wrote:
>>>>>
>>>>> why? Then there are 2 places 2 maintain or get jira'ed for a
>>>>> discrepancy.
>>>>> On Mar 30, 2015 4:46 PM, "Robert Coli" <rc...@eventbrite.com> wrote:
>>>>>
>>>>>> On Mon, Mar 30, 2015 at 1:38 AM, Pierre <pi...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Does anyone know if there is a more complete and up to date
>>>>>>> documentation about the sstable files structure (data, index, stats etc.)
>>>>>>> than this one : http://wiki.apache.org/cassandra/ArchitectureSSTable
>>>>>>
>>>>>>
>>>>>> No, there isn't. Unfortunately you will have to read the source.
>>>>>>
>>>>>>
>>>>>>> I'm looking for a full specification, with schema of the structure
>>>>>>> if possible.
>>>>>>>
>>>>>>
>>>>>> It would be nice if such fundamental things were documented, wouldn't
>>>>>> it?
>>>>>>
>>>>>> =Rob
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>