You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Sylvain Lebresne (JIRA)" <ji...@apache.org> on 2015/12/01 10:13:11 UTC

[jira] [Commented] (CASSANDRA-10520) Compressed writer and reader should support non-compressed data.

    [ https://issues.apache.org/jira/browse/CASSANDRA-10520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15033376#comment-15033376 ] 

Sylvain Lebresne commented on CASSANDRA-10520:
----------------------------------------------

bq. the patch does affect streaming as the sstable format differs. Is this a complete blocker?

We can't do a change that would prevent a new node to stream (or send any type of message) to an old node in a minor release, so if we can't get around that (and it's not immediately clear to me how we could), then it is a blocker. That's also reflected in the sstable versioning: we can use {{mb}} only if the change is basically backward compatible, which include the addition of a field to a metadata component that is not essential to the decoding of the sstable (so that old nodes that don't know about the field are fine ignoring it), but not a whole lot more.

Long story short, I doubt we can make that change before 4.0, though I haven't looked closely at the patch so it would be great if you could summarize the change to the sstable format this actually does.

> Compressed writer and reader should support non-compressed data.
> ----------------------------------------------------------------
>
>                 Key: CASSANDRA-10520
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10520
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Branimir Lambov
>            Assignee: Branimir Lambov
>             Fix For: 3.0.x
>
>
> Compressing uncompressible data, as done, for instance, to write SSTables during stress-tests, results in chunks larger than 64k which are a problem for the buffer pooling mechanisms employed by the {{CompressedRandomAccessReader}}. This results in non-negligible performance issues due to excessive memory allocation.
> To solve this problem and avoid decompression delays in the cases where it does not provide benefits, I think we should allow compressed files to store uncompressed chunks as alternative to compressed data. Such a chunk could be written after compression returns a buffer larger than, for example, 90% of the input, and would not result in additional delays in writing. On reads it could be recognized by size (using a single global threshold constant in the compression metadata) and data could be directly transferred into the decompressed buffer, skipping the decompression step and ensuring a 64k buffer for compressed data always suffices.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)