You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Brandon Williams (JIRA)" <ji...@apache.org> on 2012/06/15 17:26:42 UTC

[jira] [Commented] (CASSANDRA-4304) Add bytes-limit clause to queries

    [ https://issues.apache.org/jira/browse/CASSANDRA-4304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13295722#comment-13295722 ] 

Brandon Williams commented on CASSANDRA-4304:
---------------------------------------------

I think I do like the idea of limiting by bytes instead of count, as CASSANDRA-3911 does.  However, I think that ticket has the right approach in that it should be the operator that defines that limit, not clients, since they will still have the ability to abuse it and OOM the server, and OOM is the operator's problem.
                
> Add bytes-limit clause to queries
> ---------------------------------
>
>                 Key: CASSANDRA-4304
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4304
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: API, Core
>            Reporter: Christian Spriegel
>             Fix For: 1.2
>
>         Attachments: TestImplForSlices.patch
>
>
> Idea is to add a second limit clause to (slice)queries. This would allow easy loading of batches, even if content is variable sized.
> Imagine the following use case:
> You want to load a batch of XMLs, where each is between 100bytes and 5MB large.
> Currently you can load either
> - a large number of XMLs, but risk OOMs or timeouts
> or
> - a small number of XMLs, and do too many queries where each query usually retrieves very little data.
> With cassandra being able to limit by size and not just count, we could do a single query which would never OOM but always return a decent amount of data -- with no extra overhead for multiple queries.
> Few thoughts from my side:
> - The limit should be a soft limit, not a hard limit. Therefore it will always return at least one row/column, even if that one large than the limit specifies.
> - HintedHandoffManager:303 is already doing a InMemoryCompactionLimit/averageColumnSize to avoid OOM. It could then simply use the new limit clause :-)
> - A bytes-limit on a range- or indexed-query should always return a complete row

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira