You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Даниел Симеонов <ds...@gmail.com> on 2010/05/03 08:33:36 UTC

Re: inserting new rows with one key vs. inserting new columns in a row performance

Hello,
   It seems that I have experienced network problems (local pre-installed
firewall) and some rest http inefficiencies, so I think that it behaves the
same in both cases. I am sorry to have taken from your time.
Best regards, Daniel.

На 30 април 2010 20:46, Даниел Симеонов <ds...@gmail.com> написа:

> Hi,
>    I've checked two similar scenarios and one of them seem to be more
> performant. So timestamped data is being appended, the first use case is
> with an OPP and new rows being created every with only one column (there are
> about 7-8 CFs). The second cases is to have rows with more columns and
> RandomPartitioner, although every row gets much more than one column
> appended yet the inserts are relatively uniformly distributed among rows.
> Yet the first scenario is faster than the second, and the second one starts
> with good response times (about 20-30 ms) and gradually the mean time
> increases (to about 150-200 ms). What could be the reason?
> Thank you very much!
> Best regards, Daniel.
>

Re: inserting new rows with one key vs. inserting new columns in a row performance

Posted by Sylvain Lebresne <sy...@yakaz.com>.

Make sure you have disallowed the row cache. If you have row cache, the entire
row do get loaded to memory. Otherwise it is not.

On Mon, May 3, 2010 at 3:06 PM, malsmith <ma...@treehousesystems.com> wrote:
> I've seen this too (your second case) - it seems like the entire row
> contents (or some big subset of the row) are loaded to memory on the server
> before any column value is returned.  The partitioner selection did not make
> any difference to performance in my case.  I did not find a way around this
> except to take a strategy similar to your first case.
>
>
>
> On Mon, 2010-05-03 at 09:33 +0300, Даниел Симеонов wrote:
>
> Hello,
>    It seems that I have experienced network problems (local pre-installed
> firewall) and some rest http inefficiencies, so I think that it behaves the
> same in both cases. I am sorry to have taken from your time.
> Best regards, Daniel.
>
> На 30 април 2010 20:46, Даниел Симеонов <ds...@gmail.com> написа:
>
> Hi,
>    I've checked two similar scenarios and one of them seem to be more
> performant. So timestamped data is being appended, the first use case is
> with an OPP and new rows being created every with only one column (there are
> about 7-8 CFs). The second cases is to have rows with more columns and
> RandomPartitioner, although every row gets much more than one column
> appended yet the inserts are relatively uniformly distributed among rows.
> Yet the first scenario is faster than the second, and the second one starts
> with good response times (about 20-30 ms) and gradually the mean time
> increases (to about 150-200 ms). What could be the reason?
> Thank you very much!
> Best regards, Daniel.
>
>
>
>

Re: inserting new rows with one key vs. inserting new columns in a row performance

Posted by malsmith <ma...@treehousesystems.com>.

I've seen this too (your second case) - it seems like the entire row
contents (or some big subset of the row) are loaded to memory on the
server before any column value is returned.  The partitioner selection
did not make any difference to performance in my case.  I did not find a
way around this except to take a strategy similar to your first case.



On Mon, 2010-05-03 at 09:33 +0300, Даниел Симеонов wrote:

> Hello,
>    It seems that I have experienced network problems (local
> pre-installed firewall) and some rest http inefficiencies, so I think
> that it behaves the same in both cases. I am sorry to have taken from
> your time.
> Best regards, Daniel.
> 
> На 30 април 2010 20:46, Даниел Симеонов <ds...@gmail.com> написа:
> 
>         Hi, 
>            I've checked two similar scenarios and one of them seem to
>         be more performant. So timestamped data is being appended, the
>         first use case is with an OPP and new rows being created every
>         with only one column (there are about 7-8 CFs). The second
>         cases is to have rows with more columns and RandomPartitioner,
>         although every row gets much more than one column appended yet
>         the inserts are relatively uniformly distributed among rows.
>         Yet the first scenario is faster than the second, and the
>         second one starts with good response times (about 20-30 ms)
>         and gradually the mean time increases (to about 150-200 ms).
>         What could be the reason?  
>         Thank you very much! 
>         Best regards, Daniel. 
> 
> 
>