You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Apoorva Gaurav <ap...@myntra.com> on 2014/04/01 06:13:39 UTC

Re: Read performance in map data type

Thanks Robert, Is there a workaround, as in our test setups we keep
dropping and recreating tables.


On Mon, Mar 31, 2014 at 11:51 PM, Robert Coli <rc...@eventbrite.com> wrote:

> On Fri, Mar 28, 2014 at 7:41 PM, Apoorva Gaurav <apoorva.gaurav@myntra.com
> > wrote:
>
>> Yes primary key is (studentID, subjectID). I had dropped the test table,
>> recreating and populating it post which will share the cfhistogram. In such
>> case is there any practical limit on the rows I should fetch, for e.g.
>> should I do
>>
>
> Until this bug is fixed upstream, dropping and recreating a table may
> create unexpected behavior.
>
> https://issues.apache.org/jira/browse/CASSANDRA-5202
>
> =Rob
>
>



-- 
Thanks & Regards,
Apoorva

Re: Read performance in map data type

Posted by Apoorva Gaurav <ap...@myntra.com>.

I've observed that reducing fetch size results in better latency (isn't
that obvious :-)), tried from fetch size varying from 100 to 10000, seeing
a lot of errors for 10000. Haven't tried modifying the number of columns.

Let me start a new thread focused on fetch size.


On Wed, Apr 2, 2014 at 9:53 AM, Sourabh Agrawal <ii...@gmail.com>wrote:

> From the doc : The fetch size controls how much resulting rows will be
> retrieved simultaneously.
> So, I guess it does not depend on the number of columns as such. As all
> the columns for a key reside on the same node, I think it wouldn't matter
> much whatever be the number of columns as long as we have enough memory in
> the app.
>
> Default value is 5000. (com.datastax.driver.core.QueryOptions)
>
> We use it with the default value. I have never profiled cassandra for read
> load. If you profile it for different fetch sizes, please share the results
> :)
>
>
> On Wed, Apr 2, 2014 at 8:45 AM, Apoorva Gaurav <ap...@myntra.com>wrote:
>
>> Thanks Sourabh,
>>
>> I've modelled my table as "studentID int, subjectID int, marks int,
>> PRIMARY KEY(studentID, subjectID)" as primarily I'll be querying using
>> studentID and sometime using studentID and subjectID.
>>
>> I've tried driver 2.0.0 and its giving good results. Also using its auto
>> paging feature. Any idea what should be a typical value for fetch size. And
>> does the fetch size depends on how many columns are there in the CQL table
>> for e.g. should fetch size in a table like "studentID int, subjectID
>> int, marks1 int, marks2 int, marks3 int.... marksN int PRIMARY
>> KEY(studentID, subjectID)" be less than fetch size in "studentID int,
>> subjectID int, marks int, PRIMARY KEY(studentID, subjectID)"
>>
>>
>> On Wed, Apr 2, 2014 at 2:20 AM, Robert Coli <rc...@eventbrite.com> wrote:
>>
>>>  On Mon, Mar 31, 2014 at 9:13 PM, Apoorva Gaurav <
>>> apoorva.gaurav@myntra.com> wrote:
>>>
>>>> Thanks Robert, Is there a workaround, as in our test setups we keep
>>>> dropping and recreating tables.
>>>>
>>>
>>> Use unique keyspace (or table) names for each test? That's the approach
>>> they're taking in 5202...
>>>
>>> =Rob
>>>
>>>
>>
>>
>> --
>> Thanks & Regards,
>> Apoorva
>>
>
>
>
> --
> Sourabh Agrawal
> Bangalore
> +91 9945657973
>



-- 
Thanks & Regards,
Apoorva

Re: Read performance in map data type

Posted by Sourabh Agrawal <ii...@gmail.com>.

>From the doc : The fetch size controls how much resulting rows will be
retrieved simultaneously.
So, I guess it does not depend on the number of columns as such. As all the
columns for a key reside on the same node, I think it wouldn't matter much
whatever be the number of columns as long as we have enough memory in the
app.

Default value is 5000. (com.datastax.driver.core.QueryOptions)

We use it with the default value. I have never profiled cassandra for read
load. If you profile it for different fetch sizes, please share the results
:)


On Wed, Apr 2, 2014 at 8:45 AM, Apoorva Gaurav <ap...@myntra.com>wrote:

> Thanks Sourabh,
>
> I've modelled my table as "studentID int, subjectID int, marks int,
> PRIMARY KEY(studentID, subjectID)" as primarily I'll be querying using
> studentID and sometime using studentID and subjectID.
>
> I've tried driver 2.0.0 and its giving good results. Also using its auto
> paging feature. Any idea what should be a typical value for fetch size. And
> does the fetch size depends on how many columns are there in the CQL table
> for e.g. should fetch size in a table like "studentID int, subjectID int,
> marks1 int, marks2 int, marks3 int.... marksN int PRIMARY KEY(studentID,
> subjectID)" be less than fetch size in "studentID int, subjectID int,
> marks int, PRIMARY KEY(studentID, subjectID)"
>
>
> On Wed, Apr 2, 2014 at 2:20 AM, Robert Coli <rc...@eventbrite.com> wrote:
>
>>  On Mon, Mar 31, 2014 at 9:13 PM, Apoorva Gaurav <
>> apoorva.gaurav@myntra.com> wrote:
>>
>>> Thanks Robert, Is there a workaround, as in our test setups we keep
>>> dropping and recreating tables.
>>>
>>
>> Use unique keyspace (or table) names for each test? That's the approach
>> they're taking in 5202...
>>
>> =Rob
>>
>>
>
>
> --
> Thanks & Regards,
> Apoorva
>



-- 
Sourabh Agrawal
Bangalore
+91 9945657973

Re: Read performance in map data type

Posted by Apoorva Gaurav <ap...@myntra.com>.

Thanks Sourabh,

I've modelled my table as "studentID int, subjectID int, marks int, PRIMARY
KEY(studentID, subjectID)" as primarily I'll be querying using studentID
and sometime using studentID and subjectID.

I've tried driver 2.0.0 and its giving good results. Also using its auto
paging feature. Any idea what should be a typical value for fetch size. And
does the fetch size depends on how many columns are there in the CQL table
for e.g. should fetch size in a table like "studentID int, subjectID int,
marks1 int, marks2 int, marks3 int.... marksN int PRIMARY KEY(studentID,
subjectID)" be less than fetch size in "studentID int, subjectID int, marks
int, PRIMARY KEY(studentID, subjectID)"

On Wed, Apr 2, 2014 at 2:20 AM, Robert Coli <rc...@eventbrite.com> wrote:

>  On Mon, Mar 31, 2014 at 9:13 PM, Apoorva Gaurav <
> apoorva.gaurav@myntra.com> wrote:
>
>> Thanks Robert, Is there a workaround, as in our test setups we keep
>> dropping and recreating tables.
>>
>
> Use unique keyspace (or table) names for each test? That's the approach
> they're taking in 5202...
>
> =Rob
>
>

-- 
Thanks & Regards,
Apoorva

Re: Read performance in map data type

Posted by Robert Coli <rc...@eventbrite.com>.

 On Mon, Mar 31, 2014 at 9:13 PM, Apoorva Gaurav
<ap...@myntra.com>wrote:

> Thanks Robert, Is there a workaround, as in our test setups we keep
> dropping and recreating tables.
>

Use unique keyspace (or table) names for each test? That's the approach
they're taking in 5202...

=Rob