You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Bill Speirs <bi...@gmail.com> on 2011/02/10 17:00:18 UTC

Super Slow Multi-gets

I have a 7 node setup with a replication factor of 1 and a read
consistency of 1. I have two column families: Messages which stores
millions of rows with a UUID for the row key, DateIndex which stores
thousands of rows with a String as the row key. I perform 2 look-ups
for my queries:

1) Fetch the row from DateIndex that includes the date I'm looking
for. This returns 1,000 columns where the column names are the UUID of
the messages
2) Do a multi-get (Hector client) using those 1,000 row keys I got
from the first query.

Query 1 is taking ~300ms to fetch 1,000 columns from a single row...
respectable. However, query 2 is taking over 50s to perform 1,000 row
look-ups! Also, when I scale down to 100 row look-ups for query 2, the
time scales in a similar fashion, down to 5s.

Am I doing something wrong here? It seems like taking 5s to look-up
100 rows in a distributed hash table is way too slow.

Thoughts?

Bill-

Re: Super Slow Multi-gets

Posted by Aaron Morton <aa...@thelastpickle.com>.

The out of bounds error normally means you have columns names that are not valid time uuids.

Is that a possibility ?

Aaron

On 11/02/2011, at 5:55 AM, Bill Speirs <bi...@gmail.com> wrote:

> We attempted a compaction to see if that would improve read
> performance (BTW: write performance is as expected, fast!). Here is
> the result, an ArrayOutOfBounds exception:
> 
> INFO 11:48:41,070 Compacting
> [org.apache.cassandra.io.sstable.SSTableReader(path='/test/cassandra/data/Logging/DateIndex-e-7-Data.db'),org.apache.cassandra.io.sstable.SSTableReader(path='/test/cassandra/data/Logging/FieldIndex-e-9-Data.db'),org.apache.cassandra.io.sstable.SSTableReader(path='/test/cassandra/data/Logging/FieldIndex-e-10-Data.db'),org.apache.cassandra.io.sstable.SSTableReader(path='/test/cassandra/data/Logging/Messages-e-13-Data.db')]
> 
> ERROR 11:48:41,080 Fatal exception in thread
> Thread[CompactionExecutor:1,1,main]
> java.lang.ArrayIndexOutOfBoundsException: 7
>        at org.apache.cassandra.db.marshal.TimeUUIDType.compareTimestampBytes(TimeUUIDType.java:58)
>        at org.apache.cassandra.db.marshal.TimeUUIDType.compare(TimeUUIDType.java:45)
>        at org.apache.cassandra.db.marshal.TimeUUIDType.compare(TimeUUIDType.java:29)
>        at java.util.concurrent.ConcurrentSkipListMap$ComparableUsingComparator.compareTo(ConcurrentSkipListMap.java:606)
>        at java.util.concurrent.ConcurrentSkipListMap.doPut(ConcurrentSkipListMap.java:878)
>        at java.util.concurrent.ConcurrentSkipListMap.putIfAbsent(ConcurrentSkipListMap.java:1893)
>        at org.apache.cassandra.db.ColumnFamily.addColumn(ColumnFamily.java:218)
>        at org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:130)
>        at org.apache.cassandra.io.sstable.SSTableIdentityIterator.getColumnFamilyWithColumns(SSTableIdentityIterator.java:137)
>        at org.apache.cassandra.io.PrecompactedRow.<init>(PrecompactedRow.java:78)
>        at org.apache.cassandra.io.CompactionIterator.getCompactedRow(CompactionIterator.java:138)
>        at org.apache.cassandra.io.CompactionIterator.getReduced(CompactionIterator.java:107)
>        at org.apache.cassandra.io.CompactionIterator.getReduced(CompactionIterator.java:42)
>        at org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:73)
>        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136)
>        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131)
>        at org.apache.commons.collections.iterators.FilterIterator.setNextObject(FilterIterator.java:183)
>        at org.apache.commons.collections.iterators.FilterIterator.hasNext(FilterIterator.java:94)
>        at org.apache.cassandra.db.CompactionManager.doCompaction(CompactionManager.java:312)
>        at org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:122)
>        at org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:92)
>        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>        at java.lang.Thread.run(Thread.java:619)
> 
> Does any of that mean anything to anyone?
> 
> Thanks...
> 
> Bill-
> 
> On Thu, Feb 10, 2011 at 11:00 AM, Bill Speirs <bi...@gmail.com> wrote:
>> I have a 7 node setup with a replication factor of 1 and a read
>> consistency of 1. I have two column families: Messages which stores
>> millions of rows with a UUID for the row key, DateIndex which stores
>> thousands of rows with a String as the row key. I perform 2 look-ups
>> for my queries:
>> 
>> 1) Fetch the row from DateIndex that includes the date I'm looking
>> for. This returns 1,000 columns where the column names are the UUID of
>> the messages
>> 2) Do a multi-get (Hector client) using those 1,000 row keys I got
>> from the first query.
>> 
>> Query 1 is taking ~300ms to fetch 1,000 columns from a single row...
>> respectable. However, query 2 is taking over 50s to perform 1,000 row
>> look-ups! Also, when I scale down to 100 row look-ups for query 2, the
>> time scales in a similar fashion, down to 5s.
>> 
>> Am I doing something wrong here? It seems like taking 5s to look-up
>> 100 rows in a distributed hash table is way too slow.
>> 
>> Thoughts?
>> 
>> Bill-
>>

Re: Super Slow Multi-gets

Posted by Bill Speirs <bi...@gmail.com>.

We attempted a compaction to see if that would improve read
performance (BTW: write performance is as expected, fast!). Here is
the result, an ArrayOutOfBounds exception:

INFO 11:48:41,070 Compacting
[org.apache.cassandra.io.sstable.SSTableReader(path='/test/cassandra/data/Logging/DateIndex-e-7-Data.db'),org.apache.cassandra.io.sstable.SSTableReader(path='/test/cassandra/data/Logging/FieldIndex-e-9-Data.db'),org.apache.cassandra.io.sstable.SSTableReader(path='/test/cassandra/data/Logging/FieldIndex-e-10-Data.db'),org.apache.cassandra.io.sstable.SSTableReader(path='/test/cassandra/data/Logging/Messages-e-13-Data.db')]

ERROR 11:48:41,080 Fatal exception in thread
Thread[CompactionExecutor:1,1,main]
java.lang.ArrayIndexOutOfBoundsException: 7
        at org.apache.cassandra.db.marshal.TimeUUIDType.compareTimestampBytes(TimeUUIDType.java:58)
        at org.apache.cassandra.db.marshal.TimeUUIDType.compare(TimeUUIDType.java:45)
        at org.apache.cassandra.db.marshal.TimeUUIDType.compare(TimeUUIDType.java:29)
        at java.util.concurrent.ConcurrentSkipListMap$ComparableUsingComparator.compareTo(ConcurrentSkipListMap.java:606)
        at java.util.concurrent.ConcurrentSkipListMap.doPut(ConcurrentSkipListMap.java:878)
        at java.util.concurrent.ConcurrentSkipListMap.putIfAbsent(ConcurrentSkipListMap.java:1893)
        at org.apache.cassandra.db.ColumnFamily.addColumn(ColumnFamily.java:218)
        at org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:130)
        at org.apache.cassandra.io.sstable.SSTableIdentityIterator.getColumnFamilyWithColumns(SSTableIdentityIterator.java:137)
        at org.apache.cassandra.io.PrecompactedRow.<init>(PrecompactedRow.java:78)
        at org.apache.cassandra.io.CompactionIterator.getCompactedRow(CompactionIterator.java:138)
        at org.apache.cassandra.io.CompactionIterator.getReduced(CompactionIterator.java:107)
        at org.apache.cassandra.io.CompactionIterator.getReduced(CompactionIterator.java:42)
        at org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:73)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131)
        at org.apache.commons.collections.iterators.FilterIterator.setNextObject(FilterIterator.java:183)
        at org.apache.commons.collections.iterators.FilterIterator.hasNext(FilterIterator.java:94)
        at org.apache.cassandra.db.CompactionManager.doCompaction(CompactionManager.java:312)
        at org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:122)
        at org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:92)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:619)

Does any of that mean anything to anyone?

Thanks...

Bill-

On Thu, Feb 10, 2011 at 11:00 AM, Bill Speirs <bi...@gmail.com> wrote:
> I have a 7 node setup with a replication factor of 1 and a read
> consistency of 1. I have two column families: Messages which stores
> millions of rows with a UUID for the row key, DateIndex which stores
> thousands of rows with a String as the row key. I perform 2 look-ups
> for my queries:
>
> 1) Fetch the row from DateIndex that includes the date I'm looking
> for. This returns 1,000 columns where the column names are the UUID of
> the messages
> 2) Do a multi-get (Hector client) using those 1,000 row keys I got
> from the first query.
>
> Query 1 is taking ~300ms to fetch 1,000 columns from a single row...
> respectable. However, query 2 is taking over 50s to perform 1,000 row
> look-ups! Also, when I scale down to 100 row look-ups for query 2, the
> time scales in a similar fashion, down to 5s.
>
> Am I doing something wrong here? It seems like taking 5s to look-up
> 100 rows in a distributed hash table is way too slow.
>
> Thoughts?
>
> Bill-
>

Re: Super Slow Multi-gets

Posted by Bill Speirs <bi...@gmail.com>.

Sorry, I was setting the file on my client not the server. I will make this
change and get back to you.

Thanks again for the help...

Bill-

On Feb 10, 2011 4:45 PM, "Bill Speirs" <bi...@gmail.com> wrote:
> Doesn't seem to help, I just get a bunch of messages that look like this:
>
> DEBUG - Transport open status true for client
CassandraClient<devb01:9160-13>
> DEBUG - Status of releaseClient CassandraClient<unixdevb01:9160-13> to
> queue: true
> DEBUG - Transport open status true for client
CassandraClient<devb01:9160-14>
>
> And I got those before with my other setting...
>
> Bill-
>
> On Thu, Feb 10, 2011 at 4:37 PM, Aaron Morton <aa...@thelastpickle.com>
wrote:
>> Assuming cassandra 0.7 in log4j-server.properties make it look like
this...
>> log4j.rootLogger=DEBUG,stdout,R
>>
>>
>> A
>> On 11 Feb, 2011,at 10:30 AM, Bill Speirs <bi...@gmail.com> wrote:
>>
>> I switched my implementation to use a thread pool of 10 threads each
>> multi-getting 10 keys/rows. This reduces my time from 50s to 5s for
>> fetching all 1,000 messages.
>>
>> I started looking through the Cassandra source to find where the
>> parallel requests are actually made, and I believe it's in
>> org.apache.cassandra.service.StorageProxy.java fetchRows, is this
>> correct? I noticed a number of logger.debug calls, what do I need to
>> set in my log4j.properties file to see these messages as they would
>> probably help me determine what is taking so long. Currently my
>> log4j.properties file looks like this and I'm not seeing these
>> messages:
>>
>> log4j.appender.stdout=org.apache.log4j.ConsoleAppender
>> log4j.appender.stdout.layout=org.apache.log4j.SimpleLayout
>> log4j.category.org.apache=DEBUG, stdout
>> log4j.category.me.prettyprint=DEBUG, stdout
>>
>> Thanks...
>>
>> Bill-
>>
>>
>> On Thu, Feb 10, 2011 at 12:53 PM, Bill Speirs <bi...@gmail.com>
wrote:
>>> Each message row is well under 1K. So I don't think it is network...
plus
>>> all boxes are on a fast LAN.
>>>
>>> Bill-
>>>
>>> On Feb 10, 2011 11:59 AM, "Utku Can Topçu" <ut...@topcu.gen.tr> wrote:
>>>> Dear Bill,
>>>>
>>>> How about the size of the row in the Messages CF. Is it too big? Might
>>>> you
>>>> be having an overhead of the bandwidth?
>>>>
>>>> Regards,
>>>> Utku
>>>>
>>>> On Thu, Feb 10, 2011 at 5:00 PM, Bill Speirs <bi...@gmail.com>
>>>> wrote:
>>>>
>>>>> I have a 7 node setup with a replication factor of 1 and a read
>>>>> consistency of 1 I have two column families: Messages which stores
>>>>> millions of rows with a UUID for the row key, DateIndex which stores
>>>>> thousands of rows with a String as the row key. I perform 2 look-ups
>>>>> for my queries:
>>>>>
>>>>> 1) Fetch the row from DateIndex that includes the date I'm looking
>>>>> for. This returns 1,000 columns where the column names are the UUID of
>>>>> the messages
>>>>> 2) Do a multi-get (Hector client) using those 1,000 row keys I got
>>>>> from the first query.
>>>>>
>>>>> Query 1 is taking ~300ms to fetch 1,000 columns from a single row...
>>>>> respectable. However, query 2 is taking over 50s to perform 1,000 row
>>>>> look-ups! Also, when I scale down to 100 row look-ups for query 2, the
>>>>> time scales in a similar fashion, down to 5s.
>>>>>
>>>>> Am I doing something wrong here? It seems like taking 5s to look-up
>>>>> 100 rows in a distributed hash table is way too slow.
>>>>>
>>>>> Thoughts?
>>>>>
>>>>> Bill-
>>>>>
>>>
>>

Re: Super Slow Multi-gets

Posted by Bill Speirs <bi...@gmail.com>.

Doesn't seem to help, I just get a bunch of messages that look like this:

DEBUG - Transport open status true for client CassandraClient<devb01:9160-13>
DEBUG - Status of releaseClient CassandraClient<unixdevb01:9160-13> to
queue: true
DEBUG - Transport open status true for client CassandraClient<devb01:9160-14>

And I got those before with my other setting...

Bill-

On Thu, Feb 10, 2011 at 4:37 PM, Aaron Morton <aa...@thelastpickle.com> wrote:
> Assuming cassandra 0.7 in log4j-server.properties make it look like this...
> log4j.rootLogger=DEBUG,stdout,R
>
>
> A
> On 11 Feb, 2011,at 10:30 AM, Bill Speirs <bi...@gmail.com> wrote:
>
> I switched my implementation to use a thread pool of 10 threads each
> multi-getting 10 keys/rows. This reduces my time from 50s to 5s for
> fetching all 1,000 messages.
>
> I started looking through the Cassandra source to find where the
> parallel requests are actually made, and I believe it's in
> org.apache.cassandra.service.StorageProxy.java fetchRows, is this
> correct? I noticed a number of logger.debug calls, what do I need to
> set in my log4j.properties file to see these messages as they would
> probably help me determine what is taking so long. Currently my
> log4j.properties file looks like this and I'm not seeing these
> messages:
>
> log4j.appender.stdout=org.apache.log4j.ConsoleAppender
> log4j.appender.stdout.layout=org.apache.log4j.SimpleLayout
> log4j.category.org.apache=DEBUG, stdout
> log4j.category.me.prettyprint=DEBUG, stdout
>
> Thanks...
>
> Bill-
>
>
> On Thu, Feb 10, 2011 at 12:53 PM, Bill Speirs <bi...@gmail.com> wrote:
>> Each message row is well under 1K. So I don't think it is network... plus
>> all boxes are on a fast LAN.
>>
>> Bill-
>>
>> On Feb 10, 2011 11:59 AM, "Utku Can Topçu" <ut...@topcu.gen.tr> wrote:
>>> Dear Bill,
>>>
>>> How about the size of the row in the Messages CF. Is it too big? Might
>>> you
>>> be having an overhead of the bandwidth?
>>>
>>> Regards,
>>> Utku
>>>
>>> On Thu, Feb 10, 2011 at 5:00 PM, Bill Speirs <bi...@gmail.com>
>>> wrote:
>>>
>>>> I have a 7 node setup with a replication factor of 1 and a read
>>>> consistency of 1 I have two column families: Messages which stores
>>>> millions of rows with a UUID for the row key, DateIndex which stores
>>>> thousands of rows with a String as the row key. I perform 2 look-ups
>>>> for my queries:
>>>>
>>>> 1) Fetch the row from DateIndex that includes the date I'm looking
>>>> for. This returns 1,000 columns where the column names are the UUID of
>>>> the messages
>>>> 2) Do a multi-get (Hector client) using those 1,000 row keys I got
>>>> from the first query.
>>>>
>>>> Query 1 is taking ~300ms to fetch 1,000 columns from a single row...
>>>> respectable. However, query 2 is taking over 50s to perform 1,000 row
>>>> look-ups! Also, when I scale down to 100 row look-ups for query 2, the
>>>> time scales in a similar fashion, down to 5s.
>>>>
>>>> Am I doing something wrong here? It seems like taking 5s to look-up
>>>> 100 rows in a distributed hash table is way too slow.
>>>>
>>>> Thoughts?
>>>>
>>>> Bill-
>>>>
>>
>

Re: Super Slow Multi-gets

Posted by Utku Can Topçu <ut...@topcu.gen.tr>.

Bill,
It still sounds really strange.

Can you reproduce it? And note down the steps; I'm sure people here would be
pleased to repeat it.

Regards,
Utku

On Fri, Feb 11, 2011 at 5:34 AM, Mark Guzman <se...@hasno.info> wrote:

> I assume this should be set on all of the servers? Is there anything in
> particular one would look for in the log results?
>
> On Feb 10, 2011, at 4:37 PM, Aaron Morton wrote:
>
> Assuming cassandra 0.7 in log4j-server.properties make it look like this...
>
> log4j.rootLogger=DEBUG,stdout,R
>
>
> A
> On 11 Feb, 2011,at 10:30 AM, Bill Speirs <bi...@gmail.com> wrote:
>
> I switched my implementation to use a thread pool of 10 threads each
> multi-getting 10 keys/rows. This reduces my time from 50s to 5s for
> fetching all 1,000 messages.
>
> I started looking through the Cassandra source to find where the
> parallel requests are actually made, and I believe it's in
> org.apache.cassandra.service.StorageProxy.java fetchRows, is this
> correct? I noticed a number of logger.debug calls, what do I need to
> set in my log4j.properties file to see these messages as they would
> probably help me determine what is taking so long. Currently my
> log4j.properties file looks like this and I'm not seeing these
> messages:
>
> log4j.appender.stdout=org.apache.log4j.ConsoleAppender
> log4j.appender.stdout.layout=org.apache.log4j.SimpleLayout
> log4j.category.org.apache=DEBUG, stdout
> log4j.category.me.prettyprint=DEBUG, stdout
>
> Thanks...
>
> Bill-
>
>
> On Thu, Feb 10, 2011 at 12:53 PM, Bill Speirs <bi...@gmail.com>
> wrote:
> > Each message row is well under 1K. So I don't think it is network... plus
> > all boxes are on a fast LAN.
> >
> > Bill-
> >
> > On Feb 10, 2011 11:59 AM, "Utku Can Topçu" <ut...@topcu.gen.tr> wrote:
> >> Dear Bill,
> >>
> >> How about the size of the row in the Messages CF. Is it too big? Might
> you
> >> be having an overhead of the bandwidth?
> >>
> >> Regards,
> >> Utku
> >>
> >> On Thu, Feb 10, 2011 at 5:00 PM, Bill Speirs <bi...@gmail.com>
> >> wrote:
> >>
> >>> I have a 7 node setup with a replication factor of 1 and a read
> >>> consistency of 1 I have two column families: Messages which stores
> >>> millions of rows with a UUID for the row key, DateIndex which stores
> >>> thousands of rows with a String as the row key. I perform 2 look-ups
> >>> for my queries:
> >>>
> >>> 1) Fetch the row from DateIndex that includes the date I'm looking
> >>> for. This returns 1,000 columns where the column names are the UUID of
> >>> the messages
> >>> 2) Do a multi-get (Hector client) using those 1,000 row keys I got
> >>> from the first query.
> >>>
> >>> Query 1 is taking ~300ms to fetch 1,000 columns from a single row...
> >>> respectable. However, query 2 is taking over 50s to perform 1,000 row
> >>> look-ups! Also, when I scale down to 100 row look-ups for query 2, the
> >>> time scales in a similar fashion, down to 5s.
> >>>
> >>> Am I doing something wrong here? It seems like taking 5s to look-up
> >>> 100 rows in a distributed hash table is way too slow.
> >>>
> >>> Thoughts?
> >>>
> >>> Bill-
> >>>
> >
>
>
>

Re: Super Slow Multi-gets

Posted by Mark Guzman <se...@hasno.info>.

I assume this should be set on all of the servers? Is there anything in particular one would look for in the log results?

On Feb 10, 2011, at 4:37 PM, Aaron Morton wrote:

> Assuming cassandra 0.7 in log4j-server.properties make it look like this...
> 
> log4j.rootLogger=DEBUG,stdout,R
> 
> 
> A
> On 11 Feb, 2011,at 10:30 AM, Bill Speirs <bi...@gmail.com> wrote:
> 
>> I switched my implementation to use a thread pool of 10 threads each
>> multi-getting 10 keys/rows. This reduces my time from 50s to 5s for
>> fetching all 1,000 messages.
>> 
>> I started looking through the Cassandra source to find where the
>> parallel requests are actually made, and I believe it's in
>> org.apache.cassandra.service.StorageProxy.java fetchRows, is this
>> correct? I noticed a number of logger.debug calls, what do I need to
>> set in my log4j.properties file to see these messages as they would
>> probably help me determine what is taking so long. Currently my
>> log4j.properties file looks like this and I'm not seeing these
>> messages:
>> 
>> log4j.appender.stdout=org.apache.log4j.ConsoleAppender
>> log4j.appender.stdout.layout=org.apache.log4j.SimpleLayout
>> log4j.category.org.apache=DEBUG, stdout
>> log4j.category.me.prettyprint=DEBUG, stdout
>> 
>> Thanks...
>> 
>> Bill-
>> 
>> 
>> On Thu, Feb 10, 2011 at 12:53 PM, Bill Speirs <bi...@gmail.com> wrote:
>> > Each message row is well under 1K. So I don't think it is network... plus
>> > all boxes are on a fast LAN.
>> >
>> > Bill-
>> >
>> > On Feb 10, 2011 11:59 AM, "Utku Can Topçu" <ut...@topcu.gen.tr> wrote:
>> >> Dear Bill,
>> >>
>> >> How about the size of the row in the Messages CF. Is it too big? Might you
>> >> be having an overhead of the bandwidth?
>> >>
>> >> Regards,
>> >> Utku
>> >>
>> >> On Thu, Feb 10, 2011 at 5:00 PM, Bill Speirs <bi...@gmail.com>
>> >> wrote:
>> >>
>> >>> I have a 7 node setup with a replication factor of 1 and a read
>> >>> consistency of 1 I have two column families: Messages which stores
>> >>> millions of rows with a UUID for the row key, DateIndex which stores
>> >>> thousands of rows with a String as the row key. I perform 2 look-ups
>> >>> for my queries:
>> >>>
>> >>> 1) Fetch the row from DateIndex that includes the date I'm looking
>> >>> for. This returns 1,000 columns where the column names are the UUID of
>> >>> the messages
>> >>> 2) Do a multi-get (Hector client) using those 1,000 row keys I got
>> >>> from the first query.
>> >>>
>> >>> Query 1 is taking ~300ms to fetch 1,000 columns from a single row...
>> >>> respectable. However, query 2 is taking over 50s to perform 1,000 row
>> >>> look-ups! Also, when I scale down to 100 row look-ups for query 2, the
>> >>> time scales in a similar fashion, down to 5s.
>> >>>
>> >>> Am I doing something wrong here? It seems like taking 5s to look-up
>> >>> 100 rows in a distributed hash table is way too slow.
>> >>>
>> >>> Thoughts?
>> >>>
>> >>> Bill-
>> >>>
>> >

Re: Super Slow Multi-gets

Posted by Aaron Morton <aa...@thelastpickle.com>.

Assuming cassandra 0.7 in log4j-server.properties make it look like this..

log4j.rootLogger=DEBUG,stdout,R

A
On 11 Feb, 2011,at 10:30 AM, Bill Speirs <bi...@gmail.com> wrote:

I switched my implementation to use a thread pool of 10 threads each
multi-getting 10 keys/rows. This reduces my time from 50s to 5s for
fetching all 1,000 messages.

I started looking through the Cassandra source to find where the
parallel requests are actually made, and I believe it's in
org.apache.cassandra.service.StorageProxy.java fetchRows, is this
correct? I noticed a number of logger.debug calls, what do I need to
set in my log4j.properties file to see these messages as they would
probably help me determine what is taking so long. Currently my
log4j.properties file looks like this and I'm not seeing these
messages:

log4j.appender.stdout=org.apache.log4j.ConsoleAppender
log4j.appender.stdout.layout=org.apache.log4j.SimpleLayout
log4j.category.org.apache=DEBUG, stdout
log4j.category.me.prettyprint=DEBUG, stdout

Thanks...

Bill-

On Thu, Feb 10, 2011 at 12:53 PM, Bill Speirs <bi...@gmail.com> wrote:
> Each message row is well under 1K. So I don't think it is network... plus
> all boxes are on a fast LAN.
>
> Bill-
>
> On Feb 10, 2011 11:59 AM, "Utku Can Topçu" <ut...@topcu.gen.tr> wrote:
>> Dear Bill,
>>
>> How about the size of the row in the Messages CF. Is it too big? Might you
>> be having an overhead of the bandwidth?
>>
>> Regards,
>> Utku
>>
>> On Thu, Feb 10, 2011 at 5:00 PM, Bill Speirs <bi...@gmail.com>
>> wrote:
>>
>>> I have a 7 node setup with a replication factor of 1 and a read
>>> consistency of 1. I have two column families: Messages which stores
>>> millions of rows with a UUID for the row key, DateIndex which stores
>>> thousands of rows with a String as the row key. I perform 2 look-ups
>>> for my queries:
>>>
>>> 1) Fetch the row from DateIndex that includes the date I'm looking
>>> for. This returns 1,000 columns where the column names are the UUID of
>>> the messages
>>> 2) Do a multi-get (Hector client) using those 1,000 row keys I got
>>> from the first query.
>>>
>>> Query 1 is taking ~300ms to fetch 1,000 columns from a single row...
>>> respectable. However, query 2 is taking over 50s to perform 1,000 row
>>> look-ups! Also, when I scale down to 100 row look-ups for query 2, the
>>> time scales in a similar fashion, down to 5s.
>>>
>>> Am I doing something wrong here? It seems like taking 5s to look-up
>>> 100 rows in a distributed hash table is way too slow.
>>>
>>> Thoughts?
>>>
>>> Bill-
>>>
>

Re: Super Slow Multi-gets

Posted by Bill Speirs <bi...@gmail.com>.

I switched my implementation to use a thread pool of 10 threads each
multi-getting 10 keys/rows. This reduces my time from 50s to 5s for
fetching all 1,000 messages.

I started looking through the Cassandra source to find where the
parallel requests are actually made, and I believe it's in
org.apache.cassandra.service.StorageProxy.java fetchRows, is this
correct? I noticed a number of logger.debug calls, what do I need to
set in my log4j.properties file to see these messages as they would
probably help me determine what is taking so long. Currently my
log4j.properties file looks like this and I'm not seeing these
messages:

log4j.appender.stdout=org.apache.log4j.ConsoleAppender
log4j.appender.stdout.layout=org.apache.log4j.SimpleLayout
log4j.category.org.apache=DEBUG, stdout
log4j.category.me.prettyprint=DEBUG, stdout

Thanks...

Bill-


On Thu, Feb 10, 2011 at 12:53 PM, Bill Speirs <bi...@gmail.com> wrote:
> Each message row is well under 1K. So I don't think it is network... plus
> all boxes are on a fast LAN.
>
> Bill-
>
> On Feb 10, 2011 11:59 AM, "Utku Can Topçu" <ut...@topcu.gen.tr> wrote:
>> Dear Bill,
>>
>> How about the size of the row in the Messages CF. Is it too big? Might you
>> be having an overhead of the bandwidth?
>>
>> Regards,
>> Utku
>>
>> On Thu, Feb 10, 2011 at 5:00 PM, Bill Speirs <bi...@gmail.com>
>> wrote:
>>
>>> I have a 7 node setup with a replication factor of 1 and a read
>>> consistency of 1. I have two column families: Messages which stores
>>> millions of rows with a UUID for the row key, DateIndex which stores
>>> thousands of rows with a String as the row key. I perform 2 look-ups
>>> for my queries:
>>>
>>> 1) Fetch the row from DateIndex that includes the date I'm looking
>>> for. This returns 1,000 columns where the column names are the UUID of
>>> the messages
>>> 2) Do a multi-get (Hector client) using those 1,000 row keys I got
>>> from the first query.
>>>
>>> Query 1 is taking ~300ms to fetch 1,000 columns from a single row...
>>> respectable. However, query 2 is taking over 50s to perform 1,000 row
>>> look-ups! Also, when I scale down to 100 row look-ups for query 2, the
>>> time scales in a similar fashion, down to 5s.
>>>
>>> Am I doing something wrong here? It seems like taking 5s to look-up
>>> 100 rows in a distributed hash table is way too slow.
>>>
>>> Thoughts?
>>>
>>> Bill-
>>>
>

Re: Super Slow Multi-gets

Posted by Bill Speirs <bi...@gmail.com>.

Each message row is well under 1K. So I don't think it is network... plus
all boxes are on a fast LAN.

Bill-

On Feb 10, 2011 11:59 AM, "Utku Can Topçu" <ut...@topcu.gen.tr> wrote:
> Dear Bill,
>
> How about the size of the row in the Messages CF. Is it too big? Might you
> be having an overhead of the bandwidth?
>
> Regards,
> Utku
>
> On Thu, Feb 10, 2011 at 5:00 PM, Bill Speirs <bi...@gmail.com>
wrote:
>
>> I have a 7 node setup with a replication factor of 1 and a read
>> consistency of 1. I have two column families: Messages which stores
>> millions of rows with a UUID for the row key, DateIndex which stores
>> thousands of rows with a String as the row key. I perform 2 look-ups
>> for my queries:
>>
>> 1) Fetch the row from DateIndex that includes the date I'm looking
>> for. This returns 1,000 columns where the column names are the UUID of
>> the messages
>> 2) Do a multi-get (Hector client) using those 1,000 row keys I got
>> from the first query.
>>
>> Query 1 is taking ~300ms to fetch 1,000 columns from a single row...
>> respectable. However, query 2 is taking over 50s to perform 1,000 row
>> look-ups! Also, when I scale down to 100 row look-ups for query 2, the
>> time scales in a similar fashion, down to 5s.
>>
>> Am I doing something wrong here? It seems like taking 5s to look-up
>> 100 rows in a distributed hash table is way too slow.
>>
>> Thoughts?
>>
>> Bill-
>>

Re: Super Slow Multi-gets

Posted by Utku Can Topçu <ut...@topcu.gen.tr>.

Dear Bill,

How about the size of the row in the Messages CF. Is it too big? Might you
be having an overhead of the bandwidth?

Regards,
Utku

On Thu, Feb 10, 2011 at 5:00 PM, Bill Speirs <bi...@gmail.com> wrote:

> I have a 7 node setup with a replication factor of 1 and a read
> consistency of 1. I have two column families: Messages which stores
> millions of rows with a UUID for the row key, DateIndex which stores
> thousands of rows with a String as the row key. I perform 2 look-ups
> for my queries:
>
> 1) Fetch the row from DateIndex that includes the date I'm looking
> for. This returns 1,000 columns where the column names are the UUID of
> the messages
> 2) Do a multi-get (Hector client) using those 1,000 row keys I got
> from the first query.
>
> Query 1 is taking ~300ms to fetch 1,000 columns from a single row...
> respectable. However, query 2 is taking over 50s to perform 1,000 row
> look-ups! Also, when I scale down to 100 row look-ups for query 2, the
> time scales in a similar fashion, down to 5s.
>
> Am I doing something wrong here? It seems like taking 5s to look-up
> 100 rows in a distributed hash table is way too slow.
>
> Thoughts?
>
> Bill-
>