You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Richard Grossman <ri...@gmail.com> on 2010/01/20 11:44:15 UTC

How to unit test my code calling Cassandra with Thift

Hi

I want to write some unitest for code calling cassandra. So my code of
course use Thrift.
I've managed to up the cassandra deamon into my JVM like this :

        StorageService.instance().initServer();

Unfortunatly it's doest start the thrift interface so my code can't talk
with the server. Is there any solution ?
perhaps my method is not good.

Thanks

Richard

Re: Cassandra benchmark shows OK throughput but high read latency (> 100ms)?

Posted by Weijun Li <we...@gmail.com>.

I didn't know you use actual key instead its md5 (for random patitioner) in
KCF.  It's good point that I'll watch hit ratio of KCF to determine whether
it needs to be increased.

Thanks,
-Weijun

On Tue, Feb 16, 2010 at 5:34 PM, Jonathan Ellis <jb...@gmail.com> wrote:

> On Tue, Feb 16, 2010 at 7:27 PM, Weijun Li <we...@gmail.com> wrote:
> > Yes my KeysCachedFraction is already 0.3 but it doesn't relief the disk
> i/o.
> > I compacted the data to be a 60GB (took quite a while to finish and it
> > increased latency as expected) one but doesn't help much either.
> >
> > If I set KCF to 1 (meaning to cache all sstable index), how much memory
> will
> > it take for 50mil keys?
>
> 10/3 what 0.3 takes :)
>
> >Is the index a straight key-offset map? I guess key
> > is 16 bytes and offset is 8 bytes.
>
> key length depends on your data, of course.
>
> > Will KCF=1 help to reduce disk i/o?
>
> depends.  w/ trunk you can look at your cache hit rate w/ jconsole to
> see if increasing it more would help.
>
> -Jonathan
>

Re: Cassandra benchmark shows OK throughput but high read latency (> 100ms)?

Posted by Jonathan Ellis <jb...@gmail.com>.

On Tue, Feb 16, 2010 at 7:27 PM, Weijun Li <we...@gmail.com> wrote:
> Yes my KeysCachedFraction is already 0.3 but it doesn't relief the disk i/o.
> I compacted the data to be a 60GB (took quite a while to finish and it
> increased latency as expected) one but doesn't help much either.
>
> If I set KCF to 1 (meaning to cache all sstable index), how much memory will
> it take for 50mil keys?

10/3 what 0.3 takes :)

>Is the index a straight key-offset map? I guess key
> is 16 bytes and offset is 8 bytes.

key length depends on your data, of course.

> Will KCF=1 help to reduce disk i/o?

depends.  w/ trunk you can look at your cache hit rate w/ jconsole to
see if increasing it more would help.

-Jonathan

Re: Cassandra benchmark shows OK throughput but high read latency (> 100ms)?

Posted by Weijun Li <we...@gmail.com>.

Yes my KeysCachedFraction is already 0.3 but it doesn't relief the disk i/o.
I compacted the data to be a 60GB (took quite a while to finish and it
increased latency as expected) one but doesn't help much either.

If I set KCF to 1 (meaning to cache all sstable index), how much memory will
it take for 50mil keys? Is the index a straight key-offset map? I guess key
is 16 bytes and offset is 8 bytes. Will KCF=1 help to reduce disk i/o?

-Weijun

On Tue, Feb 16, 2010 at 5:18 PM, Jonathan Ellis <jb...@gmail.com> wrote:

> Have you tried increasing KeysCachedFraction?
>
> On Tue, Feb 16, 2010 at 6:15 PM, Weijun Li <we...@gmail.com> wrote:
> > Still have high read latency with 50mil records in the 2-node cluster
> > (replica 2). I restarted both nodes but read latency is still above 60ms
> and
> > disk i/o saturation is high. Tried compact and repair but doesn't help
> much.
> > When I reduced the client threads from 15 to 5 it looks a lot better but
> > throughput is kind of low. I changed using flushing thread of 16 instead
> the
> > defaulted 8, could that cause the disk saturation issue?
> >
> > For benchmark with decent throughput and latency, how many client threads
> do
> > they use? Can anyone share your storage-conf.xml in well-tuned high
> volume
> > cluster?
> >
> > -Weijun
> >
> > On Tue, Feb 16, 2010 at 10:31 AM, Stu Hood <st...@rackspace.com>
> wrote:
> >>
> >> > After I ran "nodeprobe compact" on node B its read latency went up to
> >> > 150ms.
> >> The compaction process can take a while to finish... in 0.5 you need to
> >> watch the logs to figure out when it has actually finished, and then you
> >> should start seeing the improvement in read latency.
> >>
> >> > Is there any way to utilize all of the heap space to decrease the read
> >> > latency?
> >> In 0.5 you can adjust the number of keys that are cached by changing the
> >> 'KeysCachedFraction' parameter in your config file. In 0.6 you can
> >> additionally cache rows. You don't want to use up all of the memory on
> your
> >> box for those caches though: you'll want to leave at least 50% for your
> OS's
> >> disk cache, which will store the full row content.
> >>
> >>
> >> -----Original Message-----
> >> From: "Weijun Li" <we...@gmail.com>
> >> Sent: Tuesday, February 16, 2010 12:16pm
> >> To: cassandra-user@incubator.apache.org
> >> Subject: Re: Cassandra benchmark shows OK throughput but high read
> latency
> >> (> 100ms)?
> >>
> >> Thanks for for DataFileDirectory trick and I'll give a try.
> >>
> >> Just noticed the impact of number of data files: node A has 13 data
> files
> >> with read latency of 20ms and node B has 27 files with read latency of
> >> 60ms.
> >> After I ran "nodeprobe compact" on node B its read latency went up to
> >> 150ms.
> >> The read latency of node A became as low as 10ms. Is this normal
> behavior?
> >> I'm using random partitioner and the hardware/JVM settings are exactly
> the
> >> same for these two nodes.
> >>
> >> Another problem is that Java heap usage is always 900mb out of 6GB? Is
> >> there
> >> any way to utilize all of the heap space to decrease the read latency?
> >>
> >> -Weijun
> >>
> >> On Tue, Feb 16, 2010 at 10:01 AM, Brandon Williams <dr...@gmail.com>
> >> wrote:
> >>
> >> > On Tue, Feb 16, 2010 at 11:56 AM, Weijun Li <we...@gmail.com>
> wrote:
> >> >
> >> >> One more thoughts about Martin's suggestion: is it possible to put
> the
> >> >> data files into multiple directories that are located in different
> >> >> physical
> >> >> disks? This should help to improve the i/o bottleneck issue.
> >> >>
> >> >>
> >> > Yes, you can already do this, just add more <DataFileDirectory>
> >> > directives
> >> > pointed at multiple drives.
> >> >
> >> >
> >> >> Has anybody tested the row-caching feature in trunk (shoot for 0.6?)?
> >> >
> >> >
> >> > Row cache and key cache both help tremendously if your read pattern
> has
> >> > a
> >> > decent repeat rate.  Completely random io can only be so fast,
> however.
> >> >
> >> > -Brandon
> >> >
> >>
> >>
> >
> >
>

Re: Cassandra benchmark shows OK throughput but high read latency (> 100ms)?

Posted by Jonathan Ellis <jb...@gmail.com>.

Have you tried increasing KeysCachedFraction?

On Tue, Feb 16, 2010 at 6:15 PM, Weijun Li <we...@gmail.com> wrote:
> Still have high read latency with 50mil records in the 2-node cluster
> (replica 2). I restarted both nodes but read latency is still above 60ms and
> disk i/o saturation is high. Tried compact and repair but doesn't help much.
> When I reduced the client threads from 15 to 5 it looks a lot better but
> throughput is kind of low. I changed using flushing thread of 16 instead the
> defaulted 8, could that cause the disk saturation issue?
>
> For benchmark with decent throughput and latency, how many client threads do
> they use? Can anyone share your storage-conf.xml in well-tuned high volume
> cluster?
>
> -Weijun
>
> On Tue, Feb 16, 2010 at 10:31 AM, Stu Hood <st...@rackspace.com> wrote:
>>
>> > After I ran "nodeprobe compact" on node B its read latency went up to
>> > 150ms.
>> The compaction process can take a while to finish... in 0.5 you need to
>> watch the logs to figure out when it has actually finished, and then you
>> should start seeing the improvement in read latency.
>>
>> > Is there any way to utilize all of the heap space to decrease the read
>> > latency?
>> In 0.5 you can adjust the number of keys that are cached by changing the
>> 'KeysCachedFraction' parameter in your config file. In 0.6 you can
>> additionally cache rows. You don't want to use up all of the memory on your
>> box for those caches though: you'll want to leave at least 50% for your OS's
>> disk cache, which will store the full row content.
>>
>>
>> -----Original Message-----
>> From: "Weijun Li" <we...@gmail.com>
>> Sent: Tuesday, February 16, 2010 12:16pm
>> To: cassandra-user@incubator.apache.org
>> Subject: Re: Cassandra benchmark shows OK throughput but high read latency
>> (> 100ms)?
>>
>> Thanks for for DataFileDirectory trick and I'll give a try.
>>
>> Just noticed the impact of number of data files: node A has 13 data files
>> with read latency of 20ms and node B has 27 files with read latency of
>> 60ms.
>> After I ran "nodeprobe compact" on node B its read latency went up to
>> 150ms.
>> The read latency of node A became as low as 10ms. Is this normal behavior?
>> I'm using random partitioner and the hardware/JVM settings are exactly the
>> same for these two nodes.
>>
>> Another problem is that Java heap usage is always 900mb out of 6GB? Is
>> there
>> any way to utilize all of the heap space to decrease the read latency?
>>
>> -Weijun
>>
>> On Tue, Feb 16, 2010 at 10:01 AM, Brandon Williams <dr...@gmail.com>
>> wrote:
>>
>> > On Tue, Feb 16, 2010 at 11:56 AM, Weijun Li <we...@gmail.com> wrote:
>> >
>> >> One more thoughts about Martin's suggestion: is it possible to put the
>> >> data files into multiple directories that are located in different
>> >> physical
>> >> disks? This should help to improve the i/o bottleneck issue.
>> >>
>> >>
>> > Yes, you can already do this, just add more <DataFileDirectory>
>> > directives
>> > pointed at multiple drives.
>> >
>> >
>> >> Has anybody tested the row-caching feature in trunk (shoot for 0.6?)?
>> >
>> >
>> > Row cache and key cache both help tremendously if your read pattern has
>> > a
>> > decent repeat rate.  Completely random io can only be so fast, however.
>> >
>> > -Brandon
>> >
>>
>>
>
>

Re: Cassandra benchmark shows OK throughput but high read latency (> 100ms)?

Posted by Weijun Li <we...@gmail.com>.

Still have high read latency with 50mil records in the 2-node cluster
(replica 2). I restarted both nodes but read latency is still above 60ms and
disk i/o saturation is high. Tried compact and repair but doesn't help much.
When I reduced the client threads from 15 to 5 it looks a lot better but
throughput is kind of low. I changed using flushing thread of 16 instead the
defaulted 8, could that cause the disk saturation issue?

For benchmark with decent throughput and latency, how many client threads do
they use? Can anyone share your storage-conf.xml in well-tuned high volume
cluster?

-Weijun

On Tue, Feb 16, 2010 at 10:31 AM, Stu Hood <st...@rackspace.com> wrote:

> > After I ran "nodeprobe compact" on node B its read latency went up to
> 150ms.
> The compaction process can take a while to finish... in 0.5 you need to
> watch the logs to figure out when it has actually finished, and then you
> should start seeing the improvement in read latency.
>
> > Is there any way to utilize all of the heap space to decrease the read
> latency?
> In 0.5 you can adjust the number of keys that are cached by changing the
> 'KeysCachedFraction' parameter in your config file. In 0.6 you can
> additionally cache rows. You don't want to use up all of the memory on your
> box for those caches though: you'll want to leave at least 50% for your OS's
> disk cache, which will store the full row content.
>
>
> -----Original Message-----
> From: "Weijun Li" <we...@gmail.com>
> Sent: Tuesday, February 16, 2010 12:16pm
> To: cassandra-user@incubator.apache.org
> Subject: Re: Cassandra benchmark shows OK throughput but high read latency
> (> 100ms)?
>
> Thanks for for DataFileDirectory trick and I'll give a try.
>
> Just noticed the impact of number of data files: node A has 13 data files
> with read latency of 20ms and node B has 27 files with read latency of
> 60ms.
> After I ran "nodeprobe compact" on node B its read latency went up to
> 150ms.
> The read latency of node A became as low as 10ms. Is this normal behavior?
> I'm using random partitioner and the hardware/JVM settings are exactly the
> same for these two nodes.
>
> Another problem is that Java heap usage is always 900mb out of 6GB? Is
> there
> any way to utilize all of the heap space to decrease the read latency?
>
> -Weijun
>
> On Tue, Feb 16, 2010 at 10:01 AM, Brandon Williams <dr...@gmail.com>
> wrote:
>
> > On Tue, Feb 16, 2010 at 11:56 AM, Weijun Li <we...@gmail.com> wrote:
> >
> >> One more thoughts about Martin's suggestion: is it possible to put the
> >> data files into multiple directories that are located in different
> physical
> >> disks? This should help to improve the i/o bottleneck issue.
> >>
> >>
> > Yes, you can already do this, just add more <DataFileDirectory>
> directives
> > pointed at multiple drives.
> >
> >
> >> Has anybody tested the row-caching feature in trunk (shoot for 0.6?)?
> >
> >
> > Row cache and key cache both help tremendously if your read pattern has a
> > decent repeat rate.  Completely random io can only be so fast, however.
> >
> > -Brandon
> >
>
>
>

Re: Cassandra benchmark shows OK throughput but high read latency (> 100ms)?

Posted by Brandon Williams <dr...@gmail.com>.

On Tue, Feb 16, 2010 at 11:50 AM, Weijun Li <we...@gmail.com> wrote:

> Dumped 50mil records into my 2-node cluster overnight, made sure that
> there's not many data files (around 30 only) per Martin's suggestion. The
> size of the data directory is 63GB. Now when I read records from the cluster
> the read latency is still ~44ms, --there's no write happening during the
> read. And iostats shows that the disk (RAID10, 4 250GB 15k SAS) is
> saturated:
>
> Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz
> avgqu-sz   await  svctm  %util
> sda              47.67    67.67 190.33 17.00 23933.33   677.33   118.70
> 5.24   25.25   4.64  96.17
> sda1              0.00     0.00  0.00  0.00     0.00     0.00     0.00
> 0.00    0.00   0.00   0.00
> sda2             47.67    67.67 190.33 17.00 23933.33   677.33   118.70
> 5.24   25.25   4.64  96.17
> sda3              0.00     0.00  0.00  0.00     0.00     0.00     0.00
> 0.00    0.00   0.00   0.00
>
> CPU usage is low.
>
> Does this mean disk i/o is the bottleneck for my case? Will it help if I
> increase KCF to cache all sstable index?
>
>
That's exactly what this means.  Disk is slow :(


> Also, this is the almost a read-only mode test, and in reality, our
> write/read ratio is close to 1:1 so I'm guessing read latency will even go
> higher in that case because there will be difficult for cassandra to find a
> good moment to compact the data files that are being busy written.
>

Reads that cause disk seeks are always going to slow things down, since disk
seeks are inherently the slowest operation in a machine.  Writes in
Cassandra should always be fast, as they do not cause any disk seeks.

-Brandon

Re: Cassandra benchmark shows OK throughput but high read latency (> 100ms)?

Posted by Stu Hood <st...@rackspace.com>.

> After I ran "nodeprobe compact" on node B its read latency went up to 150ms.
The compaction process can take a while to finish... in 0.5 you need to watch the logs to figure out when it has actually finished, and then you should start seeing the improvement in read latency.

> Is there any way to utilize all of the heap space to decrease the read latency?
In 0.5 you can adjust the number of keys that are cached by changing the 'KeysCachedFraction' parameter in your config file. In 0.6 you can additionally cache rows. You don't want to use up all of the memory on your box for those caches though: you'll want to leave at least 50% for your OS's disk cache, which will store the full row content.


-----Original Message-----
From: "Weijun Li" <we...@gmail.com>
Sent: Tuesday, February 16, 2010 12:16pm
To: cassandra-user@incubator.apache.org
Subject: Re: Cassandra benchmark shows OK throughput but high read latency (> 100ms)?

Thanks for for DataFileDirectory trick and I'll give a try.

Just noticed the impact of number of data files: node A has 13 data files
with read latency of 20ms and node B has 27 files with read latency of 60ms.
After I ran "nodeprobe compact" on node B its read latency went up to 150ms.
The read latency of node A became as low as 10ms. Is this normal behavior?
I'm using random partitioner and the hardware/JVM settings are exactly the
same for these two nodes.

Another problem is that Java heap usage is always 900mb out of 6GB? Is there
any way to utilize all of the heap space to decrease the read latency?

-Weijun

On Tue, Feb 16, 2010 at 10:01 AM, Brandon Williams <dr...@gmail.com> wrote:

> On Tue, Feb 16, 2010 at 11:56 AM, Weijun Li <we...@gmail.com> wrote:
>
>> One more thoughts about Martin's suggestion: is it possible to put the
>> data files into multiple directories that are located in different physical
>> disks? This should help to improve the i/o bottleneck issue.
>>
>>
> Yes, you can already do this, just add more <DataFileDirectory> directives
> pointed at multiple drives.
>
>
>> Has anybody tested the row-caching feature in trunk (shoot for 0.6?)?
>
>
> Row cache and key cache both help tremendously if your read pattern has a
> decent repeat rate.  Completely random io can only be so fast, however.
>
> -Brandon
>

Re: Cassandra benchmark shows OK throughput but high read latency (> 100ms)?

Posted by Brandon Williams <dr...@gmail.com>.

On Tue, Feb 16, 2010 at 12:16 PM, Weijun Li <we...@gmail.com> wrote:

> Thanks for for DataFileDirectory trick and I'll give a try.
>
> Just noticed the impact of number of data files: node A has 13 data files
> with read latency of 20ms and node B has 27 files with read latency of 60ms.
> After I ran "nodeprobe compact" on node B its read latency went up to 150ms.
> The read latency of node A became as low as 10ms. Is this normal behavior?
> I'm using random partitioner and the hardware/JVM settings are exactly the
> same for these two nodes.
>

It sounds like the latency jumped to 150ms because the newly written file
was not in the OS cache.

Another problem is that Java heap usage is always 900mb out of 6GB? Is there
> any way to utilize all of the heap space to decrease the read latency?

By default, Cassandra will use a 1GB heap, as set in bin/cassandra.in.sh.
 You can adjust the jvm heap there via the -Xmx option, but generally you
want to balance the jvm vs the OS cache.  With 6GB, I would probably give
2GB to the jvm, but if you aren't having issues now increasing the jvm's
memory probably won't provide any performance gains, but it's worth noting
that with row cache in 0.6 this may change.

-Brandon

Re: Cassandra benchmark shows OK throughput but high read latency (> 100ms)?

Posted by Weijun Li <we...@gmail.com>.

Thanks for for DataFileDirectory trick and I'll give a try.

Just noticed the impact of number of data files: node A has 13 data files
with read latency of 20ms and node B has 27 files with read latency of 60ms.
After I ran "nodeprobe compact" on node B its read latency went up to 150ms.
The read latency of node A became as low as 10ms. Is this normal behavior?
I'm using random partitioner and the hardware/JVM settings are exactly the
same for these two nodes.

Another problem is that Java heap usage is always 900mb out of 6GB? Is there
any way to utilize all of the heap space to decrease the read latency?

-Weijun

On Tue, Feb 16, 2010 at 10:01 AM, Brandon Williams <dr...@gmail.com> wrote:

> On Tue, Feb 16, 2010 at 11:56 AM, Weijun Li <we...@gmail.com> wrote:
>
>> One more thoughts about Martin's suggestion: is it possible to put the
>> data files into multiple directories that are located in different physical
>> disks? This should help to improve the i/o bottleneck issue.
>>
>>
> Yes, you can already do this, just add more <DataFileDirectory> directives
> pointed at multiple drives.
>
>
>> Has anybody tested the row-caching feature in trunk (shoot for 0.6?)?
>
>
> Row cache and key cache both help tremendously if your read pattern has a
> decent repeat rate.  Completely random io can only be so fast, however.
>
> -Brandon
>

Re: Cassandra benchmark shows OK throughput but high read latency (> 100ms)?

Posted by Brandon Williams <dr...@gmail.com>.

On Tue, Feb 16, 2010 at 11:56 AM, Weijun Li <we...@gmail.com> wrote:

> One more thoughts about Martin's suggestion: is it possible to put the data
> files into multiple directories that are located in different physical
> disks? This should help to improve the i/o bottleneck issue.
>
>
Yes, you can already do this, just add more <DataFileDirectory> directives
pointed at multiple drives.

> Has anybody tested the row-caching feature in trunk (shoot for 0.6?)?

Row cache and key cache both help tremendously if your read pattern has a
decent repeat rate.  Completely random io can only be so fast, however.

-Brandon

Re: Cassandra benchmark shows OK throughput but high read latency (> 100ms)?

Posted by Weijun Li <we...@gmail.com>.

One more thoughts about Martin's suggestion: is it possible to put the data
files into multiple directories that are located in different physical
disks? This should help to improve the i/o bottleneck issue.

Has anybody tested the row-caching feature in trunk (shoot for 0.6?)?

-Weijun

On Tue, Feb 16, 2010 at 9:50 AM, Weijun Li <we...@gmail.com> wrote:

> Dumped 50mil records into my 2-node cluster overnight, made sure that
> there's not many data files (around 30 only) per Martin's suggestion. The
> size of the data directory is 63GB. Now when I read records from the cluster
> the read latency is still ~44ms, --there's no write happening during the
> read. And iostats shows that the disk (RAID10, 4 250GB 15k SAS) is
> saturated:
>
> Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz
> avgqu-sz   await  svctm  %util
> sda              47.67    67.67 190.33 17.00 23933.33   677.33   118.70
> 5.24   25.25   4.64  96.17
> sda1              0.00     0.00  0.00  0.00     0.00     0.00     0.00
> 0.00    0.00   0.00   0.00
> sda2             47.67    67.67 190.33 17.00 23933.33   677.33   118.70
> 5.24   25.25   4.64  96.17
> sda3              0.00     0.00  0.00  0.00     0.00     0.00     0.00
> 0.00    0.00   0.00   0.00
>
> CPU usage is low.
>
> Does this mean disk i/o is the bottleneck for my case? Will it help if I
> increase KCF to cache all sstable index?
>
> Also, this is the almost a read-only mode test, and in reality, our
> write/read ratio is close to 1:1 so I'm guessing read latency will even go
> higher in that case because there will be difficult for cassandra to find a
> good moment to compact the data files that are being busy written.
>
> Thanks,
> -Weijun
>
>
>
> On Tue, Feb 16, 2010 at 6:06 AM, Brandon Williams <dr...@gmail.com>wrote:
>
>> On Tue, Feb 16, 2010 at 2:32 AM, Dr. Martin Grabmüller <
>> Martin.Grabmueller@eleven.de> wrote:
>>
>>> In my tests I have observed that good read latency depends on keeping
>>> the number of data files low.  In my current test setup, I have stored
>>> 1.9 TB of data on a single node, which is in 21 data files, and read
>>> latency is between 10 and 60ms (for small reads, larger read of course
>>> take more time).  In earlier stages of my test, I had up to 5000
>>> data files, and read performance was quite bad: my configured 10-second
>>> RPC timeout was regularly encountered.
>>>
>>
>> I believe it is known that crossing sstables is O(NlogN) but I'm unable to
>> find the ticket on this at the moment.  Perhaps Stu Hood will jump in and
>> enlighten me, but in any case I believe
>> https://issues.apache.org/jira/browse/CASSANDRA-674 will eventually solve
>> it.
>>
>> Keeping write volume low enough that compaction can keep up is one
>> solution, and throwing hardware at the problem is another, if necessary.
>>  Also, the row caching in trunk (soon to be 0.6 we hope) helps greatly for
>> repeat hits.
>>
>> -Brandon
>>
>
>

Re: Cassandra benchmark shows OK throughput but high read latency (> 100ms)?

Posted by Weijun Li <we...@gmail.com>.

Dumped 50mil records into my 2-node cluster overnight, made sure that
there's not many data files (around 30 only) per Martin's suggestion. The
size of the data directory is 63GB. Now when I read records from the cluster
the read latency is still ~44ms, --there's no write happening during the
read. And iostats shows that the disk (RAID10, 4 250GB 15k SAS) is
saturated:

Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz
avgqu-sz   await  svctm  %util
sda              47.67    67.67 190.33 17.00 23933.33   677.33   118.70
5.24   25.25   4.64  96.17
sda1              0.00     0.00  0.00  0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00
sda2             47.67    67.67 190.33 17.00 23933.33   677.33   118.70
5.24   25.25   4.64  96.17
sda3              0.00     0.00  0.00  0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00

CPU usage is low.

Does this mean disk i/o is the bottleneck for my case? Will it help if I
increase KCF to cache all sstable index?

Also, this is the almost a read-only mode test, and in reality, our
write/read ratio is close to 1:1 so I'm guessing read latency will even go
higher in that case because there will be difficult for cassandra to find a
good moment to compact the data files that are being busy written.

Thanks,
-Weijun

On Tue, Feb 16, 2010 at 6:06 AM, Brandon Williams <dr...@gmail.com> wrote:

> On Tue, Feb 16, 2010 at 2:32 AM, Dr. Martin Grabmüller <
> Martin.Grabmueller@eleven.de> wrote:
>
>> In my tests I have observed that good read latency depends on keeping
>> the number of data files low.  In my current test setup, I have stored
>> 1.9 TB of data on a single node, which is in 21 data files, and read
>> latency is between 10 and 60ms (for small reads, larger read of course
>> take more time).  In earlier stages of my test, I had up to 5000
>> data files, and read performance was quite bad: my configured 10-second
>> RPC timeout was regularly encountered.
>>
>
> I believe it is known that crossing sstables is O(NlogN) but I'm unable to
> find the ticket on this at the moment.  Perhaps Stu Hood will jump in and
> enlighten me, but in any case I believe
> https://issues.apache.org/jira/browse/CASSANDRA-674 will eventually solve
> it.
>
> Keeping write volume low enough that compaction can keep up is one
> solution, and throwing hardware at the problem is another, if necessary.
>  Also, the row caching in trunk (soon to be 0.6 we hope) helps greatly for
> repeat hits.
>
> -Brandon
>

Re: Cassandra benchmark shows OK throughput but high read latency (> 100ms)?

Posted by Brandon Williams <dr...@gmail.com>.

On Tue, Feb 16, 2010 at 2:32 AM, Dr. Martin Grabmüller <
Martin.Grabmueller@eleven.de> wrote:

> In my tests I have observed that good read latency depends on keeping
> the number of data files low.  In my current test setup, I have stored
> 1.9 TB of data on a single node, which is in 21 data files, and read
> latency is between 10 and 60ms (for small reads, larger read of course
> take more time).  In earlier stages of my test, I had up to 5000
> data files, and read performance was quite bad: my configured 10-second
> RPC timeout was regularly encountered.
>

I believe it is known that crossing sstables is O(NlogN) but I'm unable to
find the ticket on this at the moment.  Perhaps Stu Hood will jump in and
enlighten me, but in any case I believe
https://issues.apache.org/jira/browse/CASSANDRA-674 will eventually solve
it.

Keeping write volume low enough that compaction can keep up is one solution,
and throwing hardware at the problem is another, if necessary.  Also, the
row caching in trunk (soon to be 0.6 we hope) helps greatly for repeat hits.

-Brandon

RE: Cassandra benchmark shows OK throughput but high read latency (> 100ms)?

Posted by "Dr. Martin Grabmüller" <Ma...@eleven.de>.

> The other problem is: if I keep mixed write and read (e.g, 8 
> write threads
> plus 7 read threads) against the 2-nodes cluster 
> continuously, the read
> latency will go up gradually (along with the size of 
> Cassandra data file),
> and at the end it will become ~40ms (up from ~20ms) even with only 15
> threads. During this process the data file grew from 1.6GB to 
> over 3GB even
> if I kept writing the same key/values to Cassandra. It seems 
> that Cassandra
> keeps appending to sstable data files and will only clean up 
> them during
> node cleanup or compact (please correct me if this is incorrect). 

In my tests I have observed that good read latency depends on keeping
the number of data files low.  In my current test setup, I have stored
1.9 TB of data on a single node, which is in 21 data files, and read
latency is between 10 and 60ms (for small reads, larger read of course
take more time).  In earlier stages of my test, I had up to 5000
data files, and read performance was quite bad: my configured 10-second
RPC timeout was regularly encountered.

The number of data files is reduced whenever Cassandra compacts them,
which is either automatically, when enough datafiles are generated by
continuous writing, or when triggered by nodeprobe compact, cleanup etc.

So my advice is to keep the write throughput low enough so that Cassandra
can keep up compacting the data files.  For high write throughput, you need
fast drives, if possible on different RAIDs, which are configured as
different DataDirectories for Cassandra.  On my setup (6 drives in a single
RAID-5 configuration), compaction is quite slow: sequential reads/writes
are done at 150 MB/s, whereas during compaction, read/write-performance
drops to a few MB/s.  You definitively want more than one logical drive,
so that Cassandra can alternate between them when flushin memtables and
when compacting.

I would really be interested whether my observations are shared by other
people on this list.

Thanks!

Martin

RE: Cassandra benchmark shows OK throughput but high read latency (> 100ms)?

Posted by Weijun Li <we...@gmail.com>.

It seems that read latency is sensitive to number of threads (or thrift
clients): after reducing number of threads to 15 and read latency decreased
to ~20ms. 

The other problem is: if I keep mixed write and read (e.g, 8 write threads
plus 7 read threads) against the 2-nodes cluster continuously, the read
latency will go up gradually (along with the size of Cassandra data file),
and at the end it will become ~40ms (up from ~20ms) even with only 15
threads. During this process the data file grew from 1.6GB to over 3GB even
if I kept writing the same key/values to Cassandra. It seems that Cassandra
keeps appending to sstable data files and will only clean up them during
node cleanup or compact (please correct me if this is incorrect). 

Here's my test settings:

JVM xmx: 6GB
KCF: 0.3
Memtable: 512MB.
Number of records: 1 millon (payload is 1000 bytes)

I used JMX and iostat to watch the cluster but can't find any clue for the
increasing read latency issue: JVM memory, GC, CPU usage, tpstats and io
saturation all seem to be clean. One exception is that the wait time in
iostat goes up quickly once a while but is a small number for most of the
time. Another thing I noticed is that JVM doesn't use more than 1GB of
memory (out of the 6GB I specified for JVM) even if I set KCF to 0.3 and
increased memtable size to 512MB.

Did I miss anything here? How can I diagnose this kind of increasing read
latency issue? Is there any performance tuning guide available?

Thanks,
-Weijun

-----Original Message-----
From: Jonathan Ellis [mailto:jbellis@gmail.com] 
Sent: Sunday, February 14, 2010 6:22 PM
To: cassandra-user@incubator.apache.org
Subject: Re: Cassandra benchmark shows OK throughput but high read latency
(> 100ms)?

are you i/o bound?  what is your on-disk data set size?  what does
iostats tell you?
http://spyced.blogspot.com/2010/01/linux-performance-basics.html

do you have a lot of pending compactions?  (tpstats will tell you)

have you increased KeysCachedFraction?

On Sun, Feb 14, 2010 at 8:18 PM, Weijun Li <we...@gmail.com> wrote:
> Hello,
>
>
>
> I saw some Cassandra benchmark reports mentioning read latency that is
less
> than 50ms or even 30ms. But my benchmark with 0.5 doesn't seem to support
> that. Here's my settings:
>
>
>
> Nodes: 2 machines. 2x2.5GHZ Xeon Quad Core (thus 8 cores), 8GB RAM
>
> ReplicationFactor=2 Partitioner=Random
>
> JVM Xmx: 4GB
>
> Memory table size: 512MB (haven't figured out how to enable binary
memtable
> so I set both memtable number to 512mb)
>
> Flushing threads: 2-4
>
> Payload: ~1000 bytes, 3 columns in one CF.
>
> Read/write time measure: get startTime right before each Java thrift call,
> transport objects are pre-created upon creation of each thread.
>
>
>
> The result shows that total write throughput is around 2000/sec (for 2
nodes
> in the cluster) which is not bad, and read throughput is just around
> 750/sec. However for each thread the average read latency is more than
> 100ms. I'm running 100 threads for the testing and each thread randomly
pick
> a node for thrift call. So the read/sec of each thread is just around 7.5,
> meaning duration of each thrift call is 1000/7.5=133ms. Without
replication
> the cluster write throughput is around 3300/s, and read throughput is
around
> 1400/s, so the read latency is still around 70ms without replication.
>
>
>
> Is there anything wrong in my benchmark test? How can I achieve a
reasonable
> read latency (< 30ms)?
>
>
>
> Thanks,
>
> -Weijun
>
>
>
>

Re: Cassandra benchmark shows OK throughput but high read latency (> 100ms)?

Posted by Jonathan Ellis <jb...@gmail.com>.

are you i/o bound?  what is your on-disk data set size?  what does
iostats tell you?
http://spyced.blogspot.com/2010/01/linux-performance-basics.html

do you have a lot of pending compactions?  (tpstats will tell you)

have you increased KeysCachedFraction?

On Sun, Feb 14, 2010 at 8:18 PM, Weijun Li <we...@gmail.com> wrote:
> Hello,
>
>
>
> I saw some Cassandra benchmark reports mentioning read latency that is less
> than 50ms or even 30ms. But my benchmark with 0.5 doesn’t seem to support
> that. Here’s my settings:
>
>
>
> Nodes: 2 machines. 2x2.5GHZ Xeon Quad Core (thus 8 cores), 8GB RAM
>
> ReplicationFactor=2 Partitioner=Random
>
> JVM Xmx: 4GB
>
> Memory table size: 512MB (haven’t figured out how to enable binary memtable
> so I set both memtable number to 512mb)
>
> Flushing threads: 2-4
>
> Payload: ~1000 bytes, 3 columns in one CF.
>
> Read/write time measure: get startTime right before each Java thrift call,
> transport objects are pre-created upon creation of each thread.
>
>
>
> The result shows that total write throughput is around 2000/sec (for 2 nodes
> in the cluster) which is not bad, and read throughput is just around
> 750/sec. However for each thread the average read latency is more than
> 100ms. I’m running 100 threads for the testing and each thread randomly pick
> a node for thrift call. So the read/sec of each thread is just around 7.5,
> meaning duration of each thrift call is 1000/7.5=133ms. Without replication
> the cluster write throughput is around 3300/s, and read throughput is around
> 1400/s, so the read latency is still around 70ms without replication.
>
>
>
> Is there anything wrong in my benchmark test? How can I achieve a reasonable
> read latency (< 30ms)?
>
>
>
> Thanks,
>
> -Weijun
>
>
>
>

Cassandra benchmark shows OK throughput but high read latency (> 100ms)?

Posted by Weijun Li <we...@gmail.com>.

Hello,

 

I saw some Cassandra benchmark reports mentioning read latency that is less than 50ms or even 30ms. But my benchmark with 0.5 doesn’t seem to support that. Here’s my settings:

 

Nodes: 2 machines. 2x2.5GHZ Xeon Quad Core (thus 8 cores), 8GB RAM

ReplicationFactor=2 Partitioner=Random

JVM Xmx: 4GB

Memory table size: 512MB (haven’t figured out how to enable binary memtable so I set both memtable number to 512mb)

Flushing threads: 2-4

Payload: ~1000 bytes, 3 columns in one CF.

Read/write time measure: get startTime right before each Java thrift call, transport objects are pre-created upon creation of each thread.

 

The result shows that total write throughput is around 2000/sec (for 2 nodes in the cluster) which is not bad, and read throughput is just around 750/sec. However for each thread the average read latency is more than 100ms. I’m running 100 threads for the testing and each thread randomly pick a node for thrift call. So the read/sec of each thread is just around 7.5, meaning duration of each thrift call is 1000/7.5=133ms. Without replication the cluster write throughput is around 3300/s, and read throughput is around 1400/s, so the read latency is still around 70ms without replication.

 

Is there anything wrong in my benchmark test? How can I achieve a reasonable read latency (< 30ms)?

 

Thanks,

-Weijun

Re: How to unit test my code calling Cassandra with Thift

Posted by Ran Tavory <ra...@gmail.com>.

I've committed to trunk all the required code and posted about it, hope you
find it useful
http://prettyprint.me/2010/02/14/running-cassandra-as-an-embedded-service/


On Sun, Jan 24, 2010 at 12:20 PM, Richard Grossman <ri...@gmail.com>wrote:

> Great Ran,
>
> I think I've missed the .setDaemon to keep the server alive.
> Thanks
>
> Richard
>
> On Sun, Jan 24, 2010 at 12:02 PM, Ran Tavory <ra...@gmail.com> wrote:
>
>> Here's the code I've just written over the weekend and started using in
>> test:
>>
>>
>> package com.outbrain.data.cassandra.service;
>>
>> import java.io.File;
>> import java.io.FileOutputStream;
>> import java.io.IOException;
>> import java.io.InputStream;
>> import java.io.OutputStream;
>>
>> import org.apache.cassandra.config.DatabaseDescriptor;
>> import org.apache.cassandra.service.CassandraDaemon;
>> import org.apache.cassandra.utils.FileUtils;
>> import org.apache.thrift.transport.TTransportException;
>> import org.slf4j.Logger;
>> import org.slf4j.LoggerFactory;
>>
>> /**
>>  * An in-memory cassandra storage service that listens to the thrift
>> interface.
>>  * Useful for unit testing,
>>  *
>>  * @author Ran Tavory (ran@outbain.com)
>>  *
>>  */
>> public class InProcessCassandraServer implements Runnable {
>>
>>   private static final Logger log =
>> LoggerFactory.getLogger(InProcessCassandraServer.class);
>>
>>   CassandraDaemon cassandraDaemon;
>>
>>   public void init() {
>>     try {
>>       prepare();
>>     } catch (IOException e) {
>>       log.error("Cannot prepare cassandra.", e);
>>     }
>>     try {
>>       cassandraDaemon = new CassandraDaemon();
>>       cassandraDaemon.init(null);
>>     } catch (TTransportException e) {
>>       log.error("TTransportException", e);
>>     } catch (IOException e) {
>>       log.error("IOException", e);
>>     }
>>   }
>>
>>   @Override
>>   public void run() {
>>     cassandraDaemon.start();
>>   }
>>
>>   public void stop() {
>>     cassandraDaemon.stop();
>>     rmdir("tmp");
>>   }
>>
>>
>>   /**
>>    * Creates all files and directories needed
>>    * @throws IOException
>>    */
>>   private void prepare() throws IOException {
>>     // delete tmp dir first
>>     rmdir("tmp");
>>     // make a tmp dir and copy storag-conf.xml and log4j.properties to it
>>     copy("/cassandra/storage-conf.xml", "tmp");
>>     copy("/cassandra/log4j.properties", "tmp");
>>     System.setProperty("storage-config", "tmp");
>>
>>     // make cassandra directories.
>>     for (String s: DatabaseDescriptor.getAllDataFileLocations()) {
>>       mkdir(s);
>>     }
>>     mkdir(DatabaseDescriptor.getBootstrapFileLocation());
>>     mkdir(DatabaseDescriptor.getLogFileLocation());
>>   }
>>
>>   /**
>>    * Copies a resource from within the jar to a directory.
>>    *
>>    * @param resourceName
>>    * @param directory
>>    * @throws IOException
>>    */
>>   private void copy(String resource, String directory) throws IOException
>> {
>>     mkdir(directory);
>>     InputStream is = getClass().getResourceAsStream(resource);
>>     String fileName = resource.substring(resource.lastIndexOf("/") + 1);
>>     File file = new File(directory + System.getProperty("file.separator")
>> + fileName);
>>     OutputStream out = new FileOutputStream(file);
>>     byte buf[] = new byte[1024];
>>     int len;
>>     while ((len = is.read(buf)) > 0) {
>>       out.write(buf, 0, len);
>>     }
>>     out.close();
>>     is.close();
>>   }
>>
>>   /**
>>    * Creates a directory
>>    * @param dir
>>    * @throws IOException
>>    */
>>   private void mkdir(String dir) throws IOException {
>>     FileUtils.createDirectory(dir);
>>   }
>>
>>   /**
>>    * Removes a directory from file system
>>    * @param dir
>>    */
>>   private void rmdir(String dir) {
>>     FileUtils.deleteDir(new File(dir));
>>   }
>> }
>>
>>
>> And in the test class:
>>
>> public class XxxTest {
>>
>>   private static InProcessCassandraServer cassandra;
>>
>>   @BeforeClass
>>   public static void setup() throws TTransportException, IOException,
>> InterruptedException {
>>     cassandra = new InProcessCassandraServer();
>>     cassandra.init();
>>     Thread t = new Thread(cassandra);
>>     t.setDaemon(true);
>>     t.start();
>>   }
>>
>>   @AfterClass
>>   public static void shutdown() {
>>     cassandra.stop();
>>   }
>> ... test
>> }
>>
>> Now you can connect to localhost:9160.
>>
>> Assumptions:
>> The code assumes you have two files in your classpath:
>> /cassandra/stogage-config.xml and /cassandra/log4j.xml. This is convenient
>> if you use maven, just throw them at /src/test/resources/cassandra/
>> If you don't work with maven or would like to configure the configuration
>> files differently it should be fairly easy, just change the prepare()
>> method.
>>
>>
>>
>> On Sun, Jan 24, 2010 at 10:54 AM, Richard Grossman <ri...@gmail.com>wrote:
>>
>>> So Is there anybody ? Unit testing is important people ...
>>> Thanks
>>>
>>>
>>> On Thu, Jan 21, 2010 at 12:09 PM, Richard Grossman <ri...@gmail.com>wrote:
>>>
>>>> Here is the code I use
>>>>     class startServer implements Runnable {
>>>>
>>>>         @Override
>>>>         public void run() {
>>>>             try {
>>>>                 CassandraDaemon cassandraDaemon = new CassandraDaemon();
>>>>                 cassandraDaemon.init(null);
>>>>                 cassandraDaemon.start();
>>>>             } catch (TTransportException e) {
>>>>                 // TODO Auto-generated catch block
>>>>                 e.printStackTrace();
>>>>             } catch (IOException e) {
>>>>                 // TODO Auto-generated catch block
>>>>                 e.printStackTrace();
>>>>             }
>>>>         }
>>>>     }
>>>>
>>>>         Thread thread = new Thread(new startServer());
>>>>         thread.start();
>>>>
>>>> <the code to test here>
>>>>
>>>>
>>>>
>>>> On Thu, Jan 21, 2010 at 12:08 PM, Richard Grossman <richiesgr@gmail.com
>>>> > wrote:
>>>>
>>>>> Yes I've seen this and also check it but if I start the server then it
>>>>> block the current thread I can continue the test in sequence.
>>>>> So I've tried to start into separate thread but no chance too it close
>>>>> the server even before I arrive to the code to test.
>>>>>
>>>>> If you've a trick to start the server in JVM thank
>>>>>
>>>>> Richard
>>>>>
>>>>> On Wed, Jan 20, 2010 at 3:47 PM, Jonathan Ellis <jb...@gmail.com>wrote:
>>>>>
>>>>>> did you look at CassandraDaemon?
>>>>>>
>>>>>>
>>>>
>>>
>>
>

Re: How to unit test my code calling Cassandra with Thift

Posted by Ran Tavory <ra...@gmail.com>.

Sure thing.
Actually I wasn't sure about the deamon or not. I think it should be a
deamon b/c I don't want the app to get stuck when the test ends on our CI
server. But as long as the test is run, it's not important afaik whether
it's a deamon or not.
setDeamon means that as long as this is the last thread alive, don't hang
the app for it and quit.

On Sun, Jan 24, 2010 at 12:20 PM, Richard Grossman <ri...@gmail.com>wrote:

> Great Ran,
>
> I think I've missed the .setDaemon to keep the server alive.
> Thanks
>
> Richard
>
> On Sun, Jan 24, 2010 at 12:02 PM, Ran Tavory <ra...@gmail.com> wrote:
>
>> Here's the code I've just written over the weekend and started using in
>> test:
>>
>>
>> package com.outbrain.data.cassandra.service;
>>
>> import java.io.File;
>> import java.io.FileOutputStream;
>> import java.io.IOException;
>> import java.io.InputStream;
>> import java.io.OutputStream;
>>
>> import org.apache.cassandra.config.DatabaseDescriptor;
>> import org.apache.cassandra.service.CassandraDaemon;
>> import org.apache.cassandra.utils.FileUtils;
>> import org.apache.thrift.transport.TTransportException;
>> import org.slf4j.Logger;
>> import org.slf4j.LoggerFactory;
>>
>> /**
>>  * An in-memory cassandra storage service that listens to the thrift
>> interface.
>>  * Useful for unit testing,
>>  *
>>  * @author Ran Tavory (ran@outbain.com)
>>  *
>>  */
>> public class InProcessCassandraServer implements Runnable {
>>
>>   private static final Logger log =
>> LoggerFactory.getLogger(InProcessCassandraServer.class);
>>
>>   CassandraDaemon cassandraDaemon;
>>
>>   public void init() {
>>     try {
>>       prepare();
>>     } catch (IOException e) {
>>       log.error("Cannot prepare cassandra.", e);
>>     }
>>     try {
>>       cassandraDaemon = new CassandraDaemon();
>>       cassandraDaemon.init(null);
>>     } catch (TTransportException e) {
>>       log.error("TTransportException", e);
>>     } catch (IOException e) {
>>       log.error("IOException", e);
>>     }
>>   }
>>
>>   @Override
>>   public void run() {
>>     cassandraDaemon.start();
>>   }
>>
>>   public void stop() {
>>     cassandraDaemon.stop();
>>     rmdir("tmp");
>>   }
>>
>>
>>   /**
>>    * Creates all files and directories needed
>>    * @throws IOException
>>    */
>>   private void prepare() throws IOException {
>>     // delete tmp dir first
>>     rmdir("tmp");
>>     // make a tmp dir and copy storag-conf.xml and log4j.properties to it
>>     copy("/cassandra/storage-conf.xml", "tmp");
>>     copy("/cassandra/log4j.properties", "tmp");
>>     System.setProperty("storage-config", "tmp");
>>
>>     // make cassandra directories.
>>     for (String s: DatabaseDescriptor.getAllDataFileLocations()) {
>>       mkdir(s);
>>     }
>>     mkdir(DatabaseDescriptor.getBootstrapFileLocation());
>>     mkdir(DatabaseDescriptor.getLogFileLocation());
>>   }
>>
>>   /**
>>    * Copies a resource from within the jar to a directory.
>>    *
>>    * @param resourceName
>>    * @param directory
>>    * @throws IOException
>>    */
>>   private void copy(String resource, String directory) throws IOException
>> {
>>     mkdir(directory);
>>     InputStream is = getClass().getResourceAsStream(resource);
>>     String fileName = resource.substring(resource.lastIndexOf("/") + 1);
>>     File file = new File(directory + System.getProperty("file.separator")
>> + fileName);
>>     OutputStream out = new FileOutputStream(file);
>>     byte buf[] = new byte[1024];
>>     int len;
>>     while ((len = is.read(buf)) > 0) {
>>       out.write(buf, 0, len);
>>     }
>>     out.close();
>>     is.close();
>>   }
>>
>>   /**
>>    * Creates a directory
>>    * @param dir
>>    * @throws IOException
>>    */
>>   private void mkdir(String dir) throws IOException {
>>     FileUtils.createDirectory(dir);
>>   }
>>
>>   /**
>>    * Removes a directory from file system
>>    * @param dir
>>    */
>>   private void rmdir(String dir) {
>>     FileUtils.deleteDir(new File(dir));
>>   }
>> }
>>
>>
>> And in the test class:
>>
>> public class XxxTest {
>>
>>   private static InProcessCassandraServer cassandra;
>>
>>   @BeforeClass
>>   public static void setup() throws TTransportException, IOException,
>> InterruptedException {
>>     cassandra = new InProcessCassandraServer();
>>     cassandra.init();
>>     Thread t = new Thread(cassandra);
>>     t.setDaemon(true);
>>     t.start();
>>   }
>>
>>   @AfterClass
>>   public static void shutdown() {
>>     cassandra.stop();
>>   }
>> ... test
>> }
>>
>> Now you can connect to localhost:9160.
>>
>> Assumptions:
>> The code assumes you have two files in your classpath:
>> /cassandra/stogage-config.xml and /cassandra/log4j.xml. This is convenient
>> if you use maven, just throw them at /src/test/resources/cassandra/
>> If you don't work with maven or would like to configure the configuration
>> files differently it should be fairly easy, just change the prepare()
>> method.
>>
>>
>>
>> On Sun, Jan 24, 2010 at 10:54 AM, Richard Grossman <ri...@gmail.com>wrote:
>>
>>> So Is there anybody ? Unit testing is important people ...
>>> Thanks
>>>
>>>
>>> On Thu, Jan 21, 2010 at 12:09 PM, Richard Grossman <ri...@gmail.com>wrote:
>>>
>>>> Here is the code I use
>>>>     class startServer implements Runnable {
>>>>
>>>>         @Override
>>>>         public void run() {
>>>>             try {
>>>>                 CassandraDaemon cassandraDaemon = new CassandraDaemon();
>>>>                 cassandraDaemon.init(null);
>>>>                 cassandraDaemon.start();
>>>>             } catch (TTransportException e) {
>>>>                 // TODO Auto-generated catch block
>>>>                 e.printStackTrace();
>>>>             } catch (IOException e) {
>>>>                 // TODO Auto-generated catch block
>>>>                 e.printStackTrace();
>>>>             }
>>>>         }
>>>>     }
>>>>
>>>>         Thread thread = new Thread(new startServer());
>>>>         thread.start();
>>>>
>>>> <the code to test here>
>>>>
>>>>
>>>>
>>>> On Thu, Jan 21, 2010 at 12:08 PM, Richard Grossman <richiesgr@gmail.com
>>>> > wrote:
>>>>
>>>>> Yes I've seen this and also check it but if I start the server then it
>>>>> block the current thread I can continue the test in sequence.
>>>>> So I've tried to start into separate thread but no chance too it close
>>>>> the server even before I arrive to the code to test.
>>>>>
>>>>> If you've a trick to start the server in JVM thank
>>>>>
>>>>> Richard
>>>>>
>>>>> On Wed, Jan 20, 2010 at 3:47 PM, Jonathan Ellis <jb...@gmail.com>wrote:
>>>>>
>>>>>> did you look at CassandraDaemon?
>>>>>>
>>>>>>
>>>>
>>>
>>
>

Re: How to unit test my code calling Cassandra with Thift

Posted by Richard Grossman <ri...@gmail.com>.

Great Ran,

I think I've missed the .setDaemon to keep the server alive.
Thanks

Richard

On Sun, Jan 24, 2010 at 12:02 PM, Ran Tavory <ra...@gmail.com> wrote:

> Here's the code I've just written over the weekend and started using in
> test:
>
>
> package com.outbrain.data.cassandra.service;
>
> import java.io.File;
> import java.io.FileOutputStream;
> import java.io.IOException;
> import java.io.InputStream;
> import java.io.OutputStream;
>
> import org.apache.cassandra.config.DatabaseDescriptor;
> import org.apache.cassandra.service.CassandraDaemon;
> import org.apache.cassandra.utils.FileUtils;
> import org.apache.thrift.transport.TTransportException;
> import org.slf4j.Logger;
> import org.slf4j.LoggerFactory;
>
> /**
>  * An in-memory cassandra storage service that listens to the thrift
> interface.
>  * Useful for unit testing,
>  *
>  * @author Ran Tavory (ran@outbain.com)
>  *
>  */
> public class InProcessCassandraServer implements Runnable {
>
>   private static final Logger log =
> LoggerFactory.getLogger(InProcessCassandraServer.class);
>
>   CassandraDaemon cassandraDaemon;
>
>   public void init() {
>     try {
>       prepare();
>     } catch (IOException e) {
>       log.error("Cannot prepare cassandra.", e);
>     }
>     try {
>       cassandraDaemon = new CassandraDaemon();
>       cassandraDaemon.init(null);
>     } catch (TTransportException e) {
>       log.error("TTransportException", e);
>     } catch (IOException e) {
>       log.error("IOException", e);
>     }
>   }
>
>   @Override
>   public void run() {
>     cassandraDaemon.start();
>   }
>
>   public void stop() {
>     cassandraDaemon.stop();
>     rmdir("tmp");
>   }
>
>
>   /**
>    * Creates all files and directories needed
>    * @throws IOException
>    */
>   private void prepare() throws IOException {
>     // delete tmp dir first
>     rmdir("tmp");
>     // make a tmp dir and copy storag-conf.xml and log4j.properties to it
>     copy("/cassandra/storage-conf.xml", "tmp");
>     copy("/cassandra/log4j.properties", "tmp");
>     System.setProperty("storage-config", "tmp");
>
>     // make cassandra directories.
>     for (String s: DatabaseDescriptor.getAllDataFileLocations()) {
>       mkdir(s);
>     }
>     mkdir(DatabaseDescriptor.getBootstrapFileLocation());
>     mkdir(DatabaseDescriptor.getLogFileLocation());
>   }
>
>   /**
>    * Copies a resource from within the jar to a directory.
>    *
>    * @param resourceName
>    * @param directory
>    * @throws IOException
>    */
>   private void copy(String resource, String directory) throws IOException {
>     mkdir(directory);
>     InputStream is = getClass().getResourceAsStream(resource);
>     String fileName = resource.substring(resource.lastIndexOf("/") + 1);
>     File file = new File(directory + System.getProperty("file.separator") +
> fileName);
>     OutputStream out = new FileOutputStream(file);
>     byte buf[] = new byte[1024];
>     int len;
>     while ((len = is.read(buf)) > 0) {
>       out.write(buf, 0, len);
>     }
>     out.close();
>     is.close();
>   }
>
>   /**
>    * Creates a directory
>    * @param dir
>    * @throws IOException
>    */
>   private void mkdir(String dir) throws IOException {
>     FileUtils.createDirectory(dir);
>   }
>
>   /**
>    * Removes a directory from file system
>    * @param dir
>    */
>   private void rmdir(String dir) {
>     FileUtils.deleteDir(new File(dir));
>   }
> }
>
>
> And in the test class:
>
> public class XxxTest {
>
>   private static InProcessCassandraServer cassandra;
>
>   @BeforeClass
>   public static void setup() throws TTransportException, IOException,
> InterruptedException {
>     cassandra = new InProcessCassandraServer();
>     cassandra.init();
>     Thread t = new Thread(cassandra);
>     t.setDaemon(true);
>     t.start();
>   }
>
>   @AfterClass
>   public static void shutdown() {
>     cassandra.stop();
>   }
> ... test
> }
>
> Now you can connect to localhost:9160.
>
> Assumptions:
> The code assumes you have two files in your classpath:
> /cassandra/stogage-config.xml and /cassandra/log4j.xml. This is convenient
> if you use maven, just throw them at /src/test/resources/cassandra/
> If you don't work with maven or would like to configure the configuration
> files differently it should be fairly easy, just change the prepare()
> method.
>
>
>
> On Sun, Jan 24, 2010 at 10:54 AM, Richard Grossman <ri...@gmail.com>wrote:
>
>> So Is there anybody ? Unit testing is important people ...
>> Thanks
>>
>>
>> On Thu, Jan 21, 2010 at 12:09 PM, Richard Grossman <ri...@gmail.com>wrote:
>>
>>> Here is the code I use
>>>     class startServer implements Runnable {
>>>
>>>         @Override
>>>         public void run() {
>>>             try {
>>>                 CassandraDaemon cassandraDaemon = new CassandraDaemon();
>>>                 cassandraDaemon.init(null);
>>>                 cassandraDaemon.start();
>>>             } catch (TTransportException e) {
>>>                 // TODO Auto-generated catch block
>>>                 e.printStackTrace();
>>>             } catch (IOException e) {
>>>                 // TODO Auto-generated catch block
>>>                 e.printStackTrace();
>>>             }
>>>         }
>>>     }
>>>
>>>         Thread thread = new Thread(new startServer());
>>>         thread.start();
>>>
>>> <the code to test here>
>>>
>>>
>>>
>>> On Thu, Jan 21, 2010 at 12:08 PM, Richard Grossman <ri...@gmail.com>wrote:
>>>
>>>> Yes I've seen this and also check it but if I start the server then it
>>>> block the current thread I can continue the test in sequence.
>>>> So I've tried to start into separate thread but no chance too it close
>>>> the server even before I arrive to the code to test.
>>>>
>>>> If you've a trick to start the server in JVM thank
>>>>
>>>> Richard
>>>>
>>>> On Wed, Jan 20, 2010 at 3:47 PM, Jonathan Ellis <jb...@gmail.com>wrote:
>>>>
>>>>> did you look at CassandraDaemon?
>>>>>
>>>>>
>>>
>>
>

Re: How to unit test my code calling Cassandra with Thift

Posted by Jeff Hodges <jh...@twitter.com>.

This looks super useful. I was just going to do similar in ruby to speed up
our unit tests at work. Thanks!
--
Jeff

On Jan 24, 2010 3:56 AM, "Ran Tavory" <ra...@gmail.com> wrote:

agreed on the System.getProperty("java.io.tmpdir")
I can put this under contrib if you think it's useful.

On Sun, Jan 24, 2010 at 1:16 PM, gabriele renzi <rf...@gmail.com> wrote: >
> On Sun, Jan 24, 201...

Re: How to unit test my code calling Cassandra with Thift

Posted by Jonathan Ellis <jb...@gmail.com>.

i would be fine with a patch to xmlutils, with the caveat that we'd
like to move away from xml configuration for 0.7 --
https://issues.apache.org/jira/browse/CASSANDRA-671

On Mon, Jan 25, 2010 at 10:17 AM, Ran Tavory <ra...@gmail.com> wrote:
> yeah, it would. I was doing it under the assumption I don't want to change
> the source for cassandra but I'll work on putting it into contrib and add
> that c'tor as well.
>
> 2010/1/25 Ted Zlatanov <tz...@lifelogs.com>
>>
>> On Sun, 24 Jan 2010 13:56:07 +0200 Ran Tavory <ra...@gmail.com> wrote:
>>
>> RT> On Sun, Jan 24, 2010 at 1:16 PM, gabriele renzi <rf...@gmail.com>
>> wrote:
>>
>> >> On Sun, Jan 24, 2010 at 11:02 AM, Ran Tavory <ra...@gmail.com> wrote:
>> >> > Here's the code I've just written over the weekend and started using
>> >> > in
>> >> > test:
>> >>
>> >> <snip>
>> >> Thanks for sharing :)
>> >> A quick note on the code from a superficial look: instead of the
>> >> hardwired "tmp" string I think it would make more sense to use the
>> >> system's tmp dir (  System.getProperty("java.io.tmpdir")).
>> >>
>> >> I'd say something like this deserves to be present in the cassandra
>> >> distribution, or at least put in some public repo (github,
>> >> code.google, whatever), what do other people think?
>>
>> RT> agreed on the System.getProperty("java.io.tmpdir")
>> RT> I can put this under contrib if you think it's useful.
>>
>> Maybe it would make sense to also add a constructor to XMLUtils to
>> accept a configuration directly from an InputStream instead of just a
>> String filename.  Then all these tmpdir games can be avoided.
>> DocumentBuilder, which is used behind the scenes, already does this so
>> it's a simple patch to add this constructor to XMLUtils.java:
>>
>>    public XMLUtils(InputStream xmlIS) throws ParserConfigurationException,
>> SAXException, IOException
>>    {
>>        DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
>>        DocumentBuilder db = dbf.newDocumentBuilder();
>>        document_ = db.parse(xmlIS);
>>
>>        XPathFactory xpathFactory = XPathFactory.newInstance();
>>        xpath_ = xpathFactory.newXPath();
>>    }
>>
>> Ted
>>
>
>

Re: How to unit test my code calling Cassandra with Thift

Posted by Ran Tavory <ra...@gmail.com>.

yeah, it would. I was doing it under the assumption I don't want to change
the source for cassandra but I'll work on putting it into contrib and add
that c'tor as well.

2010/1/25 Ted Zlatanov <tz...@lifelogs.com>

> On Sun, 24 Jan 2010 13:56:07 +0200 Ran Tavory <ra...@gmail.com> wrote:
>
> RT> On Sun, Jan 24, 2010 at 1:16 PM, gabriele renzi <rf...@gmail.com>
> wrote:
>
> >> On Sun, Jan 24, 2010 at 11:02 AM, Ran Tavory <ra...@gmail.com> wrote:
> >> > Here's the code I've just written over the weekend and started using
> in
> >> > test:
> >>
> >> <snip>
> >> Thanks for sharing :)
> >> A quick note on the code from a superficial look: instead of the
> >> hardwired "tmp" string I think it would make more sense to use the
> >> system's tmp dir (  System.getProperty("java.io.tmpdir")).
> >>
> >> I'd say something like this deserves to be present in the cassandra
> >> distribution, or at least put in some public repo (github,
> >> code.google, whatever), what do other people think?
>
> RT> agreed on the System.getProperty("java.io.tmpdir")
> RT> I can put this under contrib if you think it's useful.
>
> Maybe it would make sense to also add a constructor to XMLUtils to
> accept a configuration directly from an InputStream instead of just a
> String filename.  Then all these tmpdir games can be avoided.
> DocumentBuilder, which is used behind the scenes, already does this so
> it's a simple patch to add this constructor to XMLUtils.java:
>
>    public XMLUtils(InputStream xmlIS) throws ParserConfigurationException,
> SAXException, IOException
>    {
>        DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
>        DocumentBuilder db = dbf.newDocumentBuilder();
>        document_ = db.parse(xmlIS);
>
>        XPathFactory xpathFactory = XPathFactory.newInstance();
>        xpath_ = xpathFactory.newXPath();
>    }
>
> Ted
>
>

Re: How to unit test my code calling Cassandra with Thift

Posted by Ted Zlatanov <tz...@lifelogs.com>.

On Sun, 24 Jan 2010 13:56:07 +0200 Ran Tavory <ra...@gmail.com> wrote: 

RT> On Sun, Jan 24, 2010 at 1:16 PM, gabriele renzi <rf...@gmail.com> wrote:

>> On Sun, Jan 24, 2010 at 11:02 AM, Ran Tavory <ra...@gmail.com> wrote:
>> > Here's the code I've just written over the weekend and started using in
>> > test:
>> 
>> <snip>
>> Thanks for sharing :)
>> A quick note on the code from a superficial look: instead of the
>> hardwired "tmp" string I think it would make more sense to use the
>> system's tmp dir (  System.getProperty("java.io.tmpdir")).
>> 
>> I'd say something like this deserves to be present in the cassandra
>> distribution, or at least put in some public repo (github,
>> code.google, whatever), what do other people think?

RT> agreed on the System.getProperty("java.io.tmpdir")
RT> I can put this under contrib if you think it's useful.

Maybe it would make sense to also add a constructor to XMLUtils to
accept a configuration directly from an InputStream instead of just a
String filename.  Then all these tmpdir games can be avoided.
DocumentBuilder, which is used behind the scenes, already does this so
it's a simple patch to add this constructor to XMLUtils.java:

    public XMLUtils(InputStream xmlIS) throws ParserConfigurationException, SAXException, IOException
    {        
        DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
        DocumentBuilder db = dbf.newDocumentBuilder();
        document_ = db.parse(xmlIS);

        XPathFactory xpathFactory = XPathFactory.newInstance();
        xpath_ = xpathFactory.newXPath();
    }

Ted

Re: How to unit test my code calling Cassandra with Thift

Posted by Ran Tavory <ra...@gmail.com>.

agreed on the System.getProperty("java.io.tmpdir")
I can put this under contrib if you think it's useful.

On Sun, Jan 24, 2010 at 1:16 PM, gabriele renzi <rf...@gmail.com> wrote:

> On Sun, Jan 24, 2010 at 11:02 AM, Ran Tavory <ra...@gmail.com> wrote:
> > Here's the code I've just written over the weekend and started using in
> > test:
>
> <snip>
> Thanks for sharing :)
> A quick note on the code from a superficial look: instead of the
> hardwired "tmp" string I think it would make more sense to use the
> system's tmp dir (  System.getProperty("java.io.tmpdir")).
>
> I'd say something like this deserves to be present in the cassandra
> distribution, or at least put in some public repo (github,
> code.google, whatever), what do other people think?
>

Re: How to unit test my code calling Cassandra with Thift

Posted by gabriele renzi <rf...@gmail.com>.

On Sun, Jan 24, 2010 at 11:02 AM, Ran Tavory <ra...@gmail.com> wrote:
> Here's the code I've just written over the weekend and started using in
> test:

<snip>
Thanks for sharing :)
A quick note on the code from a superficial look: instead of the
hardwired "tmp" string I think it would make more sense to use the
system's tmp dir (  System.getProperty("java.io.tmpdir")).

I'd say something like this deserves to be present in the cassandra
distribution, or at least put in some public repo (github,
code.google, whatever), what do other people think?

Re: How to unit test my code calling Cassandra with Thift

Posted by Ran Tavory <ra...@gmail.com>.

Here's the code I've just written over the weekend and started using in
test:


package com.outbrain.data.cassandra.service;

import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;

import org.apache.cassandra.config.DatabaseDescriptor;
import org.apache.cassandra.service.CassandraDaemon;
import org.apache.cassandra.utils.FileUtils;
import org.apache.thrift.transport.TTransportException;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

/**
 * An in-memory cassandra storage service that listens to the thrift
interface.
 * Useful for unit testing,
 *
 * @author Ran Tavory (ran@outbain.com)
 *
 */
public class InProcessCassandraServer implements Runnable {

  private static final Logger log =
LoggerFactory.getLogger(InProcessCassandraServer.class);

  CassandraDaemon cassandraDaemon;

  public void init() {
    try {
      prepare();
    } catch (IOException e) {
      log.error("Cannot prepare cassandra.", e);
    }
    try {
      cassandraDaemon = new CassandraDaemon();
      cassandraDaemon.init(null);
    } catch (TTransportException e) {
      log.error("TTransportException", e);
    } catch (IOException e) {
      log.error("IOException", e);
    }
  }

  @Override
  public void run() {
    cassandraDaemon.start();
  }

  public void stop() {
    cassandraDaemon.stop();
    rmdir("tmp");
  }


  /**
   * Creates all files and directories needed
   * @throws IOException
   */
  private void prepare() throws IOException {
    // delete tmp dir first
    rmdir("tmp");
    // make a tmp dir and copy storag-conf.xml and log4j.properties to it
    copy("/cassandra/storage-conf.xml", "tmp");
    copy("/cassandra/log4j.properties", "tmp");
    System.setProperty("storage-config", "tmp");

    // make cassandra directories.
    for (String s: DatabaseDescriptor.getAllDataFileLocations()) {
      mkdir(s);
    }
    mkdir(DatabaseDescriptor.getBootstrapFileLocation());
    mkdir(DatabaseDescriptor.getLogFileLocation());
  }

  /**
   * Copies a resource from within the jar to a directory.
   *
   * @param resourceName
   * @param directory
   * @throws IOException
   */
  private void copy(String resource, String directory) throws IOException {
    mkdir(directory);
    InputStream is = getClass().getResourceAsStream(resource);
    String fileName = resource.substring(resource.lastIndexOf("/") + 1);
    File file = new File(directory + System.getProperty("file.separator") +
fileName);
    OutputStream out = new FileOutputStream(file);
    byte buf[] = new byte[1024];
    int len;
    while ((len = is.read(buf)) > 0) {
      out.write(buf, 0, len);
    }
    out.close();
    is.close();
  }

  /**
   * Creates a directory
   * @param dir
   * @throws IOException
   */
  private void mkdir(String dir) throws IOException {
    FileUtils.createDirectory(dir);
  }

  /**
   * Removes a directory from file system
   * @param dir
   */
  private void rmdir(String dir) {
    FileUtils.deleteDir(new File(dir));
  }
}


And in the test class:

public class XxxTest {

  private static InProcessCassandraServer cassandra;

  @BeforeClass
  public static void setup() throws TTransportException, IOException,
InterruptedException {
    cassandra = new InProcessCassandraServer();
    cassandra.init();
    Thread t = new Thread(cassandra);
    t.setDaemon(true);
    t.start();
  }

  @AfterClass
  public static void shutdown() {
    cassandra.stop();
  }
... test
}

Now you can connect to localhost:9160.

Assumptions:
The code assumes you have two files in your classpath:
/cassandra/stogage-config.xml and /cassandra/log4j.xml. This is convenient
if you use maven, just throw them at /src/test/resources/cassandra/
If you don't work with maven or would like to configure the configuration
files differently it should be fairly easy, just change the prepare()
method.



On Sun, Jan 24, 2010 at 10:54 AM, Richard Grossman <ri...@gmail.com>wrote:

> So Is there anybody ? Unit testing is important people ...
> Thanks
>
>
> On Thu, Jan 21, 2010 at 12:09 PM, Richard Grossman <ri...@gmail.com>wrote:
>
>> Here is the code I use
>>     class startServer implements Runnable {
>>
>>         @Override
>>         public void run() {
>>             try {
>>                 CassandraDaemon cassandraDaemon = new CassandraDaemon();
>>                 cassandraDaemon.init(null);
>>                 cassandraDaemon.start();
>>             } catch (TTransportException e) {
>>                 // TODO Auto-generated catch block
>>                 e.printStackTrace();
>>             } catch (IOException e) {
>>                 // TODO Auto-generated catch block
>>                 e.printStackTrace();
>>             }
>>         }
>>     }
>>
>>         Thread thread = new Thread(new startServer());
>>         thread.start();
>>
>> <the code to test here>
>>
>>
>>
>> On Thu, Jan 21, 2010 at 12:08 PM, Richard Grossman <ri...@gmail.com>wrote:
>>
>>> Yes I've seen this and also check it but if I start the server then it
>>> block the current thread I can continue the test in sequence.
>>> So I've tried to start into separate thread but no chance too it close
>>> the server even before I arrive to the code to test.
>>>
>>> If you've a trick to start the server in JVM thank
>>>
>>> Richard
>>>
>>> On Wed, Jan 20, 2010 at 3:47 PM, Jonathan Ellis <jb...@gmail.com>wrote:
>>>
>>>> did you look at CassandraDaemon?
>>>>
>>>>
>>
>

Re: How to unit test my code calling Cassandra with Thift

Posted by Richard Grossman <ri...@gmail.com>.

So Is there anybody ? Unit testing is important people ...
Thanks

On Thu, Jan 21, 2010 at 12:09 PM, Richard Grossman <ri...@gmail.com>wrote:

> Here is the code I use
>     class startServer implements Runnable {
>
>         @Override
>         public void run() {
>             try {
>                 CassandraDaemon cassandraDaemon = new CassandraDaemon();
>                 cassandraDaemon.init(null);
>                 cassandraDaemon.start();
>             } catch (TTransportException e) {
>                 // TODO Auto-generated catch block
>                 e.printStackTrace();
>             } catch (IOException e) {
>                 // TODO Auto-generated catch block
>                 e.printStackTrace();
>             }
>         }
>     }
>
>         Thread thread = new Thread(new startServer());
>         thread.start();
>
> <the code to test here>
>
>
>
> On Thu, Jan 21, 2010 at 12:08 PM, Richard Grossman <ri...@gmail.com>wrote:
>
>> Yes I've seen this and also check it but if I start the server then it
>> block the current thread I can continue the test in sequence.
>> So I've tried to start into separate thread but no chance too it close the
>> server even before I arrive to the code to test.
>>
>> If you've a trick to start the server in JVM thank
>>
>> Richard
>>
>> On Wed, Jan 20, 2010 at 3:47 PM, Jonathan Ellis <jb...@gmail.com>wrote:
>>
>>> did you look at CassandraDaemon?
>>>
>>>
>

Re: How to unit test my code calling Cassandra with Thift

Posted by Richard Grossman <ri...@gmail.com>.

Here is the code I use
    class startServer implements Runnable {

        @Override
        public void run() {
            try {
                CassandraDaemon cassandraDaemon = new CassandraDaemon();
                cassandraDaemon.init(null);
                cassandraDaemon.start();
            } catch (TTransportException e) {
                // TODO Auto-generated catch block
                e.printStackTrace();
            } catch (IOException e) {
                // TODO Auto-generated catch block
                e.printStackTrace();
            }
        }
    }

        Thread thread = new Thread(new startServer());
        thread.start();

<the code to test here>


On Thu, Jan 21, 2010 at 12:08 PM, Richard Grossman <ri...@gmail.com>wrote:

> Yes I've seen this and also check it but if I start the server then it
> block the current thread I can continue the test in sequence.
> So I've tried to start into separate thread but no chance too it close the
> server even before I arrive to the code to test.
>
> If you've a trick to start the server in JVM thank
>
> Richard
>
> On Wed, Jan 20, 2010 at 3:47 PM, Jonathan Ellis <jb...@gmail.com> wrote:
>
>> did you look at CassandraDaemon?
>>
>>

Re: How to unit test my code calling Cassandra with Thift

Posted by Richard Grossman <ri...@gmail.com>.

Yes I've seen this and also check it but if I start the server then it block
the current thread I can continue the test in sequence.
So I've tried to start into separate thread but no chance too it close the
server even before I arrive to the code to test.

If you've a trick to start the server in JVM thank

Richard

On Wed, Jan 20, 2010 at 3:47 PM, Jonathan Ellis <jb...@gmail.com> wrote:

> did you look at CassandraDaemon?
>
>

Re: How to unit test my code calling Cassandra with Thift

Posted by Richard Grossman <ri...@gmail.com>.

yes of course I can mock but I really prefer in case of cassandra to check
with a real server.
thanks I'll check the jmock

On Wed, Jan 20, 2010 at 5:39 PM, Josh <jo...@schulzone.org> wrote:

> I havn't done this in java but so this may sound .netish but:
>
> When I'm writing unit tests for stuff like this I usually mock the
> datastore.  A quick google search makes me think thrift would be pretty easy
> to mock here.  That removes the complexity of getting a cassandra instance
> in place for your unit tests and keeping it up to date with your application
> version of cassandra.
>
> jMock (http://www.jmock.org/) seems like a place to start but again:  Not
> much of a java programmer so ask somebody who knows what they're talking
> about.
>
> josh
>
>
> On Wed, Jan 20, 2010 at 6:47 AM, Jonathan Ellis <jb...@gmail.com> wrote:
>
>> did you look at CassandraDaemon?
>>
>> On Wed, Jan 20, 2010 at 4:44 AM, Richard Grossman <ri...@gmail.com>
>> wrote:
>> > Hi
>> >
>> > I want to write some unitest for code calling cassandra. So my code of
>> > course use Thrift.
>> > I've managed to up the cassandra deamon into my JVM like this :
>> >
>> >         StorageService.instance().initServer();
>> >
>> > Unfortunatly it's doest start the thrift interface so my code can't talk
>> > with the server. Is there any solution ?
>> > perhaps my method is not good.
>> >
>> > Thanks
>> >
>> > Richard
>> >
>>
>
>
>
> --
> josh
> @schulz
> http://schulzone.org
>
>

Re: How to unit test my code calling Cassandra with Thift

Posted by gabriele renzi <rf...@gmail.com>.

On Wed, Jan 20, 2010 at 4:39 PM, Josh <jo...@schulzone.org> wrote:
> I havn't done this in java but so this may sound .netish but:
>
> When I'm writing unit tests for stuff like this I usually mock the
> datastore.  A quick google search makes me think thrift would be pretty easy
> to mock here.  That removes the complexity of getting a cassandra instance
> in place for your unit tests and keeping it up to date with your application
> version of cassandra.
>
> jMock (http://www.jmock.org/) seems like a place to start but again:  Not
> much of a java programmer so ask somebody who knows what they're talking
> about.

I am a fond user of mockito myself (http://mockito.org/) and it's good
for testing code interfacing with the db,
But if the subsystem under test is the database
(driver/connection/logic) I believe testing with an embedded/temp
server is better.

So, I second the op question :)

Re: How to unit test my code calling Cassandra with Thift

Posted by Josh <jo...@schulzone.org>.

I havn't done this in java but so this may sound .netish but:

When I'm writing unit tests for stuff like this I usually mock the
datastore.  A quick google search makes me think thrift would be pretty easy
to mock here.  That removes the complexity of getting a cassandra instance
in place for your unit tests and keeping it up to date with your application
version of cassandra.

jMock (http://www.jmock.org/) seems like a place to start but again:  Not
much of a java programmer so ask somebody who knows what they're talking
about.

josh

On Wed, Jan 20, 2010 at 6:47 AM, Jonathan Ellis <jb...@gmail.com> wrote:

> did you look at CassandraDaemon?
>
> On Wed, Jan 20, 2010 at 4:44 AM, Richard Grossman <ri...@gmail.com>
> wrote:
> > Hi
> >
> > I want to write some unitest for code calling cassandra. So my code of
> > course use Thrift.
> > I've managed to up the cassandra deamon into my JVM like this :
> >
> >         StorageService.instance().initServer();
> >
> > Unfortunatly it's doest start the thrift interface so my code can't talk
> > with the server. Is there any solution ?
> > perhaps my method is not good.
> >
> > Thanks
> >
> > Richard
> >
>

-- 
josh
@schulz
http://schulzone.org

Re: How to unit test my code calling Cassandra with Thift

Posted by Jonathan Ellis <jb...@gmail.com>.

did you look at CassandraDaemon?

On Wed, Jan 20, 2010 at 4:44 AM, Richard Grossman <ri...@gmail.com> wrote:
> Hi
>
> I want to write some unitest for code calling cassandra. So my code of
> course use Thrift.
> I've managed to up the cassandra deamon into my JVM like this :
>
>         StorageService.instance().initServer();
>
> Unfortunatly it's doest start the thrift interface so my code can't talk
> with the server. Is there any solution ?
> perhaps my method is not good.
>
> Thanks
>
> Richard
>