You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Krishna Sankar <ks...@gmail.com> on 2010/03/12 20:55:12 UTC

Cassandra Demo/Tutorial Applications

I was looking at this from CASSANDRA-873 as well as hands-on homework (!)
for my OSCON tutorial. Have couple of questions. Would appreciate insights:

A)  Cassandra-873 suggests Luenandra as one demo application
B)  Are there other ideas that will bring out the various aspects of
Cassandra ?
C)  What would be the goal of demo apps ? Tutorial to help folks learn the
ins and outs of Cassandra ? Show case capabilities ? I think Cassandra-873
belongs to the latter; Twissandra most probably belongs to the former.
D)  Hadoop on Cassandra might be a good demo/tutorial
E)  How would one structure the infrastructure for the demo/tutorials ? What
assumptions can we make in creating them ? As AMIs to be run in EC2 ? Also
to be run on 2-3 local machines for folks who can spare some ? Or as
multiple processes - all in one machine ? What is an optimum configuration
for learning and demo ? We need to make it simple (to reflect the domain)
but not simpler.
F)  Am looking for ideas from developers and users - hence the cross
posting. I hope apache mailer is smart enough to dedup - will find it soon
...

Cheers
<k/>    



Re: Cassandra Demo/Tutorial Applications

Posted by Krishna Sankar <ks...@gmail.com>.
Thanks Guys for the response.

Agreed, I won't be able to do all for my talk - in fact I might defer a lot
of hands-on Cassandra to Eric's PM session.

My question on multiple machines and EC2 was more for Cassandra-873 where we
want to have a set of good hands-on tutorials; while much simpler than
actual production systems, still capture the essentials of a Cassandra
infrastructure. And this also can be a homework for the attendees.

Cheers
<k/>

On 3/12/10 Fri Mar 12, 10, "Jonathan Ellis" <jb...@gmail.com> wrote:

> Also http://aws.amazon.com/publicdatasets/.
> 
> On Fri, Mar 12, 2010 at 11:59 PM, Ian Holsman <ia...@holsman.net> wrote:
>> There are several large data sets on the net you could use to build. Demo
>> with.
>> Search logs, wikipedia, uk govt stuff
>> Dbpedia may be interesting as they have some of the stuff extracted out
>> 
>> 
>> ---
>> Sent from my phone
>> Ian Holsman - 703 879-3128
>> 
>> On 13/03/2010, at 4:46 PM, Jonathan Ellis <jb...@gmail.com> wrote:
>> 
>>> On Fri, Mar 12, 2010 at 1:55 PM, Krishna Sankar <ks...@gmail.com>
>>> wrote:
>>>> 
>>>> I was looking at this from CASSANDRA-873 as well as hands-on homework (!)
>>>> for my OSCON tutorial. Have couple of questions. Would appreciate
>>>> insights:
>>>> 
>>>> A)  Cassandra-873 suggests Luenandra as one demo application
>>>> B)  Are there other ideas that will bring out the various aspects of
>>>> Cassandra ?
>>> 
>>> multi-user blog (single-user is too easy :)
>>> - extra credit: with full-text search using lucandra
>>> 
>>> discussion forum
>>> - also w/ FTS
>>> 
>>>> C)  What would be the goal of demo apps ? Tutorial to help folks learn
>>>> the
>>>> ins and outs of Cassandra ? Show case capabilities ? I think
>>>> Cassandra-873
>>>> belongs to the latter; Twissandra most probably belongs to the former.
>>> 
>>> I think you nailed it.
>>> 
>>>> D)  Hadoop on Cassandra might be a good demo/tutorial
>>> 
>>> Sure, I'll buy that.
>>> 
>>> I can't think of any standalone projects for that, but "compute a
>>> twissandra tag cloud" would be pretty cool.  (Might need to write a
>>> twissandra bot to load stuff in to make an interesting cloud. :)
>>> 
>>>> E)  How would one structure the infrastructure for the demo/tutorials ?
>>>> What
>>>> assumptions can we make in creating them ? As AMIs to be run in EC2 ?
>>> 
>>> I'd probably go with "virtualbox images" as being simpler for people
>>> who don't have an AWS key already.  (VB can read vmware player images,
>>> i think.  But there is no free vmware for OS X, so you'd want to check
>>> that before going w/ vmware format.)
>>> 
>>> Or just have people d/l cassandra and a configuration xml.  Probably
>>> easier than teaching people to use virtualbox who haven't before.
>>> 
>>>> Also
>>>> to be run on 2-3 local machines for folks who can spare some ? Or as
>>>> multiple processes - all in one machine ?
>>> 
>>> You're not going to have time to teach cluster management.  Keep it to 1.
>> 



Re: Cassandra Demo/Tutorial Applications

Posted by Jonathan Ellis <jb...@gmail.com>.
Also http://aws.amazon.com/publicdatasets/.

On Fri, Mar 12, 2010 at 11:59 PM, Ian Holsman <ia...@holsman.net> wrote:
> There are several large data sets on the net you could use to build. Demo
> with.
> Search logs, wikipedia, uk govt stuff
> Dbpedia may be interesting as they have some of the stuff extracted out
>
>
> ---
> Sent from my phone
> Ian Holsman - 703 879-3128
>
> On 13/03/2010, at 4:46 PM, Jonathan Ellis <jb...@gmail.com> wrote:
>
>> On Fri, Mar 12, 2010 at 1:55 PM, Krishna Sankar <ks...@gmail.com>
>> wrote:
>>>
>>> I was looking at this from CASSANDRA-873 as well as hands-on homework (!)
>>> for my OSCON tutorial. Have couple of questions. Would appreciate
>>> insights:
>>>
>>> A)  Cassandra-873 suggests Luenandra as one demo application
>>> B)  Are there other ideas that will bring out the various aspects of
>>> Cassandra ?
>>
>> multi-user blog (single-user is too easy :)
>> - extra credit: with full-text search using lucandra
>>
>> discussion forum
>> - also w/ FTS
>>
>>> C)  What would be the goal of demo apps ? Tutorial to help folks learn
>>> the
>>> ins and outs of Cassandra ? Show case capabilities ? I think
>>> Cassandra-873
>>> belongs to the latter; Twissandra most probably belongs to the former.
>>
>> I think you nailed it.
>>
>>> D)  Hadoop on Cassandra might be a good demo/tutorial
>>
>> Sure, I'll buy that.
>>
>> I can't think of any standalone projects for that, but "compute a
>> twissandra tag cloud" would be pretty cool.  (Might need to write a
>> twissandra bot to load stuff in to make an interesting cloud. :)
>>
>>> E)  How would one structure the infrastructure for the demo/tutorials ?
>>> What
>>> assumptions can we make in creating them ? As AMIs to be run in EC2 ?
>>
>> I'd probably go with "virtualbox images" as being simpler for people
>> who don't have an AWS key already.  (VB can read vmware player images,
>> i think.  But there is no free vmware for OS X, so you'd want to check
>> that before going w/ vmware format.)
>>
>> Or just have people d/l cassandra and a configuration xml.  Probably
>> easier than teaching people to use virtualbox who haven't before.
>>
>>> Also
>>> to be run on 2-3 local machines for folks who can spare some ? Or as
>>> multiple processes - all in one machine ?
>>
>> You're not going to have time to teach cluster management.  Keep it to 1.
>

Re: Cassandra Demo/Tutorial Applications

Posted by Ian Holsman <ia...@holsman.net>.
There are several large data sets on the net you could use to build.  
Demo with.
Search logs, wikipedia, uk govt stuff
Dbpedia may be interesting as they have some of the stuff extracted out


---
Sent from my phone
Ian Holsman - 703 879-3128

On 13/03/2010, at 4:46 PM, Jonathan Ellis <jb...@gmail.com> wrote:

> On Fri, Mar 12, 2010 at 1:55 PM, Krishna Sankar  
> <ks...@gmail.com> wrote:
>> I was looking at this from CASSANDRA-873 as well as hands-on  
>> homework (!)
>> for my OSCON tutorial. Have couple of questions. Would appreciate  
>> insights:
>>
>> A)  Cassandra-873 suggests Luenandra as one demo application
>> B)  Are there other ideas that will bring out the various aspects of
>> Cassandra ?
>
> multi-user blog (single-user is too easy :)
> - extra credit: with full-text search using lucandra
>
> discussion forum
> - also w/ FTS
>
>> C)  What would be the goal of demo apps ? Tutorial to help folks  
>> learn the
>> ins and outs of Cassandra ? Show case capabilities ? I think  
>> Cassandra-873
>> belongs to the latter; Twissandra most probably belongs to the  
>> former.
>
> I think you nailed it.
>
>> D)  Hadoop on Cassandra might be a good demo/tutorial
>
> Sure, I'll buy that.
>
> I can't think of any standalone projects for that, but "compute a
> twissandra tag cloud" would be pretty cool.  (Might need to write a
> twissandra bot to load stuff in to make an interesting cloud. :)
>
>> E)  How would one structure the infrastructure for the demo/ 
>> tutorials ? What
>> assumptions can we make in creating them ? As AMIs to be run in EC2 ?
>
> I'd probably go with "virtualbox images" as being simpler for people
> who don't have an AWS key already.  (VB can read vmware player images,
> i think.  But there is no free vmware for OS X, so you'd want to check
> that before going w/ vmware format.)
>
> Or just have people d/l cassandra and a configuration xml.  Probably
> easier than teaching people to use virtualbox who haven't before.
>
>> Also
>> to be run on 2-3 local machines for folks who can spare some ? Or as
>> multiple processes - all in one machine ?
>
> You're not going to have time to teach cluster management.  Keep it  
> to 1.

Re: Cassandra Demo/Tutorial Applications

Posted by Vick Khera <vi...@khera.org>.
On Sat, Mar 13, 2010 at 1:46 AM, Jonathan Ellis <jb...@gmail.com> wrote:
> I'd probably go with "virtualbox images" as being simpler for people
> who don't have an AWS key already.  (VB can read vmware player images,
> i think.  But there is no free vmware for OS X, so you'd want to check
> that before going w/ vmware format.)

VirtualBox will read VMWare "vmdk" disk image files.  It will not
import nor use the whole VM.  You have to make a new VM in VirtualBox
and attach it to the vmdk disk image and then it works just perfectly.
 I'd say the vmware format is just fine.  Just be sure to delete any
"snapshots" in your VMware before distributing it.

Re: Cassandra Demo/Tutorial Applications

Posted by Vick Khera <vi...@khera.org>.
On Sat, Mar 13, 2010 at 1:46 AM, Jonathan Ellis <jb...@gmail.com> wrote:
> I'd probably go with "virtualbox images" as being simpler for people
> who don't have an AWS key already.  (VB can read vmware player images,
> i think.  But there is no free vmware for OS X, so you'd want to check
> that before going w/ vmware format.)

VirtualBox will read VMWare "vmdk" disk image files.  It will not
import nor use the whole VM.  You have to make a new VM in VirtualBox
and attach it to the vmdk disk image and then it works just perfectly.
 I'd say the vmware format is just fine.  Just be sure to delete any
"snapshots" in your VMware before distributing it.

Re: Cassandra Demo/Tutorial Applications

Posted by Ian Holsman <ia...@holsman.net>.
There are several large data sets on the net you could use to build.  
Demo with.
Search logs, wikipedia, uk govt stuff
Dbpedia may be interesting as they have some of the stuff extracted out


---
Sent from my phone
Ian Holsman - 703 879-3128

On 13/03/2010, at 4:46 PM, Jonathan Ellis <jb...@gmail.com> wrote:

> On Fri, Mar 12, 2010 at 1:55 PM, Krishna Sankar  
> <ks...@gmail.com> wrote:
>> I was looking at this from CASSANDRA-873 as well as hands-on  
>> homework (!)
>> for my OSCON tutorial. Have couple of questions. Would appreciate  
>> insights:
>>
>> A)  Cassandra-873 suggests Luenandra as one demo application
>> B)  Are there other ideas that will bring out the various aspects of
>> Cassandra ?
>
> multi-user blog (single-user is too easy :)
> - extra credit: with full-text search using lucandra
>
> discussion forum
> - also w/ FTS
>
>> C)  What would be the goal of demo apps ? Tutorial to help folks  
>> learn the
>> ins and outs of Cassandra ? Show case capabilities ? I think  
>> Cassandra-873
>> belongs to the latter; Twissandra most probably belongs to the  
>> former.
>
> I think you nailed it.
>
>> D)  Hadoop on Cassandra might be a good demo/tutorial
>
> Sure, I'll buy that.
>
> I can't think of any standalone projects for that, but "compute a
> twissandra tag cloud" would be pretty cool.  (Might need to write a
> twissandra bot to load stuff in to make an interesting cloud. :)
>
>> E)  How would one structure the infrastructure for the demo/ 
>> tutorials ? What
>> assumptions can we make in creating them ? As AMIs to be run in EC2 ?
>
> I'd probably go with "virtualbox images" as being simpler for people
> who don't have an AWS key already.  (VB can read vmware player images,
> i think.  But there is no free vmware for OS X, so you'd want to check
> that before going w/ vmware format.)
>
> Or just have people d/l cassandra and a configuration xml.  Probably
> easier than teaching people to use virtualbox who haven't before.
>
>> Also
>> to be run on 2-3 local machines for folks who can spare some ? Or as
>> multiple processes - all in one machine ?
>
> You're not going to have time to teach cluster management.  Keep it  
> to 1.

Re: Cassandra Demo/Tutorial Applications

Posted by Ronald Bradford <ro...@gmail.com>.
I collated a list of public data last year,  you can check out
http://ronaldbradford.com/blog/seeking-public-data-for-benchmarks-2009-08-28/

I use VirtualBox when on Mac. It's free and it's trivial to create your own
images.

On Sat, Mar 13, 2010 at 5:01 AM, Christopher Brind <
christopher.brind@googlemail.com> wrote:

>
>> > E)  How would one structure the infrastructure for the demo/tutorials ?
>> What
>> > assumptions can we make in creating them ? As AMIs to be run in EC2 ?
>>
>> I'd probably go with "virtualbox images" as being simpler for people
>> who don't have an AWS key already.  (VB can read vmware player images,
>> i think.  But there is no free vmware for OS X, so you'd want to check
>> that before going w/ vmware format.)
>>
>>
> VirtualBox runs on Mac just fine and from the user manual:
>
> VirtualBox also fully supports the popular and open VMDK container format
> that is used by many other virtualization products, in particular, by
> VMware.3
>
> ... so that should be OK for Mac.
>
> Cheers,
> Chris
>
>

Re: Cassandra Demo/Tutorial Applications

Posted by Christopher Brind <ch...@googlemail.com>.
>
>
> > E)  How would one structure the infrastructure for the demo/tutorials ?
> What
> > assumptions can we make in creating them ? As AMIs to be run in EC2 ?
>
> I'd probably go with "virtualbox images" as being simpler for people
> who don't have an AWS key already.  (VB can read vmware player images,
> i think.  But there is no free vmware for OS X, so you'd want to check
> that before going w/ vmware format.)
>
>
VirtualBox runs on Mac just fine and from the user manual:

VirtualBox also fully supports the popular and open VMDK container format
that is used by many other virtualization products, in particular, by
VMware.3

... so that should be OK for Mac.

Cheers,
Chris

Re: Cassandra Demo/Tutorial Applications

Posted by Jonathan Ellis <jb...@gmail.com>.
On Fri, Mar 12, 2010 at 1:55 PM, Krishna Sankar <ks...@gmail.com> wrote:
> I was looking at this from CASSANDRA-873 as well as hands-on homework (!)
> for my OSCON tutorial. Have couple of questions. Would appreciate insights:
>
> A)  Cassandra-873 suggests Luenandra as one demo application
> B)  Are there other ideas that will bring out the various aspects of
> Cassandra ?

multi-user blog (single-user is too easy :)
 - extra credit: with full-text search using lucandra

discussion forum
 - also w/ FTS

> C)  What would be the goal of demo apps ? Tutorial to help folks learn the
> ins and outs of Cassandra ? Show case capabilities ? I think Cassandra-873
> belongs to the latter; Twissandra most probably belongs to the former.

I think you nailed it.

> D)  Hadoop on Cassandra might be a good demo/tutorial

Sure, I'll buy that.

I can't think of any standalone projects for that, but "compute a
twissandra tag cloud" would be pretty cool.  (Might need to write a
twissandra bot to load stuff in to make an interesting cloud. :)

> E)  How would one structure the infrastructure for the demo/tutorials ? What
> assumptions can we make in creating them ? As AMIs to be run in EC2 ?

I'd probably go with "virtualbox images" as being simpler for people
who don't have an AWS key already.  (VB can read vmware player images,
i think.  But there is no free vmware for OS X, so you'd want to check
that before going w/ vmware format.)

Or just have people d/l cassandra and a configuration xml.  Probably
easier than teaching people to use virtualbox who haven't before.

> Also
> to be run on 2-3 local machines for folks who can spare some ? Or as
> multiple processes - all in one machine ?

You're not going to have time to teach cluster management.  Keep it to 1.

Re: Cassandra Demo/Tutorial Applications

Posted by Jonathan Ellis <jb...@gmail.com>.
On Fri, Mar 12, 2010 at 1:55 PM, Krishna Sankar <ks...@gmail.com> wrote:
> I was looking at this from CASSANDRA-873 as well as hands-on homework (!)
> for my OSCON tutorial. Have couple of questions. Would appreciate insights:
>
> A)  Cassandra-873 suggests Luenandra as one demo application
> B)  Are there other ideas that will bring out the various aspects of
> Cassandra ?

multi-user blog (single-user is too easy :)
 - extra credit: with full-text search using lucandra

discussion forum
 - also w/ FTS

> C)  What would be the goal of demo apps ? Tutorial to help folks learn the
> ins and outs of Cassandra ? Show case capabilities ? I think Cassandra-873
> belongs to the latter; Twissandra most probably belongs to the former.

I think you nailed it.

> D)  Hadoop on Cassandra might be a good demo/tutorial

Sure, I'll buy that.

I can't think of any standalone projects for that, but "compute a
twissandra tag cloud" would be pretty cool.  (Might need to write a
twissandra bot to load stuff in to make an interesting cloud. :)

> E)  How would one structure the infrastructure for the demo/tutorials ? What
> assumptions can we make in creating them ? As AMIs to be run in EC2 ?

I'd probably go with "virtualbox images" as being simpler for people
who don't have an AWS key already.  (VB can read vmware player images,
i think.  But there is no free vmware for OS X, so you'd want to check
that before going w/ vmware format.)

Or just have people d/l cassandra and a configuration xml.  Probably
easier than teaching people to use virtualbox who haven't before.

> Also
> to be run on 2-3 local machines for folks who can spare some ? Or as
> multiple processes - all in one machine ?

You're not going to have time to teach cluster management.  Keep it to 1.