You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Anuj Kabra <an...@thoughtworks.com> on 2010/09/02 18:27:31 UTC

Looking for something like "like" of mysql.

I am working with cassandra-0.6.4. I am working on mail retreival problem.
We have the metadata of mail like sender, recipient, timestamp, subject and
the location of mail file stored in a cassandra DB.Everyday about 25,000
records will

be entered to this DB. We have not finalised on the data model yet but
starting with a simple one having only one column family.
<ColumnFamily name="MailMetadata" CompareWith="UTF8Type">
which have user_id of recipient as key.and columns for sender_id, timestamp
of mail, subject and location of mail file.
Now our Use case is to get the locations of all mail files which are being
sent by a user matching a given subject(can be a part of the original
subject of mail). Well according to my knowledge till now, we can get all
the rows of a user

by using user_id as key. After that i need to iterate over all the rows i
get and see which mail seems to fit the given condition.(matching a subject
in this case), which is very heavy computationally as we would get thousands
of rows.
So we are looking for something like "like" of mysql provided by thrift. I
also need to know if am going the right way.
Help is much appreciated.

Re: Looking for something like "like" of mysql.

Posted by vineet daniel <vi...@gmail.com>.
you can try using different CF for different result sets or inverted index.
but looking at the number of inserts that you have..it will become
complicated. The first thing that you need to do is stop thinking in terms
of any RDBMS as cassandra is not at all like them.
_______________________________________
Regards
Vineet Daniel
+918106217121
_______________________________________

Let your email find you....


On Thu, Sep 2, 2010 at 10:00 PM, Mike Peters <cassandra@softwareprojects.com
> wrote:

>  Cassandra doesn't support adhoc queries, like what you're describing
>
> I recommend looking at Lucandra <http://github.com/tjake/Lucandra>
>
>
> On 9/2/2010 12:27 PM, Anuj Kabra wrote:
>
> I am working with cassandra-0.6.4. I am working on mail retreival problem.
> We have the metadata of mail like sender, recipient, timestamp, subject and
> the location of mail file stored in a cassandra DB.Everyday about 25,000
> records will
>
> be entered to this DB. We have not finalised on the data model yet but
> starting with a simple one having only one column family.
> <ColumnFamily name="MailMetadata" CompareWith="UTF8Type">
> which have user_id of recipient as key.and columns for sender_id, timestamp
> of mail, subject and location of mail file.
> Now our Use case is to get the locations of all mail files which are being
> sent by a user matching a given subject(can be a part of the original
> subject of mail). Well according to my knowledge till now, we can get all
> the rows of a user
>
> by using user_id as key. After that i need to iterate over all the rows i
> get and see which mail seems to fit the given condition.(matching a subject
> in this case), which is very heavy computationally as we would get thousands
> of rows.
> So we are looking for something like "like" of mysql provided by thrift. I
> also need to know if am going the right way.
> Help is much appreciated.
>
>
>

Re: Looking for something like "like" of mysql.

Posted by Mike Peters <ca...@softwareprojects.com>.
  Cassandra doesn't support adhoc queries, like what you're describing

I recommend looking at Lucandra <http://github.com/tjake/Lucandra>

On 9/2/2010 12:27 PM, Anuj Kabra wrote:
> I am working with cassandra-0.6.4. I am working on mail retreival 
> problem. We have the metadata of mail like sender, recipient, 
> timestamp, subject and the location of mail file stored in a cassandra 
> DB.Everyday about 25,000 records will
>
> be entered to this DB. We have not finalised on the data model yet but 
> starting with a simple one having only one column family.
> <ColumnFamily name="MailMetadata" CompareWith="UTF8Type">
> which have user_id of recipient as key.and columns for sender_id, 
> timestamp of mail, subject and location of mail file.
> Now our Use case is to get the locations of all mail files which are 
> being sent by a user matching a given subject(can be a part of the 
> original subject of mail). Well according to my knowledge till now, we 
> can get all the rows of a user
>
> by using user_id as key. After that i need to iterate over all the 
> rows i get and see which mail seems to fit the given 
> condition.(matching a subject in this case), which is very heavy 
> computationally as we would get thousands of rows.
> So we are looking for something like "like" of mysql provided by 
> thrift. I also need to know if am going the right way.
> Help is much appreciated.
>