You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Chen Xinli <ch...@gmail.com> on 2010/07/19 05:25:13 UTC

goods search with cassandra

Hi,

I want to implement goods search with cassandra; and I have some confusings.
Can someone help me out?

The case is that:
There are about 1 million shops, every shop with about 10,000 goods, every
goods with property like "title", "price" etc..
The search is like "give me 10 goods in a specific shop and the price of the
goods should be less than  10$"

For the data model, I use shop name as the key; goods id as the column name
and "title", "price" are special encoded as column value .
There are too many goods in one shop, filtering the data in thrift client is
impossible for network transferring reason.
I want to implement a special ColumnValueFilter extends QueryFilter to get
the result in "local".
Is this the best way?


Insertion of goods is about 100/second for the whole cluster, so a thrift
client for insertion is ok.
For reads, latency and qps are important and I must provide a http service
for user searching.
Embedding a thrift client in such a service will involve another network
transferring, so I want to build the service on top of cassandra directly.
I reviewed the code of ClientOnlyExample.java.
What makes me confusing is that: insertion through thrift client and reading
through using cassandra directly, is data consistency promised and how?

Any help is appreciated. Thanks!

-- 
Best Regards,
Chen Xinli

Re: goods search with cassandra

Posted by Jonathan Shook <js...@gmail.com>.
It would be better to structure your data around your access patterns.
(discussion of which goes on the user list)
I'd be happy to elaborate on there.

On Mon, Jul 19, 2010 at 9:29 AM, Chen Xinli <ch...@gmail.com> wrote:
> Can someone give me some advices?
>
> Thanks!
>
> ---------- Forwarded message ----------
> From: Chen Xinli <ch...@gmail.com>
> Date: 2010/7/19
> Subject: goods search with cassandra
> To: user@cassandra.apache.org
>
>
> Hi,
>
> I want to implement goods search with cassandra; and I have some confusings.
> Can someone help me out?
>
> The case is that:
> There are about 1 million shops, every shop with about 10,000 goods, every
> goods with property like "title", "price" etc..
> The search is like "give me 10 goods in a specific shop and the price of the
> goods should be less than  10$"
>
> For the data model, I use shop name as the key; goods id as the column name
> and "title", "price" are special encoded as column value .
> There are too many goods in one shop, filtering the data in thrift client is
> impossible for network transferring reason.
> I want to implement a special ColumnValueFilter extends QueryFilter to get
> the result in "local".
> Is this the best way?
>
>
> Insertion of goods is about 100/second for the whole cluster, so a thrift
> client for insertion is ok.
> For reads, latency and qps are important and I must provide a http service
> for user searching.
> Embedding a thrift client in such a service will involve another network
> transferring, so I want to build the service on top of cassandra directly.
> I reviewed the code of ClientOnlyExample.java.
> What makes me confusing is that: insertion through thrift client and reading
> through using cassandra directly, is data consistency promised and how?
>
> Any help is appreciated. Thanks!
>
> --
> Best Regards,
> Chen Xinli
>
>
>
> --
> Best Regards,
> Chen Xinli
>

Re:Fwd: goods search with cassandra

Posted by 李楠 <5l...@163.com>.
呵呵,没啥建议,帮顶。

--
MSN:coding_boy@hotmail.com 手机:13681040563 希望大家都开心。 




>Can someone give me some advices?
>
>Thanks!
>
>---------- Forwarded message ----------
>From: Chen Xinli <ch...@gmail.com>
>Date: 2010/7/19
>Subject: goods search with cassandra
>To: user@cassandra.apache.org
>
>
>Hi,
>
>I want to implement goods search with cassandra; and I have some confusings.
>Can someone help me out?
>
>The case is that:
>There are about 1 million shops, every shop with about 10,000 goods, every
>goods with property like "title", "price" etc..
>The search is like "give me 10 goods in a specific shop and the price of the
>goods should be less than  10$"
>
>For the data model, I use shop name as the key; goods id as the column name
>and "title", "price" are special encoded as column value .
>There are too many goods in one shop, filtering the data in thrift client is
>impossible for network transferring reason.
>I want to implement a special ColumnValueFilter extends QueryFilter to get
>the result in "local".
>Is this the best way?
>
>
>Insertion of goods is about 100/second for the whole cluster, so a thrift
>client for insertion is ok.
>For reads, latency and qps are important and I must provide a http service
>for user searching.
>Embedding a thrift client in such a service will involve another network
>transferring, so I want to build the service on top of cassandra directly.
>I reviewed the code of ClientOnlyExample.java.
>What makes me confusing is that: insertion through thrift client and reading
>through using cassandra directly, is data consistency promised and how?
>
>Any help is appreciated. Thanks!
>
>-- 
>Best Regards,
>Chen Xinli
>
>
>
>-- 
>Best Regards,
>Chen Xinli

Fwd: goods search with cassandra

Posted by Chen Xinli <ch...@gmail.com>.
Can someone give me some advices?

Thanks!

---------- Forwarded message ----------
From: Chen Xinli <ch...@gmail.com>
Date: 2010/7/19
Subject: goods search with cassandra
To: user@cassandra.apache.org


Hi,

I want to implement goods search with cassandra; and I have some confusings.
Can someone help me out?

The case is that:
There are about 1 million shops, every shop with about 10,000 goods, every
goods with property like "title", "price" etc..
The search is like "give me 10 goods in a specific shop and the price of the
goods should be less than  10$"

For the data model, I use shop name as the key; goods id as the column name
and "title", "price" are special encoded as column value .
There are too many goods in one shop, filtering the data in thrift client is
impossible for network transferring reason.
I want to implement a special ColumnValueFilter extends QueryFilter to get
the result in "local".
Is this the best way?


Insertion of goods is about 100/second for the whole cluster, so a thrift
client for insertion is ok.
For reads, latency and qps are important and I must provide a http service
for user searching.
Embedding a thrift client in such a service will involve another network
transferring, so I want to build the service on top of cassandra directly.
I reviewed the code of ClientOnlyExample.java.
What makes me confusing is that: insertion through thrift client and reading
through using cassandra directly, is data consistency promised and how?

Any help is appreciated. Thanks!

-- 
Best Regards,
Chen Xinli



-- 
Best Regards,
Chen Xinli

Re: goods search with cassandra

Posted by Chen Xinli <ch...@gmail.com>.
Thanks for your suggestion.

Does it work if insertion through thrift client, and reading through
cassandra directly like ClientOnlyExample?

2010/7/21 Santal Li <sa...@gmail.com>

> I think build a ColumnValueFilter isn't a good idea, you really needs was a
> self defined index, otherwise filter will cause too many scan and disk IO.
>
> we have meet almost same problem as yours in our own webapp: store data in
> one fields, then get data by search on another fields. Our solution is
> create a new KeySpace for index, them maintains the index by query
> conditions at application. Suggest you read this document, for get
> basic idea
> http://code.google.com/intl/zh-CN/appengine/articles/index_building.html .
>
> if you using this solution, maybe you need consider bellow issue:
> 1. multi client concurrent access
> 2. index and object data maybe inconsistence during error.
>
> Some kind of lock service maybe help, like ZooKeeper.
>
> Regards
> -Santal
>
>
>
> 2010/7/19 Chen Xinli <ch...@gmail.com>
>
> Hi,
>>
>> I want to implement goods search with cassandra; and I have some
>> confusings. Can someone help me out?
>>
>> The case is that:
>> There are about 1 million shops, every shop with about 10,000 goods, every
>> goods with property like "title", "price" etc..
>> The search is like "give me 10 goods in a specific shop and the price of
>> the goods should be less than  10$"
>>
>> For the data model, I use shop name as the key; goods id as the column
>> name and "title", "price" are special encoded as column value .
>> There are too many goods in one shop, filtering the data in thrift client
>> is impossible for network transferring reason.
>> I want to implement a special ColumnValueFilter extends QueryFilter to get
>> the result in "local".
>> Is this the best way?
>>
>>
>> Insertion of goods is about 100/second for the whole cluster, so a thrift
>> client for insertion is ok.
>> For reads, latency and qps are important and I must provide a http service
>> for user searching.
>> Embedding a thrift client in such a service will involve another network
>> transferring, so I want to build the service on top of cassandra directly.
>> I reviewed the code of ClientOnlyExample.java.
>> What makes me confusing is that: insertion through thrift client and
>> reading through using cassandra directly, is data consistency promised and
>> how?
>>
>> Any help is appreciated. Thanks!
>>
>> --
>> Best Regards,
>> Chen Xinli
>>
>
>


-- 
Best Regards,
Chen Xinli

Re: goods search with cassandra

Posted by Santal Li <sa...@gmail.com>.
I think build a ColumnValueFilter isn't a good idea, you really needs was a
self defined index, otherwise filter will cause too many scan and disk IO.

we have meet almost same problem as yours in our own webapp: store data in
one fields, then get data by search on another fields. Our solution is
create a new KeySpace for index, them maintains the index by query
conditions at application. Suggest you read this document, for get
basic idea
http://code.google.com/intl/zh-CN/appengine/articles/index_building.html .

if you using this solution, maybe you need consider bellow issue:
1. multi client concurrent access
2. index and object data maybe inconsistence during error.

Some kind of lock service maybe help, like ZooKeeper.

Regards
-Santal



2010/7/19 Chen Xinli <ch...@gmail.com>

> Hi,
>
> I want to implement goods search with cassandra; and I have some
> confusings. Can someone help me out?
>
> The case is that:
> There are about 1 million shops, every shop with about 10,000 goods, every
> goods with property like "title", "price" etc..
> The search is like "give me 10 goods in a specific shop and the price of
> the goods should be less than  10$"
>
> For the data model, I use shop name as the key; goods id as the column name
> and "title", "price" are special encoded as column value .
> There are too many goods in one shop, filtering the data in thrift client
> is impossible for network transferring reason.
> I want to implement a special ColumnValueFilter extends QueryFilter to get
> the result in "local".
> Is this the best way?
>
>
> Insertion of goods is about 100/second for the whole cluster, so a thrift
> client for insertion is ok.
> For reads, latency and qps are important and I must provide a http service
> for user searching.
> Embedding a thrift client in such a service will involve another network
> transferring, so I want to build the service on top of cassandra directly.
> I reviewed the code of ClientOnlyExample.java.
> What makes me confusing is that: insertion through thrift client and
> reading through using cassandra directly, is data consistency promised and
> how?
>
> Any help is appreciated. Thanks!
>
> --
> Best Regards,
> Chen Xinli
>