You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cassandra.apache.org by "@Nandan@" <na...@gmail.com> on 2017/06/12 02:40:20 UTC

Reg:- Cassandra Data modelling for Search

Hi,

Currently, I am working on data modeling for Video Company in which we have
different types of users as well as different user functionality.
But currently, my concern is about Search video module based on different
fields.

Query patterns are as below:-
1) Select video by actor.
2) select video by producer.
3) select video by music.
4) select video by actor and producer.
5) select video by actor and music.

Note: - In short, We want to establish an advanced search module by which
we can search by anyway and get the desired results.

During a search , we need partial search also such that if any user can
search "Harry" title, then we are able to give them result as all videos
whose
 title contains "Harry" at any location.

As per my ideas, I have to create separate tables such as video_by_actor,
video_by_producer etc.. and implement solr query on all tables. Otherwise,
is there any others way by which we can implement this search module
effectively.

Please suggest.

Best regards,

Re: Reg:- Cassandra Data modelling for Search

Posted by Roi Sudman <ro...@liveperson.com>.
I as well think that ES is much better match.
With ES you will be able to extends your movie DB much more easily like
adding translation,year  or other fields.
Searching with multiple words , range scan, and more which you will get out
of the box .
All in all make me as well suggest to use ES.




On Mon, Jun 12, 2017 at 10:50 AM, @Nandan@ <na...@gmail.com>
wrote:

> But Condition is , I am working with Apache Cassandra Database in which I
> have to store my data into Cassandra and then have to implement partial
> search capability.
> If we need to search based on full search  primary key, then it really best
> and easy to work with Cassandra , but in case of flexible search , I am
> getting confused.
>
>
> On Mon, Jun 12, 2017 at 3:47 PM, Oskar Kjellin <os...@gmail.com>
> wrote:
>
> > I haven't run solr with Cassandra myself. I just meant to run
> > elasticsearch as a completely separate service and write there as well.
> >
> > On 12 Jun 2017, at 09:45, @Nandan@ <na...@gmail.com>
> wrote:
> >
> > Do you mean to use Elastic Search with Cassandra?
> > Even I am thinking to use Apache Solr With Cassandra.
> > In that case I have to create distributed tables such as:-
> > 1) video_by_title, video_by_actor, video_by_year  etc..
> > 2) After creating Tables , will have to configure solr core on all
> tables.
> >
> > Is it like this ?
> >
> >
> >
> >
> >
> > On Mon, Jun 12, 2017 at 3:19 PM, Oskar Kjellin <os...@gmail.com>
> > wrote:
> >
> >> Why not elasticsearch for this use case? It will make your life much
> >> simpler
> >>
> >> > On 12 Jun 2017, at 04:40, @Nandan@ <na...@gmail.com>
> >> wrote:
> >> >
> >> > Hi,
> >> >
> >> > Currently, I am working on data modeling for Video Company in which we
> >> have different types of users as well as different user functionality.
> >> > But currently, my concern is about Search video module based on
> >> different fields.
> >> >
> >> > Query patterns are as below:-
> >> > 1) Select video by actor.
> >> > 2) select video by producer.
> >> > 3) select video by music.
> >> > 4) select video by actor and producer.
> >> > 5) select video by actor and music.
> >> >
> >> > Note: - In short, We want to establish an advanced search module by
> >> which we can search by anyway and get the desired results.
> >> >
> >> > During a search , we need partial search also such that if any user
> can
> >> search "Harry" title, then we are able to give them result as all videos
> >> whose
> >> >  title contains "Harry" at any location.
> >> >
> >> > As per my ideas, I have to create separate tables such as
> >> video_by_actor, video_by_producer etc.. and implement solr query on all
> >> tables. Otherwise,
> >> > is there any others way by which we can implement this search module
> >> effectively.
> >> >
> >> > Please suggest.
> >> >
> >> > Best regards,
> >>
> >
> >
>



--

-- 
This message may contain confidential and/or privileged information. 
If you are not the addressee or authorized to receive this on behalf of the 
addressee you must not use, copy, disclose or take action based on this 
message or any information herein. 
If you have received this message in error, please advise the sender 
immediately by reply email and delete this message. Thank you.

Re: Reg:- Cassandra Data modelling for Search

Posted by Jason Brown <ja...@gmail.com>.
removing dev@ from this conversation, as the thread is more appropriately
for user@

On Mon, Jun 12, 2017 at 4:51 AM, Eduardo Alonso <ed...@stratio.com>
wrote:

> -Virtual tokens are not recommended when using SOLR or
> cassandra-lucene-index.
>
> If you use your table schema you will not have any problem with partition
> size because your table is *not* a WIDE row table (it does not have
> clustering keys)
> The limit for 1 record with those 15 or 20 columns must not be larger that
> 100MB. You will have enough.
>
> Eduardo Alonso
> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
> 28224 Pozuelo de Alarcón, Madrid
> Tel: +34 91 828 6473 // www.stratio.com // *@stratiobd
> <https://twitter.com/StratioBD>*
>
> 2017-06-12 12:36 GMT+02:00 @Nandan@ <na...@gmail.com>:
>
> > And due to single table videos, maybe it will go with around 15,20
> > columns, then we need to also think very carefully about partition sizes
> > also.
> >
> > On Mon, Jun 12, 2017 at 6:33 PM, @Nandan@ <nandanpriyadarshi298@gmail.
> com>
> > wrote:
> >
> >> Yes this is only Option I am also thinking like this as my second
> >> options. Before this I was thinking to do denormalize table based on
> search
> >> columns, but due to partial search this will be not that effective.
> >>
> >> Now suppose , if we are going with this single table as videos. and
> >> implemented with Solr/Lucene, then need to also care about num_tokens ?
> >>
> >>
> >> On Mon, Jun 12, 2017 at 6:27 PM, Eduardo Alonso <
> >> eduardoalonso@stratio.com> wrote:
> >>
> >>> Using cassandra collections
> >>>
> >>> CREATE TABLE videos (
> >>> videoid uuid primary key,
> >>> title text,
> >>> actor list<text>,
> >>> producer list<text>,
> >>> release_date timestamp,
> >>> description text,
> >>> music text,
> >>> etc...
> >>> );
> >>>
> >>> When using collection you need to take care of its length. Collections
> >>> are designed to store
> >>> <http://docs.datastax.com/en/cql/3.1/cql/cql_using/use_
> collections_c.html>only
> >>> a small amount of data
> >>> <http://docs.datastax.com/en/cql/3.1/cql/cql_using/use_
> collections_c.html>
> >>> .
> >>> 5/10 actors per movie is ok.
> >>>
> >>>
> >>> Eduardo Alonso
> >>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
> >>> 28224 Pozuelo de Alarcón, Madrid
> >>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com //
> *@stratiobd
> >>> <https://twitter.com/StratioBD>*
> >>>
> >>> 2017-06-12 11:54 GMT+02:00 @Nandan@ <na...@gmail.com>:
> >>>
> >>>> So In short we have to go with one single table as videos and put
> >>>> primary key as videoid uuid.
> >>>> But then how can we able to handle multiple actor name and producer
> >>>> name. ?
> >>>>
> >>>> On Mon, Jun 12, 2017 at 5:51 PM, Eduardo Alonso <
> >>>> eduardoalonso@stratio.com> wrote:
> >>>>
> >>>>> Yes, you are right.
> >>>>>
> >>>>> Table denormalization is useful just when you have unique primary
> >>>>> keys, not your case.
> >>>>> Denormalized tables are only different in its primary key, every
> >>>>> denormalized table contains all the data (it just change how it is
> >>>>> structured). So, if you need to index it, do it with just one table
> (the
> >>>>> one you showed us with videoid as the primary key is ok).
> >>>>>
> >>>>> Solr, Elastic and cassandra-lucene-index are both based on Lucene and
> >>>>> all of them fulfill all your needs.
> >>>>>
> >>>>> Solr (in DSE) and cassandra-lucene-index
> >>>>> <https://github.com/stratio/cassandra-lucene-index> are very well
> >>>>> integrated with cassandra using its secondary index interface. If you
> >>>>> choose elastic search you will need to code the integration (write
> mutex,
> >>>>> both cluster synchronization (imagine something written in cassandra
> but
> >>>>> failed to write in elastic))
> >>>>>
> >>>>> I know i am not the most suitable to recommend you to use our product
> >>>>> cassandra-lucene-index
> >>>>> <https://github.com/stratio/cassandra-lucene-index> but it is open
> >>>>> source, just take a look.
> >>>>>
> >>>>> Eduardo Alonso
> >>>>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
> >>>>> 28224 Pozuelo de Alarcón, Madrid
> >>>>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com
> // *@stratiobd
> >>>>> <https://twitter.com/StratioBD>*
> >>>>>
> >>>>> 2017-06-12 11:18 GMT+02:00 @Nandan@ <nandanpriyadarshi298@gmail.com
> >:
> >>>>>
> >>>>>> Hi Eduardo,
> >>>>>>
> >>>>>> And As we are trying to build an advanced search functionality in
> >>>>>> which we can able to do partial search based on actor, producer,
> director,
> >>>>>> etc. columns.
> >>>>>> So if we do denormalization of tables then we have to create tables
> >>>>>> such as below :-
> >>>>>> video_by_actor
> >>>>>> video_by_producer
> >>>>>> video_by_director
> >>>>>> video_by_date
> >>>>>> etc..
> >>>>>> By using denormalized, Cassandra only allows us to do equality
> >>>>>> search, but for implementing Partial search we need to implement
> solr on
> >>>>>> all above tables.
> >>>>>>
> >>>>>> This is my thinking, but I think this will be not correct way to
> >>>>>> implement Apache Solr on all tables.
> >>>>>>
> >>>>>> On Mon, Jun 12, 2017 at 5:11 PM, @Nandan@ <
> >>>>>> nandanpriyadarshi298@gmail.com> wrote:
> >>>>>>
> >>>>>>> Hi Edurado,
> >>>>>>>
> >>>>>>> As you mentioned queries 1-6 ,
> >>>>>>> In this condition, we have to proceed with a table like as below :-
> >>>>>>> create table videos (
> >>>>>>> videoid uuid primary key,
> >>>>>>> title text,
> >>>>>>> actor text,
> >>>>>>> producer text,
> >>>>>>> release_date timestamp,
> >>>>>>> description text,
> >>>>>>> music text,
> >>>>>>> etc...
> >>>>>>> );
> >>>>>>> This table will help to store video datas based on PK videoid and
> >>>>>>> will give uniqeness due to uuid.
> >>>>>>> But as we know , in one movie there are multiple actor, multiple
> >>>>>>> producer, multiple music worked, So how can we store all these..
> Only one
> >>>>>>> option will left as to use collection type columns.
> >>>>>>>
> >>>>>>>
> >>>>>>> On Mon, Jun 12, 2017 at 4:59 PM, Eduardo Alonso <
> >>>>>>> eduardoalonso@stratio.com> wrote:
> >>>>>>>
> >>>>>>>> TLDR shouldBe *PD
> >>>>>>>>
> >>>>>>>> Eduardo Alonso
> >>>>>>>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
> >>>>>>>> 28224 Pozuelo de Alarcón, Madrid
> >>>>>>>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> //
> www.stratio.com
> >>>>>>>>  // *@stratiobd <https://twitter.com/StratioBD>*
> >>>>>>>>
> >>>>>>>> 2017-06-12 10:58 GMT+02:00 Eduardo Alonso <
> >>>>>>>> eduardoalonso@stratio.com>:
> >>>>>>>>
> >>>>>>>>> Hi Nandan:
> >>>>>>>>>
> >>>>>>>>> So, your system must provide these queries:
> >>>>>>>>>
> >>>>>>>>> 1 - SELECT video FROM ... WHERE actor = '...';
> >>>>>>>>> 2 - SELECT video FROM ... WHERE producer = '...';
> >>>>>>>>> 3 - SELECT video FROM ... WHERE music = '...';
> >>>>>>>>> 4 - SELECT video FROM ... WHERE actor = '...' AND producer
> ='...';
> >>>>>>>>> 5 - SELECT video FROM ... WHERE actor = '...' AND music = '...';
> >>>>>>>>> 6 - SELECT video WHERE title CONTAINS 'Harry';
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> For queries 1-5 you can get them with just cassandra,
> >>>>>>>>> denormalizing tables just the way your mentioned but without
> solr, just
> >>>>>>>>> cassandra (Indeed, just for equality clauses)
> >>>>>>>>>
> >>>>>>>>> video_by_actor;
> >>>>>>>>> video_by_producer;
> >>>>>>>>> video_by_music;
> >>>>>>>>> video_by_actor_and_producer;
> >>>>>>>>> video_by_actor_and_music;
> >>>>>>>>>
> >>>>>>>>> For queries number 6 you need a search engine.
> >>>>>>>>>
> >>>>>>>>> SOL
> >>>>>>>>> ElasticSearch
> >>>>>>>>> cassandra-lucene-index
> >>>>>>>>> <https://github.com/stratio/cassandra-lucene-index>
> >>>>>>>>> SASI
> >>>>>>>>> <http://docs.datastax.com/en/dse/5.1/cql/cql/cql_reference/
> cql_commands/cqlCreateCustomIndex.html>
> >>>>>>>>>
> >>>>>>>>> I think, just for your query,  the easiest way to get it is to
> >>>>>>>>> build a SASI index.
> >>>>>>>>> TLDR: I work for stratio in cassandra-lucene-index but for your
> >>>>>>>>> basic query (only one dimension), SASI indexes will work for you.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Eduardo Alonso
> >>>>>>>>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
> >>>>>>>>> 28224 Pozuelo de Alarcón, Madrid
> >>>>>>>>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> //
> www.stratio.com
> >>>>>>>>>  // *@stratiobd <https://twitter.com/StratioBD>*
> >>>>>>>>>
> >>>>>>>>> 2017-06-12 9:50 GMT+02:00 @Nandan@ <nandanpriyadarshi298@gmail.
> com
> >>>>>>>>> >:
> >>>>>>>>>
> >>>>>>>>>> But Condition is , I am working with Apache Cassandra Database
> in
> >>>>>>>>>> which I have to store my data into Cassandra and then have to
> implement
> >>>>>>>>>> partial search capability.
> >>>>>>>>>> If we need to search based on full search  primary key, then it
> >>>>>>>>>> really best and easy to work with Cassandra , but in case of
> flexible
> >>>>>>>>>> search , I am getting confused.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On Mon, Jun 12, 2017 at 3:47 PM, Oskar Kjellin <
> >>>>>>>>>> oskar.kjellin@gmail.com> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> I haven't run solr with Cassandra myself. I just meant to run
> >>>>>>>>>>> elasticsearch as a completely separate service and write there
> as well.
> >>>>>>>>>>>
> >>>>>>>>>>> On 12 Jun 2017, at 09:45, @Nandan@ <
> >>>>>>>>>>> nandanpriyadarshi298@gmail.com> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>> Do you mean to use Elastic Search with Cassandra?
> >>>>>>>>>>> Even I am thinking to use Apache Solr With Cassandra.
> >>>>>>>>>>> In that case I have to create distributed tables such as:-
> >>>>>>>>>>> 1) video_by_title, video_by_actor, video_by_year  etc..
> >>>>>>>>>>> 2) After creating Tables , will have to configure solr core on
> >>>>>>>>>>> all tables.
> >>>>>>>>>>>
> >>>>>>>>>>> Is it like this ?
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On Mon, Jun 12, 2017 at 3:19 PM, Oskar Kjellin <
> >>>>>>>>>>> oskar.kjellin@gmail.com> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>> Why not elasticsearch for this use case? It will make your
> life
> >>>>>>>>>>>> much simpler
> >>>>>>>>>>>>
> >>>>>>>>>>>> > On 12 Jun 2017, at 04:40, @Nandan@ <
> >>>>>>>>>>>> nandanpriyadarshi298@gmail.com> wrote:
> >>>>>>>>>>>> >
> >>>>>>>>>>>> > Hi,
> >>>>>>>>>>>> >
> >>>>>>>>>>>> > Currently, I am working on data modeling for Video Company
> in
> >>>>>>>>>>>> which we have different types of users as well as different
> user
> >>>>>>>>>>>> functionality.
> >>>>>>>>>>>> > But currently, my concern is about Search video module based
> >>>>>>>>>>>> on different fields.
> >>>>>>>>>>>> >
> >>>>>>>>>>>> > Query patterns are as below:-
> >>>>>>>>>>>> > 1) Select video by actor.
> >>>>>>>>>>>> > 2) select video by producer.
> >>>>>>>>>>>> > 3) select video by music.
> >>>>>>>>>>>> > 4) select video by actor and producer.
> >>>>>>>>>>>> > 5) select video by actor and music.
> >>>>>>>>>>>> >
> >>>>>>>>>>>> > Note: - In short, We want to establish an advanced search
> >>>>>>>>>>>> module by which we can search by anyway and get the desired
> results.
> >>>>>>>>>>>> >
> >>>>>>>>>>>> > During a search , we need partial search also such that if
> >>>>>>>>>>>> any user can search "Harry" title, then we are able to give
> them result as
> >>>>>>>>>>>> all videos whose
> >>>>>>>>>>>> >  title contains "Harry" at any location.
> >>>>>>>>>>>> >
> >>>>>>>>>>>> > As per my ideas, I have to create separate tables such as
> >>>>>>>>>>>> video_by_actor, video_by_producer etc.. and implement solr
> query on all
> >>>>>>>>>>>> tables. Otherwise,
> >>>>>>>>>>>> > is there any others way by which we can implement this
> search
> >>>>>>>>>>>> module effectively.
> >>>>>>>>>>>> >
> >>>>>>>>>>>> > Please suggest.
> >>>>>>>>>>>> >
> >>>>>>>>>>>> > Best regards,
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
> >
>

Re: Reg:- Cassandra Data modelling for Search

Posted by "@Nandan@" <na...@gmail.com>.
Ok , Then let's try to implement and will check by using cassandra-stress
to check what will be performance.
I worked on another data model for book storage for my company, with same
situations like having 1 single table with 80 columns and primary key as
bookid uuid.  Implemented Solr on top of that.  That's why , I am try to
implement all possible best solution for upcoming projects.


On Mon, Jun 12, 2017 at 7:51 PM, Eduardo Alonso <ed...@stratio.com>
wrote:

> -Virtual tokens are not recommended when using SOLR or
> cassandra-lucene-index.
>
> If you use your table schema you will not have any problem with partition
> size because your table is *not* a WIDE row table (it does not have
> clustering keys)
> The limit for 1 record with those 15 or 20 columns must not be larger that
> 100MB. You will have enough.
>
> Eduardo Alonso
> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
> 28224 Pozuelo de Alarcón, Madrid
> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com // *@stratiobd
> <https://twitter.com/StratioBD>*
>
> 2017-06-12 12:36 GMT+02:00 @Nandan@ <na...@gmail.com>:
>
>> And due to single table videos, maybe it will go with around 15,20
>> columns, then we need to also think very carefully about partition sizes
>> also.
>>
>> On Mon, Jun 12, 2017 at 6:33 PM, @Nandan@ <nandanpriyadarshi298@gmail.com
>> > wrote:
>>
>>> Yes this is only Option I am also thinking like this as my second
>>> options. Before this I was thinking to do denormalize table based on search
>>> columns, but due to partial search this will be not that effective.
>>>
>>> Now suppose , if we are going with this single table as videos. and
>>> implemented with Solr/Lucene, then need to also care about num_tokens ?
>>>
>>>
>>> On Mon, Jun 12, 2017 at 6:27 PM, Eduardo Alonso <
>>> eduardoalonso@stratio.com> wrote:
>>>
>>>> Using cassandra collections
>>>>
>>>> CREATE TABLE videos (
>>>> videoid uuid primary key,
>>>> title text,
>>>> actor list<text>,
>>>> producer list<text>,
>>>> release_date timestamp,
>>>> description text,
>>>> music text,
>>>> etc...
>>>> );
>>>>
>>>> When using collection you need to take care of its length. Collections
>>>> are designed to store
>>>> <http://docs.datastax.com/en/cql/3.1/cql/cql_using/use_collections_c.html>only
>>>> a small amount of data
>>>> <http://docs.datastax.com/en/cql/3.1/cql/cql_using/use_collections_c.html>
>>>> .
>>>> 5/10 actors per movie is ok.
>>>>
>>>>
>>>> Eduardo Alonso
>>>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
>>>> 28224 Pozuelo de Alarcón, Madrid
>>>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com // *@stratiobd
>>>> <https://twitter.com/StratioBD>*
>>>>
>>>> 2017-06-12 11:54 GMT+02:00 @Nandan@ <na...@gmail.com>:
>>>>
>>>>> So In short we have to go with one single table as videos and put
>>>>> primary key as videoid uuid.
>>>>> But then how can we able to handle multiple actor name and producer
>>>>> name. ?
>>>>>
>>>>> On Mon, Jun 12, 2017 at 5:51 PM, Eduardo Alonso <
>>>>> eduardoalonso@stratio.com> wrote:
>>>>>
>>>>>> Yes, you are right.
>>>>>>
>>>>>> Table denormalization is useful just when you have unique primary
>>>>>> keys, not your case.
>>>>>> Denormalized tables are only different in its primary key, every
>>>>>> denormalized table contains all the data (it just change how it is
>>>>>> structured). So, if you need to index it, do it with just one table (the
>>>>>> one you showed us with videoid as the primary key is ok).
>>>>>>
>>>>>> Solr, Elastic and cassandra-lucene-index are both based on Lucene and
>>>>>> all of them fulfill all your needs.
>>>>>>
>>>>>> Solr (in DSE) and cassandra-lucene-index
>>>>>> <https://github.com/stratio/cassandra-lucene-index> are very well
>>>>>> integrated with cassandra using its secondary index interface. If you
>>>>>> choose elastic search you will need to code the integration (write mutex,
>>>>>> both cluster synchronization (imagine something written in cassandra but
>>>>>> failed to write in elastic))
>>>>>>
>>>>>> I know i am not the most suitable to recommend you to use our product
>>>>>> cassandra-lucene-index
>>>>>> <https://github.com/stratio/cassandra-lucene-index> but it is open
>>>>>> source, just take a look.
>>>>>>
>>>>>> Eduardo Alonso
>>>>>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
>>>>>> 28224 Pozuelo de Alarcón, Madrid
>>>>>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com
>>>>>>  // *@stratiobd <https://twitter.com/StratioBD>*
>>>>>>
>>>>>> 2017-06-12 11:18 GMT+02:00 @Nandan@ <na...@gmail.com>:
>>>>>>
>>>>>>> Hi Eduardo,
>>>>>>>
>>>>>>> And As we are trying to build an advanced search functionality in
>>>>>>> which we can able to do partial search based on actor, producer, director,
>>>>>>> etc. columns.
>>>>>>> So if we do denormalization of tables then we have to create tables
>>>>>>> such as below :-
>>>>>>> video_by_actor
>>>>>>> video_by_producer
>>>>>>> video_by_director
>>>>>>> video_by_date
>>>>>>> etc..
>>>>>>> By using denormalized, Cassandra only allows us to do equality
>>>>>>> search, but for implementing Partial search we need to implement solr on
>>>>>>> all above tables.
>>>>>>>
>>>>>>> This is my thinking, but I think this will be not correct way to
>>>>>>> implement Apache Solr on all tables.
>>>>>>>
>>>>>>> On Mon, Jun 12, 2017 at 5:11 PM, @Nandan@ <
>>>>>>> nandanpriyadarshi298@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi Edurado,
>>>>>>>>
>>>>>>>> As you mentioned queries 1-6 ,
>>>>>>>> In this condition, we have to proceed with a table like as below :-
>>>>>>>> create table videos (
>>>>>>>> videoid uuid primary key,
>>>>>>>> title text,
>>>>>>>> actor text,
>>>>>>>> producer text,
>>>>>>>> release_date timestamp,
>>>>>>>> description text,
>>>>>>>> music text,
>>>>>>>> etc...
>>>>>>>> );
>>>>>>>> This table will help to store video datas based on PK videoid and
>>>>>>>> will give uniqeness due to uuid.
>>>>>>>> But as we know , in one movie there are multiple actor, multiple
>>>>>>>> producer, multiple music worked, So how can we store all these.. Only one
>>>>>>>> option will left as to use collection type columns.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Jun 12, 2017 at 4:59 PM, Eduardo Alonso <
>>>>>>>> eduardoalonso@stratio.com> wrote:
>>>>>>>>
>>>>>>>>> TLDR shouldBe *PD
>>>>>>>>>
>>>>>>>>> Eduardo Alonso
>>>>>>>>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
>>>>>>>>> 28224 Pozuelo de Alarcón, Madrid
>>>>>>>>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com
>>>>>>>>>  // *@stratiobd <https://twitter.com/StratioBD>*
>>>>>>>>>
>>>>>>>>> 2017-06-12 10:58 GMT+02:00 Eduardo Alonso <
>>>>>>>>> eduardoalonso@stratio.com>:
>>>>>>>>>
>>>>>>>>>> Hi Nandan:
>>>>>>>>>>
>>>>>>>>>> So, your system must provide these queries:
>>>>>>>>>>
>>>>>>>>>> 1 - SELECT video FROM ... WHERE actor = '...';
>>>>>>>>>> 2 - SELECT video FROM ... WHERE producer = '...';
>>>>>>>>>> 3 - SELECT video FROM ... WHERE music = '...';
>>>>>>>>>> 4 - SELECT video FROM ... WHERE actor = '...' AND producer
>>>>>>>>>> ='...';
>>>>>>>>>> 5 - SELECT video FROM ... WHERE actor = '...' AND music = '...';
>>>>>>>>>> 6 - SELECT video WHERE title CONTAINS 'Harry';
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> For queries 1-5 you can get them with just cassandra,
>>>>>>>>>> denormalizing tables just the way your mentioned but without solr, just
>>>>>>>>>> cassandra (Indeed, just for equality clauses)
>>>>>>>>>>
>>>>>>>>>> video_by_actor;
>>>>>>>>>> video_by_producer;
>>>>>>>>>> video_by_music;
>>>>>>>>>> video_by_actor_and_producer;
>>>>>>>>>> video_by_actor_and_music;
>>>>>>>>>>
>>>>>>>>>> For queries number 6 you need a search engine.
>>>>>>>>>>
>>>>>>>>>> SOL
>>>>>>>>>> ElasticSearch
>>>>>>>>>> cassandra-lucene-index
>>>>>>>>>> <https://github.com/stratio/cassandra-lucene-index>
>>>>>>>>>> SASI
>>>>>>>>>> <http://docs.datastax.com/en/dse/5.1/cql/cql/cql_reference/cql_commands/cqlCreateCustomIndex.html>
>>>>>>>>>>
>>>>>>>>>> I think, just for your query,  the easiest way to get it is to
>>>>>>>>>> build a SASI index.
>>>>>>>>>> TLDR: I work for stratio in cassandra-lucene-index but for your
>>>>>>>>>> basic query (only one dimension), SASI indexes will work for you.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Eduardo Alonso
>>>>>>>>>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
>>>>>>>>>> 28224 Pozuelo de Alarcón, Madrid
>>>>>>>>>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> //
>>>>>>>>>> www.stratio.com // *@stratiobd <https://twitter.com/StratioBD>*
>>>>>>>>>>
>>>>>>>>>> 2017-06-12 9:50 GMT+02:00 @Nandan@ <nandanpriyadarshi298@gmail.co
>>>>>>>>>> m>:
>>>>>>>>>>
>>>>>>>>>>> But Condition is , I am working with Apache Cassandra Database
>>>>>>>>>>> in which I have to store my data into Cassandra and then have to implement
>>>>>>>>>>> partial search capability.
>>>>>>>>>>> If we need to search based on full search  primary key, then it
>>>>>>>>>>> really best and easy to work with Cassandra , but in case of flexible
>>>>>>>>>>> search , I am getting confused.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Jun 12, 2017 at 3:47 PM, Oskar Kjellin <
>>>>>>>>>>> oskar.kjellin@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> I haven't run solr with Cassandra myself. I just meant to run
>>>>>>>>>>>> elasticsearch as a completely separate service and write there as well.
>>>>>>>>>>>>
>>>>>>>>>>>> On 12 Jun 2017, at 09:45, @Nandan@ <
>>>>>>>>>>>> nandanpriyadarshi298@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> Do you mean to use Elastic Search with Cassandra?
>>>>>>>>>>>> Even I am thinking to use Apache Solr With Cassandra.
>>>>>>>>>>>> In that case I have to create distributed tables such as:-
>>>>>>>>>>>> 1) video_by_title, video_by_actor, video_by_year  etc..
>>>>>>>>>>>> 2) After creating Tables , will have to configure solr core on
>>>>>>>>>>>> all tables.
>>>>>>>>>>>>
>>>>>>>>>>>> Is it like this ?
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Jun 12, 2017 at 3:19 PM, Oskar Kjellin <
>>>>>>>>>>>> oskar.kjellin@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Why not elasticsearch for this use case? It will make your
>>>>>>>>>>>>> life much simpler
>>>>>>>>>>>>>
>>>>>>>>>>>>> > On 12 Jun 2017, at 04:40, @Nandan@ <
>>>>>>>>>>>>> nandanpriyadarshi298@gmail.com> wrote:
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > Hi,
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > Currently, I am working on data modeling for Video Company
>>>>>>>>>>>>> in which we have different types of users as well as different user
>>>>>>>>>>>>> functionality.
>>>>>>>>>>>>> > But currently, my concern is about Search video module based
>>>>>>>>>>>>> on different fields.
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > Query patterns are as below:-
>>>>>>>>>>>>> > 1) Select video by actor.
>>>>>>>>>>>>> > 2) select video by producer.
>>>>>>>>>>>>> > 3) select video by music.
>>>>>>>>>>>>> > 4) select video by actor and producer.
>>>>>>>>>>>>> > 5) select video by actor and music.
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > Note: - In short, We want to establish an advanced search
>>>>>>>>>>>>> module by which we can search by anyway and get the desired results.
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > During a search , we need partial search also such that if
>>>>>>>>>>>>> any user can search "Harry" title, then we are able to give them result as
>>>>>>>>>>>>> all videos whose
>>>>>>>>>>>>> >  title contains "Harry" at any location.
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > As per my ideas, I have to create separate tables such as
>>>>>>>>>>>>> video_by_actor, video_by_producer etc.. and implement solr query on all
>>>>>>>>>>>>> tables. Otherwise,
>>>>>>>>>>>>> > is there any others way by which we can implement this
>>>>>>>>>>>>> search module effectively.
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > Please suggest.
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > Best regards,
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Reg:- Cassandra Data modelling for Search

Posted by Eduardo Alonso <ed...@stratio.com>.
-Virtual tokens are not recommended when using SOLR or
cassandra-lucene-index.

If you use your table schema you will not have any problem with partition
size because your table is *not* a WIDE row table (it does not have
clustering keys)
The limit for 1 record with those 15 or 20 columns must not be larger that
100MB. You will have enough.

Eduardo Alonso
Vía de las dos Castillas, 33, Ática 4, 3ª Planta
28224 Pozuelo de Alarcón, Madrid
Tel: +34 91 828 6473 // www.stratio.com // *@stratiobd
<https://twitter.com/StratioBD>*

2017-06-12 12:36 GMT+02:00 @Nandan@ <na...@gmail.com>:

> And due to single table videos, maybe it will go with around 15,20
> columns, then we need to also think very carefully about partition sizes
> also.
>
> On Mon, Jun 12, 2017 at 6:33 PM, @Nandan@ <na...@gmail.com>
> wrote:
>
>> Yes this is only Option I am also thinking like this as my second
>> options. Before this I was thinking to do denormalize table based on search
>> columns, but due to partial search this will be not that effective.
>>
>> Now suppose , if we are going with this single table as videos. and
>> implemented with Solr/Lucene, then need to also care about num_tokens ?
>>
>>
>> On Mon, Jun 12, 2017 at 6:27 PM, Eduardo Alonso <
>> eduardoalonso@stratio.com> wrote:
>>
>>> Using cassandra collections
>>>
>>> CREATE TABLE videos (
>>> videoid uuid primary key,
>>> title text,
>>> actor list<text>,
>>> producer list<text>,
>>> release_date timestamp,
>>> description text,
>>> music text,
>>> etc...
>>> );
>>>
>>> When using collection you need to take care of its length. Collections
>>> are designed to store
>>> <http://docs.datastax.com/en/cql/3.1/cql/cql_using/use_collections_c.html>only
>>> a small amount of data
>>> <http://docs.datastax.com/en/cql/3.1/cql/cql_using/use_collections_c.html>
>>> .
>>> 5/10 actors per movie is ok.
>>>
>>>
>>> Eduardo Alonso
>>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
>>> 28224 Pozuelo de Alarcón, Madrid
>>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com // *@stratiobd
>>> <https://twitter.com/StratioBD>*
>>>
>>> 2017-06-12 11:54 GMT+02:00 @Nandan@ <na...@gmail.com>:
>>>
>>>> So In short we have to go with one single table as videos and put
>>>> primary key as videoid uuid.
>>>> But then how can we able to handle multiple actor name and producer
>>>> name. ?
>>>>
>>>> On Mon, Jun 12, 2017 at 5:51 PM, Eduardo Alonso <
>>>> eduardoalonso@stratio.com> wrote:
>>>>
>>>>> Yes, you are right.
>>>>>
>>>>> Table denormalization is useful just when you have unique primary
>>>>> keys, not your case.
>>>>> Denormalized tables are only different in its primary key, every
>>>>> denormalized table contains all the data (it just change how it is
>>>>> structured). So, if you need to index it, do it with just one table (the
>>>>> one you showed us with videoid as the primary key is ok).
>>>>>
>>>>> Solr, Elastic and cassandra-lucene-index are both based on Lucene and
>>>>> all of them fulfill all your needs.
>>>>>
>>>>> Solr (in DSE) and cassandra-lucene-index
>>>>> <https://github.com/stratio/cassandra-lucene-index> are very well
>>>>> integrated with cassandra using its secondary index interface. If you
>>>>> choose elastic search you will need to code the integration (write mutex,
>>>>> both cluster synchronization (imagine something written in cassandra but
>>>>> failed to write in elastic))
>>>>>
>>>>> I know i am not the most suitable to recommend you to use our product
>>>>> cassandra-lucene-index
>>>>> <https://github.com/stratio/cassandra-lucene-index> but it is open
>>>>> source, just take a look.
>>>>>
>>>>> Eduardo Alonso
>>>>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
>>>>> 28224 Pozuelo de Alarcón, Madrid
>>>>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com // *@stratiobd
>>>>> <https://twitter.com/StratioBD>*
>>>>>
>>>>> 2017-06-12 11:18 GMT+02:00 @Nandan@ <na...@gmail.com>:
>>>>>
>>>>>> Hi Eduardo,
>>>>>>
>>>>>> And As we are trying to build an advanced search functionality in
>>>>>> which we can able to do partial search based on actor, producer, director,
>>>>>> etc. columns.
>>>>>> So if we do denormalization of tables then we have to create tables
>>>>>> such as below :-
>>>>>> video_by_actor
>>>>>> video_by_producer
>>>>>> video_by_director
>>>>>> video_by_date
>>>>>> etc..
>>>>>> By using denormalized, Cassandra only allows us to do equality
>>>>>> search, but for implementing Partial search we need to implement solr on
>>>>>> all above tables.
>>>>>>
>>>>>> This is my thinking, but I think this will be not correct way to
>>>>>> implement Apache Solr on all tables.
>>>>>>
>>>>>> On Mon, Jun 12, 2017 at 5:11 PM, @Nandan@ <
>>>>>> nandanpriyadarshi298@gmail.com> wrote:
>>>>>>
>>>>>>> Hi Edurado,
>>>>>>>
>>>>>>> As you mentioned queries 1-6 ,
>>>>>>> In this condition, we have to proceed with a table like as below :-
>>>>>>> create table videos (
>>>>>>> videoid uuid primary key,
>>>>>>> title text,
>>>>>>> actor text,
>>>>>>> producer text,
>>>>>>> release_date timestamp,
>>>>>>> description text,
>>>>>>> music text,
>>>>>>> etc...
>>>>>>> );
>>>>>>> This table will help to store video datas based on PK videoid and
>>>>>>> will give uniqeness due to uuid.
>>>>>>> But as we know , in one movie there are multiple actor, multiple
>>>>>>> producer, multiple music worked, So how can we store all these.. Only one
>>>>>>> option will left as to use collection type columns.
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Jun 12, 2017 at 4:59 PM, Eduardo Alonso <
>>>>>>> eduardoalonso@stratio.com> wrote:
>>>>>>>
>>>>>>>> TLDR shouldBe *PD
>>>>>>>>
>>>>>>>> Eduardo Alonso
>>>>>>>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
>>>>>>>> 28224 Pozuelo de Alarcón, Madrid
>>>>>>>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com
>>>>>>>>  // *@stratiobd <https://twitter.com/StratioBD>*
>>>>>>>>
>>>>>>>> 2017-06-12 10:58 GMT+02:00 Eduardo Alonso <
>>>>>>>> eduardoalonso@stratio.com>:
>>>>>>>>
>>>>>>>>> Hi Nandan:
>>>>>>>>>
>>>>>>>>> So, your system must provide these queries:
>>>>>>>>>
>>>>>>>>> 1 - SELECT video FROM ... WHERE actor = '...';
>>>>>>>>> 2 - SELECT video FROM ... WHERE producer = '...';
>>>>>>>>> 3 - SELECT video FROM ... WHERE music = '...';
>>>>>>>>> 4 - SELECT video FROM ... WHERE actor = '...' AND producer ='...';
>>>>>>>>> 5 - SELECT video FROM ... WHERE actor = '...' AND music = '...';
>>>>>>>>> 6 - SELECT video WHERE title CONTAINS 'Harry';
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> For queries 1-5 you can get them with just cassandra,
>>>>>>>>> denormalizing tables just the way your mentioned but without solr, just
>>>>>>>>> cassandra (Indeed, just for equality clauses)
>>>>>>>>>
>>>>>>>>> video_by_actor;
>>>>>>>>> video_by_producer;
>>>>>>>>> video_by_music;
>>>>>>>>> video_by_actor_and_producer;
>>>>>>>>> video_by_actor_and_music;
>>>>>>>>>
>>>>>>>>> For queries number 6 you need a search engine.
>>>>>>>>>
>>>>>>>>> SOL
>>>>>>>>> ElasticSearch
>>>>>>>>> cassandra-lucene-index
>>>>>>>>> <https://github.com/stratio/cassandra-lucene-index>
>>>>>>>>> SASI
>>>>>>>>> <http://docs.datastax.com/en/dse/5.1/cql/cql/cql_reference/cql_commands/cqlCreateCustomIndex.html>
>>>>>>>>>
>>>>>>>>> I think, just for your query,  the easiest way to get it is to
>>>>>>>>> build a SASI index.
>>>>>>>>> TLDR: I work for stratio in cassandra-lucene-index but for your
>>>>>>>>> basic query (only one dimension), SASI indexes will work for you.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Eduardo Alonso
>>>>>>>>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
>>>>>>>>> 28224 Pozuelo de Alarcón, Madrid
>>>>>>>>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com
>>>>>>>>>  // *@stratiobd <https://twitter.com/StratioBD>*
>>>>>>>>>
>>>>>>>>> 2017-06-12 9:50 GMT+02:00 @Nandan@ <nandanpriyadarshi298@gmail.com
>>>>>>>>> >:
>>>>>>>>>
>>>>>>>>>> But Condition is , I am working with Apache Cassandra Database in
>>>>>>>>>> which I have to store my data into Cassandra and then have to implement
>>>>>>>>>> partial search capability.
>>>>>>>>>> If we need to search based on full search  primary key, then it
>>>>>>>>>> really best and easy to work with Cassandra , but in case of flexible
>>>>>>>>>> search , I am getting confused.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mon, Jun 12, 2017 at 3:47 PM, Oskar Kjellin <
>>>>>>>>>> oskar.kjellin@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> I haven't run solr with Cassandra myself. I just meant to run
>>>>>>>>>>> elasticsearch as a completely separate service and write there as well.
>>>>>>>>>>>
>>>>>>>>>>> On 12 Jun 2017, at 09:45, @Nandan@ <
>>>>>>>>>>> nandanpriyadarshi298@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>> Do you mean to use Elastic Search with Cassandra?
>>>>>>>>>>> Even I am thinking to use Apache Solr With Cassandra.
>>>>>>>>>>> In that case I have to create distributed tables such as:-
>>>>>>>>>>> 1) video_by_title, video_by_actor, video_by_year  etc..
>>>>>>>>>>> 2) After creating Tables , will have to configure solr core on
>>>>>>>>>>> all tables.
>>>>>>>>>>>
>>>>>>>>>>> Is it like this ?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Jun 12, 2017 at 3:19 PM, Oskar Kjellin <
>>>>>>>>>>> oskar.kjellin@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Why not elasticsearch for this use case? It will make your life
>>>>>>>>>>>> much simpler
>>>>>>>>>>>>
>>>>>>>>>>>> > On 12 Jun 2017, at 04:40, @Nandan@ <
>>>>>>>>>>>> nandanpriyadarshi298@gmail.com> wrote:
>>>>>>>>>>>> >
>>>>>>>>>>>> > Hi,
>>>>>>>>>>>> >
>>>>>>>>>>>> > Currently, I am working on data modeling for Video Company in
>>>>>>>>>>>> which we have different types of users as well as different user
>>>>>>>>>>>> functionality.
>>>>>>>>>>>> > But currently, my concern is about Search video module based
>>>>>>>>>>>> on different fields.
>>>>>>>>>>>> >
>>>>>>>>>>>> > Query patterns are as below:-
>>>>>>>>>>>> > 1) Select video by actor.
>>>>>>>>>>>> > 2) select video by producer.
>>>>>>>>>>>> > 3) select video by music.
>>>>>>>>>>>> > 4) select video by actor and producer.
>>>>>>>>>>>> > 5) select video by actor and music.
>>>>>>>>>>>> >
>>>>>>>>>>>> > Note: - In short, We want to establish an advanced search
>>>>>>>>>>>> module by which we can search by anyway and get the desired results.
>>>>>>>>>>>> >
>>>>>>>>>>>> > During a search , we need partial search also such that if
>>>>>>>>>>>> any user can search "Harry" title, then we are able to give them result as
>>>>>>>>>>>> all videos whose
>>>>>>>>>>>> >  title contains "Harry" at any location.
>>>>>>>>>>>> >
>>>>>>>>>>>> > As per my ideas, I have to create separate tables such as
>>>>>>>>>>>> video_by_actor, video_by_producer etc.. and implement solr query on all
>>>>>>>>>>>> tables. Otherwise,
>>>>>>>>>>>> > is there any others way by which we can implement this search
>>>>>>>>>>>> module effectively.
>>>>>>>>>>>> >
>>>>>>>>>>>> > Please suggest.
>>>>>>>>>>>> >
>>>>>>>>>>>> > Best regards,
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Reg:- Cassandra Data modelling for Search

Posted by Eduardo Alonso <ed...@stratio.com>.
-Virtual tokens are not recommended when using SOLR or
cassandra-lucene-index.

If you use your table schema you will not have any problem with partition
size because your table is *not* a WIDE row table (it does not have
clustering keys)
The limit for 1 record with those 15 or 20 columns must not be larger that
100MB. You will have enough.

Eduardo Alonso
Vía de las dos Castillas, 33, Ática 4, 3ª Planta
28224 Pozuelo de Alarcón, Madrid
Tel: +34 91 828 6473 // www.stratio.com // *@stratiobd
<https://twitter.com/StratioBD>*

2017-06-12 12:36 GMT+02:00 @Nandan@ <na...@gmail.com>:

> And due to single table videos, maybe it will go with around 15,20
> columns, then we need to also think very carefully about partition sizes
> also.
>
> On Mon, Jun 12, 2017 at 6:33 PM, @Nandan@ <na...@gmail.com>
> wrote:
>
>> Yes this is only Option I am also thinking like this as my second
>> options. Before this I was thinking to do denormalize table based on search
>> columns, but due to partial search this will be not that effective.
>>
>> Now suppose , if we are going with this single table as videos. and
>> implemented with Solr/Lucene, then need to also care about num_tokens ?
>>
>>
>> On Mon, Jun 12, 2017 at 6:27 PM, Eduardo Alonso <
>> eduardoalonso@stratio.com> wrote:
>>
>>> Using cassandra collections
>>>
>>> CREATE TABLE videos (
>>> videoid uuid primary key,
>>> title text,
>>> actor list<text>,
>>> producer list<text>,
>>> release_date timestamp,
>>> description text,
>>> music text,
>>> etc...
>>> );
>>>
>>> When using collection you need to take care of its length. Collections
>>> are designed to store
>>> <http://docs.datastax.com/en/cql/3.1/cql/cql_using/use_collections_c.html>only
>>> a small amount of data
>>> <http://docs.datastax.com/en/cql/3.1/cql/cql_using/use_collections_c.html>
>>> .
>>> 5/10 actors per movie is ok.
>>>
>>>
>>> Eduardo Alonso
>>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
>>> 28224 Pozuelo de Alarcón, Madrid
>>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com // *@stratiobd
>>> <https://twitter.com/StratioBD>*
>>>
>>> 2017-06-12 11:54 GMT+02:00 @Nandan@ <na...@gmail.com>:
>>>
>>>> So In short we have to go with one single table as videos and put
>>>> primary key as videoid uuid.
>>>> But then how can we able to handle multiple actor name and producer
>>>> name. ?
>>>>
>>>> On Mon, Jun 12, 2017 at 5:51 PM, Eduardo Alonso <
>>>> eduardoalonso@stratio.com> wrote:
>>>>
>>>>> Yes, you are right.
>>>>>
>>>>> Table denormalization is useful just when you have unique primary
>>>>> keys, not your case.
>>>>> Denormalized tables are only different in its primary key, every
>>>>> denormalized table contains all the data (it just change how it is
>>>>> structured). So, if you need to index it, do it with just one table (the
>>>>> one you showed us with videoid as the primary key is ok).
>>>>>
>>>>> Solr, Elastic and cassandra-lucene-index are both based on Lucene and
>>>>> all of them fulfill all your needs.
>>>>>
>>>>> Solr (in DSE) and cassandra-lucene-index
>>>>> <https://github.com/stratio/cassandra-lucene-index> are very well
>>>>> integrated with cassandra using its secondary index interface. If you
>>>>> choose elastic search you will need to code the integration (write mutex,
>>>>> both cluster synchronization (imagine something written in cassandra but
>>>>> failed to write in elastic))
>>>>>
>>>>> I know i am not the most suitable to recommend you to use our product
>>>>> cassandra-lucene-index
>>>>> <https://github.com/stratio/cassandra-lucene-index> but it is open
>>>>> source, just take a look.
>>>>>
>>>>> Eduardo Alonso
>>>>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
>>>>> 28224 Pozuelo de Alarcón, Madrid
>>>>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com // *@stratiobd
>>>>> <https://twitter.com/StratioBD>*
>>>>>
>>>>> 2017-06-12 11:18 GMT+02:00 @Nandan@ <na...@gmail.com>:
>>>>>
>>>>>> Hi Eduardo,
>>>>>>
>>>>>> And As we are trying to build an advanced search functionality in
>>>>>> which we can able to do partial search based on actor, producer, director,
>>>>>> etc. columns.
>>>>>> So if we do denormalization of tables then we have to create tables
>>>>>> such as below :-
>>>>>> video_by_actor
>>>>>> video_by_producer
>>>>>> video_by_director
>>>>>> video_by_date
>>>>>> etc..
>>>>>> By using denormalized, Cassandra only allows us to do equality
>>>>>> search, but for implementing Partial search we need to implement solr on
>>>>>> all above tables.
>>>>>>
>>>>>> This is my thinking, but I think this will be not correct way to
>>>>>> implement Apache Solr on all tables.
>>>>>>
>>>>>> On Mon, Jun 12, 2017 at 5:11 PM, @Nandan@ <
>>>>>> nandanpriyadarshi298@gmail.com> wrote:
>>>>>>
>>>>>>> Hi Edurado,
>>>>>>>
>>>>>>> As you mentioned queries 1-6 ,
>>>>>>> In this condition, we have to proceed with a table like as below :-
>>>>>>> create table videos (
>>>>>>> videoid uuid primary key,
>>>>>>> title text,
>>>>>>> actor text,
>>>>>>> producer text,
>>>>>>> release_date timestamp,
>>>>>>> description text,
>>>>>>> music text,
>>>>>>> etc...
>>>>>>> );
>>>>>>> This table will help to store video datas based on PK videoid and
>>>>>>> will give uniqeness due to uuid.
>>>>>>> But as we know , in one movie there are multiple actor, multiple
>>>>>>> producer, multiple music worked, So how can we store all these.. Only one
>>>>>>> option will left as to use collection type columns.
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Jun 12, 2017 at 4:59 PM, Eduardo Alonso <
>>>>>>> eduardoalonso@stratio.com> wrote:
>>>>>>>
>>>>>>>> TLDR shouldBe *PD
>>>>>>>>
>>>>>>>> Eduardo Alonso
>>>>>>>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
>>>>>>>> 28224 Pozuelo de Alarcón, Madrid
>>>>>>>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com
>>>>>>>>  // *@stratiobd <https://twitter.com/StratioBD>*
>>>>>>>>
>>>>>>>> 2017-06-12 10:58 GMT+02:00 Eduardo Alonso <
>>>>>>>> eduardoalonso@stratio.com>:
>>>>>>>>
>>>>>>>>> Hi Nandan:
>>>>>>>>>
>>>>>>>>> So, your system must provide these queries:
>>>>>>>>>
>>>>>>>>> 1 - SELECT video FROM ... WHERE actor = '...';
>>>>>>>>> 2 - SELECT video FROM ... WHERE producer = '...';
>>>>>>>>> 3 - SELECT video FROM ... WHERE music = '...';
>>>>>>>>> 4 - SELECT video FROM ... WHERE actor = '...' AND producer ='...';
>>>>>>>>> 5 - SELECT video FROM ... WHERE actor = '...' AND music = '...';
>>>>>>>>> 6 - SELECT video WHERE title CONTAINS 'Harry';
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> For queries 1-5 you can get them with just cassandra,
>>>>>>>>> denormalizing tables just the way your mentioned but without solr, just
>>>>>>>>> cassandra (Indeed, just for equality clauses)
>>>>>>>>>
>>>>>>>>> video_by_actor;
>>>>>>>>> video_by_producer;
>>>>>>>>> video_by_music;
>>>>>>>>> video_by_actor_and_producer;
>>>>>>>>> video_by_actor_and_music;
>>>>>>>>>
>>>>>>>>> For queries number 6 you need a search engine.
>>>>>>>>>
>>>>>>>>> SOL
>>>>>>>>> ElasticSearch
>>>>>>>>> cassandra-lucene-index
>>>>>>>>> <https://github.com/stratio/cassandra-lucene-index>
>>>>>>>>> SASI
>>>>>>>>> <http://docs.datastax.com/en/dse/5.1/cql/cql/cql_reference/cql_commands/cqlCreateCustomIndex.html>
>>>>>>>>>
>>>>>>>>> I think, just for your query,  the easiest way to get it is to
>>>>>>>>> build a SASI index.
>>>>>>>>> TLDR: I work for stratio in cassandra-lucene-index but for your
>>>>>>>>> basic query (only one dimension), SASI indexes will work for you.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Eduardo Alonso
>>>>>>>>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
>>>>>>>>> 28224 Pozuelo de Alarcón, Madrid
>>>>>>>>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com
>>>>>>>>>  // *@stratiobd <https://twitter.com/StratioBD>*
>>>>>>>>>
>>>>>>>>> 2017-06-12 9:50 GMT+02:00 @Nandan@ <nandanpriyadarshi298@gmail.com
>>>>>>>>> >:
>>>>>>>>>
>>>>>>>>>> But Condition is , I am working with Apache Cassandra Database in
>>>>>>>>>> which I have to store my data into Cassandra and then have to implement
>>>>>>>>>> partial search capability.
>>>>>>>>>> If we need to search based on full search  primary key, then it
>>>>>>>>>> really best and easy to work with Cassandra , but in case of flexible
>>>>>>>>>> search , I am getting confused.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mon, Jun 12, 2017 at 3:47 PM, Oskar Kjellin <
>>>>>>>>>> oskar.kjellin@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> I haven't run solr with Cassandra myself. I just meant to run
>>>>>>>>>>> elasticsearch as a completely separate service and write there as well.
>>>>>>>>>>>
>>>>>>>>>>> On 12 Jun 2017, at 09:45, @Nandan@ <
>>>>>>>>>>> nandanpriyadarshi298@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>> Do you mean to use Elastic Search with Cassandra?
>>>>>>>>>>> Even I am thinking to use Apache Solr With Cassandra.
>>>>>>>>>>> In that case I have to create distributed tables such as:-
>>>>>>>>>>> 1) video_by_title, video_by_actor, video_by_year  etc..
>>>>>>>>>>> 2) After creating Tables , will have to configure solr core on
>>>>>>>>>>> all tables.
>>>>>>>>>>>
>>>>>>>>>>> Is it like this ?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Jun 12, 2017 at 3:19 PM, Oskar Kjellin <
>>>>>>>>>>> oskar.kjellin@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Why not elasticsearch for this use case? It will make your life
>>>>>>>>>>>> much simpler
>>>>>>>>>>>>
>>>>>>>>>>>> > On 12 Jun 2017, at 04:40, @Nandan@ <
>>>>>>>>>>>> nandanpriyadarshi298@gmail.com> wrote:
>>>>>>>>>>>> >
>>>>>>>>>>>> > Hi,
>>>>>>>>>>>> >
>>>>>>>>>>>> > Currently, I am working on data modeling for Video Company in
>>>>>>>>>>>> which we have different types of users as well as different user
>>>>>>>>>>>> functionality.
>>>>>>>>>>>> > But currently, my concern is about Search video module based
>>>>>>>>>>>> on different fields.
>>>>>>>>>>>> >
>>>>>>>>>>>> > Query patterns are as below:-
>>>>>>>>>>>> > 1) Select video by actor.
>>>>>>>>>>>> > 2) select video by producer.
>>>>>>>>>>>> > 3) select video by music.
>>>>>>>>>>>> > 4) select video by actor and producer.
>>>>>>>>>>>> > 5) select video by actor and music.
>>>>>>>>>>>> >
>>>>>>>>>>>> > Note: - In short, We want to establish an advanced search
>>>>>>>>>>>> module by which we can search by anyway and get the desired results.
>>>>>>>>>>>> >
>>>>>>>>>>>> > During a search , we need partial search also such that if
>>>>>>>>>>>> any user can search "Harry" title, then we are able to give them result as
>>>>>>>>>>>> all videos whose
>>>>>>>>>>>> >  title contains "Harry" at any location.
>>>>>>>>>>>> >
>>>>>>>>>>>> > As per my ideas, I have to create separate tables such as
>>>>>>>>>>>> video_by_actor, video_by_producer etc.. and implement solr query on all
>>>>>>>>>>>> tables. Otherwise,
>>>>>>>>>>>> > is there any others way by which we can implement this search
>>>>>>>>>>>> module effectively.
>>>>>>>>>>>> >
>>>>>>>>>>>> > Please suggest.
>>>>>>>>>>>> >
>>>>>>>>>>>> > Best regards,
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Reg:- Cassandra Data modelling for Search

Posted by "@Nandan@" <na...@gmail.com>.
And due to single table videos, maybe it will go with around 15,20 columns,
then we need to also think very carefully about partition sizes also.

On Mon, Jun 12, 2017 at 6:33 PM, @Nandan@ <na...@gmail.com>
wrote:

> Yes this is only Option I am also thinking like this as my second options.
> Before this I was thinking to do denormalize table based on search columns,
> but due to partial search this will be not that effective.
>
> Now suppose , if we are going with this single table as videos. and
> implemented with Solr/Lucene, then need to also care about num_tokens ?
>
>
> On Mon, Jun 12, 2017 at 6:27 PM, Eduardo Alonso <eduardoalonso@stratio.com
> > wrote:
>
>> Using cassandra collections
>>
>> CREATE TABLE videos (
>> videoid uuid primary key,
>> title text,
>> actor list<text>,
>> producer list<text>,
>> release_date timestamp,
>> description text,
>> music text,
>> etc...
>> );
>>
>> When using collection you need to take care of its length. Collections
>> are designed to store
>> <http://docs.datastax.com/en/cql/3.1/cql/cql_using/use_collections_c.html>only
>> a small amount of data
>> <http://docs.datastax.com/en/cql/3.1/cql/cql_using/use_collections_c.html>
>> .
>> 5/10 actors per movie is ok.
>>
>>
>> Eduardo Alonso
>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
>> 28224 Pozuelo de Alarcón, Madrid
>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com // *@stratiobd
>> <https://twitter.com/StratioBD>*
>>
>> 2017-06-12 11:54 GMT+02:00 @Nandan@ <na...@gmail.com>:
>>
>>> So In short we have to go with one single table as videos and put
>>> primary key as videoid uuid.
>>> But then how can we able to handle multiple actor name and producer
>>> name. ?
>>>
>>> On Mon, Jun 12, 2017 at 5:51 PM, Eduardo Alonso <
>>> eduardoalonso@stratio.com> wrote:
>>>
>>>> Yes, you are right.
>>>>
>>>> Table denormalization is useful just when you have unique primary keys,
>>>> not your case.
>>>> Denormalized tables are only different in its primary key, every
>>>> denormalized table contains all the data (it just change how it is
>>>> structured). So, if you need to index it, do it with just one table (the
>>>> one you showed us with videoid as the primary key is ok).
>>>>
>>>> Solr, Elastic and cassandra-lucene-index are both based on Lucene and
>>>> all of them fulfill all your needs.
>>>>
>>>> Solr (in DSE) and cassandra-lucene-index
>>>> <https://github.com/stratio/cassandra-lucene-index> are very well
>>>> integrated with cassandra using its secondary index interface. If you
>>>> choose elastic search you will need to code the integration (write mutex,
>>>> both cluster synchronization (imagine something written in cassandra but
>>>> failed to write in elastic))
>>>>
>>>> I know i am not the most suitable to recommend you to use our product
>>>> cassandra-lucene-index
>>>> <https://github.com/stratio/cassandra-lucene-index> but it is open
>>>> source, just take a look.
>>>>
>>>> Eduardo Alonso
>>>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
>>>> 28224 Pozuelo de Alarcón, Madrid
>>>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com // *@stratiobd
>>>> <https://twitter.com/StratioBD>*
>>>>
>>>> 2017-06-12 11:18 GMT+02:00 @Nandan@ <na...@gmail.com>:
>>>>
>>>>> Hi Eduardo,
>>>>>
>>>>> And As we are trying to build an advanced search functionality in
>>>>> which we can able to do partial search based on actor, producer, director,
>>>>> etc. columns.
>>>>> So if we do denormalization of tables then we have to create tables
>>>>> such as below :-
>>>>> video_by_actor
>>>>> video_by_producer
>>>>> video_by_director
>>>>> video_by_date
>>>>> etc..
>>>>> By using denormalized, Cassandra only allows us to do equality search,
>>>>> but for implementing Partial search we need to implement solr on all above
>>>>> tables.
>>>>>
>>>>> This is my thinking, but I think this will be not correct way to
>>>>> implement Apache Solr on all tables.
>>>>>
>>>>> On Mon, Jun 12, 2017 at 5:11 PM, @Nandan@ <
>>>>> nandanpriyadarshi298@gmail.com> wrote:
>>>>>
>>>>>> Hi Edurado,
>>>>>>
>>>>>> As you mentioned queries 1-6 ,
>>>>>> In this condition, we have to proceed with a table like as below :-
>>>>>> create table videos (
>>>>>> videoid uuid primary key,
>>>>>> title text,
>>>>>> actor text,
>>>>>> producer text,
>>>>>> release_date timestamp,
>>>>>> description text,
>>>>>> music text,
>>>>>> etc...
>>>>>> );
>>>>>> This table will help to store video datas based on PK videoid and
>>>>>> will give uniqeness due to uuid.
>>>>>> But as we know , in one movie there are multiple actor, multiple
>>>>>> producer, multiple music worked, So how can we store all these.. Only one
>>>>>> option will left as to use collection type columns.
>>>>>>
>>>>>>
>>>>>> On Mon, Jun 12, 2017 at 4:59 PM, Eduardo Alonso <
>>>>>> eduardoalonso@stratio.com> wrote:
>>>>>>
>>>>>>> TLDR shouldBe *PD
>>>>>>>
>>>>>>> Eduardo Alonso
>>>>>>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
>>>>>>> 28224 Pozuelo de Alarcón, Madrid
>>>>>>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com
>>>>>>>  // *@stratiobd <https://twitter.com/StratioBD>*
>>>>>>>
>>>>>>> 2017-06-12 10:58 GMT+02:00 Eduardo Alonso <eduardoalonso@stratio.com
>>>>>>> >:
>>>>>>>
>>>>>>>> Hi Nandan:
>>>>>>>>
>>>>>>>> So, your system must provide these queries:
>>>>>>>>
>>>>>>>> 1 - SELECT video FROM ... WHERE actor = '...';
>>>>>>>> 2 - SELECT video FROM ... WHERE producer = '...';
>>>>>>>> 3 - SELECT video FROM ... WHERE music = '...';
>>>>>>>> 4 - SELECT video FROM ... WHERE actor = '...' AND producer ='...';
>>>>>>>> 5 - SELECT video FROM ... WHERE actor = '...' AND music = '...';
>>>>>>>> 6 - SELECT video WHERE title CONTAINS 'Harry';
>>>>>>>>
>>>>>>>>
>>>>>>>> For queries 1-5 you can get them with just cassandra, denormalizing
>>>>>>>> tables just the way your mentioned but without solr, just cassandra
>>>>>>>> (Indeed, just for equality clauses)
>>>>>>>>
>>>>>>>> video_by_actor;
>>>>>>>> video_by_producer;
>>>>>>>> video_by_music;
>>>>>>>> video_by_actor_and_producer;
>>>>>>>> video_by_actor_and_music;
>>>>>>>>
>>>>>>>> For queries number 6 you need a search engine.
>>>>>>>>
>>>>>>>> SOL
>>>>>>>> ElasticSearch
>>>>>>>> cassandra-lucene-index
>>>>>>>> <https://github.com/stratio/cassandra-lucene-index>
>>>>>>>> SASI
>>>>>>>> <http://docs.datastax.com/en/dse/5.1/cql/cql/cql_reference/cql_commands/cqlCreateCustomIndex.html>
>>>>>>>>
>>>>>>>> I think, just for your query,  the easiest way to get it is to
>>>>>>>> build a SASI index.
>>>>>>>> TLDR: I work for stratio in cassandra-lucene-index but for your
>>>>>>>> basic query (only one dimension), SASI indexes will work for you.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Eduardo Alonso
>>>>>>>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
>>>>>>>> 28224 Pozuelo de Alarcón, Madrid
>>>>>>>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com
>>>>>>>>  // *@stratiobd <https://twitter.com/StratioBD>*
>>>>>>>>
>>>>>>>> 2017-06-12 9:50 GMT+02:00 @Nandan@ <na...@gmail.com>
>>>>>>>> :
>>>>>>>>
>>>>>>>>> But Condition is , I am working with Apache Cassandra Database in
>>>>>>>>> which I have to store my data into Cassandra and then have to implement
>>>>>>>>> partial search capability.
>>>>>>>>> If we need to search based on full search  primary key, then it
>>>>>>>>> really best and easy to work with Cassandra , but in case of flexible
>>>>>>>>> search , I am getting confused.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Jun 12, 2017 at 3:47 PM, Oskar Kjellin <
>>>>>>>>> oskar.kjellin@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> I haven't run solr with Cassandra myself. I just meant to run
>>>>>>>>>> elasticsearch as a completely separate service and write there as well.
>>>>>>>>>>
>>>>>>>>>> On 12 Jun 2017, at 09:45, @Nandan@ <nandanpriyadarshi298@gmail.co
>>>>>>>>>> m> wrote:
>>>>>>>>>>
>>>>>>>>>> Do you mean to use Elastic Search with Cassandra?
>>>>>>>>>> Even I am thinking to use Apache Solr With Cassandra.
>>>>>>>>>> In that case I have to create distributed tables such as:-
>>>>>>>>>> 1) video_by_title, video_by_actor, video_by_year  etc..
>>>>>>>>>> 2) After creating Tables , will have to configure solr core on
>>>>>>>>>> all tables.
>>>>>>>>>>
>>>>>>>>>> Is it like this ?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mon, Jun 12, 2017 at 3:19 PM, Oskar Kjellin <
>>>>>>>>>> oskar.kjellin@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Why not elasticsearch for this use case? It will make your life
>>>>>>>>>>> much simpler
>>>>>>>>>>>
>>>>>>>>>>> > On 12 Jun 2017, at 04:40, @Nandan@ <
>>>>>>>>>>> nandanpriyadarshi298@gmail.com> wrote:
>>>>>>>>>>> >
>>>>>>>>>>> > Hi,
>>>>>>>>>>> >
>>>>>>>>>>> > Currently, I am working on data modeling for Video Company in
>>>>>>>>>>> which we have different types of users as well as different user
>>>>>>>>>>> functionality.
>>>>>>>>>>> > But currently, my concern is about Search video module based
>>>>>>>>>>> on different fields.
>>>>>>>>>>> >
>>>>>>>>>>> > Query patterns are as below:-
>>>>>>>>>>> > 1) Select video by actor.
>>>>>>>>>>> > 2) select video by producer.
>>>>>>>>>>> > 3) select video by music.
>>>>>>>>>>> > 4) select video by actor and producer.
>>>>>>>>>>> > 5) select video by actor and music.
>>>>>>>>>>> >
>>>>>>>>>>> > Note: - In short, We want to establish an advanced search
>>>>>>>>>>> module by which we can search by anyway and get the desired results.
>>>>>>>>>>> >
>>>>>>>>>>> > During a search , we need partial search also such that if any
>>>>>>>>>>> user can search "Harry" title, then we are able to give them result as all
>>>>>>>>>>> videos whose
>>>>>>>>>>> >  title contains "Harry" at any location.
>>>>>>>>>>> >
>>>>>>>>>>> > As per my ideas, I have to create separate tables such as
>>>>>>>>>>> video_by_actor, video_by_producer etc.. and implement solr query on all
>>>>>>>>>>> tables. Otherwise,
>>>>>>>>>>> > is there any others way by which we can implement this search
>>>>>>>>>>> module effectively.
>>>>>>>>>>> >
>>>>>>>>>>> > Please suggest.
>>>>>>>>>>> >
>>>>>>>>>>> > Best regards,
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Reg:- Cassandra Data modelling for Search

Posted by "@Nandan@" <na...@gmail.com>.
And due to single table videos, maybe it will go with around 15,20 columns,
then we need to also think very carefully about partition sizes also.

On Mon, Jun 12, 2017 at 6:33 PM, @Nandan@ <na...@gmail.com>
wrote:

> Yes this is only Option I am also thinking like this as my second options.
> Before this I was thinking to do denormalize table based on search columns,
> but due to partial search this will be not that effective.
>
> Now suppose , if we are going with this single table as videos. and
> implemented with Solr/Lucene, then need to also care about num_tokens ?
>
>
> On Mon, Jun 12, 2017 at 6:27 PM, Eduardo Alonso <eduardoalonso@stratio.com
> > wrote:
>
>> Using cassandra collections
>>
>> CREATE TABLE videos (
>> videoid uuid primary key,
>> title text,
>> actor list<text>,
>> producer list<text>,
>> release_date timestamp,
>> description text,
>> music text,
>> etc...
>> );
>>
>> When using collection you need to take care of its length. Collections
>> are designed to store
>> <http://docs.datastax.com/en/cql/3.1/cql/cql_using/use_collections_c.html>only
>> a small amount of data
>> <http://docs.datastax.com/en/cql/3.1/cql/cql_using/use_collections_c.html>
>> .
>> 5/10 actors per movie is ok.
>>
>>
>> Eduardo Alonso
>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
>> 28224 Pozuelo de Alarcón, Madrid
>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com // *@stratiobd
>> <https://twitter.com/StratioBD>*
>>
>> 2017-06-12 11:54 GMT+02:00 @Nandan@ <na...@gmail.com>:
>>
>>> So In short we have to go with one single table as videos and put
>>> primary key as videoid uuid.
>>> But then how can we able to handle multiple actor name and producer
>>> name. ?
>>>
>>> On Mon, Jun 12, 2017 at 5:51 PM, Eduardo Alonso <
>>> eduardoalonso@stratio.com> wrote:
>>>
>>>> Yes, you are right.
>>>>
>>>> Table denormalization is useful just when you have unique primary keys,
>>>> not your case.
>>>> Denormalized tables are only different in its primary key, every
>>>> denormalized table contains all the data (it just change how it is
>>>> structured). So, if you need to index it, do it with just one table (the
>>>> one you showed us with videoid as the primary key is ok).
>>>>
>>>> Solr, Elastic and cassandra-lucene-index are both based on Lucene and
>>>> all of them fulfill all your needs.
>>>>
>>>> Solr (in DSE) and cassandra-lucene-index
>>>> <https://github.com/stratio/cassandra-lucene-index> are very well
>>>> integrated with cassandra using its secondary index interface. If you
>>>> choose elastic search you will need to code the integration (write mutex,
>>>> both cluster synchronization (imagine something written in cassandra but
>>>> failed to write in elastic))
>>>>
>>>> I know i am not the most suitable to recommend you to use our product
>>>> cassandra-lucene-index
>>>> <https://github.com/stratio/cassandra-lucene-index> but it is open
>>>> source, just take a look.
>>>>
>>>> Eduardo Alonso
>>>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
>>>> 28224 Pozuelo de Alarcón, Madrid
>>>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com // *@stratiobd
>>>> <https://twitter.com/StratioBD>*
>>>>
>>>> 2017-06-12 11:18 GMT+02:00 @Nandan@ <na...@gmail.com>:
>>>>
>>>>> Hi Eduardo,
>>>>>
>>>>> And As we are trying to build an advanced search functionality in
>>>>> which we can able to do partial search based on actor, producer, director,
>>>>> etc. columns.
>>>>> So if we do denormalization of tables then we have to create tables
>>>>> such as below :-
>>>>> video_by_actor
>>>>> video_by_producer
>>>>> video_by_director
>>>>> video_by_date
>>>>> etc..
>>>>> By using denormalized, Cassandra only allows us to do equality search,
>>>>> but for implementing Partial search we need to implement solr on all above
>>>>> tables.
>>>>>
>>>>> This is my thinking, but I think this will be not correct way to
>>>>> implement Apache Solr on all tables.
>>>>>
>>>>> On Mon, Jun 12, 2017 at 5:11 PM, @Nandan@ <
>>>>> nandanpriyadarshi298@gmail.com> wrote:
>>>>>
>>>>>> Hi Edurado,
>>>>>>
>>>>>> As you mentioned queries 1-6 ,
>>>>>> In this condition, we have to proceed with a table like as below :-
>>>>>> create table videos (
>>>>>> videoid uuid primary key,
>>>>>> title text,
>>>>>> actor text,
>>>>>> producer text,
>>>>>> release_date timestamp,
>>>>>> description text,
>>>>>> music text,
>>>>>> etc...
>>>>>> );
>>>>>> This table will help to store video datas based on PK videoid and
>>>>>> will give uniqeness due to uuid.
>>>>>> But as we know , in one movie there are multiple actor, multiple
>>>>>> producer, multiple music worked, So how can we store all these.. Only one
>>>>>> option will left as to use collection type columns.
>>>>>>
>>>>>>
>>>>>> On Mon, Jun 12, 2017 at 4:59 PM, Eduardo Alonso <
>>>>>> eduardoalonso@stratio.com> wrote:
>>>>>>
>>>>>>> TLDR shouldBe *PD
>>>>>>>
>>>>>>> Eduardo Alonso
>>>>>>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
>>>>>>> 28224 Pozuelo de Alarcón, Madrid
>>>>>>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com
>>>>>>>  // *@stratiobd <https://twitter.com/StratioBD>*
>>>>>>>
>>>>>>> 2017-06-12 10:58 GMT+02:00 Eduardo Alonso <eduardoalonso@stratio.com
>>>>>>> >:
>>>>>>>
>>>>>>>> Hi Nandan:
>>>>>>>>
>>>>>>>> So, your system must provide these queries:
>>>>>>>>
>>>>>>>> 1 - SELECT video FROM ... WHERE actor = '...';
>>>>>>>> 2 - SELECT video FROM ... WHERE producer = '...';
>>>>>>>> 3 - SELECT video FROM ... WHERE music = '...';
>>>>>>>> 4 - SELECT video FROM ... WHERE actor = '...' AND producer ='...';
>>>>>>>> 5 - SELECT video FROM ... WHERE actor = '...' AND music = '...';
>>>>>>>> 6 - SELECT video WHERE title CONTAINS 'Harry';
>>>>>>>>
>>>>>>>>
>>>>>>>> For queries 1-5 you can get them with just cassandra, denormalizing
>>>>>>>> tables just the way your mentioned but without solr, just cassandra
>>>>>>>> (Indeed, just for equality clauses)
>>>>>>>>
>>>>>>>> video_by_actor;
>>>>>>>> video_by_producer;
>>>>>>>> video_by_music;
>>>>>>>> video_by_actor_and_producer;
>>>>>>>> video_by_actor_and_music;
>>>>>>>>
>>>>>>>> For queries number 6 you need a search engine.
>>>>>>>>
>>>>>>>> SOL
>>>>>>>> ElasticSearch
>>>>>>>> cassandra-lucene-index
>>>>>>>> <https://github.com/stratio/cassandra-lucene-index>
>>>>>>>> SASI
>>>>>>>> <http://docs.datastax.com/en/dse/5.1/cql/cql/cql_reference/cql_commands/cqlCreateCustomIndex.html>
>>>>>>>>
>>>>>>>> I think, just for your query,  the easiest way to get it is to
>>>>>>>> build a SASI index.
>>>>>>>> TLDR: I work for stratio in cassandra-lucene-index but for your
>>>>>>>> basic query (only one dimension), SASI indexes will work for you.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Eduardo Alonso
>>>>>>>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
>>>>>>>> 28224 Pozuelo de Alarcón, Madrid
>>>>>>>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com
>>>>>>>>  // *@stratiobd <https://twitter.com/StratioBD>*
>>>>>>>>
>>>>>>>> 2017-06-12 9:50 GMT+02:00 @Nandan@ <na...@gmail.com>
>>>>>>>> :
>>>>>>>>
>>>>>>>>> But Condition is , I am working with Apache Cassandra Database in
>>>>>>>>> which I have to store my data into Cassandra and then have to implement
>>>>>>>>> partial search capability.
>>>>>>>>> If we need to search based on full search  primary key, then it
>>>>>>>>> really best and easy to work with Cassandra , but in case of flexible
>>>>>>>>> search , I am getting confused.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Jun 12, 2017 at 3:47 PM, Oskar Kjellin <
>>>>>>>>> oskar.kjellin@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> I haven't run solr with Cassandra myself. I just meant to run
>>>>>>>>>> elasticsearch as a completely separate service and write there as well.
>>>>>>>>>>
>>>>>>>>>> On 12 Jun 2017, at 09:45, @Nandan@ <nandanpriyadarshi298@gmail.co
>>>>>>>>>> m> wrote:
>>>>>>>>>>
>>>>>>>>>> Do you mean to use Elastic Search with Cassandra?
>>>>>>>>>> Even I am thinking to use Apache Solr With Cassandra.
>>>>>>>>>> In that case I have to create distributed tables such as:-
>>>>>>>>>> 1) video_by_title, video_by_actor, video_by_year  etc..
>>>>>>>>>> 2) After creating Tables , will have to configure solr core on
>>>>>>>>>> all tables.
>>>>>>>>>>
>>>>>>>>>> Is it like this ?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mon, Jun 12, 2017 at 3:19 PM, Oskar Kjellin <
>>>>>>>>>> oskar.kjellin@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Why not elasticsearch for this use case? It will make your life
>>>>>>>>>>> much simpler
>>>>>>>>>>>
>>>>>>>>>>> > On 12 Jun 2017, at 04:40, @Nandan@ <
>>>>>>>>>>> nandanpriyadarshi298@gmail.com> wrote:
>>>>>>>>>>> >
>>>>>>>>>>> > Hi,
>>>>>>>>>>> >
>>>>>>>>>>> > Currently, I am working on data modeling for Video Company in
>>>>>>>>>>> which we have different types of users as well as different user
>>>>>>>>>>> functionality.
>>>>>>>>>>> > But currently, my concern is about Search video module based
>>>>>>>>>>> on different fields.
>>>>>>>>>>> >
>>>>>>>>>>> > Query patterns are as below:-
>>>>>>>>>>> > 1) Select video by actor.
>>>>>>>>>>> > 2) select video by producer.
>>>>>>>>>>> > 3) select video by music.
>>>>>>>>>>> > 4) select video by actor and producer.
>>>>>>>>>>> > 5) select video by actor and music.
>>>>>>>>>>> >
>>>>>>>>>>> > Note: - In short, We want to establish an advanced search
>>>>>>>>>>> module by which we can search by anyway and get the desired results.
>>>>>>>>>>> >
>>>>>>>>>>> > During a search , we need partial search also such that if any
>>>>>>>>>>> user can search "Harry" title, then we are able to give them result as all
>>>>>>>>>>> videos whose
>>>>>>>>>>> >  title contains "Harry" at any location.
>>>>>>>>>>> >
>>>>>>>>>>> > As per my ideas, I have to create separate tables such as
>>>>>>>>>>> video_by_actor, video_by_producer etc.. and implement solr query on all
>>>>>>>>>>> tables. Otherwise,
>>>>>>>>>>> > is there any others way by which we can implement this search
>>>>>>>>>>> module effectively.
>>>>>>>>>>> >
>>>>>>>>>>> > Please suggest.
>>>>>>>>>>> >
>>>>>>>>>>> > Best regards,
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Reg:- Cassandra Data modelling for Search

Posted by "@Nandan@" <na...@gmail.com>.
Yes this is only Option I am also thinking like this as my second options.
Before this I was thinking to do denormalize table based on search columns,
but due to partial search this will be not that effective.

Now suppose , if we are going with this single table as videos. and
implemented with Solr/Lucene, then need to also care about num_tokens ?


On Mon, Jun 12, 2017 at 6:27 PM, Eduardo Alonso <ed...@stratio.com>
wrote:

> Using cassandra collections
>
> CREATE TABLE videos (
> videoid uuid primary key,
> title text,
> actor list<text>,
> producer list<text>,
> release_date timestamp,
> description text,
> music text,
> etc...
> );
>
> When using collection you need to take care of its length. Collections
> are designed to store
> <http://docs.datastax.com/en/cql/3.1/cql/cql_using/use_collections_c.html>only
> a small amount of data
> <http://docs.datastax.com/en/cql/3.1/cql/cql_using/use_collections_c.html>
> .
> 5/10 actors per movie is ok.
>
>
> Eduardo Alonso
> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
> 28224 Pozuelo de Alarcón, Madrid
> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com // *@stratiobd
> <https://twitter.com/StratioBD>*
>
> 2017-06-12 11:54 GMT+02:00 @Nandan@ <na...@gmail.com>:
>
>> So In short we have to go with one single table as videos and put primary
>> key as videoid uuid.
>> But then how can we able to handle multiple actor name and producer name.
>> ?
>>
>> On Mon, Jun 12, 2017 at 5:51 PM, Eduardo Alonso <
>> eduardoalonso@stratio.com> wrote:
>>
>>> Yes, you are right.
>>>
>>> Table denormalization is useful just when you have unique primary keys,
>>> not your case.
>>> Denormalized tables are only different in its primary key, every
>>> denormalized table contains all the data (it just change how it is
>>> structured). So, if you need to index it, do it with just one table (the
>>> one you showed us with videoid as the primary key is ok).
>>>
>>> Solr, Elastic and cassandra-lucene-index are both based on Lucene and
>>> all of them fulfill all your needs.
>>>
>>> Solr (in DSE) and cassandra-lucene-index
>>> <https://github.com/stratio/cassandra-lucene-index> are very well
>>> integrated with cassandra using its secondary index interface. If you
>>> choose elastic search you will need to code the integration (write mutex,
>>> both cluster synchronization (imagine something written in cassandra but
>>> failed to write in elastic))
>>>
>>> I know i am not the most suitable to recommend you to use our product
>>> cassandra-lucene-index
>>> <https://github.com/stratio/cassandra-lucene-index> but it is open
>>> source, just take a look.
>>>
>>> Eduardo Alonso
>>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
>>> 28224 Pozuelo de Alarcón, Madrid
>>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com // *@stratiobd
>>> <https://twitter.com/StratioBD>*
>>>
>>> 2017-06-12 11:18 GMT+02:00 @Nandan@ <na...@gmail.com>:
>>>
>>>> Hi Eduardo,
>>>>
>>>> And As we are trying to build an advanced search functionality in which
>>>> we can able to do partial search based on actor, producer, director, etc.
>>>> columns.
>>>> So if we do denormalization of tables then we have to create tables
>>>> such as below :-
>>>> video_by_actor
>>>> video_by_producer
>>>> video_by_director
>>>> video_by_date
>>>> etc..
>>>> By using denormalized, Cassandra only allows us to do equality search,
>>>> but for implementing Partial search we need to implement solr on all above
>>>> tables.
>>>>
>>>> This is my thinking, but I think this will be not correct way to
>>>> implement Apache Solr on all tables.
>>>>
>>>> On Mon, Jun 12, 2017 at 5:11 PM, @Nandan@ <
>>>> nandanpriyadarshi298@gmail.com> wrote:
>>>>
>>>>> Hi Edurado,
>>>>>
>>>>> As you mentioned queries 1-6 ,
>>>>> In this condition, we have to proceed with a table like as below :-
>>>>> create table videos (
>>>>> videoid uuid primary key,
>>>>> title text,
>>>>> actor text,
>>>>> producer text,
>>>>> release_date timestamp,
>>>>> description text,
>>>>> music text,
>>>>> etc...
>>>>> );
>>>>> This table will help to store video datas based on PK videoid and will
>>>>> give uniqeness due to uuid.
>>>>> But as we know , in one movie there are multiple actor, multiple
>>>>> producer, multiple music worked, So how can we store all these.. Only one
>>>>> option will left as to use collection type columns.
>>>>>
>>>>>
>>>>> On Mon, Jun 12, 2017 at 4:59 PM, Eduardo Alonso <
>>>>> eduardoalonso@stratio.com> wrote:
>>>>>
>>>>>> TLDR shouldBe *PD
>>>>>>
>>>>>> Eduardo Alonso
>>>>>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
>>>>>> 28224 Pozuelo de Alarcón, Madrid
>>>>>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com
>>>>>>  // *@stratiobd <https://twitter.com/StratioBD>*
>>>>>>
>>>>>> 2017-06-12 10:58 GMT+02:00 Eduardo Alonso <ed...@stratio.com>
>>>>>> :
>>>>>>
>>>>>>> Hi Nandan:
>>>>>>>
>>>>>>> So, your system must provide these queries:
>>>>>>>
>>>>>>> 1 - SELECT video FROM ... WHERE actor = '...';
>>>>>>> 2 - SELECT video FROM ... WHERE producer = '...';
>>>>>>> 3 - SELECT video FROM ... WHERE music = '...';
>>>>>>> 4 - SELECT video FROM ... WHERE actor = '...' AND producer ='...';
>>>>>>> 5 - SELECT video FROM ... WHERE actor = '...' AND music = '...';
>>>>>>> 6 - SELECT video WHERE title CONTAINS 'Harry';
>>>>>>>
>>>>>>>
>>>>>>> For queries 1-5 you can get them with just cassandra, denormalizing
>>>>>>> tables just the way your mentioned but without solr, just cassandra
>>>>>>> (Indeed, just for equality clauses)
>>>>>>>
>>>>>>> video_by_actor;
>>>>>>> video_by_producer;
>>>>>>> video_by_music;
>>>>>>> video_by_actor_and_producer;
>>>>>>> video_by_actor_and_music;
>>>>>>>
>>>>>>> For queries number 6 you need a search engine.
>>>>>>>
>>>>>>> SOL
>>>>>>> ElasticSearch
>>>>>>> cassandra-lucene-index
>>>>>>> <https://github.com/stratio/cassandra-lucene-index>
>>>>>>> SASI
>>>>>>> <http://docs.datastax.com/en/dse/5.1/cql/cql/cql_reference/cql_commands/cqlCreateCustomIndex.html>
>>>>>>>
>>>>>>> I think, just for your query,  the easiest way to get it is to build
>>>>>>> a SASI index.
>>>>>>> TLDR: I work for stratio in cassandra-lucene-index but for your
>>>>>>> basic query (only one dimension), SASI indexes will work for you.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Eduardo Alonso
>>>>>>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
>>>>>>> 28224 Pozuelo de Alarcón, Madrid
>>>>>>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com
>>>>>>>  // *@stratiobd <https://twitter.com/StratioBD>*
>>>>>>>
>>>>>>> 2017-06-12 9:50 GMT+02:00 @Nandan@ <na...@gmail.com>:
>>>>>>>
>>>>>>>> But Condition is , I am working with Apache Cassandra Database in
>>>>>>>> which I have to store my data into Cassandra and then have to implement
>>>>>>>> partial search capability.
>>>>>>>> If we need to search based on full search  primary key, then it
>>>>>>>> really best and easy to work with Cassandra , but in case of flexible
>>>>>>>> search , I am getting confused.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Jun 12, 2017 at 3:47 PM, Oskar Kjellin <
>>>>>>>> oskar.kjellin@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> I haven't run solr with Cassandra myself. I just meant to run
>>>>>>>>> elasticsearch as a completely separate service and write there as well.
>>>>>>>>>
>>>>>>>>> On 12 Jun 2017, at 09:45, @Nandan@ <na...@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> Do you mean to use Elastic Search with Cassandra?
>>>>>>>>> Even I am thinking to use Apache Solr With Cassandra.
>>>>>>>>> In that case I have to create distributed tables such as:-
>>>>>>>>> 1) video_by_title, video_by_actor, video_by_year  etc..
>>>>>>>>> 2) After creating Tables , will have to configure solr core on all
>>>>>>>>> tables.
>>>>>>>>>
>>>>>>>>> Is it like this ?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Jun 12, 2017 at 3:19 PM, Oskar Kjellin <
>>>>>>>>> oskar.kjellin@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Why not elasticsearch for this use case? It will make your life
>>>>>>>>>> much simpler
>>>>>>>>>>
>>>>>>>>>> > On 12 Jun 2017, at 04:40, @Nandan@ <
>>>>>>>>>> nandanpriyadarshi298@gmail.com> wrote:
>>>>>>>>>> >
>>>>>>>>>> > Hi,
>>>>>>>>>> >
>>>>>>>>>> > Currently, I am working on data modeling for Video Company in
>>>>>>>>>> which we have different types of users as well as different user
>>>>>>>>>> functionality.
>>>>>>>>>> > But currently, my concern is about Search video module based on
>>>>>>>>>> different fields.
>>>>>>>>>> >
>>>>>>>>>> > Query patterns are as below:-
>>>>>>>>>> > 1) Select video by actor.
>>>>>>>>>> > 2) select video by producer.
>>>>>>>>>> > 3) select video by music.
>>>>>>>>>> > 4) select video by actor and producer.
>>>>>>>>>> > 5) select video by actor and music.
>>>>>>>>>> >
>>>>>>>>>> > Note: - In short, We want to establish an advanced search
>>>>>>>>>> module by which we can search by anyway and get the desired results.
>>>>>>>>>> >
>>>>>>>>>> > During a search , we need partial search also such that if any
>>>>>>>>>> user can search "Harry" title, then we are able to give them result as all
>>>>>>>>>> videos whose
>>>>>>>>>> >  title contains "Harry" at any location.
>>>>>>>>>> >
>>>>>>>>>> > As per my ideas, I have to create separate tables such as
>>>>>>>>>> video_by_actor, video_by_producer etc.. and implement solr query on all
>>>>>>>>>> tables. Otherwise,
>>>>>>>>>> > is there any others way by which we can implement this search
>>>>>>>>>> module effectively.
>>>>>>>>>> >
>>>>>>>>>> > Please suggest.
>>>>>>>>>> >
>>>>>>>>>> > Best regards,
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Reg:- Cassandra Data modelling for Search

Posted by "@Nandan@" <na...@gmail.com>.
Yes this is only Option I am also thinking like this as my second options.
Before this I was thinking to do denormalize table based on search columns,
but due to partial search this will be not that effective.

Now suppose , if we are going with this single table as videos. and
implemented with Solr/Lucene, then need to also care about num_tokens ?


On Mon, Jun 12, 2017 at 6:27 PM, Eduardo Alonso <ed...@stratio.com>
wrote:

> Using cassandra collections
>
> CREATE TABLE videos (
> videoid uuid primary key,
> title text,
> actor list<text>,
> producer list<text>,
> release_date timestamp,
> description text,
> music text,
> etc...
> );
>
> When using collection you need to take care of its length. Collections
> are designed to store
> <http://docs.datastax.com/en/cql/3.1/cql/cql_using/use_collections_c.html>only
> a small amount of data
> <http://docs.datastax.com/en/cql/3.1/cql/cql_using/use_collections_c.html>
> .
> 5/10 actors per movie is ok.
>
>
> Eduardo Alonso
> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
> 28224 Pozuelo de Alarcón, Madrid
> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com // *@stratiobd
> <https://twitter.com/StratioBD>*
>
> 2017-06-12 11:54 GMT+02:00 @Nandan@ <na...@gmail.com>:
>
>> So In short we have to go with one single table as videos and put primary
>> key as videoid uuid.
>> But then how can we able to handle multiple actor name and producer name.
>> ?
>>
>> On Mon, Jun 12, 2017 at 5:51 PM, Eduardo Alonso <
>> eduardoalonso@stratio.com> wrote:
>>
>>> Yes, you are right.
>>>
>>> Table denormalization is useful just when you have unique primary keys,
>>> not your case.
>>> Denormalized tables are only different in its primary key, every
>>> denormalized table contains all the data (it just change how it is
>>> structured). So, if you need to index it, do it with just one table (the
>>> one you showed us with videoid as the primary key is ok).
>>>
>>> Solr, Elastic and cassandra-lucene-index are both based on Lucene and
>>> all of them fulfill all your needs.
>>>
>>> Solr (in DSE) and cassandra-lucene-index
>>> <https://github.com/stratio/cassandra-lucene-index> are very well
>>> integrated with cassandra using its secondary index interface. If you
>>> choose elastic search you will need to code the integration (write mutex,
>>> both cluster synchronization (imagine something written in cassandra but
>>> failed to write in elastic))
>>>
>>> I know i am not the most suitable to recommend you to use our product
>>> cassandra-lucene-index
>>> <https://github.com/stratio/cassandra-lucene-index> but it is open
>>> source, just take a look.
>>>
>>> Eduardo Alonso
>>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
>>> 28224 Pozuelo de Alarcón, Madrid
>>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com // *@stratiobd
>>> <https://twitter.com/StratioBD>*
>>>
>>> 2017-06-12 11:18 GMT+02:00 @Nandan@ <na...@gmail.com>:
>>>
>>>> Hi Eduardo,
>>>>
>>>> And As we are trying to build an advanced search functionality in which
>>>> we can able to do partial search based on actor, producer, director, etc.
>>>> columns.
>>>> So if we do denormalization of tables then we have to create tables
>>>> such as below :-
>>>> video_by_actor
>>>> video_by_producer
>>>> video_by_director
>>>> video_by_date
>>>> etc..
>>>> By using denormalized, Cassandra only allows us to do equality search,
>>>> but for implementing Partial search we need to implement solr on all above
>>>> tables.
>>>>
>>>> This is my thinking, but I think this will be not correct way to
>>>> implement Apache Solr on all tables.
>>>>
>>>> On Mon, Jun 12, 2017 at 5:11 PM, @Nandan@ <
>>>> nandanpriyadarshi298@gmail.com> wrote:
>>>>
>>>>> Hi Edurado,
>>>>>
>>>>> As you mentioned queries 1-6 ,
>>>>> In this condition, we have to proceed with a table like as below :-
>>>>> create table videos (
>>>>> videoid uuid primary key,
>>>>> title text,
>>>>> actor text,
>>>>> producer text,
>>>>> release_date timestamp,
>>>>> description text,
>>>>> music text,
>>>>> etc...
>>>>> );
>>>>> This table will help to store video datas based on PK videoid and will
>>>>> give uniqeness due to uuid.
>>>>> But as we know , in one movie there are multiple actor, multiple
>>>>> producer, multiple music worked, So how can we store all these.. Only one
>>>>> option will left as to use collection type columns.
>>>>>
>>>>>
>>>>> On Mon, Jun 12, 2017 at 4:59 PM, Eduardo Alonso <
>>>>> eduardoalonso@stratio.com> wrote:
>>>>>
>>>>>> TLDR shouldBe *PD
>>>>>>
>>>>>> Eduardo Alonso
>>>>>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
>>>>>> 28224 Pozuelo de Alarcón, Madrid
>>>>>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com
>>>>>>  // *@stratiobd <https://twitter.com/StratioBD>*
>>>>>>
>>>>>> 2017-06-12 10:58 GMT+02:00 Eduardo Alonso <ed...@stratio.com>
>>>>>> :
>>>>>>
>>>>>>> Hi Nandan:
>>>>>>>
>>>>>>> So, your system must provide these queries:
>>>>>>>
>>>>>>> 1 - SELECT video FROM ... WHERE actor = '...';
>>>>>>> 2 - SELECT video FROM ... WHERE producer = '...';
>>>>>>> 3 - SELECT video FROM ... WHERE music = '...';
>>>>>>> 4 - SELECT video FROM ... WHERE actor = '...' AND producer ='...';
>>>>>>> 5 - SELECT video FROM ... WHERE actor = '...' AND music = '...';
>>>>>>> 6 - SELECT video WHERE title CONTAINS 'Harry';
>>>>>>>
>>>>>>>
>>>>>>> For queries 1-5 you can get them with just cassandra, denormalizing
>>>>>>> tables just the way your mentioned but without solr, just cassandra
>>>>>>> (Indeed, just for equality clauses)
>>>>>>>
>>>>>>> video_by_actor;
>>>>>>> video_by_producer;
>>>>>>> video_by_music;
>>>>>>> video_by_actor_and_producer;
>>>>>>> video_by_actor_and_music;
>>>>>>>
>>>>>>> For queries number 6 you need a search engine.
>>>>>>>
>>>>>>> SOL
>>>>>>> ElasticSearch
>>>>>>> cassandra-lucene-index
>>>>>>> <https://github.com/stratio/cassandra-lucene-index>
>>>>>>> SASI
>>>>>>> <http://docs.datastax.com/en/dse/5.1/cql/cql/cql_reference/cql_commands/cqlCreateCustomIndex.html>
>>>>>>>
>>>>>>> I think, just for your query,  the easiest way to get it is to build
>>>>>>> a SASI index.
>>>>>>> TLDR: I work for stratio in cassandra-lucene-index but for your
>>>>>>> basic query (only one dimension), SASI indexes will work for you.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Eduardo Alonso
>>>>>>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
>>>>>>> 28224 Pozuelo de Alarcón, Madrid
>>>>>>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com
>>>>>>>  // *@stratiobd <https://twitter.com/StratioBD>*
>>>>>>>
>>>>>>> 2017-06-12 9:50 GMT+02:00 @Nandan@ <na...@gmail.com>:
>>>>>>>
>>>>>>>> But Condition is , I am working with Apache Cassandra Database in
>>>>>>>> which I have to store my data into Cassandra and then have to implement
>>>>>>>> partial search capability.
>>>>>>>> If we need to search based on full search  primary key, then it
>>>>>>>> really best and easy to work with Cassandra , but in case of flexible
>>>>>>>> search , I am getting confused.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Jun 12, 2017 at 3:47 PM, Oskar Kjellin <
>>>>>>>> oskar.kjellin@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> I haven't run solr with Cassandra myself. I just meant to run
>>>>>>>>> elasticsearch as a completely separate service and write there as well.
>>>>>>>>>
>>>>>>>>> On 12 Jun 2017, at 09:45, @Nandan@ <na...@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> Do you mean to use Elastic Search with Cassandra?
>>>>>>>>> Even I am thinking to use Apache Solr With Cassandra.
>>>>>>>>> In that case I have to create distributed tables such as:-
>>>>>>>>> 1) video_by_title, video_by_actor, video_by_year  etc..
>>>>>>>>> 2) After creating Tables , will have to configure solr core on all
>>>>>>>>> tables.
>>>>>>>>>
>>>>>>>>> Is it like this ?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Jun 12, 2017 at 3:19 PM, Oskar Kjellin <
>>>>>>>>> oskar.kjellin@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Why not elasticsearch for this use case? It will make your life
>>>>>>>>>> much simpler
>>>>>>>>>>
>>>>>>>>>> > On 12 Jun 2017, at 04:40, @Nandan@ <
>>>>>>>>>> nandanpriyadarshi298@gmail.com> wrote:
>>>>>>>>>> >
>>>>>>>>>> > Hi,
>>>>>>>>>> >
>>>>>>>>>> > Currently, I am working on data modeling for Video Company in
>>>>>>>>>> which we have different types of users as well as different user
>>>>>>>>>> functionality.
>>>>>>>>>> > But currently, my concern is about Search video module based on
>>>>>>>>>> different fields.
>>>>>>>>>> >
>>>>>>>>>> > Query patterns are as below:-
>>>>>>>>>> > 1) Select video by actor.
>>>>>>>>>> > 2) select video by producer.
>>>>>>>>>> > 3) select video by music.
>>>>>>>>>> > 4) select video by actor and producer.
>>>>>>>>>> > 5) select video by actor and music.
>>>>>>>>>> >
>>>>>>>>>> > Note: - In short, We want to establish an advanced search
>>>>>>>>>> module by which we can search by anyway and get the desired results.
>>>>>>>>>> >
>>>>>>>>>> > During a search , we need partial search also such that if any
>>>>>>>>>> user can search "Harry" title, then we are able to give them result as all
>>>>>>>>>> videos whose
>>>>>>>>>> >  title contains "Harry" at any location.
>>>>>>>>>> >
>>>>>>>>>> > As per my ideas, I have to create separate tables such as
>>>>>>>>>> video_by_actor, video_by_producer etc.. and implement solr query on all
>>>>>>>>>> tables. Otherwise,
>>>>>>>>>> > is there any others way by which we can implement this search
>>>>>>>>>> module effectively.
>>>>>>>>>> >
>>>>>>>>>> > Please suggest.
>>>>>>>>>> >
>>>>>>>>>> > Best regards,
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Reg:- Cassandra Data modelling for Search

Posted by Eduardo Alonso <ed...@stratio.com>.
Using cassandra collections

CREATE TABLE videos (
videoid uuid primary key,
title text,
actor list<text>,
producer list<text>,
release_date timestamp,
description text,
music text,
etc...
);

When using collection you need to take care of its length. Collections are
designed to store
<http://docs.datastax.com/en/cql/3.1/cql/cql_using/use_collections_c.html>only
a small amount of data
<http://docs.datastax.com/en/cql/3.1/cql/cql_using/use_collections_c.html>.
5/10 actors per movie is ok.


Eduardo Alonso
Vía de las dos Castillas, 33, Ática 4, 3ª Planta
28224 Pozuelo de Alarcón, Madrid
Tel: +34 91 828 6473 // www.stratio.com // *@stratiobd
<https://twitter.com/StratioBD>*

2017-06-12 11:54 GMT+02:00 @Nandan@ <na...@gmail.com>:

> So In short we have to go with one single table as videos and put primary
> key as videoid uuid.
> But then how can we able to handle multiple actor name and producer name.
> ?
>
> On Mon, Jun 12, 2017 at 5:51 PM, Eduardo Alonso <eduardoalonso@stratio.com
> > wrote:
>
>> Yes, you are right.
>>
>> Table denormalization is useful just when you have unique primary keys,
>> not your case.
>> Denormalized tables are only different in its primary key, every
>> denormalized table contains all the data (it just change how it is
>> structured). So, if you need to index it, do it with just one table (the
>> one you showed us with videoid as the primary key is ok).
>>
>> Solr, Elastic and cassandra-lucene-index are both based on Lucene and all
>> of them fulfill all your needs.
>>
>> Solr (in DSE) and cassandra-lucene-index
>> <https://github.com/stratio/cassandra-lucene-index> are very well
>> integrated with cassandra using its secondary index interface. If you
>> choose elastic search you will need to code the integration (write mutex,
>> both cluster synchronization (imagine something written in cassandra but
>> failed to write in elastic))
>>
>> I know i am not the most suitable to recommend you to use our product
>> cassandra-lucene-index
>> <https://github.com/stratio/cassandra-lucene-index> but it is open
>> source, just take a look.
>>
>> Eduardo Alonso
>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
>> 28224 Pozuelo de Alarcón, Madrid
>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com // *@stratiobd
>> <https://twitter.com/StratioBD>*
>>
>> 2017-06-12 11:18 GMT+02:00 @Nandan@ <na...@gmail.com>:
>>
>>> Hi Eduardo,
>>>
>>> And As we are trying to build an advanced search functionality in which
>>> we can able to do partial search based on actor, producer, director, etc.
>>> columns.
>>> So if we do denormalization of tables then we have to create tables such
>>> as below :-
>>> video_by_actor
>>> video_by_producer
>>> video_by_director
>>> video_by_date
>>> etc..
>>> By using denormalized, Cassandra only allows us to do equality search,
>>> but for implementing Partial search we need to implement solr on all above
>>> tables.
>>>
>>> This is my thinking, but I think this will be not correct way to
>>> implement Apache Solr on all tables.
>>>
>>> On Mon, Jun 12, 2017 at 5:11 PM, @Nandan@ <nandanpriyadarshi298@gmail.co
>>> m> wrote:
>>>
>>>> Hi Edurado,
>>>>
>>>> As you mentioned queries 1-6 ,
>>>> In this condition, we have to proceed with a table like as below :-
>>>> create table videos (
>>>> videoid uuid primary key,
>>>> title text,
>>>> actor text,
>>>> producer text,
>>>> release_date timestamp,
>>>> description text,
>>>> music text,
>>>> etc...
>>>> );
>>>> This table will help to store video datas based on PK videoid and will
>>>> give uniqeness due to uuid.
>>>> But as we know , in one movie there are multiple actor, multiple
>>>> producer, multiple music worked, So how can we store all these.. Only one
>>>> option will left as to use collection type columns.
>>>>
>>>>
>>>> On Mon, Jun 12, 2017 at 4:59 PM, Eduardo Alonso <
>>>> eduardoalonso@stratio.com> wrote:
>>>>
>>>>> TLDR shouldBe *PD
>>>>>
>>>>> Eduardo Alonso
>>>>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
>>>>> 28224 Pozuelo de Alarcón, Madrid
>>>>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com // *@stratiobd
>>>>> <https://twitter.com/StratioBD>*
>>>>>
>>>>> 2017-06-12 10:58 GMT+02:00 Eduardo Alonso <ed...@stratio.com>:
>>>>>
>>>>>> Hi Nandan:
>>>>>>
>>>>>> So, your system must provide these queries:
>>>>>>
>>>>>> 1 - SELECT video FROM ... WHERE actor = '...';
>>>>>> 2 - SELECT video FROM ... WHERE producer = '...';
>>>>>> 3 - SELECT video FROM ... WHERE music = '...';
>>>>>> 4 - SELECT video FROM ... WHERE actor = '...' AND producer ='...';
>>>>>> 5 - SELECT video FROM ... WHERE actor = '...' AND music = '...';
>>>>>> 6 - SELECT video WHERE title CONTAINS 'Harry';
>>>>>>
>>>>>>
>>>>>> For queries 1-5 you can get them with just cassandra, denormalizing
>>>>>> tables just the way your mentioned but without solr, just cassandra
>>>>>> (Indeed, just for equality clauses)
>>>>>>
>>>>>> video_by_actor;
>>>>>> video_by_producer;
>>>>>> video_by_music;
>>>>>> video_by_actor_and_producer;
>>>>>> video_by_actor_and_music;
>>>>>>
>>>>>> For queries number 6 you need a search engine.
>>>>>>
>>>>>> SOL
>>>>>> ElasticSearch
>>>>>> cassandra-lucene-index
>>>>>> <https://github.com/stratio/cassandra-lucene-index>
>>>>>> SASI
>>>>>> <http://docs.datastax.com/en/dse/5.1/cql/cql/cql_reference/cql_commands/cqlCreateCustomIndex.html>
>>>>>>
>>>>>> I think, just for your query,  the easiest way to get it is to build
>>>>>> a SASI index.
>>>>>> TLDR: I work for stratio in cassandra-lucene-index but for your basic
>>>>>> query (only one dimension), SASI indexes will work for you.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Eduardo Alonso
>>>>>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
>>>>>> 28224 Pozuelo de Alarcón, Madrid
>>>>>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com
>>>>>>  // *@stratiobd <https://twitter.com/StratioBD>*
>>>>>>
>>>>>> 2017-06-12 9:50 GMT+02:00 @Nandan@ <na...@gmail.com>:
>>>>>>
>>>>>>> But Condition is , I am working with Apache Cassandra Database in
>>>>>>> which I have to store my data into Cassandra and then have to implement
>>>>>>> partial search capability.
>>>>>>> If we need to search based on full search  primary key, then it
>>>>>>> really best and easy to work with Cassandra , but in case of flexible
>>>>>>> search , I am getting confused.
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Jun 12, 2017 at 3:47 PM, Oskar Kjellin <
>>>>>>> oskar.kjellin@gmail.com> wrote:
>>>>>>>
>>>>>>>> I haven't run solr with Cassandra myself. I just meant to run
>>>>>>>> elasticsearch as a completely separate service and write there as well.
>>>>>>>>
>>>>>>>> On 12 Jun 2017, at 09:45, @Nandan@ <na...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Do you mean to use Elastic Search with Cassandra?
>>>>>>>> Even I am thinking to use Apache Solr With Cassandra.
>>>>>>>> In that case I have to create distributed tables such as:-
>>>>>>>> 1) video_by_title, video_by_actor, video_by_year  etc..
>>>>>>>> 2) After creating Tables , will have to configure solr core on all
>>>>>>>> tables.
>>>>>>>>
>>>>>>>> Is it like this ?
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Jun 12, 2017 at 3:19 PM, Oskar Kjellin <
>>>>>>>> oskar.kjellin@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Why not elasticsearch for this use case? It will make your life
>>>>>>>>> much simpler
>>>>>>>>>
>>>>>>>>> > On 12 Jun 2017, at 04:40, @Nandan@ <
>>>>>>>>> nandanpriyadarshi298@gmail.com> wrote:
>>>>>>>>> >
>>>>>>>>> > Hi,
>>>>>>>>> >
>>>>>>>>> > Currently, I am working on data modeling for Video Company in
>>>>>>>>> which we have different types of users as well as different user
>>>>>>>>> functionality.
>>>>>>>>> > But currently, my concern is about Search video module based on
>>>>>>>>> different fields.
>>>>>>>>> >
>>>>>>>>> > Query patterns are as below:-
>>>>>>>>> > 1) Select video by actor.
>>>>>>>>> > 2) select video by producer.
>>>>>>>>> > 3) select video by music.
>>>>>>>>> > 4) select video by actor and producer.
>>>>>>>>> > 5) select video by actor and music.
>>>>>>>>> >
>>>>>>>>> > Note: - In short, We want to establish an advanced search module
>>>>>>>>> by which we can search by anyway and get the desired results.
>>>>>>>>> >
>>>>>>>>> > During a search , we need partial search also such that if any
>>>>>>>>> user can search "Harry" title, then we are able to give them result as all
>>>>>>>>> videos whose
>>>>>>>>> >  title contains "Harry" at any location.
>>>>>>>>> >
>>>>>>>>> > As per my ideas, I have to create separate tables such as
>>>>>>>>> video_by_actor, video_by_producer etc.. and implement solr query on all
>>>>>>>>> tables. Otherwise,
>>>>>>>>> > is there any others way by which we can implement this search
>>>>>>>>> module effectively.
>>>>>>>>> >
>>>>>>>>> > Please suggest.
>>>>>>>>> >
>>>>>>>>> > Best regards,
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Reg:- Cassandra Data modelling for Search

Posted by Eduardo Alonso <ed...@stratio.com>.
Using cassandra collections

CREATE TABLE videos (
videoid uuid primary key,
title text,
actor list<text>,
producer list<text>,
release_date timestamp,
description text,
music text,
etc...
);

When using collection you need to take care of its length. Collections are
designed to store
<http://docs.datastax.com/en/cql/3.1/cql/cql_using/use_collections_c.html>only
a small amount of data
<http://docs.datastax.com/en/cql/3.1/cql/cql_using/use_collections_c.html>.
5/10 actors per movie is ok.


Eduardo Alonso
Vía de las dos Castillas, 33, Ática 4, 3ª Planta
28224 Pozuelo de Alarcón, Madrid
Tel: +34 91 828 6473 // www.stratio.com // *@stratiobd
<https://twitter.com/StratioBD>*

2017-06-12 11:54 GMT+02:00 @Nandan@ <na...@gmail.com>:

> So In short we have to go with one single table as videos and put primary
> key as videoid uuid.
> But then how can we able to handle multiple actor name and producer name.
> ?
>
> On Mon, Jun 12, 2017 at 5:51 PM, Eduardo Alonso <eduardoalonso@stratio.com
> > wrote:
>
>> Yes, you are right.
>>
>> Table denormalization is useful just when you have unique primary keys,
>> not your case.
>> Denormalized tables are only different in its primary key, every
>> denormalized table contains all the data (it just change how it is
>> structured). So, if you need to index it, do it with just one table (the
>> one you showed us with videoid as the primary key is ok).
>>
>> Solr, Elastic and cassandra-lucene-index are both based on Lucene and all
>> of them fulfill all your needs.
>>
>> Solr (in DSE) and cassandra-lucene-index
>> <https://github.com/stratio/cassandra-lucene-index> are very well
>> integrated with cassandra using its secondary index interface. If you
>> choose elastic search you will need to code the integration (write mutex,
>> both cluster synchronization (imagine something written in cassandra but
>> failed to write in elastic))
>>
>> I know i am not the most suitable to recommend you to use our product
>> cassandra-lucene-index
>> <https://github.com/stratio/cassandra-lucene-index> but it is open
>> source, just take a look.
>>
>> Eduardo Alonso
>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
>> 28224 Pozuelo de Alarcón, Madrid
>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com // *@stratiobd
>> <https://twitter.com/StratioBD>*
>>
>> 2017-06-12 11:18 GMT+02:00 @Nandan@ <na...@gmail.com>:
>>
>>> Hi Eduardo,
>>>
>>> And As we are trying to build an advanced search functionality in which
>>> we can able to do partial search based on actor, producer, director, etc.
>>> columns.
>>> So if we do denormalization of tables then we have to create tables such
>>> as below :-
>>> video_by_actor
>>> video_by_producer
>>> video_by_director
>>> video_by_date
>>> etc..
>>> By using denormalized, Cassandra only allows us to do equality search,
>>> but for implementing Partial search we need to implement solr on all above
>>> tables.
>>>
>>> This is my thinking, but I think this will be not correct way to
>>> implement Apache Solr on all tables.
>>>
>>> On Mon, Jun 12, 2017 at 5:11 PM, @Nandan@ <nandanpriyadarshi298@gmail.co
>>> m> wrote:
>>>
>>>> Hi Edurado,
>>>>
>>>> As you mentioned queries 1-6 ,
>>>> In this condition, we have to proceed with a table like as below :-
>>>> create table videos (
>>>> videoid uuid primary key,
>>>> title text,
>>>> actor text,
>>>> producer text,
>>>> release_date timestamp,
>>>> description text,
>>>> music text,
>>>> etc...
>>>> );
>>>> This table will help to store video datas based on PK videoid and will
>>>> give uniqeness due to uuid.
>>>> But as we know , in one movie there are multiple actor, multiple
>>>> producer, multiple music worked, So how can we store all these.. Only one
>>>> option will left as to use collection type columns.
>>>>
>>>>
>>>> On Mon, Jun 12, 2017 at 4:59 PM, Eduardo Alonso <
>>>> eduardoalonso@stratio.com> wrote:
>>>>
>>>>> TLDR shouldBe *PD
>>>>>
>>>>> Eduardo Alonso
>>>>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
>>>>> 28224 Pozuelo de Alarcón, Madrid
>>>>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com // *@stratiobd
>>>>> <https://twitter.com/StratioBD>*
>>>>>
>>>>> 2017-06-12 10:58 GMT+02:00 Eduardo Alonso <ed...@stratio.com>:
>>>>>
>>>>>> Hi Nandan:
>>>>>>
>>>>>> So, your system must provide these queries:
>>>>>>
>>>>>> 1 - SELECT video FROM ... WHERE actor = '...';
>>>>>> 2 - SELECT video FROM ... WHERE producer = '...';
>>>>>> 3 - SELECT video FROM ... WHERE music = '...';
>>>>>> 4 - SELECT video FROM ... WHERE actor = '...' AND producer ='...';
>>>>>> 5 - SELECT video FROM ... WHERE actor = '...' AND music = '...';
>>>>>> 6 - SELECT video WHERE title CONTAINS 'Harry';
>>>>>>
>>>>>>
>>>>>> For queries 1-5 you can get them with just cassandra, denormalizing
>>>>>> tables just the way your mentioned but without solr, just cassandra
>>>>>> (Indeed, just for equality clauses)
>>>>>>
>>>>>> video_by_actor;
>>>>>> video_by_producer;
>>>>>> video_by_music;
>>>>>> video_by_actor_and_producer;
>>>>>> video_by_actor_and_music;
>>>>>>
>>>>>> For queries number 6 you need a search engine.
>>>>>>
>>>>>> SOL
>>>>>> ElasticSearch
>>>>>> cassandra-lucene-index
>>>>>> <https://github.com/stratio/cassandra-lucene-index>
>>>>>> SASI
>>>>>> <http://docs.datastax.com/en/dse/5.1/cql/cql/cql_reference/cql_commands/cqlCreateCustomIndex.html>
>>>>>>
>>>>>> I think, just for your query,  the easiest way to get it is to build
>>>>>> a SASI index.
>>>>>> TLDR: I work for stratio in cassandra-lucene-index but for your basic
>>>>>> query (only one dimension), SASI indexes will work for you.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Eduardo Alonso
>>>>>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
>>>>>> 28224 Pozuelo de Alarcón, Madrid
>>>>>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com
>>>>>>  // *@stratiobd <https://twitter.com/StratioBD>*
>>>>>>
>>>>>> 2017-06-12 9:50 GMT+02:00 @Nandan@ <na...@gmail.com>:
>>>>>>
>>>>>>> But Condition is , I am working with Apache Cassandra Database in
>>>>>>> which I have to store my data into Cassandra and then have to implement
>>>>>>> partial search capability.
>>>>>>> If we need to search based on full search  primary key, then it
>>>>>>> really best and easy to work with Cassandra , but in case of flexible
>>>>>>> search , I am getting confused.
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Jun 12, 2017 at 3:47 PM, Oskar Kjellin <
>>>>>>> oskar.kjellin@gmail.com> wrote:
>>>>>>>
>>>>>>>> I haven't run solr with Cassandra myself. I just meant to run
>>>>>>>> elasticsearch as a completely separate service and write there as well.
>>>>>>>>
>>>>>>>> On 12 Jun 2017, at 09:45, @Nandan@ <na...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Do you mean to use Elastic Search with Cassandra?
>>>>>>>> Even I am thinking to use Apache Solr With Cassandra.
>>>>>>>> In that case I have to create distributed tables such as:-
>>>>>>>> 1) video_by_title, video_by_actor, video_by_year  etc..
>>>>>>>> 2) After creating Tables , will have to configure solr core on all
>>>>>>>> tables.
>>>>>>>>
>>>>>>>> Is it like this ?
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Jun 12, 2017 at 3:19 PM, Oskar Kjellin <
>>>>>>>> oskar.kjellin@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Why not elasticsearch for this use case? It will make your life
>>>>>>>>> much simpler
>>>>>>>>>
>>>>>>>>> > On 12 Jun 2017, at 04:40, @Nandan@ <
>>>>>>>>> nandanpriyadarshi298@gmail.com> wrote:
>>>>>>>>> >
>>>>>>>>> > Hi,
>>>>>>>>> >
>>>>>>>>> > Currently, I am working on data modeling for Video Company in
>>>>>>>>> which we have different types of users as well as different user
>>>>>>>>> functionality.
>>>>>>>>> > But currently, my concern is about Search video module based on
>>>>>>>>> different fields.
>>>>>>>>> >
>>>>>>>>> > Query patterns are as below:-
>>>>>>>>> > 1) Select video by actor.
>>>>>>>>> > 2) select video by producer.
>>>>>>>>> > 3) select video by music.
>>>>>>>>> > 4) select video by actor and producer.
>>>>>>>>> > 5) select video by actor and music.
>>>>>>>>> >
>>>>>>>>> > Note: - In short, We want to establish an advanced search module
>>>>>>>>> by which we can search by anyway and get the desired results.
>>>>>>>>> >
>>>>>>>>> > During a search , we need partial search also such that if any
>>>>>>>>> user can search "Harry" title, then we are able to give them result as all
>>>>>>>>> videos whose
>>>>>>>>> >  title contains "Harry" at any location.
>>>>>>>>> >
>>>>>>>>> > As per my ideas, I have to create separate tables such as
>>>>>>>>> video_by_actor, video_by_producer etc.. and implement solr query on all
>>>>>>>>> tables. Otherwise,
>>>>>>>>> > is there any others way by which we can implement this search
>>>>>>>>> module effectively.
>>>>>>>>> >
>>>>>>>>> > Please suggest.
>>>>>>>>> >
>>>>>>>>> > Best regards,
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Reg:- Cassandra Data modelling for Search

Posted by "@Nandan@" <na...@gmail.com>.
So In short we have to go with one single table as videos and put primary
key as videoid uuid.
But then how can we able to handle multiple actor name and producer name. ?

On Mon, Jun 12, 2017 at 5:51 PM, Eduardo Alonso <ed...@stratio.com>
wrote:

> Yes, you are right.
>
> Table denormalization is useful just when you have unique primary keys,
> not your case.
> Denormalized tables are only different in its primary key, every
> denormalized table contains all the data (it just change how it is
> structured). So, if you need to index it, do it with just one table (the
> one you showed us with videoid as the primary key is ok).
>
> Solr, Elastic and cassandra-lucene-index are both based on Lucene and all
> of them fulfill all your needs.
>
> Solr (in DSE) and cassandra-lucene-index
> <https://github.com/stratio/cassandra-lucene-index> are very well
> integrated with cassandra using its secondary index interface. If you
> choose elastic search you will need to code the integration (write mutex,
> both cluster synchronization (imagine something written in cassandra but
> failed to write in elastic))
>
> I know i am not the most suitable to recommend you to use our product
> cassandra-lucene-index <https://github.com/stratio/cassandra-lucene-index>
> but it is open source, just take a look.
>
> Eduardo Alonso
> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
> 28224 Pozuelo de Alarcón, Madrid
> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com // *@stratiobd
> <https://twitter.com/StratioBD>*
>
> 2017-06-12 11:18 GMT+02:00 @Nandan@ <na...@gmail.com>:
>
>> Hi Eduardo,
>>
>> And As we are trying to build an advanced search functionality in which
>> we can able to do partial search based on actor, producer, director, etc.
>> columns.
>> So if we do denormalization of tables then we have to create tables such
>> as below :-
>> video_by_actor
>> video_by_producer
>> video_by_director
>> video_by_date
>> etc..
>> By using denormalized, Cassandra only allows us to do equality search,
>> but for implementing Partial search we need to implement solr on all above
>> tables.
>>
>> This is my thinking, but I think this will be not correct way to
>> implement Apache Solr on all tables.
>>
>> On Mon, Jun 12, 2017 at 5:11 PM, @Nandan@ <nandanpriyadarshi298@gmail.com
>> > wrote:
>>
>>> Hi Edurado,
>>>
>>> As you mentioned queries 1-6 ,
>>> In this condition, we have to proceed with a table like as below :-
>>> create table videos (
>>> videoid uuid primary key,
>>> title text,
>>> actor text,
>>> producer text,
>>> release_date timestamp,
>>> description text,
>>> music text,
>>> etc...
>>> );
>>> This table will help to store video datas based on PK videoid and will
>>> give uniqeness due to uuid.
>>> But as we know , in one movie there are multiple actor, multiple
>>> producer, multiple music worked, So how can we store all these.. Only one
>>> option will left as to use collection type columns.
>>>
>>>
>>> On Mon, Jun 12, 2017 at 4:59 PM, Eduardo Alonso <
>>> eduardoalonso@stratio.com> wrote:
>>>
>>>> TLDR shouldBe *PD
>>>>
>>>> Eduardo Alonso
>>>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
>>>> 28224 Pozuelo de Alarcón, Madrid
>>>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com // *@stratiobd
>>>> <https://twitter.com/StratioBD>*
>>>>
>>>> 2017-06-12 10:58 GMT+02:00 Eduardo Alonso <ed...@stratio.com>:
>>>>
>>>>> Hi Nandan:
>>>>>
>>>>> So, your system must provide these queries:
>>>>>
>>>>> 1 - SELECT video FROM ... WHERE actor = '...';
>>>>> 2 - SELECT video FROM ... WHERE producer = '...';
>>>>> 3 - SELECT video FROM ... WHERE music = '...';
>>>>> 4 - SELECT video FROM ... WHERE actor = '...' AND producer ='...';
>>>>> 5 - SELECT video FROM ... WHERE actor = '...' AND music = '...';
>>>>> 6 - SELECT video WHERE title CONTAINS 'Harry';
>>>>>
>>>>>
>>>>> For queries 1-5 you can get them with just cassandra, denormalizing
>>>>> tables just the way your mentioned but without solr, just cassandra
>>>>> (Indeed, just for equality clauses)
>>>>>
>>>>> video_by_actor;
>>>>> video_by_producer;
>>>>> video_by_music;
>>>>> video_by_actor_and_producer;
>>>>> video_by_actor_and_music;
>>>>>
>>>>> For queries number 6 you need a search engine.
>>>>>
>>>>> SOL
>>>>> ElasticSearch
>>>>> cassandra-lucene-index
>>>>> <https://github.com/stratio/cassandra-lucene-index>
>>>>> SASI
>>>>> <http://docs.datastax.com/en/dse/5.1/cql/cql/cql_reference/cql_commands/cqlCreateCustomIndex.html>
>>>>>
>>>>> I think, just for your query,  the easiest way to get it is to build a
>>>>> SASI index.
>>>>> TLDR: I work for stratio in cassandra-lucene-index but for your basic
>>>>> query (only one dimension), SASI indexes will work for you.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Eduardo Alonso
>>>>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
>>>>> 28224 Pozuelo de Alarcón, Madrid
>>>>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com // *@stratiobd
>>>>> <https://twitter.com/StratioBD>*
>>>>>
>>>>> 2017-06-12 9:50 GMT+02:00 @Nandan@ <na...@gmail.com>:
>>>>>
>>>>>> But Condition is , I am working with Apache Cassandra Database in
>>>>>> which I have to store my data into Cassandra and then have to implement
>>>>>> partial search capability.
>>>>>> If we need to search based on full search  primary key, then it
>>>>>> really best and easy to work with Cassandra , but in case of flexible
>>>>>> search , I am getting confused.
>>>>>>
>>>>>>
>>>>>> On Mon, Jun 12, 2017 at 3:47 PM, Oskar Kjellin <
>>>>>> oskar.kjellin@gmail.com> wrote:
>>>>>>
>>>>>>> I haven't run solr with Cassandra myself. I just meant to run
>>>>>>> elasticsearch as a completely separate service and write there as well.
>>>>>>>
>>>>>>> On 12 Jun 2017, at 09:45, @Nandan@ <na...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>> Do you mean to use Elastic Search with Cassandra?
>>>>>>> Even I am thinking to use Apache Solr With Cassandra.
>>>>>>> In that case I have to create distributed tables such as:-
>>>>>>> 1) video_by_title, video_by_actor, video_by_year  etc..
>>>>>>> 2) After creating Tables , will have to configure solr core on all
>>>>>>> tables.
>>>>>>>
>>>>>>> Is it like this ?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Jun 12, 2017 at 3:19 PM, Oskar Kjellin <
>>>>>>> oskar.kjellin@gmail.com> wrote:
>>>>>>>
>>>>>>>> Why not elasticsearch for this use case? It will make your life
>>>>>>>> much simpler
>>>>>>>>
>>>>>>>> > On 12 Jun 2017, at 04:40, @Nandan@ <nandanpriyadarshi298@gmail.co
>>>>>>>> m> wrote:
>>>>>>>> >
>>>>>>>> > Hi,
>>>>>>>> >
>>>>>>>> > Currently, I am working on data modeling for Video Company in
>>>>>>>> which we have different types of users as well as different user
>>>>>>>> functionality.
>>>>>>>> > But currently, my concern is about Search video module based on
>>>>>>>> different fields.
>>>>>>>> >
>>>>>>>> > Query patterns are as below:-
>>>>>>>> > 1) Select video by actor.
>>>>>>>> > 2) select video by producer.
>>>>>>>> > 3) select video by music.
>>>>>>>> > 4) select video by actor and producer.
>>>>>>>> > 5) select video by actor and music.
>>>>>>>> >
>>>>>>>> > Note: - In short, We want to establish an advanced search module
>>>>>>>> by which we can search by anyway and get the desired results.
>>>>>>>> >
>>>>>>>> > During a search , we need partial search also such that if any
>>>>>>>> user can search "Harry" title, then we are able to give them result as all
>>>>>>>> videos whose
>>>>>>>> >  title contains "Harry" at any location.
>>>>>>>> >
>>>>>>>> > As per my ideas, I have to create separate tables such as
>>>>>>>> video_by_actor, video_by_producer etc.. and implement solr query on all
>>>>>>>> tables. Otherwise,
>>>>>>>> > is there any others way by which we can implement this search
>>>>>>>> module effectively.
>>>>>>>> >
>>>>>>>> > Please suggest.
>>>>>>>> >
>>>>>>>> > Best regards,
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Reg:- Cassandra Data modelling for Search

Posted by "@Nandan@" <na...@gmail.com>.
So In short we have to go with one single table as videos and put primary
key as videoid uuid.
But then how can we able to handle multiple actor name and producer name. ?

On Mon, Jun 12, 2017 at 5:51 PM, Eduardo Alonso <ed...@stratio.com>
wrote:

> Yes, you are right.
>
> Table denormalization is useful just when you have unique primary keys,
> not your case.
> Denormalized tables are only different in its primary key, every
> denormalized table contains all the data (it just change how it is
> structured). So, if you need to index it, do it with just one table (the
> one you showed us with videoid as the primary key is ok).
>
> Solr, Elastic and cassandra-lucene-index are both based on Lucene and all
> of them fulfill all your needs.
>
> Solr (in DSE) and cassandra-lucene-index
> <https://github.com/stratio/cassandra-lucene-index> are very well
> integrated with cassandra using its secondary index interface. If you
> choose elastic search you will need to code the integration (write mutex,
> both cluster synchronization (imagine something written in cassandra but
> failed to write in elastic))
>
> I know i am not the most suitable to recommend you to use our product
> cassandra-lucene-index <https://github.com/stratio/cassandra-lucene-index>
> but it is open source, just take a look.
>
> Eduardo Alonso
> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
> 28224 Pozuelo de Alarcón, Madrid
> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com // *@stratiobd
> <https://twitter.com/StratioBD>*
>
> 2017-06-12 11:18 GMT+02:00 @Nandan@ <na...@gmail.com>:
>
>> Hi Eduardo,
>>
>> And As we are trying to build an advanced search functionality in which
>> we can able to do partial search based on actor, producer, director, etc.
>> columns.
>> So if we do denormalization of tables then we have to create tables such
>> as below :-
>> video_by_actor
>> video_by_producer
>> video_by_director
>> video_by_date
>> etc..
>> By using denormalized, Cassandra only allows us to do equality search,
>> but for implementing Partial search we need to implement solr on all above
>> tables.
>>
>> This is my thinking, but I think this will be not correct way to
>> implement Apache Solr on all tables.
>>
>> On Mon, Jun 12, 2017 at 5:11 PM, @Nandan@ <nandanpriyadarshi298@gmail.com
>> > wrote:
>>
>>> Hi Edurado,
>>>
>>> As you mentioned queries 1-6 ,
>>> In this condition, we have to proceed with a table like as below :-
>>> create table videos (
>>> videoid uuid primary key,
>>> title text,
>>> actor text,
>>> producer text,
>>> release_date timestamp,
>>> description text,
>>> music text,
>>> etc...
>>> );
>>> This table will help to store video datas based on PK videoid and will
>>> give uniqeness due to uuid.
>>> But as we know , in one movie there are multiple actor, multiple
>>> producer, multiple music worked, So how can we store all these.. Only one
>>> option will left as to use collection type columns.
>>>
>>>
>>> On Mon, Jun 12, 2017 at 4:59 PM, Eduardo Alonso <
>>> eduardoalonso@stratio.com> wrote:
>>>
>>>> TLDR shouldBe *PD
>>>>
>>>> Eduardo Alonso
>>>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
>>>> 28224 Pozuelo de Alarcón, Madrid
>>>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com // *@stratiobd
>>>> <https://twitter.com/StratioBD>*
>>>>
>>>> 2017-06-12 10:58 GMT+02:00 Eduardo Alonso <ed...@stratio.com>:
>>>>
>>>>> Hi Nandan:
>>>>>
>>>>> So, your system must provide these queries:
>>>>>
>>>>> 1 - SELECT video FROM ... WHERE actor = '...';
>>>>> 2 - SELECT video FROM ... WHERE producer = '...';
>>>>> 3 - SELECT video FROM ... WHERE music = '...';
>>>>> 4 - SELECT video FROM ... WHERE actor = '...' AND producer ='...';
>>>>> 5 - SELECT video FROM ... WHERE actor = '...' AND music = '...';
>>>>> 6 - SELECT video WHERE title CONTAINS 'Harry';
>>>>>
>>>>>
>>>>> For queries 1-5 you can get them with just cassandra, denormalizing
>>>>> tables just the way your mentioned but without solr, just cassandra
>>>>> (Indeed, just for equality clauses)
>>>>>
>>>>> video_by_actor;
>>>>> video_by_producer;
>>>>> video_by_music;
>>>>> video_by_actor_and_producer;
>>>>> video_by_actor_and_music;
>>>>>
>>>>> For queries number 6 you need a search engine.
>>>>>
>>>>> SOL
>>>>> ElasticSearch
>>>>> cassandra-lucene-index
>>>>> <https://github.com/stratio/cassandra-lucene-index>
>>>>> SASI
>>>>> <http://docs.datastax.com/en/dse/5.1/cql/cql/cql_reference/cql_commands/cqlCreateCustomIndex.html>
>>>>>
>>>>> I think, just for your query,  the easiest way to get it is to build a
>>>>> SASI index.
>>>>> TLDR: I work for stratio in cassandra-lucene-index but for your basic
>>>>> query (only one dimension), SASI indexes will work for you.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Eduardo Alonso
>>>>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
>>>>> 28224 Pozuelo de Alarcón, Madrid
>>>>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com // *@stratiobd
>>>>> <https://twitter.com/StratioBD>*
>>>>>
>>>>> 2017-06-12 9:50 GMT+02:00 @Nandan@ <na...@gmail.com>:
>>>>>
>>>>>> But Condition is , I am working with Apache Cassandra Database in
>>>>>> which I have to store my data into Cassandra and then have to implement
>>>>>> partial search capability.
>>>>>> If we need to search based on full search  primary key, then it
>>>>>> really best and easy to work with Cassandra , but in case of flexible
>>>>>> search , I am getting confused.
>>>>>>
>>>>>>
>>>>>> On Mon, Jun 12, 2017 at 3:47 PM, Oskar Kjellin <
>>>>>> oskar.kjellin@gmail.com> wrote:
>>>>>>
>>>>>>> I haven't run solr with Cassandra myself. I just meant to run
>>>>>>> elasticsearch as a completely separate service and write there as well.
>>>>>>>
>>>>>>> On 12 Jun 2017, at 09:45, @Nandan@ <na...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>> Do you mean to use Elastic Search with Cassandra?
>>>>>>> Even I am thinking to use Apache Solr With Cassandra.
>>>>>>> In that case I have to create distributed tables such as:-
>>>>>>> 1) video_by_title, video_by_actor, video_by_year  etc..
>>>>>>> 2) After creating Tables , will have to configure solr core on all
>>>>>>> tables.
>>>>>>>
>>>>>>> Is it like this ?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Jun 12, 2017 at 3:19 PM, Oskar Kjellin <
>>>>>>> oskar.kjellin@gmail.com> wrote:
>>>>>>>
>>>>>>>> Why not elasticsearch for this use case? It will make your life
>>>>>>>> much simpler
>>>>>>>>
>>>>>>>> > On 12 Jun 2017, at 04:40, @Nandan@ <nandanpriyadarshi298@gmail.co
>>>>>>>> m> wrote:
>>>>>>>> >
>>>>>>>> > Hi,
>>>>>>>> >
>>>>>>>> > Currently, I am working on data modeling for Video Company in
>>>>>>>> which we have different types of users as well as different user
>>>>>>>> functionality.
>>>>>>>> > But currently, my concern is about Search video module based on
>>>>>>>> different fields.
>>>>>>>> >
>>>>>>>> > Query patterns are as below:-
>>>>>>>> > 1) Select video by actor.
>>>>>>>> > 2) select video by producer.
>>>>>>>> > 3) select video by music.
>>>>>>>> > 4) select video by actor and producer.
>>>>>>>> > 5) select video by actor and music.
>>>>>>>> >
>>>>>>>> > Note: - In short, We want to establish an advanced search module
>>>>>>>> by which we can search by anyway and get the desired results.
>>>>>>>> >
>>>>>>>> > During a search , we need partial search also such that if any
>>>>>>>> user can search "Harry" title, then we are able to give them result as all
>>>>>>>> videos whose
>>>>>>>> >  title contains "Harry" at any location.
>>>>>>>> >
>>>>>>>> > As per my ideas, I have to create separate tables such as
>>>>>>>> video_by_actor, video_by_producer etc.. and implement solr query on all
>>>>>>>> tables. Otherwise,
>>>>>>>> > is there any others way by which we can implement this search
>>>>>>>> module effectively.
>>>>>>>> >
>>>>>>>> > Please suggest.
>>>>>>>> >
>>>>>>>> > Best regards,
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Reg:- Cassandra Data modelling for Search

Posted by Eduardo Alonso <ed...@stratio.com>.
Yes, you are right.

Table denormalization is useful just when you have unique primary keys, not
your case.
Denormalized tables are only different in its primary key, every
denormalized table contains all the data (it just change how it is
structured). So, if you need to index it, do it with just one table (the
one you showed us with videoid as the primary key is ok).

Solr, Elastic and cassandra-lucene-index are both based on Lucene and all
of them fulfill all your needs.

Solr (in DSE) and cassandra-lucene-index
<https://github.com/stratio/cassandra-lucene-index> are very well
integrated with cassandra using its secondary index interface. If you
choose elastic search you will need to code the integration (write mutex,
both cluster synchronization (imagine something written in cassandra but
failed to write in elastic))

I know i am not the most suitable to recommend you to use our product
cassandra-lucene-index <https://github.com/stratio/cassandra-lucene-index>
but it is open source, just take a look.

Eduardo Alonso
Vía de las dos Castillas, 33, Ática 4, 3ª Planta
28224 Pozuelo de Alarcón, Madrid
Tel: +34 91 828 6473 // www.stratio.com // *@stratiobd
<https://twitter.com/StratioBD>*

2017-06-12 11:18 GMT+02:00 @Nandan@ <na...@gmail.com>:

> Hi Eduardo,
>
> And As we are trying to build an advanced search functionality in which we
> can able to do partial search based on actor, producer, director, etc.
> columns.
> So if we do denormalization of tables then we have to create tables such
> as below :-
> video_by_actor
> video_by_producer
> video_by_director
> video_by_date
> etc..
> By using denormalized, Cassandra only allows us to do equality search, but
> for implementing Partial search we need to implement solr on all above
> tables.
>
> This is my thinking, but I think this will be not correct way to implement
> Apache Solr on all tables.
>
> On Mon, Jun 12, 2017 at 5:11 PM, @Nandan@ <na...@gmail.com>
> wrote:
>
>> Hi Edurado,
>>
>> As you mentioned queries 1-6 ,
>> In this condition, we have to proceed with a table like as below :-
>> create table videos (
>> videoid uuid primary key,
>> title text,
>> actor text,
>> producer text,
>> release_date timestamp,
>> description text,
>> music text,
>> etc...
>> );
>> This table will help to store video datas based on PK videoid and will
>> give uniqeness due to uuid.
>> But as we know , in one movie there are multiple actor, multiple
>> producer, multiple music worked, So how can we store all these.. Only one
>> option will left as to use collection type columns.
>>
>>
>> On Mon, Jun 12, 2017 at 4:59 PM, Eduardo Alonso <
>> eduardoalonso@stratio.com> wrote:
>>
>>> TLDR shouldBe *PD
>>>
>>> Eduardo Alonso
>>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
>>> 28224 Pozuelo de Alarcón, Madrid
>>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com // *@stratiobd
>>> <https://twitter.com/StratioBD>*
>>>
>>> 2017-06-12 10:58 GMT+02:00 Eduardo Alonso <ed...@stratio.com>:
>>>
>>>> Hi Nandan:
>>>>
>>>> So, your system must provide these queries:
>>>>
>>>> 1 - SELECT video FROM ... WHERE actor = '...';
>>>> 2 - SELECT video FROM ... WHERE producer = '...';
>>>> 3 - SELECT video FROM ... WHERE music = '...';
>>>> 4 - SELECT video FROM ... WHERE actor = '...' AND producer ='...';
>>>> 5 - SELECT video FROM ... WHERE actor = '...' AND music = '...';
>>>> 6 - SELECT video WHERE title CONTAINS 'Harry';
>>>>
>>>>
>>>> For queries 1-5 you can get them with just cassandra, denormalizing
>>>> tables just the way your mentioned but without solr, just cassandra
>>>> (Indeed, just for equality clauses)
>>>>
>>>> video_by_actor;
>>>> video_by_producer;
>>>> video_by_music;
>>>> video_by_actor_and_producer;
>>>> video_by_actor_and_music;
>>>>
>>>> For queries number 6 you need a search engine.
>>>>
>>>> SOL
>>>> ElasticSearch
>>>> cassandra-lucene-index
>>>> <https://github.com/stratio/cassandra-lucene-index>
>>>> SASI
>>>> <http://docs.datastax.com/en/dse/5.1/cql/cql/cql_reference/cql_commands/cqlCreateCustomIndex.html>
>>>>
>>>> I think, just for your query,  the easiest way to get it is to build a
>>>> SASI index.
>>>> TLDR: I work for stratio in cassandra-lucene-index but for your basic
>>>> query (only one dimension), SASI indexes will work for you.
>>>>
>>>>
>>>>
>>>>
>>>> Eduardo Alonso
>>>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
>>>> 28224 Pozuelo de Alarcón, Madrid
>>>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com // *@stratiobd
>>>> <https://twitter.com/StratioBD>*
>>>>
>>>> 2017-06-12 9:50 GMT+02:00 @Nandan@ <na...@gmail.com>:
>>>>
>>>>> But Condition is , I am working with Apache Cassandra Database in
>>>>> which I have to store my data into Cassandra and then have to implement
>>>>> partial search capability.
>>>>> If we need to search based on full search  primary key, then it really
>>>>> best and easy to work with Cassandra , but in case of flexible search , I
>>>>> am getting confused.
>>>>>
>>>>>
>>>>> On Mon, Jun 12, 2017 at 3:47 PM, Oskar Kjellin <
>>>>> oskar.kjellin@gmail.com> wrote:
>>>>>
>>>>>> I haven't run solr with Cassandra myself. I just meant to run
>>>>>> elasticsearch as a completely separate service and write there as well.
>>>>>>
>>>>>> On 12 Jun 2017, at 09:45, @Nandan@ <na...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>> Do you mean to use Elastic Search with Cassandra?
>>>>>> Even I am thinking to use Apache Solr With Cassandra.
>>>>>> In that case I have to create distributed tables such as:-
>>>>>> 1) video_by_title, video_by_actor, video_by_year  etc..
>>>>>> 2) After creating Tables , will have to configure solr core on all
>>>>>> tables.
>>>>>>
>>>>>> Is it like this ?
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Jun 12, 2017 at 3:19 PM, Oskar Kjellin <
>>>>>> oskar.kjellin@gmail.com> wrote:
>>>>>>
>>>>>>> Why not elasticsearch for this use case? It will make your life much
>>>>>>> simpler
>>>>>>>
>>>>>>> > On 12 Jun 2017, at 04:40, @Nandan@ <na...@gmail.com>
>>>>>>> wrote:
>>>>>>> >
>>>>>>> > Hi,
>>>>>>> >
>>>>>>> > Currently, I am working on data modeling for Video Company in
>>>>>>> which we have different types of users as well as different user
>>>>>>> functionality.
>>>>>>> > But currently, my concern is about Search video module based on
>>>>>>> different fields.
>>>>>>> >
>>>>>>> > Query patterns are as below:-
>>>>>>> > 1) Select video by actor.
>>>>>>> > 2) select video by producer.
>>>>>>> > 3) select video by music.
>>>>>>> > 4) select video by actor and producer.
>>>>>>> > 5) select video by actor and music.
>>>>>>> >
>>>>>>> > Note: - In short, We want to establish an advanced search module
>>>>>>> by which we can search by anyway and get the desired results.
>>>>>>> >
>>>>>>> > During a search , we need partial search also such that if any
>>>>>>> user can search "Harry" title, then we are able to give them result as all
>>>>>>> videos whose
>>>>>>> >  title contains "Harry" at any location.
>>>>>>> >
>>>>>>> > As per my ideas, I have to create separate tables such as
>>>>>>> video_by_actor, video_by_producer etc.. and implement solr query on all
>>>>>>> tables. Otherwise,
>>>>>>> > is there any others way by which we can implement this search
>>>>>>> module effectively.
>>>>>>> >
>>>>>>> > Please suggest.
>>>>>>> >
>>>>>>> > Best regards,
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Reg:- Cassandra Data modelling for Search

Posted by Eduardo Alonso <ed...@stratio.com>.
Yes, you are right.

Table denormalization is useful just when you have unique primary keys, not
your case.
Denormalized tables are only different in its primary key, every
denormalized table contains all the data (it just change how it is
structured). So, if you need to index it, do it with just one table (the
one you showed us with videoid as the primary key is ok).

Solr, Elastic and cassandra-lucene-index are both based on Lucene and all
of them fulfill all your needs.

Solr (in DSE) and cassandra-lucene-index
<https://github.com/stratio/cassandra-lucene-index> are very well
integrated with cassandra using its secondary index interface. If you
choose elastic search you will need to code the integration (write mutex,
both cluster synchronization (imagine something written in cassandra but
failed to write in elastic))

I know i am not the most suitable to recommend you to use our product
cassandra-lucene-index <https://github.com/stratio/cassandra-lucene-index>
but it is open source, just take a look.

Eduardo Alonso
Vía de las dos Castillas, 33, Ática 4, 3ª Planta
28224 Pozuelo de Alarcón, Madrid
Tel: +34 91 828 6473 // www.stratio.com // *@stratiobd
<https://twitter.com/StratioBD>*

2017-06-12 11:18 GMT+02:00 @Nandan@ <na...@gmail.com>:

> Hi Eduardo,
>
> And As we are trying to build an advanced search functionality in which we
> can able to do partial search based on actor, producer, director, etc.
> columns.
> So if we do denormalization of tables then we have to create tables such
> as below :-
> video_by_actor
> video_by_producer
> video_by_director
> video_by_date
> etc..
> By using denormalized, Cassandra only allows us to do equality search, but
> for implementing Partial search we need to implement solr on all above
> tables.
>
> This is my thinking, but I think this will be not correct way to implement
> Apache Solr on all tables.
>
> On Mon, Jun 12, 2017 at 5:11 PM, @Nandan@ <na...@gmail.com>
> wrote:
>
>> Hi Edurado,
>>
>> As you mentioned queries 1-6 ,
>> In this condition, we have to proceed with a table like as below :-
>> create table videos (
>> videoid uuid primary key,
>> title text,
>> actor text,
>> producer text,
>> release_date timestamp,
>> description text,
>> music text,
>> etc...
>> );
>> This table will help to store video datas based on PK videoid and will
>> give uniqeness due to uuid.
>> But as we know , in one movie there are multiple actor, multiple
>> producer, multiple music worked, So how can we store all these.. Only one
>> option will left as to use collection type columns.
>>
>>
>> On Mon, Jun 12, 2017 at 4:59 PM, Eduardo Alonso <
>> eduardoalonso@stratio.com> wrote:
>>
>>> TLDR shouldBe *PD
>>>
>>> Eduardo Alonso
>>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
>>> 28224 Pozuelo de Alarcón, Madrid
>>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com // *@stratiobd
>>> <https://twitter.com/StratioBD>*
>>>
>>> 2017-06-12 10:58 GMT+02:00 Eduardo Alonso <ed...@stratio.com>:
>>>
>>>> Hi Nandan:
>>>>
>>>> So, your system must provide these queries:
>>>>
>>>> 1 - SELECT video FROM ... WHERE actor = '...';
>>>> 2 - SELECT video FROM ... WHERE producer = '...';
>>>> 3 - SELECT video FROM ... WHERE music = '...';
>>>> 4 - SELECT video FROM ... WHERE actor = '...' AND producer ='...';
>>>> 5 - SELECT video FROM ... WHERE actor = '...' AND music = '...';
>>>> 6 - SELECT video WHERE title CONTAINS 'Harry';
>>>>
>>>>
>>>> For queries 1-5 you can get them with just cassandra, denormalizing
>>>> tables just the way your mentioned but without solr, just cassandra
>>>> (Indeed, just for equality clauses)
>>>>
>>>> video_by_actor;
>>>> video_by_producer;
>>>> video_by_music;
>>>> video_by_actor_and_producer;
>>>> video_by_actor_and_music;
>>>>
>>>> For queries number 6 you need a search engine.
>>>>
>>>> SOL
>>>> ElasticSearch
>>>> cassandra-lucene-index
>>>> <https://github.com/stratio/cassandra-lucene-index>
>>>> SASI
>>>> <http://docs.datastax.com/en/dse/5.1/cql/cql/cql_reference/cql_commands/cqlCreateCustomIndex.html>
>>>>
>>>> I think, just for your query,  the easiest way to get it is to build a
>>>> SASI index.
>>>> TLDR: I work for stratio in cassandra-lucene-index but for your basic
>>>> query (only one dimension), SASI indexes will work for you.
>>>>
>>>>
>>>>
>>>>
>>>> Eduardo Alonso
>>>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
>>>> 28224 Pozuelo de Alarcón, Madrid
>>>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com // *@stratiobd
>>>> <https://twitter.com/StratioBD>*
>>>>
>>>> 2017-06-12 9:50 GMT+02:00 @Nandan@ <na...@gmail.com>:
>>>>
>>>>> But Condition is , I am working with Apache Cassandra Database in
>>>>> which I have to store my data into Cassandra and then have to implement
>>>>> partial search capability.
>>>>> If we need to search based on full search  primary key, then it really
>>>>> best and easy to work with Cassandra , but in case of flexible search , I
>>>>> am getting confused.
>>>>>
>>>>>
>>>>> On Mon, Jun 12, 2017 at 3:47 PM, Oskar Kjellin <
>>>>> oskar.kjellin@gmail.com> wrote:
>>>>>
>>>>>> I haven't run solr with Cassandra myself. I just meant to run
>>>>>> elasticsearch as a completely separate service and write there as well.
>>>>>>
>>>>>> On 12 Jun 2017, at 09:45, @Nandan@ <na...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>> Do you mean to use Elastic Search with Cassandra?
>>>>>> Even I am thinking to use Apache Solr With Cassandra.
>>>>>> In that case I have to create distributed tables such as:-
>>>>>> 1) video_by_title, video_by_actor, video_by_year  etc..
>>>>>> 2) After creating Tables , will have to configure solr core on all
>>>>>> tables.
>>>>>>
>>>>>> Is it like this ?
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Jun 12, 2017 at 3:19 PM, Oskar Kjellin <
>>>>>> oskar.kjellin@gmail.com> wrote:
>>>>>>
>>>>>>> Why not elasticsearch for this use case? It will make your life much
>>>>>>> simpler
>>>>>>>
>>>>>>> > On 12 Jun 2017, at 04:40, @Nandan@ <na...@gmail.com>
>>>>>>> wrote:
>>>>>>> >
>>>>>>> > Hi,
>>>>>>> >
>>>>>>> > Currently, I am working on data modeling for Video Company in
>>>>>>> which we have different types of users as well as different user
>>>>>>> functionality.
>>>>>>> > But currently, my concern is about Search video module based on
>>>>>>> different fields.
>>>>>>> >
>>>>>>> > Query patterns are as below:-
>>>>>>> > 1) Select video by actor.
>>>>>>> > 2) select video by producer.
>>>>>>> > 3) select video by music.
>>>>>>> > 4) select video by actor and producer.
>>>>>>> > 5) select video by actor and music.
>>>>>>> >
>>>>>>> > Note: - In short, We want to establish an advanced search module
>>>>>>> by which we can search by anyway and get the desired results.
>>>>>>> >
>>>>>>> > During a search , we need partial search also such that if any
>>>>>>> user can search "Harry" title, then we are able to give them result as all
>>>>>>> videos whose
>>>>>>> >  title contains "Harry" at any location.
>>>>>>> >
>>>>>>> > As per my ideas, I have to create separate tables such as
>>>>>>> video_by_actor, video_by_producer etc.. and implement solr query on all
>>>>>>> tables. Otherwise,
>>>>>>> > is there any others way by which we can implement this search
>>>>>>> module effectively.
>>>>>>> >
>>>>>>> > Please suggest.
>>>>>>> >
>>>>>>> > Best regards,
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Reg:- Cassandra Data modelling for Search

Posted by "@Nandan@" <na...@gmail.com>.
Hi Eduardo,

And As we are trying to build an advanced search functionality in which we
can able to do partial search based on actor, producer, director, etc.
columns.
So if we do denormalization of tables then we have to create tables such as
below :-
video_by_actor
video_by_producer
video_by_director
video_by_date
etc..
By using denormalized, Cassandra only allows us to do equality search, but
for implementing Partial search we need to implement solr on all above
tables.

This is my thinking, but I think this will be not correct way to implement
Apache Solr on all tables.

On Mon, Jun 12, 2017 at 5:11 PM, @Nandan@ <na...@gmail.com>
wrote:

> Hi Edurado,
>
> As you mentioned queries 1-6 ,
> In this condition, we have to proceed with a table like as below :-
> create table videos (
> videoid uuid primary key,
> title text,
> actor text,
> producer text,
> release_date timestamp,
> description text,
> music text,
> etc...
> );
> This table will help to store video datas based on PK videoid and will
> give uniqeness due to uuid.
> But as we know , in one movie there are multiple actor, multiple producer,
> multiple music worked, So how can we store all these.. Only one option will
> left as to use collection type columns.
>
>
> On Mon, Jun 12, 2017 at 4:59 PM, Eduardo Alonso <eduardoalonso@stratio.com
> > wrote:
>
>> TLDR shouldBe *PD
>>
>> Eduardo Alonso
>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
>> 28224 Pozuelo de Alarcón, Madrid
>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com // *@stratiobd
>> <https://twitter.com/StratioBD>*
>>
>> 2017-06-12 10:58 GMT+02:00 Eduardo Alonso <ed...@stratio.com>:
>>
>>> Hi Nandan:
>>>
>>> So, your system must provide these queries:
>>>
>>> 1 - SELECT video FROM ... WHERE actor = '...';
>>> 2 - SELECT video FROM ... WHERE producer = '...';
>>> 3 - SELECT video FROM ... WHERE music = '...';
>>> 4 - SELECT video FROM ... WHERE actor = '...' AND producer ='...';
>>> 5 - SELECT video FROM ... WHERE actor = '...' AND music = '...';
>>> 6 - SELECT video WHERE title CONTAINS 'Harry';
>>>
>>>
>>> For queries 1-5 you can get them with just cassandra, denormalizing
>>> tables just the way your mentioned but without solr, just cassandra
>>> (Indeed, just for equality clauses)
>>>
>>> video_by_actor;
>>> video_by_producer;
>>> video_by_music;
>>> video_by_actor_and_producer;
>>> video_by_actor_and_music;
>>>
>>> For queries number 6 you need a search engine.
>>>
>>> SOL
>>> ElasticSearch
>>> cassandra-lucene-index
>>> <https://github.com/stratio/cassandra-lucene-index>
>>> SASI
>>> <http://docs.datastax.com/en/dse/5.1/cql/cql/cql_reference/cql_commands/cqlCreateCustomIndex.html>
>>>
>>> I think, just for your query,  the easiest way to get it is to build a
>>> SASI index.
>>> TLDR: I work for stratio in cassandra-lucene-index but for your basic
>>> query (only one dimension), SASI indexes will work for you.
>>>
>>>
>>>
>>>
>>> Eduardo Alonso
>>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
>>> 28224 Pozuelo de Alarcón, Madrid
>>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com // *@stratiobd
>>> <https://twitter.com/StratioBD>*
>>>
>>> 2017-06-12 9:50 GMT+02:00 @Nandan@ <na...@gmail.com>:
>>>
>>>> But Condition is , I am working with Apache Cassandra Database in which
>>>> I have to store my data into Cassandra and then have to implement partial
>>>> search capability.
>>>> If we need to search based on full search  primary key, then it really
>>>> best and easy to work with Cassandra , but in case of flexible search , I
>>>> am getting confused.
>>>>
>>>>
>>>> On Mon, Jun 12, 2017 at 3:47 PM, Oskar Kjellin <oskar.kjellin@gmail.com
>>>> > wrote:
>>>>
>>>>> I haven't run solr with Cassandra myself. I just meant to run
>>>>> elasticsearch as a completely separate service and write there as well.
>>>>>
>>>>> On 12 Jun 2017, at 09:45, @Nandan@ <na...@gmail.com>
>>>>> wrote:
>>>>>
>>>>> Do you mean to use Elastic Search with Cassandra?
>>>>> Even I am thinking to use Apache Solr With Cassandra.
>>>>> In that case I have to create distributed tables such as:-
>>>>> 1) video_by_title, video_by_actor, video_by_year  etc..
>>>>> 2) After creating Tables , will have to configure solr core on all
>>>>> tables.
>>>>>
>>>>> Is it like this ?
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Jun 12, 2017 at 3:19 PM, Oskar Kjellin <
>>>>> oskar.kjellin@gmail.com> wrote:
>>>>>
>>>>>> Why not elasticsearch for this use case? It will make your life much
>>>>>> simpler
>>>>>>
>>>>>> > On 12 Jun 2017, at 04:40, @Nandan@ <na...@gmail.com>
>>>>>> wrote:
>>>>>> >
>>>>>> > Hi,
>>>>>> >
>>>>>> > Currently, I am working on data modeling for Video Company in which
>>>>>> we have different types of users as well as different user functionality.
>>>>>> > But currently, my concern is about Search video module based on
>>>>>> different fields.
>>>>>> >
>>>>>> > Query patterns are as below:-
>>>>>> > 1) Select video by actor.
>>>>>> > 2) select video by producer.
>>>>>> > 3) select video by music.
>>>>>> > 4) select video by actor and producer.
>>>>>> > 5) select video by actor and music.
>>>>>> >
>>>>>> > Note: - In short, We want to establish an advanced search module by
>>>>>> which we can search by anyway and get the desired results.
>>>>>> >
>>>>>> > During a search , we need partial search also such that if any user
>>>>>> can search "Harry" title, then we are able to give them result as all
>>>>>> videos whose
>>>>>> >  title contains "Harry" at any location.
>>>>>> >
>>>>>> > As per my ideas, I have to create separate tables such as
>>>>>> video_by_actor, video_by_producer etc.. and implement solr query on all
>>>>>> tables. Otherwise,
>>>>>> > is there any others way by which we can implement this search
>>>>>> module effectively.
>>>>>> >
>>>>>> > Please suggest.
>>>>>> >
>>>>>> > Best regards,
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Reg:- Cassandra Data modelling for Search

Posted by "@Nandan@" <na...@gmail.com>.
Hi Eduardo,

And As we are trying to build an advanced search functionality in which we
can able to do partial search based on actor, producer, director, etc.
columns.
So if we do denormalization of tables then we have to create tables such as
below :-
video_by_actor
video_by_producer
video_by_director
video_by_date
etc..
By using denormalized, Cassandra only allows us to do equality search, but
for implementing Partial search we need to implement solr on all above
tables.

This is my thinking, but I think this will be not correct way to implement
Apache Solr on all tables.

On Mon, Jun 12, 2017 at 5:11 PM, @Nandan@ <na...@gmail.com>
wrote:

> Hi Edurado,
>
> As you mentioned queries 1-6 ,
> In this condition, we have to proceed with a table like as below :-
> create table videos (
> videoid uuid primary key,
> title text,
> actor text,
> producer text,
> release_date timestamp,
> description text,
> music text,
> etc...
> );
> This table will help to store video datas based on PK videoid and will
> give uniqeness due to uuid.
> But as we know , in one movie there are multiple actor, multiple producer,
> multiple music worked, So how can we store all these.. Only one option will
> left as to use collection type columns.
>
>
> On Mon, Jun 12, 2017 at 4:59 PM, Eduardo Alonso <eduardoalonso@stratio.com
> > wrote:
>
>> TLDR shouldBe *PD
>>
>> Eduardo Alonso
>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
>> 28224 Pozuelo de Alarcón, Madrid
>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com // *@stratiobd
>> <https://twitter.com/StratioBD>*
>>
>> 2017-06-12 10:58 GMT+02:00 Eduardo Alonso <ed...@stratio.com>:
>>
>>> Hi Nandan:
>>>
>>> So, your system must provide these queries:
>>>
>>> 1 - SELECT video FROM ... WHERE actor = '...';
>>> 2 - SELECT video FROM ... WHERE producer = '...';
>>> 3 - SELECT video FROM ... WHERE music = '...';
>>> 4 - SELECT video FROM ... WHERE actor = '...' AND producer ='...';
>>> 5 - SELECT video FROM ... WHERE actor = '...' AND music = '...';
>>> 6 - SELECT video WHERE title CONTAINS 'Harry';
>>>
>>>
>>> For queries 1-5 you can get them with just cassandra, denormalizing
>>> tables just the way your mentioned but without solr, just cassandra
>>> (Indeed, just for equality clauses)
>>>
>>> video_by_actor;
>>> video_by_producer;
>>> video_by_music;
>>> video_by_actor_and_producer;
>>> video_by_actor_and_music;
>>>
>>> For queries number 6 you need a search engine.
>>>
>>> SOL
>>> ElasticSearch
>>> cassandra-lucene-index
>>> <https://github.com/stratio/cassandra-lucene-index>
>>> SASI
>>> <http://docs.datastax.com/en/dse/5.1/cql/cql/cql_reference/cql_commands/cqlCreateCustomIndex.html>
>>>
>>> I think, just for your query,  the easiest way to get it is to build a
>>> SASI index.
>>> TLDR: I work for stratio in cassandra-lucene-index but for your basic
>>> query (only one dimension), SASI indexes will work for you.
>>>
>>>
>>>
>>>
>>> Eduardo Alonso
>>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
>>> 28224 Pozuelo de Alarcón, Madrid
>>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com // *@stratiobd
>>> <https://twitter.com/StratioBD>*
>>>
>>> 2017-06-12 9:50 GMT+02:00 @Nandan@ <na...@gmail.com>:
>>>
>>>> But Condition is , I am working with Apache Cassandra Database in which
>>>> I have to store my data into Cassandra and then have to implement partial
>>>> search capability.
>>>> If we need to search based on full search  primary key, then it really
>>>> best and easy to work with Cassandra , but in case of flexible search , I
>>>> am getting confused.
>>>>
>>>>
>>>> On Mon, Jun 12, 2017 at 3:47 PM, Oskar Kjellin <oskar.kjellin@gmail.com
>>>> > wrote:
>>>>
>>>>> I haven't run solr with Cassandra myself. I just meant to run
>>>>> elasticsearch as a completely separate service and write there as well.
>>>>>
>>>>> On 12 Jun 2017, at 09:45, @Nandan@ <na...@gmail.com>
>>>>> wrote:
>>>>>
>>>>> Do you mean to use Elastic Search with Cassandra?
>>>>> Even I am thinking to use Apache Solr With Cassandra.
>>>>> In that case I have to create distributed tables such as:-
>>>>> 1) video_by_title, video_by_actor, video_by_year  etc..
>>>>> 2) After creating Tables , will have to configure solr core on all
>>>>> tables.
>>>>>
>>>>> Is it like this ?
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Jun 12, 2017 at 3:19 PM, Oskar Kjellin <
>>>>> oskar.kjellin@gmail.com> wrote:
>>>>>
>>>>>> Why not elasticsearch for this use case? It will make your life much
>>>>>> simpler
>>>>>>
>>>>>> > On 12 Jun 2017, at 04:40, @Nandan@ <na...@gmail.com>
>>>>>> wrote:
>>>>>> >
>>>>>> > Hi,
>>>>>> >
>>>>>> > Currently, I am working on data modeling for Video Company in which
>>>>>> we have different types of users as well as different user functionality.
>>>>>> > But currently, my concern is about Search video module based on
>>>>>> different fields.
>>>>>> >
>>>>>> > Query patterns are as below:-
>>>>>> > 1) Select video by actor.
>>>>>> > 2) select video by producer.
>>>>>> > 3) select video by music.
>>>>>> > 4) select video by actor and producer.
>>>>>> > 5) select video by actor and music.
>>>>>> >
>>>>>> > Note: - In short, We want to establish an advanced search module by
>>>>>> which we can search by anyway and get the desired results.
>>>>>> >
>>>>>> > During a search , we need partial search also such that if any user
>>>>>> can search "Harry" title, then we are able to give them result as all
>>>>>> videos whose
>>>>>> >  title contains "Harry" at any location.
>>>>>> >
>>>>>> > As per my ideas, I have to create separate tables such as
>>>>>> video_by_actor, video_by_producer etc.. and implement solr query on all
>>>>>> tables. Otherwise,
>>>>>> > is there any others way by which we can implement this search
>>>>>> module effectively.
>>>>>> >
>>>>>> > Please suggest.
>>>>>> >
>>>>>> > Best regards,
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Reg:- Cassandra Data modelling for Search

Posted by "@Nandan@" <na...@gmail.com>.
Hi Edurado,

As you mentioned queries 1-6 ,
In this condition, we have to proceed with a table like as below :-
create table videos (
videoid uuid primary key,
title text,
actor text,
producer text,
release_date timestamp,
description text,
music text,
etc...
);
This table will help to store video datas based on PK videoid and will give
uniqeness due to uuid.
But as we know , in one movie there are multiple actor, multiple producer,
multiple music worked, So how can we store all these.. Only one option will
left as to use collection type columns.


On Mon, Jun 12, 2017 at 4:59 PM, Eduardo Alonso <ed...@stratio.com>
wrote:

> TLDR shouldBe *PD
>
> Eduardo Alonso
> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
> 28224 Pozuelo de Alarcón, Madrid
> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com // *@stratiobd
> <https://twitter.com/StratioBD>*
>
> 2017-06-12 10:58 GMT+02:00 Eduardo Alonso <ed...@stratio.com>:
>
>> Hi Nandan:
>>
>> So, your system must provide these queries:
>>
>> 1 - SELECT video FROM ... WHERE actor = '...';
>> 2 - SELECT video FROM ... WHERE producer = '...';
>> 3 - SELECT video FROM ... WHERE music = '...';
>> 4 - SELECT video FROM ... WHERE actor = '...' AND producer ='...';
>> 5 - SELECT video FROM ... WHERE actor = '...' AND music = '...';
>> 6 - SELECT video WHERE title CONTAINS 'Harry';
>>
>>
>> For queries 1-5 you can get them with just cassandra, denormalizing
>> tables just the way your mentioned but without solr, just cassandra
>> (Indeed, just for equality clauses)
>>
>> video_by_actor;
>> video_by_producer;
>> video_by_music;
>> video_by_actor_and_producer;
>> video_by_actor_and_music;
>>
>> For queries number 6 you need a search engine.
>>
>> SOL
>> ElasticSearch
>> cassandra-lucene-index
>> <https://github.com/stratio/cassandra-lucene-index>
>> SASI
>> <http://docs.datastax.com/en/dse/5.1/cql/cql/cql_reference/cql_commands/cqlCreateCustomIndex.html>
>>
>> I think, just for your query,  the easiest way to get it is to build a
>> SASI index.
>> TLDR: I work for stratio in cassandra-lucene-index but for your basic
>> query (only one dimension), SASI indexes will work for you.
>>
>>
>>
>>
>> Eduardo Alonso
>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
>> 28224 Pozuelo de Alarcón, Madrid
>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com // *@stratiobd
>> <https://twitter.com/StratioBD>*
>>
>> 2017-06-12 9:50 GMT+02:00 @Nandan@ <na...@gmail.com>:
>>
>>> But Condition is , I am working with Apache Cassandra Database in which
>>> I have to store my data into Cassandra and then have to implement partial
>>> search capability.
>>> If we need to search based on full search  primary key, then it really
>>> best and easy to work with Cassandra , but in case of flexible search , I
>>> am getting confused.
>>>
>>>
>>> On Mon, Jun 12, 2017 at 3:47 PM, Oskar Kjellin <os...@gmail.com>
>>> wrote:
>>>
>>>> I haven't run solr with Cassandra myself. I just meant to run
>>>> elasticsearch as a completely separate service and write there as well.
>>>>
>>>> On 12 Jun 2017, at 09:45, @Nandan@ <na...@gmail.com>
>>>> wrote:
>>>>
>>>> Do you mean to use Elastic Search with Cassandra?
>>>> Even I am thinking to use Apache Solr With Cassandra.
>>>> In that case I have to create distributed tables such as:-
>>>> 1) video_by_title, video_by_actor, video_by_year  etc..
>>>> 2) After creating Tables , will have to configure solr core on all
>>>> tables.
>>>>
>>>> Is it like this ?
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Mon, Jun 12, 2017 at 3:19 PM, Oskar Kjellin <oskar.kjellin@gmail.com
>>>> > wrote:
>>>>
>>>>> Why not elasticsearch for this use case? It will make your life much
>>>>> simpler
>>>>>
>>>>> > On 12 Jun 2017, at 04:40, @Nandan@ <na...@gmail.com>
>>>>> wrote:
>>>>> >
>>>>> > Hi,
>>>>> >
>>>>> > Currently, I am working on data modeling for Video Company in which
>>>>> we have different types of users as well as different user functionality.
>>>>> > But currently, my concern is about Search video module based on
>>>>> different fields.
>>>>> >
>>>>> > Query patterns are as below:-
>>>>> > 1) Select video by actor.
>>>>> > 2) select video by producer.
>>>>> > 3) select video by music.
>>>>> > 4) select video by actor and producer.
>>>>> > 5) select video by actor and music.
>>>>> >
>>>>> > Note: - In short, We want to establish an advanced search module by
>>>>> which we can search by anyway and get the desired results.
>>>>> >
>>>>> > During a search , we need partial search also such that if any user
>>>>> can search "Harry" title, then we are able to give them result as all
>>>>> videos whose
>>>>> >  title contains "Harry" at any location.
>>>>> >
>>>>> > As per my ideas, I have to create separate tables such as
>>>>> video_by_actor, video_by_producer etc.. and implement solr query on all
>>>>> tables. Otherwise,
>>>>> > is there any others way by which we can implement this search module
>>>>> effectively.
>>>>> >
>>>>> > Please suggest.
>>>>> >
>>>>> > Best regards,
>>>>>
>>>>
>>>>
>>>
>>
>

Re: Reg:- Cassandra Data modelling for Search

Posted by "@Nandan@" <na...@gmail.com>.
Hi Edurado,

As you mentioned queries 1-6 ,
In this condition, we have to proceed with a table like as below :-
create table videos (
videoid uuid primary key,
title text,
actor text,
producer text,
release_date timestamp,
description text,
music text,
etc...
);
This table will help to store video datas based on PK videoid and will give
uniqeness due to uuid.
But as we know , in one movie there are multiple actor, multiple producer,
multiple music worked, So how can we store all these.. Only one option will
left as to use collection type columns.


On Mon, Jun 12, 2017 at 4:59 PM, Eduardo Alonso <ed...@stratio.com>
wrote:

> TLDR shouldBe *PD
>
> Eduardo Alonso
> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
> 28224 Pozuelo de Alarcón, Madrid
> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com // *@stratiobd
> <https://twitter.com/StratioBD>*
>
> 2017-06-12 10:58 GMT+02:00 Eduardo Alonso <ed...@stratio.com>:
>
>> Hi Nandan:
>>
>> So, your system must provide these queries:
>>
>> 1 - SELECT video FROM ... WHERE actor = '...';
>> 2 - SELECT video FROM ... WHERE producer = '...';
>> 3 - SELECT video FROM ... WHERE music = '...';
>> 4 - SELECT video FROM ... WHERE actor = '...' AND producer ='...';
>> 5 - SELECT video FROM ... WHERE actor = '...' AND music = '...';
>> 6 - SELECT video WHERE title CONTAINS 'Harry';
>>
>>
>> For queries 1-5 you can get them with just cassandra, denormalizing
>> tables just the way your mentioned but without solr, just cassandra
>> (Indeed, just for equality clauses)
>>
>> video_by_actor;
>> video_by_producer;
>> video_by_music;
>> video_by_actor_and_producer;
>> video_by_actor_and_music;
>>
>> For queries number 6 you need a search engine.
>>
>> SOL
>> ElasticSearch
>> cassandra-lucene-index
>> <https://github.com/stratio/cassandra-lucene-index>
>> SASI
>> <http://docs.datastax.com/en/dse/5.1/cql/cql/cql_reference/cql_commands/cqlCreateCustomIndex.html>
>>
>> I think, just for your query,  the easiest way to get it is to build a
>> SASI index.
>> TLDR: I work for stratio in cassandra-lucene-index but for your basic
>> query (only one dimension), SASI indexes will work for you.
>>
>>
>>
>>
>> Eduardo Alonso
>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
>> 28224 Pozuelo de Alarcón, Madrid
>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com // *@stratiobd
>> <https://twitter.com/StratioBD>*
>>
>> 2017-06-12 9:50 GMT+02:00 @Nandan@ <na...@gmail.com>:
>>
>>> But Condition is , I am working with Apache Cassandra Database in which
>>> I have to store my data into Cassandra and then have to implement partial
>>> search capability.
>>> If we need to search based on full search  primary key, then it really
>>> best and easy to work with Cassandra , but in case of flexible search , I
>>> am getting confused.
>>>
>>>
>>> On Mon, Jun 12, 2017 at 3:47 PM, Oskar Kjellin <os...@gmail.com>
>>> wrote:
>>>
>>>> I haven't run solr with Cassandra myself. I just meant to run
>>>> elasticsearch as a completely separate service and write there as well.
>>>>
>>>> On 12 Jun 2017, at 09:45, @Nandan@ <na...@gmail.com>
>>>> wrote:
>>>>
>>>> Do you mean to use Elastic Search with Cassandra?
>>>> Even I am thinking to use Apache Solr With Cassandra.
>>>> In that case I have to create distributed tables such as:-
>>>> 1) video_by_title, video_by_actor, video_by_year  etc..
>>>> 2) After creating Tables , will have to configure solr core on all
>>>> tables.
>>>>
>>>> Is it like this ?
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Mon, Jun 12, 2017 at 3:19 PM, Oskar Kjellin <oskar.kjellin@gmail.com
>>>> > wrote:
>>>>
>>>>> Why not elasticsearch for this use case? It will make your life much
>>>>> simpler
>>>>>
>>>>> > On 12 Jun 2017, at 04:40, @Nandan@ <na...@gmail.com>
>>>>> wrote:
>>>>> >
>>>>> > Hi,
>>>>> >
>>>>> > Currently, I am working on data modeling for Video Company in which
>>>>> we have different types of users as well as different user functionality.
>>>>> > But currently, my concern is about Search video module based on
>>>>> different fields.
>>>>> >
>>>>> > Query patterns are as below:-
>>>>> > 1) Select video by actor.
>>>>> > 2) select video by producer.
>>>>> > 3) select video by music.
>>>>> > 4) select video by actor and producer.
>>>>> > 5) select video by actor and music.
>>>>> >
>>>>> > Note: - In short, We want to establish an advanced search module by
>>>>> which we can search by anyway and get the desired results.
>>>>> >
>>>>> > During a search , we need partial search also such that if any user
>>>>> can search "Harry" title, then we are able to give them result as all
>>>>> videos whose
>>>>> >  title contains "Harry" at any location.
>>>>> >
>>>>> > As per my ideas, I have to create separate tables such as
>>>>> video_by_actor, video_by_producer etc.. and implement solr query on all
>>>>> tables. Otherwise,
>>>>> > is there any others way by which we can implement this search module
>>>>> effectively.
>>>>> >
>>>>> > Please suggest.
>>>>> >
>>>>> > Best regards,
>>>>>
>>>>
>>>>
>>>
>>
>

Re: Reg:- Cassandra Data modelling for Search

Posted by Eduardo Alonso <ed...@stratio.com>.
TLDR shouldBe *PD

Eduardo Alonso
Vía de las dos Castillas, 33, Ática 4, 3ª Planta
28224 Pozuelo de Alarcón, Madrid
Tel: +34 91 828 6473 // www.stratio.com // *@stratiobd
<https://twitter.com/StratioBD>*

2017-06-12 10:58 GMT+02:00 Eduardo Alonso <ed...@stratio.com>:

> Hi Nandan:
>
> So, your system must provide these queries:
>
> 1 - SELECT video FROM ... WHERE actor = '...';
> 2 - SELECT video FROM ... WHERE producer = '...';
> 3 - SELECT video FROM ... WHERE music = '...';
> 4 - SELECT video FROM ... WHERE actor = '...' AND producer ='...';
> 5 - SELECT video FROM ... WHERE actor = '...' AND music = '...';
> 6 - SELECT video WHERE title CONTAINS 'Harry';
>
>
> For queries 1-5 you can get them with just cassandra, denormalizing tables
> just the way your mentioned but without solr, just cassandra (Indeed, just
> for equality clauses)
>
> video_by_actor;
> video_by_producer;
> video_by_music;
> video_by_actor_and_producer;
> video_by_actor_and_music;
>
> For queries number 6 you need a search engine.
>
> SOL
> ElasticSearch
> cassandra-lucene-index <https://github.com/stratio/cassandra-lucene-index>
> SASI
> <http://docs.datastax.com/en/dse/5.1/cql/cql/cql_reference/cql_commands/cqlCreateCustomIndex.html>
>
> I think, just for your query,  the easiest way to get it is to build a
> SASI index.
> TLDR: I work for stratio in cassandra-lucene-index but for your basic
> query (only one dimension), SASI indexes will work for you.
>
>
>
>
> Eduardo Alonso
> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
> 28224 Pozuelo de Alarcón, Madrid
> Tel: +34 91 828 6473 // www.stratio.com // *@stratiobd
> <https://twitter.com/StratioBD>*
>
> 2017-06-12 9:50 GMT+02:00 @Nandan@ <na...@gmail.com>:
>
>> But Condition is , I am working with Apache Cassandra Database in which I
>> have to store my data into Cassandra and then have to implement partial
>> search capability.
>> If we need to search based on full search  primary key, then it really
>> best and easy to work with Cassandra , but in case of flexible search , I
>> am getting confused.
>>
>>
>> On Mon, Jun 12, 2017 at 3:47 PM, Oskar Kjellin <os...@gmail.com>
>> wrote:
>>
>>> I haven't run solr with Cassandra myself. I just meant to run
>>> elasticsearch as a completely separate service and write there as well.
>>>
>>> On 12 Jun 2017, at 09:45, @Nandan@ <na...@gmail.com>
>>> wrote:
>>>
>>> Do you mean to use Elastic Search with Cassandra?
>>> Even I am thinking to use Apache Solr With Cassandra.
>>> In that case I have to create distributed tables such as:-
>>> 1) video_by_title, video_by_actor, video_by_year  etc..
>>> 2) After creating Tables , will have to configure solr core on all
>>> tables.
>>>
>>> Is it like this ?
>>>
>>>
>>>
>>>
>>>
>>> On Mon, Jun 12, 2017 at 3:19 PM, Oskar Kjellin <os...@gmail.com>
>>> wrote:
>>>
>>>> Why not elasticsearch for this use case? It will make your life much
>>>> simpler
>>>>
>>>> > On 12 Jun 2017, at 04:40, @Nandan@ <na...@gmail.com>
>>>> wrote:
>>>> >
>>>> > Hi,
>>>> >
>>>> > Currently, I am working on data modeling for Video Company in which
>>>> we have different types of users as well as different user functionality.
>>>> > But currently, my concern is about Search video module based on
>>>> different fields.
>>>> >
>>>> > Query patterns are as below:-
>>>> > 1) Select video by actor.
>>>> > 2) select video by producer.
>>>> > 3) select video by music.
>>>> > 4) select video by actor and producer.
>>>> > 5) select video by actor and music.
>>>> >
>>>> > Note: - In short, We want to establish an advanced search module by
>>>> which we can search by anyway and get the desired results.
>>>> >
>>>> > During a search , we need partial search also such that if any user
>>>> can search "Harry" title, then we are able to give them result as all
>>>> videos whose
>>>> >  title contains "Harry" at any location.
>>>> >
>>>> > As per my ideas, I have to create separate tables such as
>>>> video_by_actor, video_by_producer etc.. and implement solr query on all
>>>> tables. Otherwise,
>>>> > is there any others way by which we can implement this search module
>>>> effectively.
>>>> >
>>>> > Please suggest.
>>>> >
>>>> > Best regards,
>>>>
>>>
>>>
>>
>

Re: Reg:- Cassandra Data modelling for Search

Posted by Eduardo Alonso <ed...@stratio.com>.
TLDR shouldBe *PD

Eduardo Alonso
Vía de las dos Castillas, 33, Ática 4, 3ª Planta
28224 Pozuelo de Alarcón, Madrid
Tel: +34 91 828 6473 // www.stratio.com // *@stratiobd
<https://twitter.com/StratioBD>*

2017-06-12 10:58 GMT+02:00 Eduardo Alonso <ed...@stratio.com>:

> Hi Nandan:
>
> So, your system must provide these queries:
>
> 1 - SELECT video FROM ... WHERE actor = '...';
> 2 - SELECT video FROM ... WHERE producer = '...';
> 3 - SELECT video FROM ... WHERE music = '...';
> 4 - SELECT video FROM ... WHERE actor = '...' AND producer ='...';
> 5 - SELECT video FROM ... WHERE actor = '...' AND music = '...';
> 6 - SELECT video WHERE title CONTAINS 'Harry';
>
>
> For queries 1-5 you can get them with just cassandra, denormalizing tables
> just the way your mentioned but without solr, just cassandra (Indeed, just
> for equality clauses)
>
> video_by_actor;
> video_by_producer;
> video_by_music;
> video_by_actor_and_producer;
> video_by_actor_and_music;
>
> For queries number 6 you need a search engine.
>
> SOL
> ElasticSearch
> cassandra-lucene-index <https://github.com/stratio/cassandra-lucene-index>
> SASI
> <http://docs.datastax.com/en/dse/5.1/cql/cql/cql_reference/cql_commands/cqlCreateCustomIndex.html>
>
> I think, just for your query,  the easiest way to get it is to build a
> SASI index.
> TLDR: I work for stratio in cassandra-lucene-index but for your basic
> query (only one dimension), SASI indexes will work for you.
>
>
>
>
> Eduardo Alonso
> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
> 28224 Pozuelo de Alarcón, Madrid
> Tel: +34 91 828 6473 // www.stratio.com // *@stratiobd
> <https://twitter.com/StratioBD>*
>
> 2017-06-12 9:50 GMT+02:00 @Nandan@ <na...@gmail.com>:
>
>> But Condition is , I am working with Apache Cassandra Database in which I
>> have to store my data into Cassandra and then have to implement partial
>> search capability.
>> If we need to search based on full search  primary key, then it really
>> best and easy to work with Cassandra , but in case of flexible search , I
>> am getting confused.
>>
>>
>> On Mon, Jun 12, 2017 at 3:47 PM, Oskar Kjellin <os...@gmail.com>
>> wrote:
>>
>>> I haven't run solr with Cassandra myself. I just meant to run
>>> elasticsearch as a completely separate service and write there as well.
>>>
>>> On 12 Jun 2017, at 09:45, @Nandan@ <na...@gmail.com>
>>> wrote:
>>>
>>> Do you mean to use Elastic Search with Cassandra?
>>> Even I am thinking to use Apache Solr With Cassandra.
>>> In that case I have to create distributed tables such as:-
>>> 1) video_by_title, video_by_actor, video_by_year  etc..
>>> 2) After creating Tables , will have to configure solr core on all
>>> tables.
>>>
>>> Is it like this ?
>>>
>>>
>>>
>>>
>>>
>>> On Mon, Jun 12, 2017 at 3:19 PM, Oskar Kjellin <os...@gmail.com>
>>> wrote:
>>>
>>>> Why not elasticsearch for this use case? It will make your life much
>>>> simpler
>>>>
>>>> > On 12 Jun 2017, at 04:40, @Nandan@ <na...@gmail.com>
>>>> wrote:
>>>> >
>>>> > Hi,
>>>> >
>>>> > Currently, I am working on data modeling for Video Company in which
>>>> we have different types of users as well as different user functionality.
>>>> > But currently, my concern is about Search video module based on
>>>> different fields.
>>>> >
>>>> > Query patterns are as below:-
>>>> > 1) Select video by actor.
>>>> > 2) select video by producer.
>>>> > 3) select video by music.
>>>> > 4) select video by actor and producer.
>>>> > 5) select video by actor and music.
>>>> >
>>>> > Note: - In short, We want to establish an advanced search module by
>>>> which we can search by anyway and get the desired results.
>>>> >
>>>> > During a search , we need partial search also such that if any user
>>>> can search "Harry" title, then we are able to give them result as all
>>>> videos whose
>>>> >  title contains "Harry" at any location.
>>>> >
>>>> > As per my ideas, I have to create separate tables such as
>>>> video_by_actor, video_by_producer etc.. and implement solr query on all
>>>> tables. Otherwise,
>>>> > is there any others way by which we can implement this search module
>>>> effectively.
>>>> >
>>>> > Please suggest.
>>>> >
>>>> > Best regards,
>>>>
>>>
>>>
>>
>

Re: Reg:- Cassandra Data modelling for Search

Posted by Eduardo Alonso <ed...@stratio.com>.
Hi Nandan:

So, your system must provide these queries:

1 - SELECT video FROM ... WHERE actor = '...';
2 - SELECT video FROM ... WHERE producer = '...';
3 - SELECT video FROM ... WHERE music = '...';
4 - SELECT video FROM ... WHERE actor = '...' AND producer ='...';
5 - SELECT video FROM ... WHERE actor = '...' AND music = '...';
6 - SELECT video WHERE title CONTAINS 'Harry';


For queries 1-5 you can get them with just cassandra, denormalizing tables
just the way your mentioned but without solr, just cassandra (Indeed, just
for equality clauses)

video_by_actor;
video_by_producer;
video_by_music;
video_by_actor_and_producer;
video_by_actor_and_music;

For queries number 6 you need a search engine.

SOL
ElasticSearch
cassandra-lucene-index <https://github.com/stratio/cassandra-lucene-index>
SASI
<http://docs.datastax.com/en/dse/5.1/cql/cql/cql_reference/cql_commands/cqlCreateCustomIndex.html>

I think, just for your query,  the easiest way to get it is to build a SASI
index.
TLDR: I work for stratio in cassandra-lucene-index but for your basic query
(only one dimension), SASI indexes will work for you.




Eduardo Alonso
Vía de las dos Castillas, 33, Ática 4, 3ª Planta
28224 Pozuelo de Alarcón, Madrid
Tel: +34 91 828 6473 // www.stratio.com // *@stratiobd
<https://twitter.com/StratioBD>*

2017-06-12 9:50 GMT+02:00 @Nandan@ <na...@gmail.com>:

> But Condition is , I am working with Apache Cassandra Database in which I
> have to store my data into Cassandra and then have to implement partial
> search capability.
> If we need to search based on full search  primary key, then it really
> best and easy to work with Cassandra , but in case of flexible search , I
> am getting confused.
>
>
> On Mon, Jun 12, 2017 at 3:47 PM, Oskar Kjellin <os...@gmail.com>
> wrote:
>
>> I haven't run solr with Cassandra myself. I just meant to run
>> elasticsearch as a completely separate service and write there as well.
>>
>> On 12 Jun 2017, at 09:45, @Nandan@ <na...@gmail.com>
>> wrote:
>>
>> Do you mean to use Elastic Search with Cassandra?
>> Even I am thinking to use Apache Solr With Cassandra.
>> In that case I have to create distributed tables such as:-
>> 1) video_by_title, video_by_actor, video_by_year  etc..
>> 2) After creating Tables , will have to configure solr core on all
>> tables.
>>
>> Is it like this ?
>>
>>
>>
>>
>>
>> On Mon, Jun 12, 2017 at 3:19 PM, Oskar Kjellin <os...@gmail.com>
>> wrote:
>>
>>> Why not elasticsearch for this use case? It will make your life much
>>> simpler
>>>
>>> > On 12 Jun 2017, at 04:40, @Nandan@ <na...@gmail.com>
>>> wrote:
>>> >
>>> > Hi,
>>> >
>>> > Currently, I am working on data modeling for Video Company in which we
>>> have different types of users as well as different user functionality.
>>> > But currently, my concern is about Search video module based on
>>> different fields.
>>> >
>>> > Query patterns are as below:-
>>> > 1) Select video by actor.
>>> > 2) select video by producer.
>>> > 3) select video by music.
>>> > 4) select video by actor and producer.
>>> > 5) select video by actor and music.
>>> >
>>> > Note: - In short, We want to establish an advanced search module by
>>> which we can search by anyway and get the desired results.
>>> >
>>> > During a search , we need partial search also such that if any user
>>> can search "Harry" title, then we are able to give them result as all
>>> videos whose
>>> >  title contains "Harry" at any location.
>>> >
>>> > As per my ideas, I have to create separate tables such as
>>> video_by_actor, video_by_producer etc.. and implement solr query on all
>>> tables. Otherwise,
>>> > is there any others way by which we can implement this search module
>>> effectively.
>>> >
>>> > Please suggest.
>>> >
>>> > Best regards,
>>>
>>
>>
>

Re: Reg:- Cassandra Data modelling for Search

Posted by Eduardo Alonso <ed...@stratio.com>.
Hi Nandan:

So, your system must provide these queries:

1 - SELECT video FROM ... WHERE actor = '...';
2 - SELECT video FROM ... WHERE producer = '...';
3 - SELECT video FROM ... WHERE music = '...';
4 - SELECT video FROM ... WHERE actor = '...' AND producer ='...';
5 - SELECT video FROM ... WHERE actor = '...' AND music = '...';
6 - SELECT video WHERE title CONTAINS 'Harry';


For queries 1-5 you can get them with just cassandra, denormalizing tables
just the way your mentioned but without solr, just cassandra (Indeed, just
for equality clauses)

video_by_actor;
video_by_producer;
video_by_music;
video_by_actor_and_producer;
video_by_actor_and_music;

For queries number 6 you need a search engine.

SOL
ElasticSearch
cassandra-lucene-index <https://github.com/stratio/cassandra-lucene-index>
SASI
<http://docs.datastax.com/en/dse/5.1/cql/cql/cql_reference/cql_commands/cqlCreateCustomIndex.html>

I think, just for your query,  the easiest way to get it is to build a SASI
index.
TLDR: I work for stratio in cassandra-lucene-index but for your basic query
(only one dimension), SASI indexes will work for you.




Eduardo Alonso
Vía de las dos Castillas, 33, Ática 4, 3ª Planta
28224 Pozuelo de Alarcón, Madrid
Tel: +34 91 828 6473 // www.stratio.com // *@stratiobd
<https://twitter.com/StratioBD>*

2017-06-12 9:50 GMT+02:00 @Nandan@ <na...@gmail.com>:

> But Condition is , I am working with Apache Cassandra Database in which I
> have to store my data into Cassandra and then have to implement partial
> search capability.
> If we need to search based on full search  primary key, then it really
> best and easy to work with Cassandra , but in case of flexible search , I
> am getting confused.
>
>
> On Mon, Jun 12, 2017 at 3:47 PM, Oskar Kjellin <os...@gmail.com>
> wrote:
>
>> I haven't run solr with Cassandra myself. I just meant to run
>> elasticsearch as a completely separate service and write there as well.
>>
>> On 12 Jun 2017, at 09:45, @Nandan@ <na...@gmail.com>
>> wrote:
>>
>> Do you mean to use Elastic Search with Cassandra?
>> Even I am thinking to use Apache Solr With Cassandra.
>> In that case I have to create distributed tables such as:-
>> 1) video_by_title, video_by_actor, video_by_year  etc..
>> 2) After creating Tables , will have to configure solr core on all
>> tables.
>>
>> Is it like this ?
>>
>>
>>
>>
>>
>> On Mon, Jun 12, 2017 at 3:19 PM, Oskar Kjellin <os...@gmail.com>
>> wrote:
>>
>>> Why not elasticsearch for this use case? It will make your life much
>>> simpler
>>>
>>> > On 12 Jun 2017, at 04:40, @Nandan@ <na...@gmail.com>
>>> wrote:
>>> >
>>> > Hi,
>>> >
>>> > Currently, I am working on data modeling for Video Company in which we
>>> have different types of users as well as different user functionality.
>>> > But currently, my concern is about Search video module based on
>>> different fields.
>>> >
>>> > Query patterns are as below:-
>>> > 1) Select video by actor.
>>> > 2) select video by producer.
>>> > 3) select video by music.
>>> > 4) select video by actor and producer.
>>> > 5) select video by actor and music.
>>> >
>>> > Note: - In short, We want to establish an advanced search module by
>>> which we can search by anyway and get the desired results.
>>> >
>>> > During a search , we need partial search also such that if any user
>>> can search "Harry" title, then we are able to give them result as all
>>> videos whose
>>> >  title contains "Harry" at any location.
>>> >
>>> > As per my ideas, I have to create separate tables such as
>>> video_by_actor, video_by_producer etc.. and implement solr query on all
>>> tables. Otherwise,
>>> > is there any others way by which we can implement this search module
>>> effectively.
>>> >
>>> > Please suggest.
>>> >
>>> > Best regards,
>>>
>>
>>
>

Re: Reg:- Cassandra Data modelling for Search

Posted by "@Nandan@" <na...@gmail.com>.
But Condition is , I am working with Apache Cassandra Database in which I
have to store my data into Cassandra and then have to implement partial
search capability.
If we need to search based on full search  primary key, then it really best
and easy to work with Cassandra , but in case of flexible search , I am
getting confused.


On Mon, Jun 12, 2017 at 3:47 PM, Oskar Kjellin <os...@gmail.com>
wrote:

> I haven't run solr with Cassandra myself. I just meant to run
> elasticsearch as a completely separate service and write there as well.
>
> On 12 Jun 2017, at 09:45, @Nandan@ <na...@gmail.com> wrote:
>
> Do you mean to use Elastic Search with Cassandra?
> Even I am thinking to use Apache Solr With Cassandra.
> In that case I have to create distributed tables such as:-
> 1) video_by_title, video_by_actor, video_by_year  etc..
> 2) After creating Tables , will have to configure solr core on all tables.
>
> Is it like this ?
>
>
>
>
>
> On Mon, Jun 12, 2017 at 3:19 PM, Oskar Kjellin <os...@gmail.com>
> wrote:
>
>> Why not elasticsearch for this use case? It will make your life much
>> simpler
>>
>> > On 12 Jun 2017, at 04:40, @Nandan@ <na...@gmail.com>
>> wrote:
>> >
>> > Hi,
>> >
>> > Currently, I am working on data modeling for Video Company in which we
>> have different types of users as well as different user functionality.
>> > But currently, my concern is about Search video module based on
>> different fields.
>> >
>> > Query patterns are as below:-
>> > 1) Select video by actor.
>> > 2) select video by producer.
>> > 3) select video by music.
>> > 4) select video by actor and producer.
>> > 5) select video by actor and music.
>> >
>> > Note: - In short, We want to establish an advanced search module by
>> which we can search by anyway and get the desired results.
>> >
>> > During a search , we need partial search also such that if any user can
>> search "Harry" title, then we are able to give them result as all videos
>> whose
>> >  title contains "Harry" at any location.
>> >
>> > As per my ideas, I have to create separate tables such as
>> video_by_actor, video_by_producer etc.. and implement solr query on all
>> tables. Otherwise,
>> > is there any others way by which we can implement this search module
>> effectively.
>> >
>> > Please suggest.
>> >
>> > Best regards,
>>
>
>

Re: Reg:- Cassandra Data modelling for Search

Posted by "@Nandan@" <na...@gmail.com>.
But Condition is , I am working with Apache Cassandra Database in which I
have to store my data into Cassandra and then have to implement partial
search capability.
If we need to search based on full search  primary key, then it really best
and easy to work with Cassandra , but in case of flexible search , I am
getting confused.


On Mon, Jun 12, 2017 at 3:47 PM, Oskar Kjellin <os...@gmail.com>
wrote:

> I haven't run solr with Cassandra myself. I just meant to run
> elasticsearch as a completely separate service and write there as well.
>
> On 12 Jun 2017, at 09:45, @Nandan@ <na...@gmail.com> wrote:
>
> Do you mean to use Elastic Search with Cassandra?
> Even I am thinking to use Apache Solr With Cassandra.
> In that case I have to create distributed tables such as:-
> 1) video_by_title, video_by_actor, video_by_year  etc..
> 2) After creating Tables , will have to configure solr core on all tables.
>
> Is it like this ?
>
>
>
>
>
> On Mon, Jun 12, 2017 at 3:19 PM, Oskar Kjellin <os...@gmail.com>
> wrote:
>
>> Why not elasticsearch for this use case? It will make your life much
>> simpler
>>
>> > On 12 Jun 2017, at 04:40, @Nandan@ <na...@gmail.com>
>> wrote:
>> >
>> > Hi,
>> >
>> > Currently, I am working on data modeling for Video Company in which we
>> have different types of users as well as different user functionality.
>> > But currently, my concern is about Search video module based on
>> different fields.
>> >
>> > Query patterns are as below:-
>> > 1) Select video by actor.
>> > 2) select video by producer.
>> > 3) select video by music.
>> > 4) select video by actor and producer.
>> > 5) select video by actor and music.
>> >
>> > Note: - In short, We want to establish an advanced search module by
>> which we can search by anyway and get the desired results.
>> >
>> > During a search , we need partial search also such that if any user can
>> search "Harry" title, then we are able to give them result as all videos
>> whose
>> >  title contains "Harry" at any location.
>> >
>> > As per my ideas, I have to create separate tables such as
>> video_by_actor, video_by_producer etc.. and implement solr query on all
>> tables. Otherwise,
>> > is there any others way by which we can implement this search module
>> effectively.
>> >
>> > Please suggest.
>> >
>> > Best regards,
>>
>
>

Re: Reg:- Cassandra Data modelling for Search

Posted by Oskar Kjellin <os...@gmail.com>.
I haven't run solr with Cassandra myself. I just meant to run elasticsearch as a completely separate service and write there as well. 

> On 12 Jun 2017, at 09:45, @Nandan@ <na...@gmail.com> wrote:
> 
> Do you mean to use Elastic Search with Cassandra?
> Even I am thinking to use Apache Solr With Cassandra. 
> In that case I have to create distributed tables such as:-
> 1) video_by_title, video_by_actor, video_by_year  etc..
> 2) After creating Tables , will have to configure solr core on all tables. 
> 
> Is it like this ?
> 
> 
> 
>  
> 
>> On Mon, Jun 12, 2017 at 3:19 PM, Oskar Kjellin <os...@gmail.com> wrote:
>> Why not elasticsearch for this use case? It will make your life much simpler
>> 
>> > On 12 Jun 2017, at 04:40, @Nandan@ <na...@gmail.com> wrote:
>> >
>> > Hi,
>> >
>> > Currently, I am working on data modeling for Video Company in which we have different types of users as well as different user functionality.
>> > But currently, my concern is about Search video module based on different fields.
>> >
>> > Query patterns are as below:-
>> > 1) Select video by actor.
>> > 2) select video by producer.
>> > 3) select video by music.
>> > 4) select video by actor and producer.
>> > 5) select video by actor and music.
>> >
>> > Note: - In short, We want to establish an advanced search module by which we can search by anyway and get the desired results.
>> >
>> > During a search , we need partial search also such that if any user can search "Harry" title, then we are able to give them result as all videos whose
>> >  title contains "Harry" at any location.
>> >
>> > As per my ideas, I have to create separate tables such as video_by_actor, video_by_producer etc.. and implement solr query on all tables. Otherwise,
>> > is there any others way by which we can implement this search module effectively.
>> >
>> > Please suggest.
>> >
>> > Best regards,
> 

Re: Reg:- Cassandra Data modelling for Search

Posted by Oskar Kjellin <os...@gmail.com>.
I haven't run solr with Cassandra myself. I just meant to run elasticsearch as a completely separate service and write there as well. 

> On 12 Jun 2017, at 09:45, @Nandan@ <na...@gmail.com> wrote:
> 
> Do you mean to use Elastic Search with Cassandra?
> Even I am thinking to use Apache Solr With Cassandra. 
> In that case I have to create distributed tables such as:-
> 1) video_by_title, video_by_actor, video_by_year  etc..
> 2) After creating Tables , will have to configure solr core on all tables. 
> 
> Is it like this ?
> 
> 
> 
>  
> 
>> On Mon, Jun 12, 2017 at 3:19 PM, Oskar Kjellin <os...@gmail.com> wrote:
>> Why not elasticsearch for this use case? It will make your life much simpler
>> 
>> > On 12 Jun 2017, at 04:40, @Nandan@ <na...@gmail.com> wrote:
>> >
>> > Hi,
>> >
>> > Currently, I am working on data modeling for Video Company in which we have different types of users as well as different user functionality.
>> > But currently, my concern is about Search video module based on different fields.
>> >
>> > Query patterns are as below:-
>> > 1) Select video by actor.
>> > 2) select video by producer.
>> > 3) select video by music.
>> > 4) select video by actor and producer.
>> > 5) select video by actor and music.
>> >
>> > Note: - In short, We want to establish an advanced search module by which we can search by anyway and get the desired results.
>> >
>> > During a search , we need partial search also such that if any user can search "Harry" title, then we are able to give them result as all videos whose
>> >  title contains "Harry" at any location.
>> >
>> > As per my ideas, I have to create separate tables such as video_by_actor, video_by_producer etc.. and implement solr query on all tables. Otherwise,
>> > is there any others way by which we can implement this search module effectively.
>> >
>> > Please suggest.
>> >
>> > Best regards,
> 

Re: Reg:- Cassandra Data modelling for Search

Posted by "@Nandan@" <na...@gmail.com>.
Do you mean to use Elastic Search with Cassandra?
Even I am thinking to use Apache Solr With Cassandra.
In that case I have to create distributed tables such as:-
1) video_by_title, video_by_actor, video_by_year  etc..
2) After creating Tables , will have to configure solr core on all tables.

Is it like this ?





On Mon, Jun 12, 2017 at 3:19 PM, Oskar Kjellin <os...@gmail.com>
wrote:

> Why not elasticsearch for this use case? It will make your life much
> simpler
>
> > On 12 Jun 2017, at 04:40, @Nandan@ <na...@gmail.com>
> wrote:
> >
> > Hi,
> >
> > Currently, I am working on data modeling for Video Company in which we
> have different types of users as well as different user functionality.
> > But currently, my concern is about Search video module based on
> different fields.
> >
> > Query patterns are as below:-
> > 1) Select video by actor.
> > 2) select video by producer.
> > 3) select video by music.
> > 4) select video by actor and producer.
> > 5) select video by actor and music.
> >
> > Note: - In short, We want to establish an advanced search module by
> which we can search by anyway and get the desired results.
> >
> > During a search , we need partial search also such that if any user can
> search "Harry" title, then we are able to give them result as all videos
> whose
> >  title contains "Harry" at any location.
> >
> > As per my ideas, I have to create separate tables such as
> video_by_actor, video_by_producer etc.. and implement solr query on all
> tables. Otherwise,
> > is there any others way by which we can implement this search module
> effectively.
> >
> > Please suggest.
> >
> > Best regards,
>

Re: Reg:- Cassandra Data modelling for Search

Posted by "@Nandan@" <na...@gmail.com>.
Do you mean to use Elastic Search with Cassandra?
Even I am thinking to use Apache Solr With Cassandra.
In that case I have to create distributed tables such as:-
1) video_by_title, video_by_actor, video_by_year  etc..
2) After creating Tables , will have to configure solr core on all tables.

Is it like this ?





On Mon, Jun 12, 2017 at 3:19 PM, Oskar Kjellin <os...@gmail.com>
wrote:

> Why not elasticsearch for this use case? It will make your life much
> simpler
>
> > On 12 Jun 2017, at 04:40, @Nandan@ <na...@gmail.com>
> wrote:
> >
> > Hi,
> >
> > Currently, I am working on data modeling for Video Company in which we
> have different types of users as well as different user functionality.
> > But currently, my concern is about Search video module based on
> different fields.
> >
> > Query patterns are as below:-
> > 1) Select video by actor.
> > 2) select video by producer.
> > 3) select video by music.
> > 4) select video by actor and producer.
> > 5) select video by actor and music.
> >
> > Note: - In short, We want to establish an advanced search module by
> which we can search by anyway and get the desired results.
> >
> > During a search , we need partial search also such that if any user can
> search "Harry" title, then we are able to give them result as all videos
> whose
> >  title contains "Harry" at any location.
> >
> > As per my ideas, I have to create separate tables such as
> video_by_actor, video_by_producer etc.. and implement solr query on all
> tables. Otherwise,
> > is there any others way by which we can implement this search module
> effectively.
> >
> > Please suggest.
> >
> > Best regards,
>

Re: Reg:- Cassandra Data modelling for Search

Posted by Oskar Kjellin <os...@gmail.com>.
Why not elasticsearch for this use case? It will make your life much simpler 

> On 12 Jun 2017, at 04:40, @Nandan@ <na...@gmail.com> wrote:
> 
> Hi, 
> 
> Currently, I am working on data modeling for Video Company in which we have different types of users as well as different user functionality. 
> But currently, my concern is about Search video module based on different fields. 
> 
> Query patterns are as below:-
> 1) Select video by actor.
> 2) select video by producer.
> 3) select video by music.
> 4) select video by actor and producer. 
> 5) select video by actor and music. 
> 
> Note: - In short, We want to establish an advanced search module by which we can search by anyway and get the desired results. 
> 
> During a search , we need partial search also such that if any user can search "Harry" title, then we are able to give them result as all videos whose
>  title contains "Harry" at any location. 
> 
> As per my ideas, I have to create separate tables such as video_by_actor, video_by_producer etc.. and implement solr query on all tables. Otherwise,
> is there any others way by which we can implement this search module effectively. 
> 
> Please suggest.
> 
> Best regards,

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
For additional commands, e-mail: user-help@cassandra.apache.org


Re: Reg:- Cassandra Data modelling for Search

Posted by "@Nandan@" <na...@gmail.com>.
Hi Michael ,
MV is also good option when we have to select based on equality search, but
here condition is to developing a model for advance partial search way.
And Also , In case of MV, suppose we have 2 DC with 3 Nodes on each DC then
MV will replicated data based on 6*6 times which will be another problem.


On Mon, Jun 12, 2017 at 11:08 PM, Michael Mior <mm...@uwaterloo.ca> wrote:

> For queries 1-5 this seems like a potentially good use case for
> materialized views. Create one table with the videos stored by ID and the
> materialized views for each of the queries.
>
> --
> Michael Mior
> mmior@apache.org
>
>
> 2017-06-11 22:40 GMT-04:00 @Nandan@ <na...@gmail.com>:
>
>> Hi,
>>
>> Currently, I am working on data modeling for Video Company in which we
>> have different types of users as well as different user functionality.
>> But currently, my concern is about Search video module based on different
>> fields.
>>
>> Query patterns are as below:-
>> 1) Select video by actor.
>> 2) select video by producer.
>> 3) select video by music.
>> 4) select video by actor and producer.
>> 5) select video by actor and music.
>>
>> Note: - In short, We want to establish an advanced search module by which
>> we can search by anyway and get the desired results.
>>
>> During a search , we need partial search also such that if any user can
>> search "Harry" title, then we are able to give them result as all videos
>> whose
>>  title contains "Harry" at any location.
>>
>> As per my ideas, I have to create separate tables such as video_by_actor,
>> video_by_producer etc.. and implement solr query on all tables. Otherwise,
>> is there any others way by which we can implement this search module
>> effectively.
>>
>> Please suggest.
>>
>> Best regards,
>>
>
>

Re: Reg:- Cassandra Data modelling for Search

Posted by "@Nandan@" <na...@gmail.com>.
Yes I am not thinking to go with MV. I am trying to implement by myself.
May be some idea will get about doing cassandra-stress about data
generation and all.
Thanks Jonathan.

On Tue, Jun 13, 2017 at 10:44 AM, Jonathan Haddad <jo...@jonhaddad.com> wrote:

> Unless you're willing to put in a lot of time fixing bugs, I'd recommend
> avoiding 3.0's materialized views and manage them yourself.
>
> On Mon, Jun 12, 2017 at 6:11 PM @Nandan@ <na...@gmail.com>
> wrote:
>
>> Correct, Our first concern is to store huge READ and WRITE, for that
>> Cassandra is our First and Best Choice. But according to Use Case, we need
>> to implement Advance search like Partial text, Phrase search etc.. So we
>> are thinking the best way, that how to implement data model.
>>
>>
>> On Tue, Jun 13, 2017 at 3:35 AM, Oskar Kjellin <os...@gmail.com>
>> wrote:
>>
>>> Agree, I meant as Jonathan said to use C* for primary key and as a
>>> primary storage and ES as an indexed version of what you have in cassandra.
>>>
>>> 2017-06-12 19:19 GMT+02:00 DuyHai Doan <do...@gmail.com>:
>>>
>>>> Sorry, I misread some reply I had the impression that people recommend
>>>> ES as primary datastore
>>>>
>>>> On Mon, Jun 12, 2017 at 7:12 PM, Jonathan Haddad <jo...@jonhaddad.com>
>>>> wrote:
>>>>
>>>>> Nobody is promoting ES as a primary datastore in this thread.  Every
>>>>> mention of it is to accompany C*.
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Jun 12, 2017 at 10:03 AM DuyHai Doan <do...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> For all those promoting ES as a PRIMARY datastore, please read this
>>>>>> before:
>>>>>>
>>>>>> https://discuss.elastic.co/t/elasticsearch-as-a-primary-
>>>>>> database/85733/13
>>>>>>
>>>>>> There are a lot of warning before recommending ES as a datastore.
>>>>>>
>>>>>> The answer from Pilato, ES official evangelist:
>>>>>>
>>>>>>
>>>>>>    - You absolutely care about your data and you want to be able to
>>>>>>    reindex in all cases. You need for that a datastore. A datastore can be a
>>>>>>    filesystem where you store JSON, HDFS, and/or a database you prefer and you
>>>>>>    are confident with. About how to inject data in it, you may want to read:
>>>>>>    http://david.pilato.fr/blog/2015/05/09/advanced-
>>>>>>    search-for-your-legacy-application/7
>>>>>>    <http://david.pilato.fr/blog/2015/05/09/advanced-search-for-your-legacy-application/>
>>>>>>    .
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Jun 12, 2017 at 5:08 PM, Michael Mior <mm...@uwaterloo.ca>
>>>>>> wrote:
>>>>>>
>>>>>>> For queries 1-5 this seems like a potentially good use case for
>>>>>>> materialized views. Create one table with the videos stored by ID and the
>>>>>>> materialized views for each of the queries.
>>>>>>>
>>>>>>> --
>>>>>>> Michael Mior
>>>>>>> mmior@apache.org
>>>>>>>
>>>>>>>
>>>>>>> 2017-06-11 22:40 GMT-04:00 @Nandan@ <na...@gmail.com>
>>>>>>> :
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> Currently, I am working on data modeling for Video Company in which
>>>>>>>> we have different types of users as well as different user functionality.
>>>>>>>> But currently, my concern is about Search video module based on
>>>>>>>> different fields.
>>>>>>>>
>>>>>>>> Query patterns are as below:-
>>>>>>>> 1) Select video by actor.
>>>>>>>> 2) select video by producer.
>>>>>>>> 3) select video by music.
>>>>>>>> 4) select video by actor and producer.
>>>>>>>> 5) select video by actor and music.
>>>>>>>>
>>>>>>>> Note: - In short, We want to establish an advanced search module by
>>>>>>>> which we can search by anyway and get the desired results.
>>>>>>>>
>>>>>>>> During a search , we need partial search also such that if any user
>>>>>>>> can search "Harry" title, then we are able to give them result as all
>>>>>>>> videos whose
>>>>>>>>  title contains "Harry" at any location.
>>>>>>>>
>>>>>>>> As per my ideas, I have to create separate tables such as
>>>>>>>> video_by_actor, video_by_producer etc.. and implement solr query on all
>>>>>>>> tables. Otherwise,
>>>>>>>> is there any others way by which we can implement this search
>>>>>>>> module effectively.
>>>>>>>>
>>>>>>>> Please suggest.
>>>>>>>>
>>>>>>>> Best regards,
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>
>>>
>>

Re: Reg:- Cassandra Data modelling for Search

Posted by Jonathan Haddad <jo...@jonhaddad.com>.
Unless you're willing to put in a lot of time fixing bugs, I'd recommend
avoiding 3.0's materialized views and manage them yourself.

On Mon, Jun 12, 2017 at 6:11 PM @Nandan@ <na...@gmail.com>
wrote:

> Correct, Our first concern is to store huge READ and WRITE, for that
> Cassandra is our First and Best Choice. But according to Use Case, we need
> to implement Advance search like Partial text, Phrase search etc.. So we
> are thinking the best way, that how to implement data model.
>
>
> On Tue, Jun 13, 2017 at 3:35 AM, Oskar Kjellin <os...@gmail.com>
> wrote:
>
>> Agree, I meant as Jonathan said to use C* for primary key and as a
>> primary storage and ES as an indexed version of what you have in cassandra.
>>
>> 2017-06-12 19:19 GMT+02:00 DuyHai Doan <do...@gmail.com>:
>>
>>> Sorry, I misread some reply I had the impression that people recommend
>>> ES as primary datastore
>>>
>>> On Mon, Jun 12, 2017 at 7:12 PM, Jonathan Haddad <jo...@jonhaddad.com>
>>> wrote:
>>>
>>>> Nobody is promoting ES as a primary datastore in this thread.  Every
>>>> mention of it is to accompany C*.
>>>>
>>>>
>>>>
>>>> On Mon, Jun 12, 2017 at 10:03 AM DuyHai Doan <do...@gmail.com>
>>>> wrote:
>>>>
>>>>> For all those promoting ES as a PRIMARY datastore, please read this
>>>>> before:
>>>>>
>>>>>
>>>>> https://discuss.elastic.co/t/elasticsearch-as-a-primary-database/85733/13
>>>>>
>>>>> There are a lot of warning before recommending ES as a datastore.
>>>>>
>>>>> The answer from Pilato, ES official evangelist:
>>>>>
>>>>>
>>>>>    - You absolutely care about your data and you want to be able to
>>>>>    reindex in all cases. You need for that a datastore. A datastore can be a
>>>>>    filesystem where you store JSON, HDFS, and/or a database you prefer and you
>>>>>    are confident with. About how to inject data in it, you may want to read:
>>>>>    http://david.pilato.fr/blog/2015/05/09/advanced-search-for-your-legacy-application/
>>>>>    7
>>>>>    <http://david.pilato.fr/blog/2015/05/09/advanced-search-for-your-legacy-application/>
>>>>>    .
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Jun 12, 2017 at 5:08 PM, Michael Mior <mm...@uwaterloo.ca>
>>>>> wrote:
>>>>>
>>>>>> For queries 1-5 this seems like a potentially good use case for
>>>>>> materialized views. Create one table with the videos stored by ID and the
>>>>>> materialized views for each of the queries.
>>>>>>
>>>>>> --
>>>>>> Michael Mior
>>>>>> mmior@apache.org
>>>>>>
>>>>>>
>>>>>> 2017-06-11 22:40 GMT-04:00 @Nandan@ <na...@gmail.com>:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> Currently, I am working on data modeling for Video Company in which
>>>>>>> we have different types of users as well as different user functionality.
>>>>>>> But currently, my concern is about Search video module based on
>>>>>>> different fields.
>>>>>>>
>>>>>>> Query patterns are as below:-
>>>>>>> 1) Select video by actor.
>>>>>>> 2) select video by producer.
>>>>>>> 3) select video by music.
>>>>>>> 4) select video by actor and producer.
>>>>>>> 5) select video by actor and music.
>>>>>>>
>>>>>>> Note: - In short, We want to establish an advanced search module by
>>>>>>> which we can search by anyway and get the desired results.
>>>>>>>
>>>>>>> During a search , we need partial search also such that if any user
>>>>>>> can search "Harry" title, then we are able to give them result as all
>>>>>>> videos whose
>>>>>>>  title contains "Harry" at any location.
>>>>>>>
>>>>>>> As per my ideas, I have to create separate tables such as
>>>>>>> video_by_actor, video_by_producer etc.. and implement solr query on all
>>>>>>> tables. Otherwise,
>>>>>>> is there any others way by which we can implement this search module
>>>>>>> effectively.
>>>>>>>
>>>>>>> Please suggest.
>>>>>>>
>>>>>>> Best regards,
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>
>>
>

Re: Reg:- Cassandra Data modelling for Search

Posted by "@Nandan@" <na...@gmail.com>.
Correct, Our first concern is to store huge READ and WRITE, for that
Cassandra is our First and Best Choice. But according to Use Case, we need
to implement Advance search like Partial text, Phrase search etc.. So we
are thinking the best way, that how to implement data model.


On Tue, Jun 13, 2017 at 3:35 AM, Oskar Kjellin <os...@gmail.com>
wrote:

> Agree, I meant as Jonathan said to use C* for primary key and as a primary
> storage and ES as an indexed version of what you have in cassandra.
>
> 2017-06-12 19:19 GMT+02:00 DuyHai Doan <do...@gmail.com>:
>
>> Sorry, I misread some reply I had the impression that people recommend ES
>> as primary datastore
>>
>> On Mon, Jun 12, 2017 at 7:12 PM, Jonathan Haddad <jo...@jonhaddad.com>
>> wrote:
>>
>>> Nobody is promoting ES as a primary datastore in this thread.  Every
>>> mention of it is to accompany C*.
>>>
>>>
>>>
>>> On Mon, Jun 12, 2017 at 10:03 AM DuyHai Doan <do...@gmail.com>
>>> wrote:
>>>
>>>> For all those promoting ES as a PRIMARY datastore, please read this
>>>> before:
>>>>
>>>> https://discuss.elastic.co/t/elasticsearch-as-a-primary-data
>>>> base/85733/13
>>>>
>>>> There are a lot of warning before recommending ES as a datastore.
>>>>
>>>> The answer from Pilato, ES official evangelist:
>>>>
>>>>
>>>>    - You absolutely care about your data and you want to be able to
>>>>    reindex in all cases. You need for that a datastore. A datastore can be a
>>>>    filesystem where you store JSON, HDFS, and/or a database you prefer and you
>>>>    are confident with. About how to inject data in it, you may want to read:
>>>>    http://david.pilato.fr/blog/2015/05/09/advanced-search
>>>>    -for-your-legacy-application/7
>>>>    <http://david.pilato.fr/blog/2015/05/09/advanced-search-for-your-legacy-application/>
>>>>    .
>>>>
>>>>
>>>>
>>>>
>>>> On Mon, Jun 12, 2017 at 5:08 PM, Michael Mior <mm...@uwaterloo.ca>
>>>> wrote:
>>>>
>>>>> For queries 1-5 this seems like a potentially good use case for
>>>>> materialized views. Create one table with the videos stored by ID and the
>>>>> materialized views for each of the queries.
>>>>>
>>>>> --
>>>>> Michael Mior
>>>>> mmior@apache.org
>>>>>
>>>>>
>>>>> 2017-06-11 22:40 GMT-04:00 @Nandan@ <na...@gmail.com>:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> Currently, I am working on data modeling for Video Company in which
>>>>>> we have different types of users as well as different user functionality.
>>>>>> But currently, my concern is about Search video module based on
>>>>>> different fields.
>>>>>>
>>>>>> Query patterns are as below:-
>>>>>> 1) Select video by actor.
>>>>>> 2) select video by producer.
>>>>>> 3) select video by music.
>>>>>> 4) select video by actor and producer.
>>>>>> 5) select video by actor and music.
>>>>>>
>>>>>> Note: - In short, We want to establish an advanced search module by
>>>>>> which we can search by anyway and get the desired results.
>>>>>>
>>>>>> During a search , we need partial search also such that if any user
>>>>>> can search "Harry" title, then we are able to give them result as all
>>>>>> videos whose
>>>>>>  title contains "Harry" at any location.
>>>>>>
>>>>>> As per my ideas, I have to create separate tables such as
>>>>>> video_by_actor, video_by_producer etc.. and implement solr query on all
>>>>>> tables. Otherwise,
>>>>>> is there any others way by which we can implement this search module
>>>>>> effectively.
>>>>>>
>>>>>> Please suggest.
>>>>>>
>>>>>> Best regards,
>>>>>>
>>>>>
>>>>>
>>>>
>>
>

Re: Reg:- Cassandra Data modelling for Search

Posted by Oskar Kjellin <os...@gmail.com>.
Agree, I meant as Jonathan said to use C* for primary key and as a primary
storage and ES as an indexed version of what you have in cassandra.

2017-06-12 19:19 GMT+02:00 DuyHai Doan <do...@gmail.com>:

> Sorry, I misread some reply I had the impression that people recommend ES
> as primary datastore
>
> On Mon, Jun 12, 2017 at 7:12 PM, Jonathan Haddad <jo...@jonhaddad.com>
> wrote:
>
>> Nobody is promoting ES as a primary datastore in this thread.  Every
>> mention of it is to accompany C*.
>>
>>
>>
>> On Mon, Jun 12, 2017 at 10:03 AM DuyHai Doan <do...@gmail.com>
>> wrote:
>>
>>> For all those promoting ES as a PRIMARY datastore, please read this
>>> before:
>>>
>>> https://discuss.elastic.co/t/elasticsearch-as-a-primary-data
>>> base/85733/13
>>>
>>> There are a lot of warning before recommending ES as a datastore.
>>>
>>> The answer from Pilato, ES official evangelist:
>>>
>>>
>>>    - You absolutely care about your data and you want to be able to
>>>    reindex in all cases. You need for that a datastore. A datastore can be a
>>>    filesystem where you store JSON, HDFS, and/or a database you prefer and you
>>>    are confident with. About how to inject data in it, you may want to read:
>>>    http://david.pilato.fr/blog/2015/05/09/advanced-search
>>>    -for-your-legacy-application/7
>>>    <http://david.pilato.fr/blog/2015/05/09/advanced-search-for-your-legacy-application/>
>>>    .
>>>
>>>
>>>
>>>
>>> On Mon, Jun 12, 2017 at 5:08 PM, Michael Mior <mm...@uwaterloo.ca>
>>> wrote:
>>>
>>>> For queries 1-5 this seems like a potentially good use case for
>>>> materialized views. Create one table with the videos stored by ID and the
>>>> materialized views for each of the queries.
>>>>
>>>> --
>>>> Michael Mior
>>>> mmior@apache.org
>>>>
>>>>
>>>> 2017-06-11 22:40 GMT-04:00 @Nandan@ <na...@gmail.com>:
>>>>
>>>>> Hi,
>>>>>
>>>>> Currently, I am working on data modeling for Video Company in which we
>>>>> have different types of users as well as different user functionality.
>>>>> But currently, my concern is about Search video module based on
>>>>> different fields.
>>>>>
>>>>> Query patterns are as below:-
>>>>> 1) Select video by actor.
>>>>> 2) select video by producer.
>>>>> 3) select video by music.
>>>>> 4) select video by actor and producer.
>>>>> 5) select video by actor and music.
>>>>>
>>>>> Note: - In short, We want to establish an advanced search module by
>>>>> which we can search by anyway and get the desired results.
>>>>>
>>>>> During a search , we need partial search also such that if any user
>>>>> can search "Harry" title, then we are able to give them result as all
>>>>> videos whose
>>>>>  title contains "Harry" at any location.
>>>>>
>>>>> As per my ideas, I have to create separate tables such as
>>>>> video_by_actor, video_by_producer etc.. and implement solr query on all
>>>>> tables. Otherwise,
>>>>> is there any others way by which we can implement this search module
>>>>> effectively.
>>>>>
>>>>> Please suggest.
>>>>>
>>>>> Best regards,
>>>>>
>>>>
>>>>
>>>
>

Re: Reg:- Cassandra Data modelling for Search

Posted by DuyHai Doan <do...@gmail.com>.
Sorry, I misread some reply I had the impression that people recommend ES
as primary datastore

On Mon, Jun 12, 2017 at 7:12 PM, Jonathan Haddad <jo...@jonhaddad.com> wrote:

> Nobody is promoting ES as a primary datastore in this thread.  Every
> mention of it is to accompany C*.
>
>
>
> On Mon, Jun 12, 2017 at 10:03 AM DuyHai Doan <do...@gmail.com> wrote:
>
>> For all those promoting ES as a PRIMARY datastore, please read this
>> before:
>>
>> https://discuss.elastic.co/t/elasticsearch-as-a-primary-database/85733/13
>>
>> There are a lot of warning before recommending ES as a datastore.
>>
>> The answer from Pilato, ES official evangelist:
>>
>>
>>    - You absolutely care about your data and you want to be able to
>>    reindex in all cases. You need for that a datastore. A datastore can be a
>>    filesystem where you store JSON, HDFS, and/or a database you prefer and you
>>    are confident with. About how to inject data in it, you may want to read:
>>    http://david.pilato.fr/blog/2015/05/09/advanced-
>>    search-for-your-legacy-application/7
>>    <http://david.pilato.fr/blog/2015/05/09/advanced-search-for-your-legacy-application/>
>>    .
>>
>>
>>
>>
>> On Mon, Jun 12, 2017 at 5:08 PM, Michael Mior <mm...@uwaterloo.ca> wrote:
>>
>>> For queries 1-5 this seems like a potentially good use case for
>>> materialized views. Create one table with the videos stored by ID and the
>>> materialized views for each of the queries.
>>>
>>> --
>>> Michael Mior
>>> mmior@apache.org
>>>
>>>
>>> 2017-06-11 22:40 GMT-04:00 @Nandan@ <na...@gmail.com>:
>>>
>>>> Hi,
>>>>
>>>> Currently, I am working on data modeling for Video Company in which we
>>>> have different types of users as well as different user functionality.
>>>> But currently, my concern is about Search video module based on
>>>> different fields.
>>>>
>>>> Query patterns are as below:-
>>>> 1) Select video by actor.
>>>> 2) select video by producer.
>>>> 3) select video by music.
>>>> 4) select video by actor and producer.
>>>> 5) select video by actor and music.
>>>>
>>>> Note: - In short, We want to establish an advanced search module by
>>>> which we can search by anyway and get the desired results.
>>>>
>>>> During a search , we need partial search also such that if any user can
>>>> search "Harry" title, then we are able to give them result as all videos
>>>> whose
>>>>  title contains "Harry" at any location.
>>>>
>>>> As per my ideas, I have to create separate tables such as
>>>> video_by_actor, video_by_producer etc.. and implement solr query on all
>>>> tables. Otherwise,
>>>> is there any others way by which we can implement this search module
>>>> effectively.
>>>>
>>>> Please suggest.
>>>>
>>>> Best regards,
>>>>
>>>
>>>
>>

Re: Reg:- Cassandra Data modelling for Search

Posted by Jonathan Haddad <jo...@jonhaddad.com>.
Nobody is promoting ES as a primary datastore in this thread.  Every
mention of it is to accompany C*.



On Mon, Jun 12, 2017 at 10:03 AM DuyHai Doan <do...@gmail.com> wrote:

> For all those promoting ES as a PRIMARY datastore, please read this before:
>
> https://discuss.elastic.co/t/elasticsearch-as-a-primary-database/85733/13
>
> There are a lot of warning before recommending ES as a datastore.
>
> The answer from Pilato, ES official evangelist:
>
>
>    - You absolutely care about your data and you want to be able to
>    reindex in all cases. You need for that a datastore. A datastore can be a
>    filesystem where you store JSON, HDFS, and/or a database you prefer and you
>    are confident with. About how to inject data in it, you may want to read:
>    http://david.pilato.fr/blog/2015/05/09/advanced-search-for-your-legacy-application/
>    7
>    <http://david.pilato.fr/blog/2015/05/09/advanced-search-for-your-legacy-application/>
>    .
>
>
>
>
> On Mon, Jun 12, 2017 at 5:08 PM, Michael Mior <mm...@uwaterloo.ca> wrote:
>
>> For queries 1-5 this seems like a potentially good use case for
>> materialized views. Create one table with the videos stored by ID and the
>> materialized views for each of the queries.
>>
>> --
>> Michael Mior
>> mmior@apache.org
>>
>>
>> 2017-06-11 22:40 GMT-04:00 @Nandan@ <na...@gmail.com>:
>>
>>> Hi,
>>>
>>> Currently, I am working on data modeling for Video Company in which we
>>> have different types of users as well as different user functionality.
>>> But currently, my concern is about Search video module based on
>>> different fields.
>>>
>>> Query patterns are as below:-
>>> 1) Select video by actor.
>>> 2) select video by producer.
>>> 3) select video by music.
>>> 4) select video by actor and producer.
>>> 5) select video by actor and music.
>>>
>>> Note: - In short, We want to establish an advanced search module by
>>> which we can search by anyway and get the desired results.
>>>
>>> During a search , we need partial search also such that if any user can
>>> search "Harry" title, then we are able to give them result as all videos
>>> whose
>>>  title contains "Harry" at any location.
>>>
>>> As per my ideas, I have to create separate tables such as
>>> video_by_actor, video_by_producer etc.. and implement solr query on all
>>> tables. Otherwise,
>>> is there any others way by which we can implement this search module
>>> effectively.
>>>
>>> Please suggest.
>>>
>>> Best regards,
>>>
>>
>>
>

Re: Reg:- Cassandra Data modelling for Search

Posted by DuyHai Doan <do...@gmail.com>.
For all those promoting ES as a PRIMARY datastore, please read this before:

https://discuss.elastic.co/t/elasticsearch-as-a-primary-database/85733/13

There are a lot of warning before recommending ES as a datastore.

The answer from Pilato, ES official evangelist:


   - You absolutely care about your data and you want to be able to reindex
   in all cases. You need for that a datastore. A datastore can be a
   filesystem where you store JSON, HDFS, and/or a database you prefer and you
   are confident with. About how to inject data in it, you may want to read:
   http://david.pilato.fr/blog/2015/05/09/advanced-search-for-your-legacy-
   application/7
   <http://david.pilato.fr/blog/2015/05/09/advanced-search-for-your-legacy-application/>
   .




On Mon, Jun 12, 2017 at 5:08 PM, Michael Mior <mm...@uwaterloo.ca> wrote:

> For queries 1-5 this seems like a potentially good use case for
> materialized views. Create one table with the videos stored by ID and the
> materialized views for each of the queries.
>
> --
> Michael Mior
> mmior@apache.org
>
>
> 2017-06-11 22:40 GMT-04:00 @Nandan@ <na...@gmail.com>:
>
>> Hi,
>>
>> Currently, I am working on data modeling for Video Company in which we
>> have different types of users as well as different user functionality.
>> But currently, my concern is about Search video module based on different
>> fields.
>>
>> Query patterns are as below:-
>> 1) Select video by actor.
>> 2) select video by producer.
>> 3) select video by music.
>> 4) select video by actor and producer.
>> 5) select video by actor and music.
>>
>> Note: - In short, We want to establish an advanced search module by which
>> we can search by anyway and get the desired results.
>>
>> During a search , we need partial search also such that if any user can
>> search "Harry" title, then we are able to give them result as all videos
>> whose
>>  title contains "Harry" at any location.
>>
>> As per my ideas, I have to create separate tables such as video_by_actor,
>> video_by_producer etc.. and implement solr query on all tables. Otherwise,
>> is there any others way by which we can implement this search module
>> effectively.
>>
>> Please suggest.
>>
>> Best regards,
>>
>
>

Re: Reg:- Cassandra Data modelling for Search

Posted by "@Nandan@" <na...@gmail.com>.
Hi Michael ,
MV is also good option when we have to select based on equality search, but
here condition is to developing a model for advance partial search way.
And Also , In case of MV, suppose we have 2 DC with 3 Nodes on each DC then
MV will replicated data based on 6*6 times which will be another problem.


On Mon, Jun 12, 2017 at 11:08 PM, Michael Mior <mm...@uwaterloo.ca> wrote:

> For queries 1-5 this seems like a potentially good use case for
> materialized views. Create one table with the videos stored by ID and the
> materialized views for each of the queries.
>
> --
> Michael Mior
> mmior@apache.org
>
>
> 2017-06-11 22:40 GMT-04:00 @Nandan@ <na...@gmail.com>:
>
>> Hi,
>>
>> Currently, I am working on data modeling for Video Company in which we
>> have different types of users as well as different user functionality.
>> But currently, my concern is about Search video module based on different
>> fields.
>>
>> Query patterns are as below:-
>> 1) Select video by actor.
>> 2) select video by producer.
>> 3) select video by music.
>> 4) select video by actor and producer.
>> 5) select video by actor and music.
>>
>> Note: - In short, We want to establish an advanced search module by which
>> we can search by anyway and get the desired results.
>>
>> During a search , we need partial search also such that if any user can
>> search "Harry" title, then we are able to give them result as all videos
>> whose
>>  title contains "Harry" at any location.
>>
>> As per my ideas, I have to create separate tables such as video_by_actor,
>> video_by_producer etc.. and implement solr query on all tables. Otherwise,
>> is there any others way by which we can implement this search module
>> effectively.
>>
>> Please suggest.
>>
>> Best regards,
>>
>
>

Re: Reg:- Cassandra Data modelling for Search

Posted by Michael Mior <mm...@uwaterloo.ca>.
For queries 1-5 this seems like a potentially good use case for
materialized views. Create one table with the videos stored by ID and the
materialized views for each of the queries.

--
Michael Mior
mmior@apache.org


2017-06-11 22:40 GMT-04:00 @Nandan@ <na...@gmail.com>:

> Hi,
>
> Currently, I am working on data modeling for Video Company in which we
> have different types of users as well as different user functionality.
> But currently, my concern is about Search video module based on different
> fields.
>
> Query patterns are as below:-
> 1) Select video by actor.
> 2) select video by producer.
> 3) select video by music.
> 4) select video by actor and producer.
> 5) select video by actor and music.
>
> Note: - In short, We want to establish an advanced search module by which
> we can search by anyway and get the desired results.
>
> During a search , we need partial search also such that if any user can
> search "Harry" title, then we are able to give them result as all videos
> whose
>  title contains "Harry" at any location.
>
> As per my ideas, I have to create separate tables such as video_by_actor,
> video_by_producer etc.. and implement solr query on all tables. Otherwise,
> is there any others way by which we can implement this search module
> effectively.
>
> Please suggest.
>
> Best regards,
>

Re: Reg:- Cassandra Data modelling for Search

Posted by Oskar Kjellin <os...@gmail.com>.
Why not elasticsearch for this use case? It will make your life much simpler 

> On 12 Jun 2017, at 04:40, @Nandan@ <na...@gmail.com> wrote:
> 
> Hi, 
> 
> Currently, I am working on data modeling for Video Company in which we have different types of users as well as different user functionality. 
> But currently, my concern is about Search video module based on different fields. 
> 
> Query patterns are as below:-
> 1) Select video by actor.
> 2) select video by producer.
> 3) select video by music.
> 4) select video by actor and producer. 
> 5) select video by actor and music. 
> 
> Note: - In short, We want to establish an advanced search module by which we can search by anyway and get the desired results. 
> 
> During a search , we need partial search also such that if any user can search "Harry" title, then we are able to give them result as all videos whose
>  title contains "Harry" at any location. 
> 
> As per my ideas, I have to create separate tables such as video_by_actor, video_by_producer etc.. and implement solr query on all tables. Otherwise,
> is there any others way by which we can implement this search module effectively. 
> 
> Please suggest.
> 
> Best regards,

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org


Re: Reg:- Cassandra Data modelling for Search

Posted by Michael Mior <mm...@uwaterloo.ca>.
For queries 1-5 this seems like a potentially good use case for
materialized views. Create one table with the videos stored by ID and the
materialized views for each of the queries.

--
Michael Mior
mmior@apache.org


2017-06-11 22:40 GMT-04:00 @Nandan@ <na...@gmail.com>:

> Hi,
>
> Currently, I am working on data modeling for Video Company in which we
> have different types of users as well as different user functionality.
> But currently, my concern is about Search video module based on different
> fields.
>
> Query patterns are as below:-
> 1) Select video by actor.
> 2) select video by producer.
> 3) select video by music.
> 4) select video by actor and producer.
> 5) select video by actor and music.
>
> Note: - In short, We want to establish an advanced search module by which
> we can search by anyway and get the desired results.
>
> During a search , we need partial search also such that if any user can
> search "Harry" title, then we are able to give them result as all videos
> whose
>  title contains "Harry" at any location.
>
> As per my ideas, I have to create separate tables such as video_by_actor,
> video_by_producer etc.. and implement solr query on all tables. Otherwise,
> is there any others way by which we can implement this search module
> effectively.
>
> Please suggest.
>
> Best regards,
>