You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Oleg Dulin <ol...@gmail.com> on 2012/10/06 15:57:33 UTC

Re: Text searches and free form queries

So, what I ended up doing is this --

As I write my records into the main CF, I tokenize some fields that I 
want to search on using Lucene and write an index into a separate CF, 
such that my columns are a composite of:

luceneToken:record key

I can then search my records by doing a slice for each lucene token in 
the search query and then do an intersection of the sets. It works 
pretty fast.

Regards,
Oleg

On 2012-09-05 01:28:44 +0000, aaron morton said:

> AFAIk if you want to keep it inside cassandra then DSE, roll your own 
> from scratch or start with https://github.com/tjake/Solandra . 
> 
> Outside of Cassandra I've heard of people using Elastic Search or Solr 
> which I *think* is now faster at updating the index. 
> 
> Hope that helps. 
> 
>  
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 4/09/2012, at 3:00 AM, Andrey V. Panov <pa...@gmail.com> wrote:
> Some one did search on Lucene, but for very fresh data they build 
> search index in memory so data become available for search without 
> delays.
> 
> On 3 September 2012 22:25, Oleg Dulin <ol...@gmail.com> wrote:
> Dear Distinguished Colleagues:


-- 
Regards,
Oleg Dulin
NYC Java Big Data Engineer
http://www.olegdulin.com/

Re: Text searches and free form queries

Posted by Oleg Dulin <ol...@gmail.com>.

>> 
>> It works pretty fast.
> Cool.
> Just keep an eye out for how big the lucene token row gets.
> Cheers
> 
> 

Indeed, it may get out of hand, but for now we are ok -- for the 
foreseable future I would say.

Should it get larger, I can split it up into rows -- i.e. all tokens 
that start with "a", all tokens that start with "b", etc.

Re: Text searches and free form queries

Posted by aaron morton <aa...@thelastpickle.com>.

>  It works pretty fast.
Cool. 

Just keep an eye out for how big the lucene token row gets. 

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 7/10/2012, at 2:57 AM, Oleg Dulin <ol...@gmail.com> wrote:

> So, what I ended up doing is this --
> 
> As I write my records into the main CF, I tokenize some fields that I want to search on using Lucene and write an index into a separate CF, such that my columns are a composite of:
> 
> luceneToken:record key
> 
> I can then search my records by doing a slice for each lucene token in the search query and then do an intersection of the sets. It works pretty fast.
> 
> Regards,
> Oleg
> 
> On 2012-09-05 01:28:44 +0000, aaron morton said:
> 
> AFAIk if you want to keep it inside cassandra then DSE, roll your own from scratch or start with https://github.com/tjake/Solandra . 
> 
> Outside of Cassandra I've heard of people using Elastic Search or Solr which I *think* is now faster at updating the index. 
> 
> Hope that helps. 
> 
>  
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 4/09/2012, at 3:00 AM, Andrey V. Panov <pa...@gmail.com> wrote:
> Some one did search on Lucene, but for very fresh data they build search index in memory so data become available for search without delays.
> 
> On 3 September 2012 22:25, Oleg Dulin <ol...@gmail.com> wrote:
> Dear Distinguished Colleagues:
> 
> 
> -- 
> Regards,
> Oleg Dulin
> NYC Java Big Data Engineer
> http://www.olegdulin.com/