You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Mead Lai <la...@gmail.com> on 2011/10/11 09:11:48 UTC

Is there any "Query" in Lucene can search the term, which is similar as "SQL-LIKE"?

Hello all,
*Background:
*There are *ONE MILLION* data in a table, and this table has 100 columns
inside.
The application need to search the data in EVERY column with one 'keyword'.
so, I try it in a clumsy way, using a database view, then search the view.
Just like the following SQL:
*=Step1*: create a view.

CREATE OR REPLACE VIEW V_MY_VIEW(id,title,content)
as
SELECT
mv.l_instanceid,mv.c_param1,mv.c_param2||';'||mv.c_param3||';'||mv.c_param4||';'||mv.c_param5||';'||mv.c_param6||';'||mv.c_param7||';'||mv.c_param8||';'||mv.c_param9||';'||mv.c_param10||';'||mv.c_param11||';'||mv.c_param12||';'||mv.c_param13||';'||mv.c_param14||';'||mv.c_param15||';'||mv.c_param16||';'||mv.c_param17||';'||mv.c_param18||';'||mv.c_param19||';'||mv.c_param20||';'||mv.c_param21||';'||mv.c_param22||';'||mv.c_param23||';'||mv.c_param24||';'||mv.c_param25||';'||mv.c_param26||';'||mv.c_param27||';'||mv.c_param28||';'||mv.c_param29||';'||mv.c_param30||';'||mv.c_param31||';'||mv.c_param32||';'||mv.c_param33||';'||mv.c_param34||';'||mv.c_param35||';'||mv.c_param36||';'||mv.c_param37||';'||mv.c_param38||';'||mv.c_param39||';'||mv.c_param40||';'||mv.c_param41||';'||mv.c_param42||';'||mv.c_param43||';'||mv.c_param44||';'||mv.c_param45||';'||mv.c_param46||';'||mv.c_param47||';'||mv.c_param48||';'||mv.c_param49||';'||mv.c_param50||';'||mv.c_param51||';'||mv.c_param52||';'||mv.c_param53||';'||mv.c_param54||';'||mv.c_param55||';'||mv.c_param56||';'||mv.c_param57||';'||mv.c_param58||';'||mv.c_param59||';'||mv.c_param60||';'||mv.c_param61||';'||mv.c_param62||';'||mv.c_param63||';'||mv.c_param64||';'||mv.c_param65||';'||mv.c_param66||';'||mv.c_param67||';'||mv.c_param68||';'||mv.c_param69||';'||mv.c_param70||';'||mv.c_param71||';'||mv.c_param72||';'||mv.c_param73||';'||mv.c_param74||';'||mv.c_param75||';'||mv.c_param76||';'||mv.c_param77||';'||mv.c_param78||';'||mv.c_param79||';'||mv.c_param80||';'||mv.c_param81||';'||mv.c_param82||';'||mv.c_param83||';'||mv.c_param84||';'||mv.c_param85||';'||mv.c_param86||';'||mv.c_param87||';'||mv.c_param88||';'||mv.c_param89||';'||mv.c_param90||';'||mv.c_param91||';'||mv.c_param92||';'||mv.c_param93||';'||mv.c_param94||';'||mv.c_param95||';'||mv.c_param96||';'||mv.c_param97||';'||mv.c_param98||';'||mv.c_param99||';'||mv.c_param100||';'
FROM MyTable mv

*=Step2*: search the view with LIKE '%keyword%'

SELECT *
FROM V_MY_VIEW wcv
WHERE wcv.content LIKE '%keyword%'

Finally, it works nice, but inefficiency, almost cost 5~7 seconds. cos ONE
MILLION rows are tooo huge.

*Lucene way:*
    So, I use the Lucene to store these ONE MILLION data,
code:document.add(new Field("content", content, Store.YES,
Index.ANALYZED));//variable content, is the strings which jointed from the
100 columns
     The problem is that: if some keyword is not a word or a term, the
search will return nothing.
Usually, the keyword would be a person's name or some jargon, like
'catstiger.amber','amin.ahmad','fund-in-trust'.
and 'catstiger.amber' can't be split into a term to save in the index store
    Cos, the Index.ANALYZED would fail to recognition the keyword as a term
, so there is no such index at all.
So, In short, is there any "Query" in Lucene can search the term, which is
similar as "SQL-LIKE"?

    This SQL-function will meet the purpose:

SELECT * FROM luceneDB ldb
WHERE ldb.content *LIKE* '%keyword%'

Thank you very much.

Regards,
Mead

Re: Is there any "Query" in Lucene can search the term, which is similar as "SQL-LIKE"?

Posted by Ian Lea <ia...@gmail.com>.
See the FAQ entry
http://wiki.apache.org/lucene-java/LuceneFAQ#What_wildcard_search_support_is_available_from_Lucene.3F

There is also org.apache.lucene.search.regex.RegexQuery in lucene contrib.

Be aware that wildcard queries can be slow, particularly if you enable
leading wild cards.


I don't understand your example, but it sounds like you might be
better off looking into analyzers, so that e.g. catstiger.amber is
indexed in a way that lets you search for it. Probably more work up
front, but potentially much better performance.


--
Ian.


On Tue, Oct 11, 2011 at 8:11 AM, Mead Lai <la...@gmail.com> wrote:
> Hello all,
> *Background:
> *There are *ONE MILLION* data in a table, and this table has 100 columns
> inside.
> The application need to search the data in EVERY column with one 'keyword'.
> so, I try it in a clumsy way, using a database view, then search the view.
> Just like the following SQL:
> *=Step1*: create a view.
>
> CREATE OR REPLACE VIEW V_MY_VIEW(id,title,content)
> as
> SELECT
> mv.l_instanceid,mv.c_param1,mv.c_param2||';'||mv.c_param3||';'||mv.c_param4||';'||mv.c_param5||';'||mv.c_param6||';'||mv.c_param7||';'||mv.c_param8||';'||mv.c_param9||';'||mv.c_param10||';'||mv.c_param11||';'||mv.c_param12||';'||mv.c_param13||';'||mv.c_param14||';'||mv.c_param15||';'||mv.c_param16||';'||mv.c_param17||';'||mv.c_param18||';'||mv.c_param19||';'||mv.c_param20||';'||mv.c_param21||';'||mv.c_param22||';'||mv.c_param23||';'||mv.c_param24||';'||mv.c_param25||';'||mv.c_param26||';'||mv.c_param27||';'||mv.c_param28||';'||mv.c_param29||';'||mv.c_param30||';'||mv.c_param31||';'||mv.c_param32||';'||mv.c_param33||';'||mv.c_param34||';'||mv.c_param35||';'||mv.c_param36||';'||mv.c_param37||';'||mv.c_param38||';'||mv.c_param39||';'||mv.c_param40||';'||mv.c_param41||';'||mv.c_param42||';'||mv.c_param43||';'||mv.c_param44||';'||mv.c_param45||';'||mv.c_param46||';'||mv.c_param47||';'||mv.c_param48||';'||mv.c_param49||';'||mv.c_param50||';'||mv.c_param51||';'||mv.c_param52||';'||mv.c_param53||';'||mv.c_param54||';'||mv.c_param55||';'||mv.c_param56||';'||mv.c_param57||';'||mv.c_param58||';'||mv.c_param59||';'||mv.c_param60||';'||mv.c_param61||';'||mv.c_param62||';'||mv.c_param63||';'||mv.c_param64||';'||mv.c_param65||';'||mv.c_param66||';'||mv.c_param67||';'||mv.c_param68||';'||mv.c_param69||';'||mv.c_param70||';'||mv.c_param71||';'||mv.c_param72||';'||mv.c_param73||';'||mv.c_param74||';'||mv.c_param75||';'||mv.c_param76||';'||mv.c_param77||';'||mv.c_param78||';'||mv.c_param79||';'||mv.c_param80||';'||mv.c_param81||';'||mv.c_param82||';'||mv.c_param83||';'||mv.c_param84||';'||mv.c_param85||';'||mv.c_param86||';'||mv.c_param87||';'||mv.c_param88||';'||mv.c_param89||';'||mv.c_param90||';'||mv.c_param91||';'||mv.c_param92||';'||mv.c_param93||';'||mv.c_param94||';'||mv.c_param95||';'||mv.c_param96||';'||mv.c_param97||';'||mv.c_param98||';'||mv.c_param99||';'||mv.c_param100||';'
> FROM MyTable mv
>
> *=Step2*: search the view with LIKE '%keyword%'
>
> SELECT *
> FROM V_MY_VIEW wcv
> WHERE wcv.content LIKE '%keyword%'
>
> Finally, it works nice, but inefficiency, almost cost 5~7 seconds. cos ONE
> MILLION rows are tooo huge.
>
> *Lucene way:*
>    So, I use the Lucene to store these ONE MILLION data,
> code:document.add(new Field("content", content, Store.YES,
> Index.ANALYZED));//variable content, is the strings which jointed from the
> 100 columns
>     The problem is that: if some keyword is not a word or a term, the
> search will return nothing.
> Usually, the keyword would be a person's name or some jargon, like
> 'catstiger.amber','amin.ahmad','fund-in-trust'.
> and 'catstiger.amber' can't be split into a term to save in the index store
>    Cos, the Index.ANALYZED would fail to recognition the keyword as a term
> , so there is no such index at all.
> So, In short, is there any "Query" in Lucene can search the term, which is
> similar as "SQL-LIKE"?
>
>    This SQL-function will meet the purpose:
>
> SELECT * FROM luceneDB ldb
> WHERE ldb.content *LIKE* '%keyword%'
>
> Thank you very much.
>
> Regards,
> Mead
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Is there any "Query" in Lucene can search the term, which is similar as "SQL-LIKE"?

Posted by Sujit Pal <su...@comcast.net>.
Hi Mead,

You may want to check out the permuterm index idea.
http://www-nlp.stanford.edu/IR-book/html/htmledition/permuterm-indexes-1.html 

Basically you write a custom filter that takes a term and generates all
word permutations off it. On the query side, you convert your query so
its always a prefix query by rotating the characters so * is always at
the end and match against the permuterm indexed field.

I have a simple (and currently incomplete) working implementation (works
with queries such as *keyword, keyword*, key*rd, *keyword*, but only a
single * and no ? unlike the Wildcard query. But because its always a
prefix query internally, it does not have the performance penalty of
leading * in WildcardQuery. Maybe it will give you some ideas...
http://sujitpal.blogspot.com/2011/10/lucene-wildcard-query-and-permuterm.html 

-sujit

On Thu, 2011-10-13 at 10:10 +0800, Mead Lai wrote:
> Thank you very much,
> With your helps, that, finally, I use "WildcardQuery" to find right result:
>     BooleanQuery resultQuery = new BooleanQuery();
>     resultQuery.add(WildcardQuery(new Term("content", "*keyword*"));
>     TopDocs topDocs = searcher.search(resultQuery,*1000*);
> 
> But there is also a problem puzzle me, the result only can get 1000 items,
> which is not enough.
> I want to have entire/whole items, which match that condition(*keyword*).
> 
> OR, may I put a date condtion to query,
> e.g: select * from table where start_date *>=* 2011-10-12
> 
> 
> Regards,
> Mead
> 
> 
> On Tue, Oct 11, 2011 at 11:39 PM, Chris Lu <ch...@gmail.com> wrote:
> 
> > You need to analyze the search keyword with the same analyzer that's
> > applied
> > on the "content" field.
> >
> > --
> > Chris Lu
> > -------------------------
> > Instant Scalable Full-Text Search On Any Database/Application
> > site: http://www.dbsight.net
> > demo: http://search.dbsight.com
> > Lucene Database Search in 3 minutes:
> >
> > http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes
> >
> > On Tue, Oct 11, 2011 at 12:11 AM, Mead Lai <la...@gmail.com> wrote:
> >
> > > Hello all,
> > > *Background:
> > > *There are *ONE MILLION* data in a table, and this table has 100 columns
> > > inside.
> > > The application need to search the data in EVERY column with one
> > 'keyword'.
> > > so, I try it in a clumsy way, using a database view, then search the
> > view.
> > > Just like the following SQL:
> > > *=Step1*: create a view.
> > >
> > > CREATE OR REPLACE VIEW V_MY_VIEW(id,title,content)
> > > as
> > > SELECT
> > >
> > >
> > mv.l_instanceid,mv.c_param1,mv.c_param2||';'||mv.c_param3||';'||mv.c_param4||';'||mv.c_param5||';'||mv.c_param6||';'||mv.c_param7||';'||mv.c_param8||';'||mv.c_param9||';'||mv.c_param10||';'||mv.c_param11||';'||mv.c_param12||';'||mv.c_param13||';'||mv.c_param14||';'||mv.c_param15||';'||mv.c_param16||';'||mv.c_param17||';'||mv.c_param18||';'||mv.c_param19||';'||mv.c_param20||';'||mv.c_param21||';'||mv.c_param22||';'||mv.c_param23||';'||mv.c_param24||';'||mv.c_param25||';'||mv.c_param26||';'||mv.c_param27||';'||mv.c_param28||';'||mv.c_param29||';'||mv.c_param30||';'||mv.c_param31||';'||mv.c_param32||';'||mv.c_param33||';'||mv.c_param34||';'||mv.c_param35||';'||mv.c_param36||';'||mv.c_param37||';'||mv.c_param38||';'||mv.c_param39||';'||mv.c_param40||';'||mv.c_param41||';'||mv.c_param42||';'||mv.c_param43||';'||mv.c_param44||';'||mv.c_param45||';'||mv.c_param46||';'||mv.c_param47||';'||mv.c_param48||';'||mv.c_param49||';'||mv.c_param50||';'||mv.c_param51||';'||mv.c_param52||';'||mv.c_param53||';'||mv.c_param54||';'||mv.c_param55||';'||mv.c_param56||';'||mv.c_param57||';'||mv.c_param58||';'||mv.c_param59||';'||mv.c_param60||';'||mv.c_param61||';'||mv.c_param62||';'||mv.c_param63||';'||mv.c_param64||';'||mv.c_param65||';'||mv.c_param66||';'||mv.c_param67||';'||mv.c_param68||';'||mv.c_param69||';'||mv.c_param70||';'||mv.c_param71||';'||mv.c_param72||';'||mv.c_param73||';'||mv.c_param74||';'||mv.c_param75||';'||mv.c_param76||';'||mv.c_param77||';'||mv.c_param78||';'||mv.c_param79||';'||mv.c_param80||';'||mv.c_param81||';'||mv.c_param82||';'||mv.c_param83||';'||mv.c_param84||';'||mv.c_param85||';'||mv.c_param86||';'||mv.c_param87||';'||mv.c_param88||';'||mv.c_param89||';'||mv.c_param90||';'||mv.c_param91||';'||mv.c_param92||';'||mv.c_param93||';'||mv.c_param94||';'||mv.c_param95||';'||mv.c_param96||';'||mv.c_param97||';'||mv.c_param98||';'||mv.c_param99||';'||mv.c_param100||';'
> > > FROM MyTable mv
> > >
> > > *=Step2*: search the view with LIKE '%keyword%'
> > >
> > > SELECT *
> > > FROM V_MY_VIEW wcv
> > > WHERE wcv.content LIKE '%keyword%'
> > >
> > > Finally, it works nice, but inefficiency, almost cost 5~7 seconds. cos
> > ONE
> > > MILLION rows are tooo huge.
> > >
> > > *Lucene way:*
> > >    So, I use the Lucene to store these ONE MILLION data,
> > > code:document.add(new Field("content", content, Store.YES,
> > > Index.ANALYZED));//variable content, is the strings which jointed from
> > the
> > > 100 columns
> > >     The problem is that: if some keyword is not a word or a term, the
> > > search will return nothing.
> > > Usually, the keyword would be a person's name or some jargon, like
> > > 'catstiger.amber','amin.ahmad','fund-in-trust'.
> > > and 'catstiger.amber' can't be split into a term to save in the index
> > store
> > >    Cos, the Index.ANALYZED would fail to recognition the keyword as a
> > term
> > > , so there is no such index at all.
> > > So, In short, is there any "Query" in Lucene can search the term, which
> > is
> > > similar as "SQL-LIKE"?
> > >
> > >    This SQL-function will meet the purpose:
> > >
> > > SELECT * FROM luceneDB ldb
> > > WHERE ldb.content *LIKE* '%keyword%'
> >  >
> > > Thank you very much.
> > >
> > > Regards,
> > > Mead
> > >
> >


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Is there any "Query" in Lucene can search the term, which is similar as "SQL-LIKE"?

Posted by Mead Lai <la...@gmail.com>.
While, I found "Range Searches", it would useful for this circumstance.
Thank you.

Regards,
Mead


On Thu, Oct 13, 2011 at 10:10 AM, Mead Lai <la...@gmail.com> wrote:

> Thank you very much,
> With your helps, that, finally, I use "WildcardQuery" to find right result:
>     BooleanQuery resultQuery = new BooleanQuery();
>     resultQuery.add(WildcardQuery(new Term("content", "*keyword*"));
>     TopDocs topDocs = searcher.search(resultQuery,*1000*);
>
> But there is also a problem puzzle me, the result only can get 1000 items,
> which is not enough.
> I want to have entire/whole items, which match that condition(*keyword*).
>
> OR, may I put a date condtion to query,
> e.g: select * from table where start_date *>=* 2011-10-12
>
>
> Regards,
> Mead
>
>
>   On Tue, Oct 11, 2011 at 11:39 PM, Chris Lu <ch...@gmail.com> wrote:
>
>> You need to analyze the search keyword with the same analyzer that's
>> applied
>> on the "content" field.
>>
>> --
>> Chris Lu
>> -------------------------
>> Instant Scalable Full-Text Search On Any Database/Application
>> site: http://www.dbsight.net
>> demo: http://search.dbsight.com
>> Lucene Database Search in 3 minutes:
>>
>> http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes
>>
>> On Tue, Oct 11, 2011 at 12:11 AM, Mead Lai <la...@gmail.com> wrote:
>>
>> > Hello all,
>> > *Background:
>> > *There are *ONE MILLION* data in a table, and this table has 100 columns
>> > inside.
>> > The application need to search the data in EVERY column with one
>> 'keyword'.
>> > so, I try it in a clumsy way, using a database view, then search the
>> view.
>> > Just like the following SQL:
>> > *=Step1*: create a view.
>> >
>> > CREATE OR REPLACE VIEW V_MY_VIEW(id,title,content)
>> > as
>> > SELECT
>> >
>> >
>> mv.l_instanceid,mv.c_param1,mv.c_param2||';'||mv.c_param3||';'||mv.c_param4||';'||mv.c_param5||';'||mv.c_param6||';'||mv.c_param7||';'||mv.c_param8||';'||mv.c_param9||';'||mv.c_param10||';'||mv.c_param11||';'||mv.c_param12||';'||mv.c_param13||';'||mv.c_param14||';'||mv.c_param15||';'||mv.c_param16||';'||mv.c_param17||';'||mv.c_param18||';'||mv.c_param19||';'||mv.c_param20||';'||mv.c_param21||';'||mv.c_param22||';'||mv.c_param23||';'||mv.c_param24||';'||mv.c_param25||';'||mv.c_param26||';'||mv.c_param27||';'||mv.c_param28||';'||mv.c_param29||';'||mv.c_param30||';'||mv.c_param31||';'||mv.c_param32||';'||mv.c_param33||';'||mv.c_param34||';'||mv.c_param35||';'||mv.c_param36||';'||mv.c_param37||';'||mv.c_param38||';'||mv.c_param39||';'||mv.c_param40||';'||mv.c_param41||';'||mv.c_param42||';'||mv.c_param43||';'||mv.c_param44||';'||mv.c_param45||';'||mv.c_param46||';'||mv.c_param47||';'||mv.c_param48||';'||mv.c_param49||';'||mv.c_param50||';'||mv.c_param51||';'||mv.c_param52||';'||mv.c_param53||';'||mv.c_param54||';'||mv.c_param55||';'||mv.c_param56||';'||mv.c_param57||';'||mv.c_param58||';'||mv.c_param59||';'||mv.c_param60||';'||mv.c_param61||';'||mv.c_param62||';'||mv.c_param63||';'||mv.c_param64||';'||mv.c_param65||';'||mv.c_param66||';'||mv.c_param67||';'||mv.c_param68||';'||mv.c_param69||';'||mv.c_param70||';'||mv.c_param71||';'||mv.c_param72||';'||mv.c_param73||';'||mv.c_param74||';'||mv.c_param75||';'||mv.c_param76||';'||mv.c_param77||';'||mv.c_param78||';'||mv.c_param79||';'||mv.c_param80||';'||mv.c_param81||';'||mv.c_param82||';'||mv.c_param83||';'||mv.c_param84||';'||mv.c_param85||';'||mv.c_param86||';'||mv.c_param87||';'||mv.c_param88||';'||mv.c_param89||';'||mv.c_param90||';'||mv.c_param91||';'||mv.c_param92||';'||mv.c_param93||';'||mv.c_param94||';'||mv.c_param95||';'||mv.c_param96||';'||mv.c_param97||';'||mv.c_param98||';'||mv.c_param99||';'||mv.c_param100||';'
>> > FROM MyTable mv
>> >
>> > *=Step2*: search the view with LIKE '%keyword%'
>> >
>> > SELECT *
>> > FROM V_MY_VIEW wcv
>> > WHERE wcv.content LIKE '%keyword%'
>> >
>> > Finally, it works nice, but inefficiency, almost cost 5~7 seconds. cos
>> ONE
>> > MILLION rows are tooo huge.
>> >
>> > *Lucene way:*
>> >    So, I use the Lucene to store these ONE MILLION data,
>> > code:document.add(new Field("content", content, Store.YES,
>> > Index.ANALYZED));//variable content, is the strings which jointed from
>> the
>> > 100 columns
>> >     The problem is that: if some keyword is not a word or a term, the
>> > search will return nothing.
>> > Usually, the keyword would be a person's name or some jargon, like
>> > 'catstiger.amber','amin.ahmad','fund-in-trust'.
>> > and 'catstiger.amber' can't be split into a term to save in the index
>> store
>> >    Cos, the Index.ANALYZED would fail to recognition the keyword as a
>> term
>> > , so there is no such index at all.
>> > So, In short, is there any "Query" in Lucene can search the term, which
>> is
>> > similar as "SQL-LIKE"?
>> >
>> >    This SQL-function will meet the purpose:
>> >
>> > SELECT * FROM luceneDB ldb
>> > WHERE ldb.content *LIKE* '%keyword%'
>>  >
>> > Thank you very much.
>> >
>> > Regards,
>> > Mead
>> >
>>
>
>

Re: Is there any "Query" in Lucene can search the term, which is similar as "SQL-LIKE"?

Posted by Mead Lai <la...@gmail.com>.
Thank you very much,
With your helps, that, finally, I use "WildcardQuery" to find right result:
    BooleanQuery resultQuery = new BooleanQuery();
    resultQuery.add(WildcardQuery(new Term("content", "*keyword*"));
    TopDocs topDocs = searcher.search(resultQuery,*1000*);

But there is also a problem puzzle me, the result only can get 1000 items,
which is not enough.
I want to have entire/whole items, which match that condition(*keyword*).

OR, may I put a date condtion to query,
e.g: select * from table where start_date *>=* 2011-10-12


Regards,
Mead


On Tue, Oct 11, 2011 at 11:39 PM, Chris Lu <ch...@gmail.com> wrote:

> You need to analyze the search keyword with the same analyzer that's
> applied
> on the "content" field.
>
> --
> Chris Lu
> -------------------------
> Instant Scalable Full-Text Search On Any Database/Application
> site: http://www.dbsight.net
> demo: http://search.dbsight.com
> Lucene Database Search in 3 minutes:
>
> http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes
>
> On Tue, Oct 11, 2011 at 12:11 AM, Mead Lai <la...@gmail.com> wrote:
>
> > Hello all,
> > *Background:
> > *There are *ONE MILLION* data in a table, and this table has 100 columns
> > inside.
> > The application need to search the data in EVERY column with one
> 'keyword'.
> > so, I try it in a clumsy way, using a database view, then search the
> view.
> > Just like the following SQL:
> > *=Step1*: create a view.
> >
> > CREATE OR REPLACE VIEW V_MY_VIEW(id,title,content)
> > as
> > SELECT
> >
> >
> mv.l_instanceid,mv.c_param1,mv.c_param2||';'||mv.c_param3||';'||mv.c_param4||';'||mv.c_param5||';'||mv.c_param6||';'||mv.c_param7||';'||mv.c_param8||';'||mv.c_param9||';'||mv.c_param10||';'||mv.c_param11||';'||mv.c_param12||';'||mv.c_param13||';'||mv.c_param14||';'||mv.c_param15||';'||mv.c_param16||';'||mv.c_param17||';'||mv.c_param18||';'||mv.c_param19||';'||mv.c_param20||';'||mv.c_param21||';'||mv.c_param22||';'||mv.c_param23||';'||mv.c_param24||';'||mv.c_param25||';'||mv.c_param26||';'||mv.c_param27||';'||mv.c_param28||';'||mv.c_param29||';'||mv.c_param30||';'||mv.c_param31||';'||mv.c_param32||';'||mv.c_param33||';'||mv.c_param34||';'||mv.c_param35||';'||mv.c_param36||';'||mv.c_param37||';'||mv.c_param38||';'||mv.c_param39||';'||mv.c_param40||';'||mv.c_param41||';'||mv.c_param42||';'||mv.c_param43||';'||mv.c_param44||';'||mv.c_param45||';'||mv.c_param46||';'||mv.c_param47||';'||mv.c_param48||';'||mv.c_param49||';'||mv.c_param50||';'||mv.c_param51||';'||mv.c_param52||';'||mv.c_param53||';'||mv.c_param54||';'||mv.c_param55||';'||mv.c_param56||';'||mv.c_param57||';'||mv.c_param58||';'||mv.c_param59||';'||mv.c_param60||';'||mv.c_param61||';'||mv.c_param62||';'||mv.c_param63||';'||mv.c_param64||';'||mv.c_param65||';'||mv.c_param66||';'||mv.c_param67||';'||mv.c_param68||';'||mv.c_param69||';'||mv.c_param70||';'||mv.c_param71||';'||mv.c_param72||';'||mv.c_param73||';'||mv.c_param74||';'||mv.c_param75||';'||mv.c_param76||';'||mv.c_param77||';'||mv.c_param78||';'||mv.c_param79||';'||mv.c_param80||';'||mv.c_param81||';'||mv.c_param82||';'||mv.c_param83||';'||mv.c_param84||';'||mv.c_param85||';'||mv.c_param86||';'||mv.c_param87||';'||mv.c_param88||';'||mv.c_param89||';'||mv.c_param90||';'||mv.c_param91||';'||mv.c_param92||';'||mv.c_param93||';'||mv.c_param94||';'||mv.c_param95||';'||mv.c_param96||';'||mv.c_param97||';'||mv.c_param98||';'||mv.c_param99||';'||mv.c_param100||';'
> > FROM MyTable mv
> >
> > *=Step2*: search the view with LIKE '%keyword%'
> >
> > SELECT *
> > FROM V_MY_VIEW wcv
> > WHERE wcv.content LIKE '%keyword%'
> >
> > Finally, it works nice, but inefficiency, almost cost 5~7 seconds. cos
> ONE
> > MILLION rows are tooo huge.
> >
> > *Lucene way:*
> >    So, I use the Lucene to store these ONE MILLION data,
> > code:document.add(new Field("content", content, Store.YES,
> > Index.ANALYZED));//variable content, is the strings which jointed from
> the
> > 100 columns
> >     The problem is that: if some keyword is not a word or a term, the
> > search will return nothing.
> > Usually, the keyword would be a person's name or some jargon, like
> > 'catstiger.amber','amin.ahmad','fund-in-trust'.
> > and 'catstiger.amber' can't be split into a term to save in the index
> store
> >    Cos, the Index.ANALYZED would fail to recognition the keyword as a
> term
> > , so there is no such index at all.
> > So, In short, is there any "Query" in Lucene can search the term, which
> is
> > similar as "SQL-LIKE"?
> >
> >    This SQL-function will meet the purpose:
> >
> > SELECT * FROM luceneDB ldb
> > WHERE ldb.content *LIKE* '%keyword%'
>  >
> > Thank you very much.
> >
> > Regards,
> > Mead
> >
>

Re: Is there any "Query" in Lucene can search the term, which is similar as "SQL-LIKE"?

Posted by Chris Lu <ch...@gmail.com>.
You need to analyze the search keyword with the same analyzer that's applied
on the "content" field.

-- 
Chris Lu
-------------------------
Instant Scalable Full-Text Search On Any Database/Application
site: http://www.dbsight.net
demo: http://search.dbsight.com
Lucene Database Search in 3 minutes:
http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes

On Tue, Oct 11, 2011 at 12:11 AM, Mead Lai <la...@gmail.com> wrote:

> Hello all,
> *Background:
> *There are *ONE MILLION* data in a table, and this table has 100 columns
> inside.
> The application need to search the data in EVERY column with one 'keyword'.
> so, I try it in a clumsy way, using a database view, then search the view.
> Just like the following SQL:
> *=Step1*: create a view.
>
> CREATE OR REPLACE VIEW V_MY_VIEW(id,title,content)
> as
> SELECT
>
> mv.l_instanceid,mv.c_param1,mv.c_param2||';'||mv.c_param3||';'||mv.c_param4||';'||mv.c_param5||';'||mv.c_param6||';'||mv.c_param7||';'||mv.c_param8||';'||mv.c_param9||';'||mv.c_param10||';'||mv.c_param11||';'||mv.c_param12||';'||mv.c_param13||';'||mv.c_param14||';'||mv.c_param15||';'||mv.c_param16||';'||mv.c_param17||';'||mv.c_param18||';'||mv.c_param19||';'||mv.c_param20||';'||mv.c_param21||';'||mv.c_param22||';'||mv.c_param23||';'||mv.c_param24||';'||mv.c_param25||';'||mv.c_param26||';'||mv.c_param27||';'||mv.c_param28||';'||mv.c_param29||';'||mv.c_param30||';'||mv.c_param31||';'||mv.c_param32||';'||mv.c_param33||';'||mv.c_param34||';'||mv.c_param35||';'||mv.c_param36||';'||mv.c_param37||';'||mv.c_param38||';'||mv.c_param39||';'||mv.c_param40||';'||mv.c_param41||';'||mv.c_param42||';'||mv.c_param43||';'||mv.c_param44||';'||mv.c_param45||';'||mv.c_param46||';'||mv.c_param47||';'||mv.c_param48||';'||mv.c_param49||';'||mv.c_param50||';'||mv.c_param51||';'||mv.c_param52||';'||mv.c_param53||';'||mv.c_param54||';'||mv.c_param55||';'||mv.c_param56||';'||mv.c_param57||';'||mv.c_param58||';'||mv.c_param59||';'||mv.c_param60||';'||mv.c_param61||';'||mv.c_param62||';'||mv.c_param63||';'||mv.c_param64||';'||mv.c_param65||';'||mv.c_param66||';'||mv.c_param67||';'||mv.c_param68||';'||mv.c_param69||';'||mv.c_param70||';'||mv.c_param71||';'||mv.c_param72||';'||mv.c_param73||';'||mv.c_param74||';'||mv.c_param75||';'||mv.c_param76||';'||mv.c_param77||';'||mv.c_param78||';'||mv.c_param79||';'||mv.c_param80||';'||mv.c_param81||';'||mv.c_param82||';'||mv.c_param83||';'||mv.c_param84||';'||mv.c_param85||';'||mv.c_param86||';'||mv.c_param87||';'||mv.c_param88||';'||mv.c_param89||';'||mv.c_param90||';'||mv.c_param91||';'||mv.c_param92||';'||mv.c_param93||';'||mv.c_param94||';'||mv.c_param95||';'||mv.c_param96||';'||mv.c_param97||';'||mv.c_param98||';'||mv.c_param99||';'||mv.c_param100||';'
> FROM MyTable mv
>
> *=Step2*: search the view with LIKE '%keyword%'
>
> SELECT *
> FROM V_MY_VIEW wcv
> WHERE wcv.content LIKE '%keyword%'
>
> Finally, it works nice, but inefficiency, almost cost 5~7 seconds. cos ONE
> MILLION rows are tooo huge.
>
> *Lucene way:*
>    So, I use the Lucene to store these ONE MILLION data,
> code:document.add(new Field("content", content, Store.YES,
> Index.ANALYZED));//variable content, is the strings which jointed from the
> 100 columns
>     The problem is that: if some keyword is not a word or a term, the
> search will return nothing.
> Usually, the keyword would be a person's name or some jargon, like
> 'catstiger.amber','amin.ahmad','fund-in-trust'.
> and 'catstiger.amber' can't be split into a term to save in the index store
>    Cos, the Index.ANALYZED would fail to recognition the keyword as a term
> , so there is no such index at all.
> So, In short, is there any "Query" in Lucene can search the term, which is
> similar as "SQL-LIKE"?
>
>    This SQL-function will meet the purpose:
>
> SELECT * FROM luceneDB ldb
> WHERE ldb.content *LIKE* '%keyword%'
>
> Thank you very much.
>
> Regards,
> Mead
>