You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Ashley Rajaratnam <as...@avalonserv.com> on 2006/01/20 16:08:30 UTC

Beginner Querying multiple fields pls help

Hi,

 

Please forgive me if this comes across as being naïve however Ive bashed my
head against it for a while and can’t come up with a solution.

 

Overview:

 

I have the following basic document structure:

…

Document doc = new Document();

doc.Add(Field.Text("itemtitle", iteminf.itemtitle));

doc.Add(Field.Text("itemSubTitle", iteminf.itemSubTitle));

doc.Add(Field.Keyword("CategoryID", iteminf.CategoryID));  -- THIS ID (1
-30000) is passed in here as a string

writer.AddDocument(doc);

…

 

What I would like to do is search the title and subtitle fields with what a
user types in so I build up a search query string 

 

        Qstr = (query + “ “);

        Qstr += (“itemtitle:” + “(“ + query + “)” + “ “);

        Qstr += (“itemSubTitle:” + “(“ + query + “)” + “ “);

 

Pass it to QueryParse get the Query 

Query q = QueryParser.Parse(sbquerys.ToString(), SEARCH_KEY_ITEMDESC, new
StandardAnalyzer()); 

 

Execute the query searcher.Search(query); No Problem get the results etc.

 

Now the problem what I would like to do is restricted the results coming
back to only certain CategoryIDs so I changed my Qstr dynamically such like

 

                       string[] leafcats =
GET_ARRAY_OF_RELEVANT_CATIDS_AS_STRING_ARRAY();

                       sbquerys += “CategoryID:” + “(“;

                       for(int i=0; i< leafcats.Length; i++)

                       {

                              sbquerys += leafcats[i]+ “ “;

                       }

                       sbquerys += “)”;

 

Qstr += sbquerys

 

Pass it to QueryParse get the Query 

Query q = QueryParser.Parse(Qstr SEARCH_KEY_ITEMDESC, new
StandardAnalyzer());

 

Now not surprisingly the Parse Function calls takes upward of about 10
seconds (or throws a toomanybool clauses exception) I think this is due to
the fact that Im passing it with so many CategoryID (between 1 -30000)  

 

So I tried building up the Query using the API 

 

                              BooleanQuery bq = new  BooleanQuery();

                              

                              Term t = new Term(“itemdesc”, query);   <<<
Doesn’t work because query is something user has typed in and thus doesn’t
match properly

                              bq.Add(new TermQuery(t), false, false);

                              t = new Term(“itemSubTitle”, query);  

                              bq.Add(new TermQuery(t), false, false);

                              t = new Term(“itemtitle”, query);     

                              bq.Add(new TermQuery(t), false, false);

 

                       if (categoryid != null)

                       {            

                              string[] leafcats =
(string[])leafcategoryids.ToArray(typeof(string));

 

                                      t = new Term(SEARCH_KEY_CATEGORYID,
string.Join(" OR ",leafcats));  <<<< OR DOESN’T WORK

 

                                      bq.Add(new TermQuery(t), true, false);
<< NOTE TRUE

                              }

 

 

But this doesn’t seem to execute the OR clause in the new Term value which
is not surprising because it’s a TermQuery which is supposed to be 1 value.
So now Im a bit lost as to where to take this perhaps QueryFilter would do
the trick but Im lost on the syntax and just don’t feel that is the right
approach.

 

 

Anyhow any help you can provide would be greatly appreciate

 

-A

 

 


RE: Beginner Querying multiple fields pls help

Posted by Ashley Rajaratnam <as...@avalonserv.com>.
Hi Joshi

Thanks for the reply!

I had already done that before but failed to put it in the code in the
original post

	if (BooleanQuery.GetMaxClauseCount() < MAX_CLAUSE_COUNT)
			BooleanQuery.SetMaxClauseCount(MAX_CLAUSE_COUNT);

Im using Lucene 1.9 that fixes the problem if you are using the API to build
the BooleanQuery manually but the QueryParser.Parse still throws.

Anyhow unfortunately I don't think that is the answer basically I would like
to query 

Itemtitle with what a user types in and
itemsubtitle with what a user types in
restricted by certain values for CategoryID 


Let me know 
Cheers A

-----Original Message-----
From: Hemant Joshi [mailto:hemantmjoshi@gmail.com] 
Sent: Friday, January 20, 2006 4:25 PM
To: java-user@lucene.apache.org
Subject: Re: Beginner Querying multiple fields pls help

You have to set bq.setMaxClauseCount value as the default number of 
clauses BooleanQuery supports is 1024.
I am guessing you have categoryIDs between 1-30000 which means more than 
1024 clauses.
-Hemant


      setMaxClauseCount


Ashley Rajaratnam wrote:

>Hi,
>
> 
>
>Please forgive me if this comes across as being naïve however Ive bashed my
>head against it for a while and can’t come up with a solution.
>
> 
>
>Overview:
>
> 
>
>I have the following basic document structure:
>
>…
>
>Document doc = new Document();
>
>doc.Add(Field.Text("itemtitle", iteminf.itemtitle));
>
>doc.Add(Field.Text("itemSubTitle", iteminf.itemSubTitle));
>
>doc.Add(Field.Keyword("CategoryID", iteminf.CategoryID));  -- THIS ID (1
>-30000) is passed in here as a string
>
>writer.AddDocument(doc);
>
>…
>
> 
>
>What I would like to do is search the title and subtitle fields with what a
>user types in so I build up a search query string 
>
> 
>
>        Qstr = (query + “ “);
>
>        Qstr += (“itemtitle:” + “(“ + query + “)” + “ “);
>
>        Qstr += (“itemSubTitle:” + “(“ + query + “)” + “ “);
>
> 
>
>Pass it to QueryParse get the Query 
>
>Query q = QueryParser.Parse(sbquerys.ToString(), SEARCH_KEY_ITEMDESC, new
>StandardAnalyzer()); 
>
> 
>
>Execute the query searcher.Search(query); No Problem get the results etc.
>
> 
>
>Now the problem what I would like to do is restricted the results coming
>back to only certain CategoryIDs so I changed my Qstr dynamically such like
>
> 
>
>                       string[] leafcats =
>GET_ARRAY_OF_RELEVANT_CATIDS_AS_STRING_ARRAY();
>
>                       sbquerys += “CategoryID:” + “(“;
>
>                       for(int i=0; i< leafcats.Length; i++)
>
>                       {
>
>                              sbquerys += leafcats[i]+ “ “;
>
>                       }
>
>                       sbquerys += “)”;
>
> 
>
>Qstr += sbquerys
>
> 
>
>Pass it to QueryParse get the Query 
>
>Query q = QueryParser.Parse(Qstr SEARCH_KEY_ITEMDESC, new
>StandardAnalyzer());
>
> 
>
>Now not surprisingly the Parse Function calls takes upward of about 10
>seconds (or throws a toomanybool clauses exception) I think this is due to
>the fact that Im passing it with so many CategoryID (between 1 -30000)  
>
> 
>
>So I tried building up the Query using the API 
>
> 
>
>                              BooleanQuery bq = new  BooleanQuery();
>
>                              
>
>                              Term t = new Term(“itemdesc”, query);   <<<
>Doesn’t work because query is something user has typed in and thus doesn’t
>match properly
>
>                              bq.Add(new TermQuery(t), false, false);
>
>                              t = new Term(“itemSubTitle”, query);  
>
>                              bq.Add(new TermQuery(t), false, false);
>
>                              t = new Term(“itemtitle”, query);     
>
>                              bq.Add(new TermQuery(t), false, false);
>
> 
>
>                       if (categoryid != null)
>
>                       {            
>
>                              string[] leafcats =
>(string[])leafcategoryids.ToArray(typeof(string));
>
> 
>
>                                      t = new Term(SEARCH_KEY_CATEGORYID,
>string.Join(" OR ",leafcats));  <<<< OR DOESN’T WORK
>
> 
>
>                                      bq.Add(new TermQuery(t), true,
false);
><< NOTE TRUE
>
>                              }
>
> 
>
> 
>
>But this doesn’t seem to execute the OR clause in the new Term value which
>is not surprising because it’s a TermQuery which is supposed to be 1 value.
>So now Im a bit lost as to where to take this perhaps QueryFilter would do
>the trick but Im lost on the syntax and just don’t feel that is the right
>approach.
>
> 
>
> 
>
>Anyhow any help you can provide would be greatly appreciate
>
> 
>
>-A
>
> 
>
> 
>
>
>  
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Beginner Querying multiple fields pls help

Posted by Hemant Joshi <he...@gmail.com>.
You have to set bq.setMaxClauseCount value as the default number of 
clauses BooleanQuery supports is 1024.
I am guessing you have categoryIDs between 1-30000 which means more than 
1024 clauses.
-Hemant


      setMaxClauseCount


Ashley Rajaratnam wrote:

>Hi,
>
> 
>
>Please forgive me if this comes across as being naïve however Ive bashed my
>head against it for a while and can’t come up with a solution.
>
> 
>
>Overview:
>
> 
>
>I have the following basic document structure:
>
>…
>
>Document doc = new Document();
>
>doc.Add(Field.Text("itemtitle", iteminf.itemtitle));
>
>doc.Add(Field.Text("itemSubTitle", iteminf.itemSubTitle));
>
>doc.Add(Field.Keyword("CategoryID", iteminf.CategoryID));  -- THIS ID (1
>-30000) is passed in here as a string
>
>writer.AddDocument(doc);
>
>…
>
> 
>
>What I would like to do is search the title and subtitle fields with what a
>user types in so I build up a search query string 
>
> 
>
>        Qstr = (query + “ “);
>
>        Qstr += (“itemtitle:” + “(“ + query + “)” + “ “);
>
>        Qstr += (“itemSubTitle:” + “(“ + query + “)” + “ “);
>
> 
>
>Pass it to QueryParse get the Query 
>
>Query q = QueryParser.Parse(sbquerys.ToString(), SEARCH_KEY_ITEMDESC, new
>StandardAnalyzer()); 
>
> 
>
>Execute the query searcher.Search(query); No Problem get the results etc.
>
> 
>
>Now the problem what I would like to do is restricted the results coming
>back to only certain CategoryIDs so I changed my Qstr dynamically such like
>
> 
>
>                       string[] leafcats =
>GET_ARRAY_OF_RELEVANT_CATIDS_AS_STRING_ARRAY();
>
>                       sbquerys += “CategoryID:” + “(“;
>
>                       for(int i=0; i< leafcats.Length; i++)
>
>                       {
>
>                              sbquerys += leafcats[i]+ “ “;
>
>                       }
>
>                       sbquerys += “)”;
>
> 
>
>Qstr += sbquerys
>
> 
>
>Pass it to QueryParse get the Query 
>
>Query q = QueryParser.Parse(Qstr SEARCH_KEY_ITEMDESC, new
>StandardAnalyzer());
>
> 
>
>Now not surprisingly the Parse Function calls takes upward of about 10
>seconds (or throws a toomanybool clauses exception) I think this is due to
>the fact that Im passing it with so many CategoryID (between 1 -30000)  
>
> 
>
>So I tried building up the Query using the API 
>
> 
>
>                              BooleanQuery bq = new  BooleanQuery();
>
>                              
>
>                              Term t = new Term(“itemdesc”, query);   <<<
>Doesn’t work because query is something user has typed in and thus doesn’t
>match properly
>
>                              bq.Add(new TermQuery(t), false, false);
>
>                              t = new Term(“itemSubTitle”, query);  
>
>                              bq.Add(new TermQuery(t), false, false);
>
>                              t = new Term(“itemtitle”, query);     
>
>                              bq.Add(new TermQuery(t), false, false);
>
> 
>
>                       if (categoryid != null)
>
>                       {            
>
>                              string[] leafcats =
>(string[])leafcategoryids.ToArray(typeof(string));
>
> 
>
>                                      t = new Term(SEARCH_KEY_CATEGORYID,
>string.Join(" OR ",leafcats));  <<<< OR DOESN’T WORK
>
> 
>
>                                      bq.Add(new TermQuery(t), true, false);
><< NOTE TRUE
>
>                              }
>
> 
>
> 
>
>But this doesn’t seem to execute the OR clause in the new Term value which
>is not surprising because it’s a TermQuery which is supposed to be 1 value.
>So now Im a bit lost as to where to take this perhaps QueryFilter would do
>the trick but Im lost on the syntax and just don’t feel that is the right
>approach.
>
> 
>
> 
>
>Anyhow any help you can provide would be greatly appreciate
>
> 
>
>-A
>
> 
>
> 
>
>
>  
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Beginner Querying multiple fields pls help

Posted by Chris Hostetter <ho...@fucit.org>.
1) instead of building up a string, and giving it to QueryParser, i would
suggest you look at the MultiFieldQueryParser

2) it sounds like what you are interested regarding the category field, is
not to restrict your search to a particular handfull of categoryIds, but
to a large set of categories that have something in common.  i'm not sure
what it is that these categories have in common, but perhaps you could
make a field for it and put that in each document.   (ie:
doc.add(Field.Keyword("in_leaf_cat","1")))

if not, if you really do just want to restrict to all categories with an
id between 1 and 30000, i suggest you use a RangeFilter.  it should work
for you perfectly.

if there isn't some common property of hte categories you are interested
in that you can specify, nor are the category ids all truely sequential,
and you still need some way to specify an arbitrarily large set of
category ids to restrict your results to, a Filter is still probably a
better way to go then just building up a mongo BooleanQuery.  search the
mailing list archives for discussion about a "SetFilter" ... it's not
something anyone has written an commited, but it is a fairly simpel
concept to impliment given some of the hints from previous messages.




-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org