You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by hg...@cswebmail.com on 2004/04/05 16:50:04 UTC

Search Expansion - more

On Sun, 4 Apr 2004 13:42:45 -0400, Erik Hatcher wrote:
> You could, perhaps, take an easier way out and run
text through an Analyzer as you build up your query,
without using QueryParser.  Look, 
> again, at my AnalysisDemo code in the java.net
> article.... just pull 
> what you need from there to process a TokenStream out
> of an Analyzer.
> 
> 	Erik
> 

Erik,

I managed to process the tokenStream further but it
does not allow to search for "host defense"
as phrase.


Here is what I've done:
(input to the function is by the string array myquery
which contains e.g. myquery[1]="term1",
myquery[2]="host defense")


	BooleanQuery query = new BooleanQuery();
	
	//for each term to add:
	for (int j=0; j<myquery.length; j++){
		stream = analyzer.tokenStream("contents", new
StringReader(myquery[j]));
		String str = "";
		while (true){
			Token token = stream.next();
			if (token == null) break;
			str = str + token.termText() + " ";
			}
		query.add(new TermQuery(new Term("subject",
str.trim())), false, false);
	}

With this code I tried to assemble single tokens like
"host" and "defense" that are probably coming out of
the analyser back to "host defense" - but it doesn't
find me "host defense" ??

Holger :-(

___________________________________________________
The ALL NEW CS2000 from CompuServe
 Better!  Faster! More Powerful!
 250 FREE hours! Sign-on Now!
 http://www.compuserve.com/trycsrv/cs2000/webmail/





---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Re: Search Expansion - more

Posted by Erik Hatcher <er...@ehatchersolutions.com>.
I really would love to spend some time on this thread, but it is hard 
for me to carve out time to do by creating my own example.  If you 
could come up with a self-contained example (use RAMDirectory, 
hard-code strings for indexing a single document, and then your query 
code) I will be able to look at it more sufficiently.  There are too 
many variables to your equation for me to make sense of via piecemeal 
e-mail code chunks.

You're doing TermQuery's, so that is all about single terms, not 
phrases.  So I'm not clear on how you are even attempting a PhraseQuery 
with what you've shown.

	Erik


On Apr 5, 2004, at 10:50 AM, hgadm@cswebmail.com wrote:

> On Sun, 4 Apr 2004 13:42:45 -0400, Erik Hatcher wrote:
>> You could, perhaps, take an easier way out and run
> text through an Analyzer as you build up your query,
> without using QueryParser.  Look,
>> again, at my AnalysisDemo code in the java.net
>> article.... just pull
>> what you need from there to process a TokenStream out
>> of an Analyzer.
>>
>> 	Erik
>>
>
> Erik,
>
> I managed to process the tokenStream further but it
> does not allow to search for "host defense"
> as phrase.
>
>
> Here is what I've done:
> (input to the function is by the string array myquery
> which contains e.g. myquery[1]="term1",
> myquery[2]="host defense")
>
>
> 	BooleanQuery query = new BooleanQuery();
> 	
> 	//for each term to add:
> 	for (int j=0; j<myquery.length; j++){
> 		stream = analyzer.tokenStream("contents", new
> StringReader(myquery[j]));
> 		String str = "";
> 		while (true){
> 			Token token = stream.next();
> 			if (token == null) break;
> 			str = str + token.termText() + " ";
> 			}
> 		query.add(new TermQuery(new Term("subject",
> str.trim())), false, false);
> 	}
>
> With this code I tried to assemble single tokens like
> "host" and "defense" that are probably coming out of
> the analyser back to "host defense" - but it doesn't
> find me "host defense" ??
>
> Holger :-(
>
> ___________________________________________________
> The ALL NEW CS2000 from CompuServe
>  Better!  Faster! More Powerful!
>  250 FREE hours! Sign-on Now!
>  http://www.compuserve.com/trycsrv/cs2000/webmail/
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Re: Search Expansion - more

Posted by Erik Hatcher <er...@ehatchersolutions.com>.
On Apr 5, 2004, at 10:50 AM, hgadm@cswebmail.com wrote:
> Here is what I've done:
> (input to the function is by the string array myquery
> which contains e.g. myquery[1]="term1",
> myquery[2]="host defense")
>
>
> 	BooleanQuery query = new BooleanQuery();
> 	
> 	//for each term to add:
> 	for (int j=0; j<myquery.length; j++){
> 		stream = analyzer.tokenStream("contents", new
> StringReader(myquery[j]));
> 		String str = "";
> 		while (true){
> 			Token token = stream.next();
> 			if (token == null) break;
> 			str = str + token.termText() + " ";
> 			}
> 		query.add(new TermQuery(new Term("subject",
> str.trim())), false, false);
> 	}
>

This doesn't make sense to me.  Why are you appending a space and 
making a single TermQuery for each myquery?  (I don't understand why 
you have an array myquery either, but if you can build a standalone 
simple example - please keep it succinct - maybe I'll understand 
better).  I think you want something more like this:

stream = analyzer.tokenStream("contents", new StringReader(myquery[j]));
while (true){
   Token token = stream.next();
   if (token == null) break;
   query.add(new TermQuery(new Term("subject", token.termText())), 
false, false);
}


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org