You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@lucenenet.apache.org by Sudhanya Chatterjee <su...@persistent.co.in> on 2008/04/07 11:02:42 UTC

lucene.net query

Hi,

 

In our application we need to develop "Starts With" kind of search.

While indexing we have used WhiteSpace Tokenizer, and the search fields
while indexing are Tokenized.

 

We have tried creating such queries with WildCardQuery - with "*" at the
end.

It works fine for single word queries.

 

For multi word query we need the following behavior - 

Search key - "abc def"

Results should list - "abc def"

                                    "abc defg"

                                    "abc defgh" . etc.

 

Search result should contain all in which def* appears immediately after
abc.

 

If we directly try to create the query in the following way - 

Query q = new WildcardQuery(new Term(field, "abc def" + "*"));

It does not work.

What kind of query creation should be used to achieve this.

 

On the same fileds we also need Approximate search, for which we are using
fuzzy query.

 

We need to keep the search fileds tokenized only as it is required in few
other types of search being performed.

Thanks,

Sudhanya


DISCLAIMER
==========
This e-mail may contain privileged and confidential information which is the property of Persistent Systems Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Ltd. does not accept any liability for virus infected mails.

RE: lucene.net query

Posted by Digy <di...@gmail.com>.
Adding an UN_TOKENIZED field to your index can solve the problem
For ex,
<PRE>

	//Create Index
      IndexWriter wr = new IndexWriter("IndexXXX", new
WhitespaceAnalyzer(),true);

      Document doc = new Document();
      doc.Add(new Field("field1", "abc def", Field.Store.YES,
Field.Index.TOKENIZED));
      doc.Add(new Field("field2", "abc def", Field.Store.YES,
Field.Index.UN_TOKENIZED));
      wr.AddDocument(doc);

      doc = new Document();
      doc.Add(new Field("field1", "abc defg", Field.Store.YES,
Field.Index.TOKENIZED));
      doc.Add(new Field("field2", "abc defg", Field.Store.YES,
Field.Index.UN_TOKENIZED));
      wr.AddDocument(doc);

      wr.Close();




      //Search
      IndexSearcher searcher = new IndexSearcher("IndexXXX");
            
      Query query = new WildcardQuery(new Term("field1", "abc def" + "*"));
      Hits hits = searcher.Search(query);
      int resCount = hits.Length(); //<--******** returns 0 result
            
      query = new WildcardQuery(new Term("field2", "abc def" + "*"));
      hits = searcher.Search(query);
      resCount = hits.Length(); //<--******** returns 2 results

</PRE>


DIGY

-----Original Message-----
From: Sudhanya Chatterjee [mailto:sudhanya_chatterjee@persistent.co.in] 
Sent: Monday, April 07, 2008 12:03 PM
To: lucene-net-user@incubator.apache.org
Subject: lucene.net query

Hi,

 

In our application we need to develop "Starts With" kind of search.

While indexing we have used WhiteSpace Tokenizer, and the search fields
while indexing are Tokenized.

 

We have tried creating such queries with WildCardQuery - with "*" at the
end.

It works fine for single word queries.

 

For multi word query we need the following behavior - 

Search key - "abc def"

Results should list - "abc def"

                                    "abc defg"

                                    "abc defgh" . etc.

 

Search result should contain all in which def* appears immediately after
abc.

 

If we directly try to create the query in the following way - 

Query q = new WildcardQuery(new Term(field, "abc def" + "*"));

It does not work.

What kind of query creation should be used to achieve this.

 

On the same fileds we also need Approximate search, for which we are using
fuzzy query.

 

We need to keep the search fileds tokenized only as it is required in few
other types of search being performed.

Thanks,

Sudhanya


DISCLAIMER
==========
This e-mail may contain privileged and confidential information which is the
property of Persistent Systems Ltd. It is intended only for the use of the
individual or entity to which it is addressed. If you are not the intended
recipient, you are not authorized to read, retain, copy, print, distribute
or use this message. If you have received this communication in error,
please notify the sender and delete all copies of this message. Persistent
Systems Ltd. does not accept any liability for virus infected mails.