You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Honey George <ho...@yahoo.com> on 2004/09/09 13:41:14 UTC

Case sensitiveness and wildcard searches

Hi,
 I noticed a behavior with wildcard searches and like
to clarify.

>From the FAQ
http://www.jguru.com/faq/view.jsp?EID=538312
in JGuru, Analyzer is not used for wildcard queries.
In my case I have a document which contains the word
IMPORTANT. I use PorterStemFiler + StandardAnalyzer
for indexing & searching. I am getting the document if
I search for the word IM*. But if analyzer is not used
then who does the conversion of the word to lowercase.

My code will look like this.

---
QueryParser qp=new QueryParser("title",
      new MyAnalyzer());
Query q = qp.parse(text);
---


Though I pass the text in uppercase (IM*), when I
print the Query object I can see it in lowercase,
something like  (title:im*)

I am using lucene-1.3-final. Can someone explain this?

Thanks & regards,
   George




	
	
		
___________________________________________________________ALL-NEW Yahoo! Messenger - all new features - even more fun!  http://uk.messenger.yahoo.com

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Case sensitiveness and wildcard searches

Posted by René Hackl <re...@gmx.de>.
George,

The QueryParser does toLowerCase() on WildcardQueries by default. Hence
you'd need to follow Daniel's advice to use

 QueryParser's setLowercaseWildcardTerms(false) 

if you wanted IM* to stay IM*

Cheers,
René


-- 
Supergünstige DSL-Tarife + WLAN-Router für 0,- EUR*
Jetzt zu GMX wechseln und sparen http://www.gmx.net/de/go/dsl


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Case sensitiveness and wildcard searches

Posted by Honey George <ho...@yahoo.com>.
Thanks for links René,
 The mail is not exactly talking about my case because
the StandardAnalyzer which I use does lowercase the
input. So it is the same scenario as the FAQ entry.

-George

 --- "René_Hackl" <re...@gmx.de> wrote: 
> Hi George,
> 
> I'm not sure about v1.3, but you may want to take a
> look
> at
> 
>
http://issues.apache.org/eyebrowse/ReadMsg?listName=lucene-user@jakarta.apache.org&msgNo=9342
> 
> or
> 
>
http://issues.apache.org/eyebrowse/ReadMsg?listName=lucene-user@jakarta.apache.org&msgId=1806371
> 
> cheers,
> René
> 
> -- 
> NEU: Bis zu 10 GB Speicher für e-mails & Dateien!
> 1 GB bereits bei GMX FreeMail
> http://www.gmx.net/de/go/mail
> 
> 
>
---------------------------------------------------------------------
> To unsubscribe, e-mail:
> lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail:
> lucene-user-help@jakarta.apache.org
> 
>  


	
	
		
___________________________________________________________ALL-NEW Yahoo! Messenger - all new features - even more fun!  http://uk.messenger.yahoo.com

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Case sensitiveness and wildcard searches

Posted by René Hackl <re...@gmx.de>.
Hi George,

I'm not sure about v1.3, but you may want to take a look
at

http://issues.apache.org/eyebrowse/ReadMsg?listName=lucene-user@jakarta.apache.org&msgNo=9342

or

http://issues.apache.org/eyebrowse/ReadMsg?listName=lucene-user@jakarta.apache.org&msgId=1806371

cheers,
René

-- 
NEU: Bis zu 10 GB Speicher für e-mails & Dateien!
1 GB bereits bei GMX FreeMail http://www.gmx.net/de/go/mail


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: (n00b) Meaning of Hits.id (int)

Posted by Peter Pimley <pp...@semantico.com>.
Oh, it's that simple. :)
Thanks for that!

Peter


Morus Walter wrote:

>It's lucenes internal id or document number which allows you to access
>the document and its stored fields.
>
>See 
>IndexSearcher.doc(int i)
>or
>IndexReader.document(int n)
>
>The docs just don't name the parameter 'id'.
>  
>


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: (n00b) Meaning of Hits.id (int)

Posted by Morus Walter <mo...@tanto.de>.
Peter Pimley writes:
> 
> My documents are not stored in their original form by lucene, but in a 
> seperate database.  My lucene docs do however store the primary key, so 
> that I can fetch the original version from the database to show the user 
> (does that sound sane?)
> 
yes.

> I see that the 'Hits' class has an id (int) method, which sounds 
> interesting.  The javadoc says "Returns the id for the nth document in 
> this set.".  However, I can't find any mention anywhere else about 
> Document ids.  Could anybody explain what this is?
> 
It's lucenes internal id or document number which allows you to access
the document and its stored fields.

See 
IndexSearcher.doc(int i)
or
IndexReader.document(int n)

The docs just don't name the parameter 'id'.

Morus

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


(n00b) Meaning of Hits.id (int)

Posted by Peter Pimley <pp...@semantico.com>.
Hello everyone.

I'm in the process of writing "my first lucene app", and I've got to the 
bit where I get my search results back (very exciting! ;).

My documents are not stored in their original form by lucene, but in a 
seperate database.  My lucene docs do however store the primary key, so 
that I can fetch the original version from the database to show the user 
(does that sound sane?)

I see that the 'Hits' class has an id (int) method, which sounds 
interesting.  The javadoc says "Returns the id for the nth document in 
this set.".  However, I can't find any mention anywhere else about 
Document ids.  Could anybody explain what this is?

Many Thanks in Advance,
Peter Pimley, Semantico


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org