You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Terry Steichen <te...@net-frame.com> on 2002/09/18 21:49:04 UTC

Peculiar matching problem

I'm using 1.2RC5 with the StandardAnalyzer (using the default stop words).  In the course of my development I've discovered that when I index field contents with a dash ("-") in them when that dash is significant, I can't search them properly.  So, as part of the indexing process, I simply change the dashes to underscores.

It seems to work just fine - except when the text to be indexed (in a field called 'cat') is something like "ap_this_story".  

Then it fails.  

I can get hits just fine based on a 'cat:ap*' query, but not using 'cat:ap_*' (or cat:ap-*, for that matter).

There are many other codes that use underscores, such as "zz_codes" and "e_sources", and these work just fine.  It seems that only when the first two characters are "ap" (and of course there may be others I've not yet discovered), it won't work.

I've looked through the stop words to see if there's some match there, but doesn't look like it.

Appreciate any thoughts anyone might share with me on what might be going on here.

Regards,

Terry