You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by SBS <jt...@uow.edu.au> on 2011/08/23 04:36:16 UTC

Searching behaviour with content containing decimal points

I have content such as "E71.0" and when I enter a search query of "E71" I
would like it to match that document.  At the moment though it only matches
that document if I enter "E71*" or "E71.0".

What's the trick to getting such a query to match this document?  I am using
StandardAnalyzer and QueryParser at the moment in Lucene Java 3.2.

Thanks,

-sbs

--
View this message in context: http://lucene.472066.n3.nabble.com/Searching-behaviour-with-content-containing-decimal-points-tp3276878p3276878.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Searching behaviour with content containing decimal points

Posted by SBS <jt...@uow.edu.au>.
> I have content such as "E71.0" and when I enter a search query of "E71" I
would like it to match that
> document.  At the moment though it only matches that document if I enter
> "E71*" or "E71.0".
>
> What's the trick to getting such a query to match this document?  I am
> using StandardAnalyzer and
> QueryParser at the moment in Lucene Java 3.2.

I have included the previous post as I realise not everyone accesses this
content via the web.

-sbs
 


--
View this message in context: http://lucene.472066.n3.nabble.com/Searching-behaviour-with-content-containing-decimal-points-tp3276878p3285429.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Searching behaviour with content containing decimal points

Posted by Erick Erickson <er...@gmail.com>.
You have to do some normalizing here, and I don't think there's
anything available out of the box, so you'll probably have to roll
your own filter that does the normalization for this field.

Be a little cautious, though. Your example, while fine itself, may not
generalize. Your rule for normalization might be "remove all non
alphanum characters and drop trailing zeros". But if applied to,
say, part numbers does it still make sense? Does your user base
expect a part number (or something) like 123000 to fail to match
1230?

Anyway, you'll probably be making your own custom Analyzer
for this by chaining together, say, WhiteSpaceTokenizer with
your custom Filter.

Best
Erick

On Thu, Aug 25, 2011 at 8:03 PM, Josh Rehman <jo...@joshrehman.com> wrote:
> Actually I have this issue too. I've played around with various analyzers,
> and I would expect the WhitespaceAnalyzer to work (at least) but it does
> not.
>
> On Thu, Aug 25, 2011 at 4:58 PM, SBS <jt...@uow.edu.au> wrote:
>
>> Can anyone help me with this?  Do you require further information?  This
>> has
>> become a serious issue for us.
>>
>> Thanks,
>>
>> -sbs
>>
>> --
>> View this message in context:
>> http://lucene.472066.n3.nabble.com/Searching-behaviour-with-content-containing-decimal-points-tp3276878p3285423.html
>> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Searching behaviour with content containing decimal points

Posted by Josh Rehman <jo...@joshrehman.com>.
Actually I have this issue too. I've played around with various analyzers,
and I would expect the WhitespaceAnalyzer to work (at least) but it does
not.

On Thu, Aug 25, 2011 at 4:58 PM, SBS <jt...@uow.edu.au> wrote:

> Can anyone help me with this?  Do you require further information?  This
> has
> become a serious issue for us.
>
> Thanks,
>
> -sbs
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Searching-behaviour-with-content-containing-decimal-points-tp3276878p3285423.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: Searching behaviour with content containing decimal points

Posted by SBS <jt...@uow.edu.au>.
Can anyone help me with this?  Do you require further information?  This has
become a serious issue for us.

Thanks,

-sbs

--
View this message in context: http://lucene.472066.n3.nabble.com/Searching-behaviour-with-content-containing-decimal-points-tp3276878p3285423.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org