You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Paul Taylor <pa...@fastmail.fm> on 2009/07/29 12:03:31 UTC

How do you Parse a query to convert numbers to strings

I think this is a common problem, but don't know the correct solution.

Users were doing queries on a numeric field such as qdur:[73 TO 117] and 
expecting to find all the values within but this fails because lucene 
treats the numbers as strings and just does alphabetical search. So I 
now index the field using
NumberTools.longToString()  so allows equivalent search  i.e. 
qdur:[00000000000021 TO 00000000000039] which works correctly.

Trouble user types query into a free text field and cannot expect user 
to do the conversion, so I need to convert qdur:[73 TO 117] to 
qdur:[00000000000021 TO 00000000000039] before doing the lucene search, 
Im trying to do this with regular expressions but not convinced its safe 
for all variations, is there a better way (Im using Lucene 2.9)?


Paul

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: How do you Parse a query to convert numbers to strings

Posted by Paul Taylor <pa...@fastmail.fm>.
Paul Taylor wrote:
> I think this is a common problem, but don't know the correct solution.
>
> Users were doing queries on a numeric field such as qdur:[73 TO 117] 
> and expecting to find all the values within but this fails because 
> lucene treats the numbers as strings and just does alphabetical 
> search. So I now index the field using
> NumberTools.longToString()  so allows equivalent search  i.e. 
> qdur:[00000000000021 TO 00000000000039] which works correctly.
>
> Trouble user types query into a free text field and cannot expect user 
> to do the conversion, so I need to convert qdur:[73 TO 117] to 
> qdur:[00000000000021 TO 00000000000039] before doing the lucene 
> search, Im trying to do this with regular expressions but not 
> convinced its safe for all variations, is there a better way (Im using 
> Lucene 2.9)?
>
>
> Paul
>
And this is my code:

public class TrackMangler implements QueryMangler{

    private Pattern matchDur;
    private Pattern matchDurrRange;
   
    public TrackMangler()
    {
        matchDur         = Pattern.compile("(dur:)([0-9]+)");
        matchDurRange    = Pattern.compile("(dur:\\[)([0-9]+)( TO 
)([0-9]+)(\\])");
       
    }

 private String convertDurToOrderable(String query)
    {
        Matcher m =  matchDur.matcher(query);
        StringBuffer sb = new StringBuffer();
        while (m.find()) {
             int duration = Integer.parseInt(m.group(2));
             m.appendReplacement(sb, m.group(1) + 
NumberTools.longToString(duration));
        }
        m.appendTail(sb);
        query =  sb.toString();

        m  =  matchDurRange.matcher(query);      
        sb = new StringBuffer();
        while (m.find()) {
             int firstTrackNo = Integer.parseInt(m.group(2));
             int secondTrackNo = Integer.parseInt(m.group(4));
             m.appendReplacement(sb, m.group(1)
                     + NumberTools.longToString(firstTrackNo)
                     + m.group(3)
                     + NumberTools.longToString(secondTrackNo)
                     + m.group(5));
        }
        m.appendTail(sb);
        return sb.toString();
    }

}

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org