You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Matthew Denner <ma...@codemonkeyconsultancy.net> on 2005/02/07 17:56:30 UTC
Refactoring suggestion for query parsing and creation
Hi,
I have been working with Lucene for about a month now and I am really
impressed by the work of the people involved. I did, however, come across
something that I thought my be better refactored: extending the QueryParser
to add your own handling of various Query implementations. So I had a go at
introducing a QueryFactory interface that classes can implement to provide
construction of these Query implementations, and then an instance can be
passed to a QueryParser instance for it to use. I have a patch that provides
this if the developers of Lucene are interested but, because it is a quite
dramatic change (it removes alot of deprecated methods which I was very
worried about) I would prefer someone to take a look at it and see if they
think it is worthwhile.
The reason I made this patch is that I wanted to deal with integer ranges for
a particular field in my application and, like I said, extending QueryParser
felt wrong to me (and took me almost a week to actually find!). With the
patch I write:
QueryParser parser =
new QueryParser(
"description",
new StandardAnalyzer(),
new SpecialQueryFactory(new QueryFactoryImpl()));
And my specialised QueryFactory implementation gets used during parsing. The
implementation of QueryFactoryImpl was created by moving those methods that
created Query instances in the original QueryFactory into a separate class.
I also created a MultiFieldQueryFactory implementation that takes the methods
from MultiFieldQueryParser (effectively making it now redundant).
Personally I prefer the idea of composition over inheritance, in this
circumstance, but I can understand why other people would not want this.
Anyway, if someone would like to see the patch (made against the HEAD of the
Lucene CVS code) I can provide it; or you can tell me where to go! Either
way, Lucene has still made searching so much easier and I thank you.
Matt
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org
Re: Refactoring suggestion for query parsing and creation
Posted by Erik Hatcher <er...@ehatchersolutions.com>.
The MockObjects dependency is fine with me - it is a nice way to unit
test for sure. The important thing is that it is Apache Software
Licensed, which MO is.
Erik
On Feb 7, 2005, at 3:33 PM, Matthew Denner wrote:
> On Monday 07 February 2005 19:19, Daniel Naber wrote:
>> On Monday 07 February 2005 17:56, Matthew Denner wrote:
>>> QueryParser parser =
>>> new QueryParser(
>>> "description",
>>> new StandardAnalyzer(),
>>> new SpecialQueryFactory(new QueryFactoryImpl()));
>>
>> This sounds interesting, could you create a bug report (see "Lucene
>> Bugs"
>> on the homepage) and then attach your patch?
>
> I will do this I just need to finish off the unit-tests and work the
> changes
> into the QueryParser.jj file (I did them directly on the Java source).
> Would
> anyone mind if I added the MockObjects (http://www.mockobjects.com/)
> JAR as a
> dependency as it makes testing the LowerCaseQueryFactory and
> MultiFieldQueryFactory implementations slightly easier?
>
> Matt
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org
Re: Refactoring suggestion for query parsing and creation
Posted by Matthew Denner <ma...@codemonkeyconsultancy.net>.
On Monday 07 February 2005 19:19, Daniel Naber wrote:
> On Monday 07 February 2005 17:56, Matthew Denner wrote:
> > QueryParser parser =
> > new QueryParser(
> > "description",
> > new StandardAnalyzer(),
> > new SpecialQueryFactory(new QueryFactoryImpl()));
>
> This sounds interesting, could you create a bug report (see "Lucene Bugs"
> on the homepage) and then attach your patch?
I will do this I just need to finish off the unit-tests and work the changes
into the QueryParser.jj file (I did them directly on the Java source). Would
anyone mind if I added the MockObjects (http://www.mockobjects.com/) JAR as a
dependency as it makes testing the LowerCaseQueryFactory and
MultiFieldQueryFactory implementations slightly easier?
Matt
Re: Refactoring suggestion for query parsing and creation
Posted by Daniel Naber <da...@t-online.de>.
On Monday 07 February 2005 17:56, Matthew Denner wrote:
> QueryParser parser =
> new QueryParser(
> "description",
> new StandardAnalyzer(),
> new SpecialQueryFactory(new QueryFactoryImpl()));
This sounds interesting, could you create a bug report (see "Lucene Bugs"
on the homepage) and then attach your patch?
Regards
Daniel
--
http://www.danielnaber.de
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org
Re: Refactoring suggestion for query parsing and creation
Posted by Matthew Denner <ma...@codemonkeyconsultancy.net>.
On Monday 07 February 2005 19:17, David Spencer wrote:
> This is interesting and may match what I've wanted - a way to register
> customized query expansion modules w/ the query parser so that you can
> experiment with different ways of expanding the query e.g. I wrote a
> "WordNet" package that's in the sandbox that expands terms by adding
> their synonyms e.g. "big" might expand to "big large^.9 huge^.9".
I took a quick look at the WordNet CVS and I'm not quite sure so I'm kind of
working off what's below!
> So is it safe to assume I could use your code like this (with 2 lines
> changed from your example):
>
> QueryParser parser =
> new QueryParser(
> "synonym", // pseudo field?
> new StandardAnalyzer(),
> new SpecialQueryFactory(new WordNetQueryFactoryImpl())); //
> wordnet factory?
It would probably be something like:
QueryParser parser =
new QueryParser(
"defaultSearchField", // default search field (as normal)
new StandardAnalyzer(),
new WordNetQueryFactoryImpl("synonym", new QueryFactoryImpl()));
I'm thinking that an instance of your QueryFactory implementation would
effectively decorate another instance and might have your pseudo-field passed
as a parameter. It would then be implemented to catch any pseudo-field (in
this case "synonym") queries and process the text itself.
> Then a query can use the pseudo-field "synonym" any place:
>
> "+synonym:big dog"
>
> expands to:
>
> "+(big large^.9 huge^.9) dog"
Your WordNetQueryFactoryImpl would be something like this:
public class WordNetQueryFactoryImpl implements QueryFactory {
private final QueryFactory m_delegate;
private final String m_pseudoField;
public WordNetQueryFactoryImpl(String pseudoField, QueryFactory delegate) {
m_pseudoField = pseudoField;
m_delegate = delegate;
}
// ... a few other interface methods here but cut for clarity ...
public Query getFieldQuery(
String field,
Analyzer analyzer,
String queryText,
int slop)
{
// If the field is our pseudo-field and the value for it is 'big'
if (m_pseudoField.equals(field) && "big".equals(queryText)) {
// Use the delegate to build our query?
BooleanQuery tmp = new BooleanQuery();
tmp.add(
m_delegate.getFieldQuery(field, analyzer, "big", slop),
BooleanClause.Occur.SHOULD);
tmp.add(
m_delegate.getFieldQuery(field, analyzer, "large", slop),
BooleanClause.Occur.SHOULD);
tmp.add(
m_delegate.getFieldQuery(field, analyzer, "huge", slop),
BooleanClause.Occur.SHOULD);
return tmp;
} else {
// ... otherwise delegate the handling to the other QueryFactory
return m_delegate.getFieldQuery(field, analyzer, queryText, slop);
}
}
}
Just as an example, I'm sure you'll work out what I mean!
> If so cool, I've wanted an extensible query parser for a while..
I don't think it quite fits an "extensible query parser", but I think it might
get you somewhere.
Matt
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org
Re: Refactoring suggestion for query parsing and creation
Posted by David Spencer <da...@tropo.com>.
This is interesting and may match what I've wanted - a way to register
customized query expansion modules w/ the query parser so that you can
experiment with different ways of expanding the query e.g. I wrote a
"WordNet" package that's in the sandbox that expands terms by adding
their synonyms e.g. "big" might expand to "big large^.9 huge^.9".
So is it safe to assume I could use your code like this (with 2 lines
changed from your example):
QueryParser parser =
new QueryParser(
"synonym", // pseudo field?
new StandardAnalyzer(),
new SpecialQueryFactory(new WordNetQueryFactoryImpl())); //
wordnet factory?
Then a query can use the pseudo-field "synonym" any place:
"+synonym:big dog"
expands to:
"+(big large^.9 huge^.9) dog"
If so cool, I've wanted an extensible query parser for a while..
thx,
Dave
PS
When I wrote up my ideas on this a while ago I though the psuedo-field
should look different from the normal fields e.g. it should have 2
colons, not 1, but it's not a huge issue.
Matthew Denner wrote:
> Hi,
>
> I have been working with Lucene for about a month now and I am really
> impressed by the work of the people involved. I did, however, come across
> something that I thought my be better refactored: extending the QueryParser
> to add your own handling of various Query implementations. So I had a go at
> introducing a QueryFactory interface that classes can implement to provide
> construction of these Query implementations, and then an instance can be
> passed to a QueryParser instance for it to use. I have a patch that provides
> this if the developers of Lucene are interested but, because it is a quite
> dramatic change (it removes alot of deprecated methods which I was very
> worried about) I would prefer someone to take a look at it and see if they
> think it is worthwhile.
>
> The reason I made this patch is that I wanted to deal with integer ranges for
> a particular field in my application and, like I said, extending QueryParser
> felt wrong to me (and took me almost a week to actually find!). With the
> patch I write:
>
> QueryParser parser =
> new QueryParser(
> "description",
> new StandardAnalyzer(),
> new SpecialQueryFactory(new QueryFactoryImpl()));
>
> And my specialised QueryFactory implementation gets used during parsing. The
> implementation of QueryFactoryImpl was created by moving those methods that
> created Query instances in the original QueryFactory into a separate class.
> I also created a MultiFieldQueryFactory implementation that takes the methods
> from MultiFieldQueryParser (effectively making it now redundant).
>
> Personally I prefer the idea of composition over inheritance, in this
> circumstance, but I can understand why other people would not want this.
>
> Anyway, if someone would like to see the patch (made against the HEAD of the
> Lucene CVS code) I can provide it; or you can tell me where to go! Either
> way, Lucene has still made searching so much easier and I thank you.
>
> Matt
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-dev-help@jakarta.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org