You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Charlie <ch...@gmail.com> on 2006/07/24 04:31:10 UTC

Span Query NLE

Would anyone give me a hint regarding the natural language expression
of the following span query?

------------if creating queries programmatically (it is in Lucene scr)

    SpanTermQuery t1 = new SpanTermQuery(new Term("field","six"));
    SpanTermQuery t2 = new SpanTermQuery(new Term("field","hundred"));
    SpanNearQuery tt1 = new SpanNearQuery(new SpanQuery[] {t1, t2}, 0,true);

    SpanTermQuery t3 = new SpanTermQuery(new Term("field","seven"));
    SpanTermQuery t4 = new SpanTermQuery(new Term("field","hundred"));
    SpanNearQuery tt2 = new SpanNearQuery(new SpanQuery[] {t3, t4}, 0,true);
    
    SpanTermQuery t5 = new SpanTermQuery(new Term("field","seven"));
    SpanTermQuery t6 = new SpanTermQuery(new Term("field","six"));

    SpanOrQuery to1 = new SpanOrQuery(new SpanQuery[] {tt1, tt2});
    SpanOrQuery to2 = new SpanOrQuery(new SpanQuery[] {t5, t6});
    
    SpanNearQuery query = new SpanNearQuery(new SpanQuery[] {to1, to2},
                                            100, true);
------------and it becomes:

spanNear([spanOr([spanNear([field:six, field:hundred], 0, true), spanNear([field:seven, field:hundred], 0, true)]), spanOr([field:seven, field:six])], 100, true)


------------what's its equivalence in natural language?

(something we can write in one line and can be parsed by QueryParser)
(if we have default field already)

e.g.

(("six hundred"~0 "six hundred"~0) AND (seven six))~100

-----------

Thanks,

Charlie



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Span Query NLE

Posted by karl wettin <ka...@gmail.com>.
On Mon, 2006-07-24 at 00:04 -0700, Chris Hostetter wrote:

> > not supported by the QueryParser. 

> I think one of us is missunderstanding the question ... in my mind the
> "natural language expression" for this query...
> 
>    spanNear([spanOr([spanNear([field:six,

> ...is...
> 
>   Either "six" followed by

I think your eyes missed this part from the original post:

> > > and can be parsed by QueryParser

:-)


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Span Query NLE

Posted by Paul Elschot <pa...@xs4all.nl>.
On Tuesday 25 July 2006 03:26, Charlie wrote:
...
> 
> can "surround" be nested
> 
>     3w(4n(a?a AND bb?) AND cc+)

Yes, but iirc the "arguments" need to be separated by comma's:
3w( 4n( ... , ...) , ...)
instead of by AND.

Regards,
Paul Elschot

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Re[2]: Span Query NLE

Posted by karl wettin <ka...@gmail.com>.
On Mon, 2006-07-24 at 13:44 -0400, Erik Hatcher wrote:
> It does take some time for someone unfamiliar with JavaCC, such as
> myself initially, to implement a custom parser but it can be a huge
> success for a project to have this capability. 

5 cents:

In case of anyone consider writing a new query parser that can be
contributed to Apache, I would recommend using ANTlr instead, as the
grammar can be compiled to most languages with a Lucene-port.



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re[4]: Span Query NLE

Posted by Charlie <ch...@gmail.com>.
Thanks Erik,

"surround" query parser is surely interesting to me.

I really wish surround.txt could explain more in detail and add more
examples, especially in its test cases, it will be very instrumental
to add similar test cases like what
org.apache.lucene.queryParser.TestQueryParser offered and show
assertQueryEquals().

Anyways, so what's the proper interpretation of the following:

         3w(a?a or bb?, cc+)

can "surround" be nested

    3w(4n(a?a AND bb?) AND cc+)

And have you seen/written examples to actually use "surround" query
parser? (not the proprietary one you have mentioned.) I am currently
looking at the package, not much doc to read though.

-- 
Best regards,
 Charlie                            


---
Monday, July 24, 2006, 12:44:16 PM, you wrote:

> The "surround" query parser in Lucene's contrib area implements a  
> language to construct SpanQuery's.  Check out surround.txt in  
> Subversion:

>        
> <http://svn.apache.org/repos/asf/lucene/java/trunk/contrib/surround/>

> I have written a query parser for a client that allows construction
> of very sophisticated queries including the full spectrum of  
> SpanQuery's, but the language is legacy and not something I'd wish  
> upon the general public, and the code is proprietary anyway.  It does
> take some time for someone unfamiliar with JavaCC, such as myself  
> initially, to implement a custom parser but it can be a huge success
> for a project to have this capability.

>         Erik






---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Re[2]: Span Query NLE

Posted by Erik Hatcher <er...@ehatchersolutions.com>.
The "surround" query parser in Lucene's contrib area implements a  
language to construct SpanQuery's.  Check out surround.txt in  
Subversion:

	<http://svn.apache.org/repos/asf/lucene/java/trunk/contrib/surround/>

I have written a query parser for a client that allows construction  
of very sophisticated queries including the full spectrum of  
SpanQuery's, but the language is legacy and not something I'd wish  
upon the general public, and the code is proprietary anyway.  It does  
take some time for someone unfamiliar with JavaCC, such as myself  
initially, to implement a custom parser but it can be a huge success  
for a project to have this capability.

	Erik


On Jul 24, 2006, at 11:48 AM, Charlie wrote:

> Thanks for both of you, Karl and Chris.
>
> You both made my intention even more clearer.
>
> So now the question is:
>
>    Is there a powerful QueryParser.jj can process span query syntax?
>
> (prerequisite is: have we ever defined the Span Query Syntax?)
>
> I will be boasting if I am claim to write one now. I had only one
> compiler class and never feel good about it:)
>
> -- 
> Best regards,
>  Charlie
>
>
> ---
> Monday, July 24, 2006, 2:04:35 AM, you wrote:
>
>
> : >> Would anyone give me a hint regarding the natural language  
> expression
> : >> of the following span query?
>
>> : I'm sorry, but all queries are not supported by the QueryParser.  
>> Spans
>> : beeing one of them. See QueryParser.jj to add your syntax.
>
>> I think one of us is missunderstanding the question ... in my mind  
>> the
>> "natural language expression" for this query...
>
>>    spanNear([spanOr([spanNear([field:six, field:hundred], 0, true),
>>                      spanNear([field:seven, field:hundred], 0, true)
>>                     ]),
>>             spanOr([field:seven, field:six])],
>>             100, true)
>
>> ...is...
>
>>   Either "six" followed by "hundred" with no gap between them, or  
>> "seven"
>>   followed by "hundred" with no gap between them; followed by either
>>   "seven" or "six" with a gap of no more no more then 100 tokens in
>>   between them.
>
>> It's a fairly contrived test case from TestSpansAdvanced if i'm not
>> mistaken, constructed purely to test some complex combinations.
>
>> An example that might make a little more sense is something like...
>
>>    spanNear([spanOr([spanNear([field:Erik, field:Hatcher], 0, true),
>>                      spanNear([field:Otis, field:Gospodnetic], 0,  
>> true)
>>                     ]),
>>             spanOr([field:Apache, field:Lucene])],
>>             100, false)
>
>> ...which I would translate as...
>
>>   Either "Erik" followed by "Hatcher" with no gap between them, or  
>> "Otis"
>>   followed by "Gospodnetic" with no gap between them; near either
>>   "Apache" or "Lucene" with a gap of no more no more then 100  
>> tokens in
>>   between them.
>
>
>> -Hoss
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re[2]: Span Query NLE

Posted by Charlie <ch...@gmail.com>.
Thanks for both of you, Karl and Chris.

You both made my intention even more clearer.

So now the question is:

   Is there a powerful QueryParser.jj can process span query syntax?

(prerequisite is: have we ever defined the Span Query Syntax?)

I will be boasting if I am claim to write one now. I had only one
compiler class and never feel good about it:)

-- 
Best regards,
 Charlie                            


---
Monday, July 24, 2006, 2:04:35 AM, you wrote:


: >> Would anyone give me a hint regarding the natural language expression
: >> of the following span query?

> : I'm sorry, but all queries are not supported by the QueryParser. Spans
> : beeing one of them. See QueryParser.jj to add your syntax.

> I think one of us is missunderstanding the question ... in my mind the
> "natural language expression" for this query...

>    spanNear([spanOr([spanNear([field:six, field:hundred], 0, true),
>                      spanNear([field:seven, field:hundred], 0, true)
>                     ]),
>             spanOr([field:seven, field:six])],
>             100, true)

> ...is...

>   Either "six" followed by "hundred" with no gap between them, or "seven"
>   followed by "hundred" with no gap between them; followed by either
>   "seven" or "six" with a gap of no more no more then 100 tokens in
>   between them.

> It's a fairly contrived test case from TestSpansAdvanced if i'm not
> mistaken, constructed purely to test some complex combinations.

> An example that might make a little more sense is something like...

>    spanNear([spanOr([spanNear([field:Erik, field:Hatcher], 0, true),
>                      spanNear([field:Otis, field:Gospodnetic], 0, true)
>                     ]),
>             spanOr([field:Apache, field:Lucene])],
>             100, false)

> ...which I would translate as...

>   Either "Erik" followed by "Hatcher" with no gap between them, or "Otis"
>   followed by "Gospodnetic" with no gap between them; near either
>   "Apache" or "Lucene" with a gap of no more no more then 100 tokens in
>   between them.


> -Hoss



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Span Query NLE

Posted by Chris Hostetter <ho...@fucit.org>.
: > Would anyone give me a hint regarding the natural language expression
: > of the following span query?

: I'm sorry, but all queries are not supported by the QueryParser. Spans
: beeing one of them. See QueryParser.jj to add your syntax.

I think one of us is missunderstanding the question ... in my mind the
"natural language expression" for this query...

   spanNear([spanOr([spanNear([field:six, field:hundred], 0, true),
                     spanNear([field:seven, field:hundred], 0, true)
                    ]),
            spanOr([field:seven, field:six])],
            100, true)

...is...

  Either "six" followed by "hundred" with no gap between them, or "seven"
  followed by "hundred" with no gap between them; followed by either
  "seven" or "six" with a gap of no more no more then 100 tokens in
  between them.

It's a fairly contrived test case from TestSpansAdvanced if i'm not
mistaken, constructed purely to test some complex combinations.

An example that might make a little more sense is something like...

   spanNear([spanOr([spanNear([field:Erik, field:Hatcher], 0, true),
                     spanNear([field:Otis, field:Gospodnetic], 0, true)
                    ]),
            spanOr([field:Apache, field:Lucene])],
            100, false)

...which I would translate as...

  Either "Erik" followed by "Hatcher" with no gap between them, or "Otis"
  followed by "Gospodnetic" with no gap between them; near either
  "Apache" or "Lucene" with a gap of no more no more then 100 tokens in
  between them.


-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Span Query NLE

Posted by karl wettin <ka...@gmail.com>.
On Sun, 2006-07-23 at 21:31 -0500, Charlie wrote:
> Would anyone give me a hint regarding the natural language expression
> of the following span query?

> spanNear([spanOr([spanNear([field:six, field:hundred], 0, true),
> spanNear([field:seven, field:hundred], 0, true)]),
> spanOr([field:seven, field:six])], 100, true)

> ------------what's its equivalence in natural language?

I'm sorry, but all queries are not supported by the QueryParser. Spans
beeing one of them. See QueryParser.jj to add your syntax.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org