You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Michael McCandless (JIRA)" <ji...@apache.org> on 2011/05/23 20:28:48 UTC

[jira] [Created] (LUCENE-3133) Fix QueryParser to handle nested fields

Fix QueryParser to handle nested fields
---------------------------------------

                 Key: LUCENE-3133
                 URL: https://issues.apache.org/jira/browse/LUCENE-3133
             Project: Lucene - Java
          Issue Type: Improvement
            Reporter: Michael McCandless
             Fix For: 3.2, 4.0


Once we commit LUCENE-2454, we need to make it easy for apps to enable this with QueryParser.

It seems like it's a "schema" like behavior, ie we need to be able to express the join structure of the related fields.

And then whenever QP produces a query that spans fields requiring a join, the NestedDocumentQuery is used to wrap the child fields?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-3133) Fix QueryParser to handle nested fields

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless updated LUCENE-3133:
---------------------------------------

    Fix Version/s:     (was: 3.4)
                   3.5

> Fix QueryParser to handle nested fields
> ---------------------------------------
>
>                 Key: LUCENE-3133
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3133
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>             Fix For: 3.5, 4.0
>
>
> Once we commit LUCENE-2454, we need to make it easy for apps to enable this with QueryParser.
> It seems like it's a "schema" like behavior, ie we need to be able to express the join structure of the related fields.
> And then whenever QP produces a query that spans fields requiring a join, the NestedDocumentQuery is used to wrap the child fields?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3133) Fix QueryParser to handle nested fields

Posted by "Mark Harwood (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038270#comment-13038270 ] 

Mark Harwood commented on LUCENE-3133:
--------------------------------------

So the 2 reasons I can think of why the WITH construct may be needed are:
1) If the field names aren't exclusive to a doc type e.g. "name" or "age" is a field found on both parent and child docs
or
2) If you want to find a parent with two different children (e.g. a resume of someone who has held a position at Google in 2009 and a different position at LinkedIn during 2010).

In both cases the WITH clause is needed to set the context around clauses to avoid any ambiguity

> Fix QueryParser to handle nested fields
> ---------------------------------------
>
>                 Key: LUCENE-3133
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3133
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>             Fix For: 3.2, 4.0
>
>
> Once we commit LUCENE-2454, we need to make it easy for apps to enable this with QueryParser.
> It seems like it's a "schema" like behavior, ie we need to be able to express the join structure of the related fields.
> And then whenever QP produces a query that spans fields requiring a join, the NestedDocumentQuery is used to wrap the child fields?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3133) Fix QueryParser to handle nested fields

Posted by "Mike Sokolov (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038328#comment-13038328 ] 

Mike Sokolov commented on LUCENE-3133:
--------------------------------------

Mightn't you want to be able to do self-joins?  For example if you want to represent an XML document, and your field is "Element" - it has any number Attribute children and any number of Node children, which in turn may be Elements.  I wonder if LUCENE-2454 could be extended to allow recursive ChildDocumentQuery - ie DescendantDocumentQuery?

> Fix QueryParser to handle nested fields
> ---------------------------------------
>
>                 Key: LUCENE-3133
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3133
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>             Fix For: 3.2, 4.0
>
>
> Once we commit LUCENE-2454, we need to make it easy for apps to enable this with QueryParser.
> It seems like it's a "schema" like behavior, ie we need to be able to express the join structure of the related fields.
> And then whenever QP produces a query that spans fields requiring a join, the NestedDocumentQuery is used to wrap the child fields?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3133) Fix QueryParser to handle nested fields

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038235#comment-13038235 ] 

Michael McCandless commented on LUCENE-3133:
--------------------------------------------


I'm confused on why the query parser language would need to be
extended to handle this...

Ie, it seems like, for a given index, the assignment of fields to
parent vs child docs is a global/static decision?  And then any query
that has clauses against mixed parent/child fields should be "wrapped"
by NestedDocumentQuery so that the child field/doc matches are
"translated" to the corresponding parent docs?

Why should each query be free to change this?

EG if a user type that same query, but without WITH, then nothing
would match right?

I guess this means I'd vote for 2 :)


> Fix QueryParser to handle nested fields
> ---------------------------------------
>
>                 Key: LUCENE-3133
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3133
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>             Fix For: 3.2, 4.0
>
>
> Once we commit LUCENE-2454, we need to make it easy for apps to enable this with QueryParser.
> It seems like it's a "schema" like behavior, ie we need to be able to express the join structure of the related fields.
> And then whenever QP produces a query that spans fields requiring a join, the NestedDocumentQuery is used to wrap the child fields?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3133) Fix QueryParser to handle nested fields

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038292#comment-13038292 ] 

Michael McCandless commented on LUCENE-3133:
--------------------------------------------

Duh nevermind on case 2 -- Boolean AND query would work for that example!

> Fix QueryParser to handle nested fields
> ---------------------------------------
>
>                 Key: LUCENE-3133
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3133
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>             Fix For: 3.2, 4.0
>
>
> Once we commit LUCENE-2454, we need to make it easy for apps to enable this with QueryParser.
> It seems like it's a "schema" like behavior, ie we need to be able to express the join structure of the related fields.
> And then whenever QP produces a query that spans fields requiring a join, the NestedDocumentQuery is used to wrap the child fields?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3133) Fix QueryParser to handle nested fields

Posted by "Mark Harwood (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038148#comment-13038148 ] 

Mark Harwood commented on LUCENE-3133:
--------------------------------------

2454 already includes extensions for the XML parser syntax and the standard QueryParser could work the same with some added syntax. I think I've seen other languages use WITH as a keyword e.g.
{code:borderStyle=solid}
  forename:john surname:smith WITH(employer:google AND date:2009)
{code}

In this example the WITH keyword is used to mark a clause that relates to a child document.
What is left unsaid here is how parent documents are distinguished from child docs in the index. I guess you could 
1) Extend the WITH syntax to make it part of the query expression or
2) Make it part of the QueryParser constructor (i.e. tell the query parser what denotes parent docs) or
3) Have a fixed system for tagging parents enforced by Lucene's IndexWriter when calling the addDocuments API.

Option 3 seems too restrictive (it may be desirable for example to have multiple levels of hierarchy to roll up to in an index).

The majority of users we have using this feature currently do so using a form-based query builder which assembles the nested XML syntax behind the scenes so there is no need for extensions to the standard QueryParser.  I can see some power users would want this though.


> Fix QueryParser to handle nested fields
> ---------------------------------------
>
>                 Key: LUCENE-3133
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3133
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>             Fix For: 3.2, 4.0
>
>
> Once we commit LUCENE-2454, we need to make it easy for apps to enable this with QueryParser.
> It seems like it's a "schema" like behavior, ie we need to be able to express the join structure of the related fields.
> And then whenever QP produces a query that spans fields requiring a join, the NestedDocumentQuery is used to wrap the child fields?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3133) Fix QueryParser to handle nested fields

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038286#comment-13038286 ] 

Michael McCandless commented on LUCENE-3133:
--------------------------------------------

Oh I see.  Hmm, is case 1 is going to cause problems?  (Ie if both parent & child docs can come back matching a given field).  Is there a "normal" use case where you would want to put same field name on both parent & child docs?  (I had thought normally the field names would be orthogonal).

Case 2) I agree needs some special syntax.  In fact, even non-nested docs might want such a query?  Eg if my docs are cars, and each car has a multi-valued field listing its features ("A/C", "Automatic transmission", ...), and I want to find all cars that have both A/C and Automatic transmission.  Boolean AND query won't work correctly for this; I'd need this same extension as your bullet 2 I think?

> Fix QueryParser to handle nested fields
> ---------------------------------------
>
>                 Key: LUCENE-3133
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3133
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>             Fix For: 3.2, 4.0
>
>
> Once we commit LUCENE-2454, we need to make it easy for apps to enable this with QueryParser.
> It seems like it's a "schema" like behavior, ie we need to be able to express the join structure of the related fields.
> And then whenever QP produces a query that spans fields requiring a join, the NestedDocumentQuery is used to wrap the child fields?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3133) Fix QueryParser to handle nested fields

Posted by "Mark Harwood (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038448#comment-13038448 ] 

Mark Harwood commented on LUCENE-3133:
--------------------------------------

bq.  I wonder if LUCENE-2454 could be extended to allow recursive ChildDocumentQuery 

No need to extend. This can be done today by nesting a NestedDocumentQuery inside another.
The only thing you need to do is set the "ParentsFilter" to roll up results to the appropriate point e.g. parent/child/grandchild

bq. is there a "normal" use case where you would want to put same field name on both parent & child docs?

I wouldn't want to rule that possibility out e.g. a person has a name and age and their sons and daughters have names and ages too.



> Fix QueryParser to handle nested fields
> ---------------------------------------
>
>                 Key: LUCENE-3133
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3133
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>             Fix For: 3.2, 4.0
>
>
> Once we commit LUCENE-2454, we need to make it easy for apps to enable this with QueryParser.
> It seems like it's a "schema" like behavior, ie we need to be able to express the join structure of the related fields.
> And then whenever QP produces a query that spans fields requiring a join, the NestedDocumentQuery is used to wrap the child fields?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org