You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Yangrui Guo <gu...@gmail.com> on 2016/04/20 03:28:33 UTC

how to restrict phrase to appear in same child document

hello

I have a nested document type in my index. Here's the structure of my
document:

{
id:
{
    car:
    color:
}
{
    driver:
    color:
}
}

However, when I use the query q={!parent
which="content_type:parent"}+(black AND driver)&fq={!parent
which="content_type:parent"}+(white AND mercedes), the result also
contained white driver with black mercedes. I know I can put fields before
terms but it is not always easy to do this. Users might just enter one
string. How can I modify my query to require that the terms between two
parentheses must appear in the same child document, or boost those meet the
criteria? Thanks

Re[2]: how to restrict phrase to appear in same child document

Posted by "Alisa Z." <pr...@mail.ru>.
 I'm afraid that if the queries are given in such a loose natural language form, the only way to handle it is to introduce some natural language processing stage that would form the right query (which is actually a working strategy, IBM does so). 

If your document structure is fixed (i.e., you know types of nested documents and what fields they exactly contain) , you can try to introduce some basic NLP that will detect the entities or nouns,e.g., "driver" and "car" (try AlchemyLanguage API  http://www.alchemyapi.com/products/demo/alchemylanguage for this) and you will also need some syntactic parser to connect black+driver and white+mercedes correctly.  



>Среда, 20 апреля 2016, 15:31 -04:00 от Yangrui Guo <gu...@gmail.com>:
>
>Hi thanks for answering. My problem is that users do not distinguish what
>color the color belongs to in the query. For example, "which black driver
>has a white mercedes", it is difficult to distinguish which color belongs
>to which field, because there can be thousands of car brands and
>professions. Is there anyway that can achieve the feature I stated been
>fore?
>
>On Wednesday, April 20, 2016, Alisa Z. < proloxx@mail.ru > wrote:
>
>>  Yangrui,
>>
>> First, have you indexed your documents with proper nested document
>> structure [
>>  https://cwiki.apache.org/confluence/display/solr/Uploading+Data+with+Index+Handlers#UploadingDatawithIndexHandlers-NestedChildDocuments ]?
>> From the peice of data you showed, it seems that you just put it right as
>> it is and it all got flattened.
>>
>> Then, you'll probably want to introduce a distinguishing
>> "type"/"category"/"path" fields into your data, so it would look like this:
>>
>> {
>> type:top
>> id:
>> {
>>     type:car_color
>> car:
>>     color:
>> }
>> {
>>   type:driver_color
>>     driver:
>>     color:
>> }
>> }
>>
>>
>> >Wed, 20 Apr 2016 -3:28:33 -0400 от Yangrui Guo < guoyangrui@gmail.com
>> <javascript:;>>:
>> >
>> >hello
>> >
>> >I have a nested document type in my index. Here's the structure of my
>> >document:
>> >
>> >{
>> >id:
>> >{
>> >    car:
>> >    color:
>> >}
>> >{
>> >    driver:
>> >    color:
>> >}
>> >}
>> >
>> >However, when I use the query q={!parent
>> >which="content_type:parent"}+(black AND driver)&fq={!parent
>> >which="content_type:parent"}+(white AND mercedes), the result also
>> >contained white driver with black mercedes. I know I can put fields before
>> >terms but it is not always easy to do this. Users might just enter one
>> >string. How can I modify my query to require that the terms between two
>> >parentheses must appear in the same child document, or boost those meet
>> the
>> >criteria? Thanks
>>
>>


Re: how to restrict phrase to appear in same child document

Posted by Yangrui Guo <gu...@gmail.com>.
Hi thanks for answering. My problem is that users do not distinguish what
color the color belongs to in the query. For example, "which black driver
has a white mercedes", it is difficult to distinguish which color belongs
to which field, because there can be thousands of car brands and
professions. Is there anyway that can achieve the feature I stated been
fore?

On Wednesday, April 20, 2016, Alisa Z. <pr...@mail.ru> wrote:

>  Yangrui,
>
> First, have you indexed your documents with proper nested document
> structure [
> https://cwiki.apache.org/confluence/display/solr/Uploading+Data+with+Index+Handlers#UploadingDatawithIndexHandlers-NestedChildDocuments]?
> From the peice of data you showed, it seems that you just put it right as
> it is and it all got flattened.
>
> Then, you'll probably want to introduce a distinguishing
> "type"/"category"/"path" fields into your data, so it would look like this:
>
> {
> type:top
> id:
> {
>     type:car_color
> car:
>     color:
> }
> {
>   type:driver_color
>     driver:
>     color:
> }
> }
>
>
> >Wed, 20 Apr 2016 -3:28:33 -0400 от Yangrui Guo <guoyangrui@gmail.com
> <javascript:;>>:
> >
> >hello
> >
> >I have a nested document type in my index. Here's the structure of my
> >document:
> >
> >{
> >id:
> >{
> >    car:
> >    color:
> >}
> >{
> >    driver:
> >    color:
> >}
> >}
> >
> >However, when I use the query q={!parent
> >which="content_type:parent"}+(black AND driver)&fq={!parent
> >which="content_type:parent"}+(white AND mercedes), the result also
> >contained white driver with black mercedes. I know I can put fields before
> >terms but it is not always easy to do this. Users might just enter one
> >string. How can I modify my query to require that the terms between two
> >parentheses must appear in the same child document, or boost those meet
> the
> >criteria? Thanks
>
>

Re: how to restrict phrase to appear in same child document

Posted by "Alisa Z." <pr...@mail.ru>.
 Yangrui, 

First, have you indexed your documents with proper nested document structure [https://cwiki.apache.org/confluence/display/solr/Uploading+Data+with+Index+Handlers#UploadingDatawithIndexHandlers-NestedChildDocuments]? From the peice of data you showed, it seems that you just put it right as it is and it all got flattened. 

Then, you'll probably want to introduce a distinguishing "type"/"category"/"path" fields into your data, so it would look like this: 

{
type:top
id:
{
    type:car_color
car:
    color:
}
{
  type:driver_color
    driver:
    color:
}
}


>Wed, 20 Apr 2016 -3:28:33 -0400 от Yangrui Guo <gu...@gmail.com>:
>
>hello
>
>I have a nested document type in my index. Here's the structure of my
>document:
>
>{
>id:
>{
>    car:
>    color:
>}
>{
>    driver:
>    color:
>}
>}
>
>However, when I use the query q={!parent
>which="content_type:parent"}+(black AND driver)&fq={!parent
>which="content_type:parent"}+(white AND mercedes), the result also
>contained white driver with black mercedes. I know I can put fields before
>terms but it is not always easy to do this. Users might just enter one
>string. How can I modify my query to require that the terms between two
>parentheses must appear in the same child document, or boost those meet the
>criteria? Thanks