You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by "Alisa Z." <pr...@mail.ru> on 2016/04/22 00:41:18 UTC

Re[2]: how to restrict phrase to appear in same child document

 I'm afraid that if the queries are given in such a loose natural language form, the only way to handle it is to introduce some natural language processing stage that would form the right query (which is actually a working strategy, IBM does so). 

If your document structure is fixed (i.e., you know types of nested documents and what fields they exactly contain) , you can try to introduce some basic NLP that will detect the entities or nouns,e.g., "driver" and "car" (try AlchemyLanguage API  http://www.alchemyapi.com/products/demo/alchemylanguage for this) and you will also need some syntactic parser to connect black+driver and white+mercedes correctly.  



>Среда, 20 апреля 2016, 15:31 -04:00 от Yangrui Guo <gu...@gmail.com>:
>
>Hi thanks for answering. My problem is that users do not distinguish what
>color the color belongs to in the query. For example, "which black driver
>has a white mercedes", it is difficult to distinguish which color belongs
>to which field, because there can be thousands of car brands and
>professions. Is there anyway that can achieve the feature I stated been
>fore?
>
>On Wednesday, April 20, 2016, Alisa Z. < proloxx@mail.ru > wrote:
>
>>  Yangrui,
>>
>> First, have you indexed your documents with proper nested document
>> structure [
>>  https://cwiki.apache.org/confluence/display/solr/Uploading+Data+with+Index+Handlers#UploadingDatawithIndexHandlers-NestedChildDocuments ]?
>> From the peice of data you showed, it seems that you just put it right as
>> it is and it all got flattened.
>>
>> Then, you'll probably want to introduce a distinguishing
>> "type"/"category"/"path" fields into your data, so it would look like this:
>>
>> {
>> type:top
>> id:
>> {
>>     type:car_color
>> car:
>>     color:
>> }
>> {
>>   type:driver_color
>>     driver:
>>     color:
>> }
>> }
>>
>>
>> >Wed, 20 Apr 2016 -3:28:33 -0400 от Yangrui Guo < guoyangrui@gmail.com
>> <javascript:;>>:
>> >
>> >hello
>> >
>> >I have a nested document type in my index. Here's the structure of my
>> >document:
>> >
>> >{
>> >id:
>> >{
>> >    car:
>> >    color:
>> >}
>> >{
>> >    driver:
>> >    color:
>> >}
>> >}
>> >
>> >However, when I use the query q={!parent
>> >which="content_type:parent"}+(black AND driver)&fq={!parent
>> >which="content_type:parent"}+(white AND mercedes), the result also
>> >contained white driver with black mercedes. I know I can put fields before
>> >terms but it is not always easy to do this. Users might just enter one
>> >string. How can I modify my query to require that the terms between two
>> >parentheses must appear in the same child document, or boost those meet
>> the
>> >criteria? Thanks
>>
>>