You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Bogdan Marinescu (JIRA)" <ji...@apache.org> on 2014/11/06 08:53:33 UTC

[jira] [Comment Edited] (SOLR-6700) ChildDocTransformer doesn't return correct children after updating and optimising sol'r index

    [ https://issues.apache.org/jira/browse/SOLR-6700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14199964#comment-14199964 ] 

Bogdan Marinescu edited comment on SOLR-6700 at 11/6/14 7:52 AM:
-----------------------------------------------------------------

That's exactly what I've been doing. 
{code:title=update_parent.xml|borderStyle=solid}
<add>
<doc>
<field name="id">1</field>
<field name="pName" update="set">INIT</field>
<field name="entityType">1</field>
<doc>
        <field name="id">11</field>
        <field name="cAlbum">Test Album 1</field>
	    <field name="cSong">Test Song 1</field>
        <field name="entityType">2</field>
    </doc>
</doc>
</add>
{code}

This yields the same result. If instead I use this but without the 'update="set"' part, it replaces the parent and everything works,  and best of all, no optimisation is required, probably because the index is not fragmented. 
{code}
<add>
<doc>
<field name="id">1</field>
<field name="pName">INIT</field>
<field name="entityType">1</field>
<doc>
        <field name="id">11</field>
        <field name="cAlbum">Test Album 1</field>
	    <field name="cSong">Test Song 1</field>
        <field name="entityType">2</field>
    </doc>
</doc>
</add>
{code}

Question remains though, why doesn't it work with the update="set" flag ? The problem with replacing the document is that instead of just setting the field/fields to be updated, you have to submit the whole document and their children. 
As I've said before, the data gets "scrambled" only after the *optimise* is performed. 

Am I to understand that this bug won't be solved anytime soon ? 


was (Author: bogandy):
That's exactly what I've been doing. 
{code:title=update_parent.xml|borderStyle=solid}
<add>
<doc>
<field name="id">1</field>
<field name="pName" update="set">INIT</field>
<field name="entityType">1</field>
<doc>
        <field name="id">11</field>
        <field name="cAlbum">Test Album 1</field>
	    <field name="cSong">Test Song 1</field>
        <field name="entityType">2</field>
    </doc>
</doc>
</add>
{code}

This yields the same result. If instead I use this but without the 'update="set"' part, it replaces the parent and everything works,  and best of all, no optimisation is required, probably because the index is not fragmented. 

Question remains though, why doesn't it work with the update="set" flag ? The problem with replacing the document is that instead of just setting the field/fields to be updated, you have to submit the whole document and their children. 
As I've said before, the data gets "scrambled" only after the *optimise* is performed. 

Am I to understand that this bug won't be solved anytime soon ? 

> ChildDocTransformer doesn't return correct children after updating and optimising sol'r index
> ---------------------------------------------------------------------------------------------
>
>                 Key: SOLR-6700
>                 URL: https://issues.apache.org/jira/browse/SOLR-6700
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Bogdan Marinescu
>            Priority: Blocker
>             Fix For: 4.10.3, 5.0
>
>
> I have an index with nested documents. 
> {code:title=schema.xml snippet|borderStyle=solid}
>  <field name="id" type="string" indexed="true" stored="true" required="true" multiValued="false" />
> <field name="entityType" type="int" indexed="true" stored="true" required="true"/>
> <field name="pName" type="string" indexed="true" stored="true"/>
> <field name="cAlbum" type="string" indexed="true" stored="true"/>
> <field name="cSong" type="string" indexed="true" stored="true"/>
> <field name="_root_" type="string" indexed="true" stored="true"/>
> <field name="_version_" type="long" indexed="true" stored="true"/>
> {code}
> Afterwards I add the following documents:
> {code}
> <add>
>   <doc>
>     <field name="id">1</field>
>     <field name="pName">Test Artist 1</field>
>     <field name="entityType">1</field>
>     <doc>
>         <field name="id">11</field>
>         <field name="cAlbum">Test Album 1</field>
> 	    <field name="cSong">Test Song 1</field>
>         <field name="entityType">2</field>
>     </doc>
>   </doc>
>   <doc>
>     <field name="id">2</field>
>     <field name="pName">Test Artist 2</field>
>     <field name="entityType">1</field>
>     <doc>
>         <field name="id">22</field>
>         <field name="cAlbum">Test Album 2</field>
> 	    <field name="cSong">Test Song 2</field>
>         <field name="entityType">2</field>
>     </doc>
>   </doc>
> </add>
> {code}
> After performing the following query 
> {quote}
> http://localhost:8983/solr/collection1/select?q=%7B!parent+which%3DentityType%3A1%7D&fl=*%2Cscore%2C%5Bchild+parentFilter%3DentityType%3A1%5D&wt=json&indent=true
> {quote}
> I get a correct answer (child matches parent, check _root_ field)
> {code:title=add docs|borderStyle=solid}
> {
>   "responseHeader":{
>     "status":0,
>     "QTime":1,
>     "params":{
>       "fl":"*,score,[child parentFilter=entityType:1]",
>       "indent":"true",
>       "q":"{!parent which=entityType:1}",
>       "wt":"json"}},
>   "response":{"numFound":2,"start":0,"maxScore":1.0,"docs":[
>       {
>         "id":"1",
>         "pName":"Test Artist 1",
>         "entityType":1,
>         "_version_":1483832661048819712,
>         "_root_":"1",
>         "score":1.0,
>         "_childDocuments_":[
>         {
>           "id":"11",
>           "cAlbum":"Test Album 1",
>           "cSong":"Test Song 1",
>           "entityType":2,
>           "_root_":"1"}]},
>       {
>         "id":"2",
>         "pName":"Test Artist 2",
>         "entityType":1,
>         "_version_":1483832661050916864,
>         "_root_":"2",
>         "score":1.0,
>         "_childDocuments_":[
>         {
>           "id":"22",
>           "cAlbum":"Test Album 2",
>           "cSong":"Test Song 2",
>           "entityType":2,
>           "_root_":"2"}]}]
>   }}
> {code}
> Afterwards I try to update one document:
> {code:title=update doc|borderStyle=solid}
> <add>
> <doc>
> <field name="id">1</field>
> <field name="pName" update="set">INIT</field>
> </doc>
> </add>
> {code}
> After performing the previous query I get the right result (like the previous one but with the pName field updated).
> The problem only comes after performing an *optimize*. 
> Now, the same query yields the following result:
> {code}
> {
>   "responseHeader":{
>     "status":0,
>     "QTime":1,
>     "params":{
>       "fl":"*,score,[child parentFilter=entityType:1]",
>       "indent":"true",
>       "q":"{!parent which=entityType:1}",
>       "wt":"json"}},
>   "response":{"numFound":2,"start":0,"maxScore":1.0,"docs":[
>       {
>         "id":"2",
>         "pName":"Test Artist 2",
>         "entityType":1,
>         "_version_":1483832661050916864,
>         "_root_":"2",
>         "score":1.0,
>         "_childDocuments_":[
>         {
>           "id":"11",
>           "cAlbum":"Test Album 1",
>           "cSong":"Test Song 1",
>           "entityType":2,
>           "_root_":"1"},
>         {
>           "id":"22",
>           "cAlbum":"Test Album 2",
>           "cSong":"Test Song 2",
>           "entityType":2,
>           "_root_":"2"}]},
>       {
>         "id":"1",
>         "pName":"INIT",
>         "entityType":1,
>         "_root_":"1",
>         "_version_":1483832916867809280,
>         "score":1.0}]
>   }}
> {code}
> As can be seen, the document with id:2 now contains the child with id:11 that belongs to the document with id:1. 
> I haven't found any references on the web about this except http://blog.griddynamics.com/2013/09/solr-block-join-support.html
> Similar issue: SOLR-6096
> Is this problem known? Is there a workaround for this? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org