You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Jeffery Yuan <yu...@gmail.com> on 2017/05/14 06:16:37 UTC

SolrJ - How to add a blocked document without child documents

Nested documents is quite useful to model structural hierarchy data. 

Sometimes, we only have parent document which doesn't have child documents
yet, we want to add it first, and then later update it: re-add the whole
document including the parent documents and its all child documents.

But we found out that in the server, there would be two parent documents
with same id: one without child document, the other one which contains child
documents.

http://localhost:8983/solr/thecollection_shard1_replica2/select?q=id:*&fl=*,[docid]&distrib=false
<result name="response" numFound="3" start="0">
  <doc>
    <str name="docType">parent</str>
    <str name="id">9816c0f3-f3ae-4a7c-a5fe-89a2c481467a</str>
    <int name="[docid]">0</int>
  </doc>
  <doc>
    <str name="docType">child</str>
    <str name="id">e27d2709-2dc0-439d-b017-4d95212bf05f</str>
    <arr name="_root_">
      <str>9816c0f3-f3ae-4a7c-a5fe-89a2c481467a</str>
    </arr>
    <int name="[docid]">1</int>
  </doc>
  <doc>
    <str name="docType">parent</str>
    <str name="id">9816c0f3-f3ae-4a7c-a5fe-89a2c481467a</str>
    <arr name="_root_">
      <str>9816c0f3-f3ae-4a7c-a5fe-89a2c481467a</str>
    </arr>
    <int name="[docid]">2</int>
  </doc>
</result>

How I can avoid the duplicate parent documents?
How could I add a blocked document without child documents?

- I can workaround this by delete first before add new documents but the
performance would suffer

Thanks a lot for your help and response.




--
View this message in context: http://lucene.472066.n3.nabble.com/SolrJ-How-to-add-a-blocked-document-without-child-documents-tp4335006.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: SolrJ - How to add a blocked document without child documents

Posted by Jeffery Yuan <yu...@gmail.com>.
Mikhail Khludnev provided the workaround in
https://issues.apache.org/jira/browse/SOLR-6096:
So, far the workaround is to nest empty child w/o fields or with id only
field.
-- It works



--
View this message in context: http://lucene.472066.n3.nabble.com/SolrJ-How-to-add-a-blocked-document-without-child-documents-tp4335006p4342031.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: SolrJ - How to add a blocked document without child documents

Posted by Jeffery Yuan <yu...@gmail.com>.
Yes, the id is the unique key.

I think maybe this is because the first one (a parent doc(Parent1) without
any children) is not a block (I don't really know what's the term), so later
when we add same parent (Parent2) with some children, the first one is
somehow left alone.

- If we update the parent document again with some new child documents, it
will update Parent2 correctly, but still leave/keep Parent1.

This issue is talked in some jiras like
https://issues.apache.org/jira/browse/SOLR-6096.



--
View this message in context: http://lucene.472066.n3.nabble.com/SolrJ-How-to-add-a-blocked-document-without-child-documents-tp4335006p4335441.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: SolrJ - How to add a blocked document without child documents

Posted by Zheng Lin Edwin Yeo <ed...@gmail.com>.
Is the id your unique key in the collections? By right if your id is the
unique key, it will be overwritten automatically if the id is the same,
when you add the same parent documents with child documents.

Regards,
Edwin


On 16 May 2017 at 08:25, Jeffery Yuan <yu...@gmail.com> wrote:

> Hi, Damien Kamerman
>
>   Thanks for your reply. The problem is when we add a parent documents
> which
> doesn't contain child info yet.
>   Later we will add same parent documents with child documents.
>
>   But this would cause 2 parent documents with same id in the solr index.
>
>   I workaround this issue by always deleting first, but I am wondering
> whether there is better approach.
>
> Thanks
> Jeffery Yuan
>
>
>
>
> --
> View this message in context: http://lucene.472066.n3.
> nabble.com/SolrJ-How-to-add-a-blocked-document-without-child-documents-
> tp4335006p4335195.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: SolrJ - How to add a blocked document without child documents

Posted by Jeffery Yuan <yu...@gmail.com>.
Hi, Damien Kamerman

  Thanks for your reply. The problem is when we add a parent documents which
doesn't contain child info yet. 
  Later we will add same parent documents with child documents.

  But this would cause 2 parent documents with same id in the solr index.

  I workaround this issue by always deleting first, but I am wondering
whether there is better approach.

Thanks
Jeffery Yuan




--
View this message in context: http://lucene.472066.n3.nabble.com/SolrJ-How-to-add-a-blocked-document-without-child-documents-tp4335006p4335195.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: SolrJ - How to add a blocked document without child documents

Posted by Damien Kamerman <da...@gmail.com>.
Does this fl help?

fl=*,[child childFilter="docType:child" parentFilter=docType:parent]

On 14 May 2017 at 16:16, Jeffery Yuan <yu...@gmail.com> wrote:

> Nested documents is quite useful to model structural hierarchy data.
>
> Sometimes, we only have parent document which doesn't have child documents
> yet, we want to add it first, and then later update it: re-add the whole
> document including the parent documents and its all child documents.
>
> But we found out that in the server, there would be two parent documents
> with same id: one without child document, the other one which contains
> child
> documents.
>
> http://localhost:8983/solr/thecollection_shard1_replica2/
> select?q=id:*&fl=*,[docid]&distrib=false
> <result name="response" numFound="3" start="0">
>   <doc>
>     <str name="docType">parent</str>
>     <str name="id">9816c0f3-f3ae-4a7c-a5fe-89a2c481467a</str>
>     <int name="[docid]">0</int>
>   </doc>
>   <doc>
>     <str name="docType">child</str>
>     <str name="id">e27d2709-2dc0-439d-b017-4d95212bf05f</str>
>     <arr name="_root_">
>       <str>9816c0f3-f3ae-4a7c-a5fe-89a2c481467a</str>
>     </arr>
>     <int name="[docid]">1</int>
>   </doc>
>   <doc>
>     <str name="docType">parent</str>
>     <str name="id">9816c0f3-f3ae-4a7c-a5fe-89a2c481467a</str>
>     <arr name="_root_">
>       <str>9816c0f3-f3ae-4a7c-a5fe-89a2c481467a</str>
>     </arr>
>     <int name="[docid]">2</int>
>   </doc>
> </result>
>
> How I can avoid the duplicate parent documents?
> How could I add a blocked document without child documents?
>
> - I can workaround this by delete first before add new documents but the
> performance would suffer
>
> Thanks a lot for your help and response.
>
>
>
>
> --
> View this message in context: http://lucene.472066.n3.
> nabble.com/SolrJ-How-to-add-a-blocked-document-without-
> child-documents-tp4335006.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>