You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@solr.apache.org by Taisuke Miyazaki <mi...@lifull.com> on 2021/10/11 09:23:59 UTC

Is updating using nested document slow?

Hello there,

I started using Solr, which utilizes nested documents.

There are about 6 million parent documents and about 15 million child
documents.
In other words, each parent document has an average of 2-3 child documents.
(In reality, there is a significant bias.)

In Solr without child documents, the index size is 22.88 GB, and in Solr
with child documents, it is 24.55 GB.
So the size of the child documents is 1.67GB.

I am running on SolrCloud and the number of shards is 1.


The search query is working well for now, but there is a delay in updating
the documents.

Throughput for updates: 400,000/hour
Parallelism (Lambda concurrency): 20
Number of updated documents per update request: 40
In case of error, retry up to 3 times.

With the above settings, I was able to handle the Solr timeout of 15
seconds without any problems, but when I started updating child documents
as well, timeouts occurred frequently.

What could be the cause?
There was no noticeable difference in the Solr metrics, except that the CPU
usage of about 10% increased by 2-3%.

Are there too many documents to update at one time? Or is the update
throughput too high? Are there not enough shards for the number of
documents?

Thanks,
Taisuke