You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by "raj.yadav" <ra...@cse.ism.ac.in> on 2020/09/29 10:59:27 UTC

How to Resolve : "The request took too long to iterate over doc values"?

In our index, we have few fields defined as `ExternalFileField` field type.
We decided to use docValues for such fields. Here is the field type
definition

OLD => (ExternalFileField)
<fieldType name="ext_file_field" indexed="false" stored="true" keyField="id"
defVal="0.0" class="solr.ExternalFileField"/>

NEW => (docValues)
<fieldType name="doc_value_field"  class="solr.FloatPointField"
indexed="false" stored="false"  docValues="true"
useDocValuesAsStored="false"/>

After this modification we started getting the following `timeout warning`
messages:

```The request took too long to iterate over doc values. Timeout: timeoutAt:
1626463774823735 (System.nanoTime(): 1626463774836490),&#8203;
DocValues=org.apache.lucene.codecs.lucene80.Lucene80DocValuesProducer$8@4efddff
```

Our system configuration:
Each Solr Instance: 8 vcpus, 64 GiB memory
JAVA Memory: 30GB
Collection: 4 shards (each shard has approximately 12 million docs and index
size of 12 GB) and each Solr instance has one replica of the shard. 

GC_TUNE="-XX:NewRatio=3 \
-XX:SurvivorRatio=4 \
-XX:PermSize=64m \
-XX:MaxPermSize=64m \
-XX:TargetSurvivorRatio=80 \
-XX:MaxTenuringThreshold=9 \
-XX:+UseConcMarkSweepGC \
-XX:+UseParNewGC \
-XX:+CMSClassUnloadingEnabled \
-XX:ConcGCThreads=4 -XX:ParallelGCThreads=4 \
-XX:+CMSScavengeBeforeRemark \
-XX:PretenureSizeThreshold=64m \
-XX:+UseCMSInitiatingOccupancyOnly \
-XX:CMSInitiatingOccupancyFraction=50 \
-XX:CMSMaxAbortablePrecleanTime=6000 \
-XX:+CMSParallelRemarkEnabled \
-XX:+ParallelRefProcEnabled"
 
1. What this warning message means?
2. How to resolve it?




--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: How to Resolve : "The request took too long to iterate over doc values"?

Posted by "raj.yadav" <ra...@cse.ism.ac.in>.
I went through other queries for which we are getting `The request took too
long to iterate over doc values` warning. As pointed by Erick I have cross
check all the fields that are being used in query and there is no such field
against which we are searching and it as index=false and docValues=true. 

Few observations I would like to share here:

- We are performing a load test on our system and the above timeout warning
is occurring for only those queries which are fetching a large number of
documents. 

- I had stopped all the load on the system and fired same queries (for which
we were getting timeout warning). Here is solr response:

Solr Response:
response: {
numFound: 6082251,
start: 0,
maxScore: 4709.594,
docs: [ ]
}

The response was quite weird (header is saying there are `6082251` docs
found but `docs` array is empty) also there was no timeout warning in logs.
Then I increased `timeAllowed` to 5000ms (default is 1000ms). This time
`docs` array was not empty and in fact there was an increase in numFound
count. This clearly points that query was not able to complete in 1000ms
(default timeAllowed).

I have following question:
1. Is doc value is as effiecient as ExternalFileField for functional query?
2. Why I got warning message when system was under load but no when there
was no laod?

When we were performing load test (load scale is same) with 
ExternalFileField type were not getting any warning messages in our logs.



raj.yadav wrote
> Hey Erick,
> 
> In cases for which we are getting this warning, I'm not able to extract
> the
> `exact solr query`. Instead logger is logging `parsedquery ` for such
> cases.
> Here is one example:
> 
> ````
> 2020-09-29 13:09:41.279 WARN  (qtp926837661-82461) [c:mycollection
> s:shard1_0 r:core_node5 x:mycollection_shard1_0_replica_n3]
> o.a.s.s.SolrIndexSearcher Query: [+FunctionScoreQuery(+*:*, scored by
> boost(product(if(max(const(0),
>                 sub(float(my_doc_value_field1),const(500))),const(0.01),
>                
> if(max(const(0),sub(float(my_doc_value_field2),const(290))),const(0.2),const(1))),
>                
> sqrt(product(sum(const(1),float(my_doc_value_field3),float(my_doc_value_field4)),
>                 sqrt(sum(const(1),float(my_doc_value_field5))))))))
> #BitSetDocTopFilter]; The request took too long to iterate over doc
> values.
> Timeout: timeoutAt: 1635297585120522 (System.nanoTime():
> 1635297690311384),
> DocValues=org.apache.lucene.codecs.lucene80.Lucene80DocValuesProducer$8@7df12bf1
> 
> ````
> 
> As per my understanding query in the above case is `q=*:*`. And then there
> is boost function which uses functional query on my_doc_value_field*
> (fieldtype doc_value_field i.e having index=false and docValue = true) to
> reorder matched docs. If docValue works efficiently for _function queries_
> then why this warning are coming?
> 
> 
> Also, we do use frange queries on doc_value_field (having index=false and
> docValue = true).
> example:
> {!frange l=1.0}my_doc_value_field1
> 
> 
> Erick Erickson wrote
>> Let’s see the query. My bet is that you are _searching_ against the field
>> and have indexed=false.
>> 
>> Searching against a docValues=true indexed=false field results in the
>> equivalent of a “table scan” in the RDBMS world. You may use
>> the docValues efficiently for _function queries_ to mimic some
>> search behavior.
>> 
>> Best,
>> Erick
> 
> 
> 
> 
> 
> --
> Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html





--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: How to Resolve : "The request took too long to iterate over doc values"?

Posted by "raj.yadav" <ra...@cse.ism.ac.in>.
Hi,

I went through other queries for which we are getting `The request took too
long to iterate over doc values` warning. As pointed by Erick I have cross
check all the fields that are being used in query and there is no such field
against which we are searching and it as index=false and docValues=true.

Few observations I would like to share here:

- We are performing a load test on our system and the above timeout warning
is occurring for only those queries which are fetching a large number of
documents.

- I had stopped all the load on the system and fired same queries (for which
we were getting timeout warning). Here is solr response:

Solr Response:
response: {
numFound: 6082251,
start: 0,
maxScore: 4709.594,
docs: [ ]
}

The response was quite weird (header is saying there are `6082251` docs
found but `docs` array is empty) also there was no timeout warning in logs.
Then I increased `timeAllowed` to 5000ms (default is 1000ms). This time
`docs` array was not empty and in fact there was an increase in numFound
count. This clearly points that query was not able to complete in 1000ms
(default timeAllowed).

I have following question:
1. Is doc value is as effiecient as ExternalFileField for functional query?
2. Why I got warning message when system was under load but no warning
message was thrown when there was no not under laod?

When we were performing load test (load scale is same) with 
ExternalFileField type were not getting any warning messages in our logs.

Regards,
Raj



--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: How to Resolve : "The request took too long to iterate over doc values"?

Posted by "raj.yadav" <ra...@cse.ism.ac.in>.
raj.yadav wrote
> In cases for which we are getting this warning, I'm not able to extract
> the
> `exact solr query`. Instead logger is logging `parsedquery ` for such
> cases.
> Here is one example:
> 
> ````
> 2020-09-29 13:09:41.279 WARN  (qtp926837661-82461) [c:mycollection
> s:shard1_0 r:core_node5 x:mycollection_shard1_0_replica_n3]
> o.a.s.s.SolrIndexSearcher Query: [+FunctionScoreQuery(+*:*, scored by
> boost(product(if(max(const(0),
>                 sub(float(my_doc_value_field1),const(500))),const(0.01),
>                
> if(max(const(0),sub(float(my_doc_value_field2),const(290))),const(0.2),const(1))),
>                
> sqrt(product(sum(const(1),float(my_doc_value_field3),float(my_doc_value_field4)),
>                 sqrt(sum(const(1),float(my_doc_value_field5))))))))
> #BitSetDocTopFilter]; The request took too long to iterate over doc
> values.
> Timeout: timeoutAt: 1635297585120522 (System.nanoTime():
> 1635297690311384),
> DocValues=org.apache.lucene.codecs.lucene80.Lucene80DocValuesProducer$8@7df12bf1
> 
> ````


Hi Community members,

In my previous mail, I had mentioned that solr is not logging actual
`solr_query` and instead its only logging parsedquery. Actually, solr is
logging the solr_query just after logging above warning message.

Coming back to the above query for which we are getting above warning:
QUERY => retrieve all docs (i.e q = *:*) and ordered them using
multiplicative boost function (i.e boost functional query). So this clearly
rules out the possibility mentioned by Erick (i.e query might be searching
against field which has indexed=false and docValue=true). 

Is this expected on using doc values for the functional query? This is only
happening when the query is retrieving  large number of documents (in
millions).  Has anyone else faced this issue before? We are experiencing
this issue even when there is no load on the system.

Regards,
Raj






--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: How to Resolve : "The request took too long to iterate over doc values"?

Posted by "raj.yadav" <ra...@cse.ism.ac.in>.
Hi,

I went through other queries for which we are getting `The request took too
long to iterate over doc values` warning. As pointed by Erick I have cross
check all the fields that are being used in query and there is no such field
against which we are searching and it as index=false and docValues=true.

Few observations I would like to share here:

- We are performing a load test on our system and the above timeout warning
is occurring for only those queries which are fetching a large number of
documents.

- I had stopped all the load on the system and fired same queries (for which
we were getting timeout warning). Here is solr response:

Solr Response:
response: {
numFound: 6082251,
start: 0,
maxScore: 4709.594,
docs: [ ]
}

The response was quite weird (header is saying there are `6082251` docs
found but `docs` array is empty) also there was no timeout warning in logs.
Then I increased `timeAllowed` to 5000ms (default is 1000ms). This time
`docs` array was not empty and in fact there was an increase in numFound
count. This clearly points that query was not able to complete in 1000ms
(default timeAllowed).

I have following question:
1. Is doc value is as effiecient as ExternalFileField for functional query?
2. Why I got warning message when system was under load but no when there
was no laod?

When we were performing load test (load scale is same) with 
ExternalFileField type were not getting any warning messages in our logs.

Regards,
Raj



--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: How to Resolve : "The request took too long to iterate over doc values"?

Posted by "raj.yadav" <ra...@cse.ism.ac.in>.
Hey Erick,

In cases for which we are getting this warning, I'm not able to extract the
`exact solr query`. Instead logger is logging `parsedquery ` for such cases.
Here is one example:

````
2020-09-29 13:09:41.279 WARN  (qtp926837661-82461) [c:mycollection
s:shard1_0 r:core_node5 x:mycollection_shard1_0_replica_n3]
o.a.s.s.SolrIndexSearcher Query: [+FunctionScoreQuery(+*:*, scored by
boost(product(if(max(const(0),sub(float(my_doc_value_field1),const(50))),const(0.01),if(max(const(0),sub(float(my_doc_value_field2),const(29))),const(0.2),const(1))),sqrt(product(sum(const(1),float(my_doc_value_field3),float(my_doc_value_field4)),sqrt(sum(const(1),float(my_doc_value_field5))))))))
#BitSetDocTopFilter]; The request took too long to iterate over doc values.
Timeout: timeoutAt: 1635297585120522 (System.nanoTime(): 1635297690311384),
DocValues=org.apache.lucene.codecs.lucene80.Lucene80DocValuesProducer$8@7df12bf1

````

As per my understanding query in the above case is `q=*:*`. And then there
is boost function which uses functional query on my_doc_value_field*
(fieldtype doc_value_field i.e having index=false and docValue = true) to
reorder matched docs. If docValue works efficiently for _function queries_
then why this warning are coming?


Also, we do use frange queries on doc_value_field (having index=false and
docValue = true).
example:
{!frange l=1.0}my_doc_value_field1


Erick Erickson wrote
> Let’s see the query. My bet is that you are _searching_ against the field
> and have indexed=false.
> 
> Searching against a docValues=true indexed=false field results in the
> equivalent of a “table scan” in the RDBMS world. You may use
> the docValues efficiently for _function queries_ to mimic some
> search behavior.
> 
> Best,
> Erick





--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: How to Resolve : "The request took too long to iterate over doc values"?

Posted by Erick Erickson <er...@gmail.com>.
Let’s see the query. My bet is that you are _searching_ against the field and have indexed=false.

Searching against a docValues=true indexed=false field results in the
equivalent of a “table scan” in the RDBMS world. You may use
the docValues efficiently for _function queries_ to mimic some
search behavior.

Best,
Erick

> On Sep 29, 2020, at 6:59 AM, raj.yadav <ra...@cse.ism.ac.in> wrote:
> 
> In our index, we have few fields defined as `ExternalFileField` field type.
> We decided to use docValues for such fields. Here is the field type
> definition
> 
> OLD => (ExternalFileField)
> <fieldType name="ext_file_field" indexed="false" stored="true" keyField="id"
> defVal="0.0" class="solr.ExternalFileField"/>
> 
> NEW => (docValues)
> <fieldType name="doc_value_field"  class="solr.FloatPointField"
> indexed="false" stored="false"  docValues="true"
> useDocValuesAsStored="false"/>
> 
> After this modification we started getting the following `timeout warning`
> messages:
> 
> ```The request took too long to iterate over doc values. Timeout: timeoutAt:
> 1626463774823735 (System.nanoTime(): 1626463774836490),&#8203;
> DocValues=org.apache.lucene.codecs.lucene80.Lucene80DocValuesProducer$8@4efddff
> ```
> 
> Our system configuration:
> Each Solr Instance: 8 vcpus, 64 GiB memory
> JAVA Memory: 30GB
> Collection: 4 shards (each shard has approximately 12 million docs and index
> size of 12 GB) and each Solr instance has one replica of the shard. 
> 
> GC_TUNE="-XX:NewRatio=3 \
> -XX:SurvivorRatio=4 \
> -XX:PermSize=64m \
> -XX:MaxPermSize=64m \
> -XX:TargetSurvivorRatio=80 \
> -XX:MaxTenuringThreshold=9 \
> -XX:+UseConcMarkSweepGC \
> -XX:+UseParNewGC \
> -XX:+CMSClassUnloadingEnabled \
> -XX:ConcGCThreads=4 -XX:ParallelGCThreads=4 \
> -XX:+CMSScavengeBeforeRemark \
> -XX:PretenureSizeThreshold=64m \
> -XX:+UseCMSInitiatingOccupancyOnly \
> -XX:CMSInitiatingOccupancyFraction=50 \
> -XX:CMSMaxAbortablePrecleanTime=6000 \
> -XX:+CMSParallelRemarkEnabled \
> -XX:+ParallelRefProcEnabled"
> 
> 1. What this warning message means?
> 2. How to resolve it?
> 
> 
> 
> 
> --
> Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html