You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@solr.apache.org by Susmit <sh...@gmail.com> on 2021/03/06 15:35:06 UTC

Re: Idle timeout expired and Early Client Disconnect errors

i have used a workaround to increase the default (hard coded) timeout of 2 min in solrclientcache. 
i can run 9+ hour long streaming queries with no issues.

Sent from my iPhone

> On Mar 2, 2021, at 5:32 PM, ufuk yılmaz <uy...@vivaldi.net.invalid> wrote:
> 
> I divided the query to 1000 pieces and removed the parallel stream clause, it seems to be working without timeout so far, if it does I just can divide it to even smaller pieces I guess.
> 
> I tried to send all 1000 pieces in a “list” expression to be executed linearly, it didn’t work but I was just curious if it could handle such a large query 😃
> 
> Now I’m just generating expression strings from java code and sending them one by one. I tried to use SolrJ for this, but encountered a weird problem where even the simplest expression (echo) stops working after a few iterations in a loop. I’m guessing the underlying HttpClient is not closing connections timely, hitting the OS per-host connection limit. I asked a separate question about this. I was following the example on lucidworks: https://lucidworks.com/post/streaming-expressions-in-solrj/
> 
> I just modified my code to use regular REST calls using okhttp3, it’s a shame that I couldn’t use SolrJ since it truly streams every result 1 by 1 continuously. REST just returns a single large response at the very end of the stream.
> 
> Thanks again for your help.
> 
> Sent from Mail for Windows 10
> 
> From: Joel Bernstein
> Sent: 02 March 2021 00:19
> To: solr-user@lucene.apache.org
> Subject: Re: Idle timeout expired and Early Client Disconnect errors
> 
> Also the parallel function builds hash partitioning filters that could lead
> to timeouts if they take too long to build. Try the query without the
> parallel function if you're still getting timeouts when making the query
> smaller.
> 
> 
> 
> Joel Bernstein
> http://joelsolr.blogspot.com/
> 
> 
>> On Mon, Mar 1, 2021 at 4:03 PM Joel Bernstein <jo...@gmail.com> wrote:
>> 
>> The settings in your version are 30 seconds and 15 seconds for socket and
>> connection timeouts.
>> 
>> Typically timeouts occur because one or more shards in the query are idle
>> beyond the timeout threshold. This happens because lot's of data is being
>> read from other shards.
>> 
>> Breaking the query into small parts would be a good strategy.
>> 
>> 
>> 
>> 
>> Joel Bernstein
>> http://joelsolr.blogspot.com/
>> 
>> 
>> On Mon, Mar 1, 2021 at 3:30 PM ufuk yılmaz <uy...@vivaldi.net.invalid>
>> wrote:
>> 
>>> Hello Mr. Bernstein,
>>> 
>>> I’m using version 8.4. So, if I understand correctly, I can’t increase
>>> timeouts and they are bound to happen in such a large stream. Should I just
>>> reduce the output of my search expressions?
>>> 
>>> Maybe I can split my search results into ~100 parts and run the same
>>> query 100 times in series. Each part would emit ~3M documents so they
>>> should finish before timeout?
>>> 
>>> Is this a reasonable solution?
>>> 
>>> Btw how long is the default hard-coded timeout value? Because yesterday I
>>> ran another query which took more than 1 hour without any timeouts and
>>> finished successfully.
>>> 
>>> Sent from Mail for Windows 10
>>> 
>>> From: Joel Bernstein
>>> Sent: 01 March 2021 23:03
>>> To: solr-user@lucene.apache.org
>>> Subject: Re: Idle timeout expired and Early Client Disconnect errors
>>> 
>>> Oh wait, I misread your email. The idle timeout issue is configurable in:
>>> 
>>> https://issues.apache.org/jira/browse/SOLR-14672
>>> 
>>> This unfortunately missed the 8.8 release and will be 8.9.
>>> 
>>> 
>>> 
>>> This i
>>> 
>>> 
>>> 
>>> Joel Bernstein
>>> http://joelsolr.blogspot.com/
>>> 
>>> 
>>>> On Mon, Mar 1, 2021 at 2:56 PM Joel Bernstein <jo...@gmail.com> wrote:
>>> 
>>>> What version are you using?
>>>> 
>>>> Solr 8.7 has changes that caused these errors to hit the logs. These
>>> used
>>>> to be suppressed. This has been fixed in Solr 9.0 but it has not been
>>> back
>>>> ported to Solr 8.x.
>>>> 
>>>> The errors are actually normal operational occurrences when doing joins
>>> so
>>>> should be suppressed in the logs and were before the specific release.
>>>> 
>>>> It might make sense to do a release that specifically suppresses these
>>>> errors without backporting the full Solr 9.0 changes which impact the
>>>> memory footprint of export.
>>>> 
>>>> 
>>>> 
>>>> 
>>>> Joel Bernstein
>>>> http://joelsolr.blogspot.com/
>>>> 
>>>> 
>>>> On Mon, Mar 1, 2021 at 10:29 AM ufuk yılmaz <uyilmaz@vivaldi.net.invalid
>>>> 
>>>> wrote:
>>>> 
>>>>> Hello all,
>>>>> 
>>>>> I’m running a large streaming expression and feeding the result to
>>> update
>>>>> expression.
>>>>> 
>>>>> update(targetCollection, ...long running stream here...,
>>>>> 
>>>>> I tried sending the exact same query multiple times, it sometimes works
>>>>> and indexes some results, then gives exception, other times fails with
>>> an
>>>>> exception after 2 minutes.
>>>>> 
>>>>> Response is like:
>>>>> "EXCEPTION":"java.util.concurrent.ExecutionException:
>>>>> java.io.IOException: params distrib=false&numWorkers=4.... and my long
>>>>> stream expression
>>>>> 
>>>>> Server log (short):
>>>>> [c:DNM s:shard1 r:core_node2 x:DNM_shard1_replica_n1]
>>>>> o.a.s.s.HttpSolrCall null:java.io.IOException:
>>>>> java.util.concurrent.TimeoutException: Idle timeout expired:
>>> 120000/120000
>>>>> ms
>>>>> o.a.s.s.HttpSolrCall null:java.io.IOException:
>>>>> java.util.concurrent.TimeoutException: Idle timeout expired:
>>> 120000/120000
>>>>> ms
>>>>> 
>>>>> I tried to increase the jetty idle timeout value on the node which
>>> hosts
>>>>> my target collection to something like an hour. It didn’t affect.
>>>>> 
>>>>> 
>>>>> Server logs (long)
>>>>> ERROR (qtp832292933-589) [c:DNM s:shard1 r:core_node2
>>>>> x:DNM_shard1_replica_n1] o.a.s.s.HttpSolrCall null:java.io.IOException:
>>>>> java.util.concurrent.TimeoutException: Idle timeout expired: 1
>>>>>                                20000/120000 ms
>>>>> solr-01    |    at
>>>>> 
>>> org.eclipse.jetty.util.SharedBlockingCallback$Blocker.block(SharedBlockingCallback.java:235)
>>>>> solr-01    |    at
>>>>> org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:226)
>>>>> solr-01    |    at
>>>>> org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:524)
>>>>> solr-01    |    at
>>>>> 
>>> org.apache.solr.servlet.ServletOutputStreamWrapper.write(ServletOutputStreamWrapper.java:134)
>>>>> solr-01    |    at
>>>>> java.base/sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:233)
>>>>> solr-01    |    at
>>>>> java.base/sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:303)
>>>>> solr-01    |    at
>>>>> java.base/sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:281)
>>>>> solr-01    |    at
>>>>> java.base/sun.nio.cs.StreamEncoder.write(StreamEncoder.java:125)
>>>>> solr-01    |    at java.base/java.io
>>>>> .OutputStreamWriter.write(OutputStreamWriter.java:211)
>>>>> solr-01    |    at
>>>>> org.apache.solr.common.util.FastWriter.flush(FastWriter.java:140)
>>>>> solr-01    |    at
>>>>> org.apache.solr.common.util.FastWriter.write(FastWriter.java:54)
>>>>> solr-01    |    at
>>>>> org.apache.solr.response.JSONWriter._writeChar(JSONWriter.java:173)
>>>>> solr-01    |    at
>>>>> 
>>> org.apache.solr.common.util.JsonTextWriter.writeStr(JsonTextWriter.java:86)
>>>>> solr-01    |    at
>>>>> org.apache.solr.common.util.TextWriter.writeVal(TextWriter.java:52)
>>>>> solr-01    |    at
>>>>> 
>>> org.apache.solr.response.TextResponseWriter.writeVal(TextResponseWriter.java:152)
>>>>> solr-01    |    at
>>>>> 
>>> org.apache.solr.common.util.JsonTextWriter$2.put(JsonTextWriter.java:176)
>>>>> solr-01    |    at
>>>>> org.apache.solr.common.MapWriter$EntryWriter.put(MapWriter.java:154)
>>>>> solr-01    |    at
>>>>> 
>>> org.apache.solr.handler.export.StringFieldWriter.write(StringFieldWriter.java:77)
>>>>> solr-01    |    at
>>>>> 
>>> org.apache.solr.handler.export.ExportWriter.writeDoc(ExportWriter.java:313)
>>>>> solr-01    |    at
>>>>> 
>>> org.apache.solr.handler.export.ExportWriter.lambda$addDocsToItemWriter$4(ExportWriter.java:263)
>>>>> --
>>>>> solr-01    |    at org.eclipse.jetty.io
>>>>> .FillInterest.fillable(FillInterest.java:103)
>>>>> solr-01    |    at org.eclipse.jetty.io
>>>>> .ChannelEndPoint$2.run(ChannelEndPoint.java:117)
>>>>> solr-01    |    at
>>>>> 
>>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333)
>>>>> solr-01    |    at
>>>>> 
>>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310)
>>>>> solr-01    |    at
>>>>> 
>>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)
>>>>> solr-01    |    at
>>>>> 
>>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126)
>>>>> solr-01    |    at
>>>>> 
>>> org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366)
>>>>> solr-01    |    at
>>>>> 
>>> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:781)
>>>>> solr-01    |    at
>>>>> 
>>> org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:917)
>>>>> solr-01    |    at java.base/java.lang.Thread.run(Thread.java:834)
>>>>> solr-01    | Caused by: java.util.concurrent.TimeoutException: Idle
>>>>> timeout expired: 120000/120000 ms
>>>>> solr-01    |    at org.eclipse.jetty.io
>>>>> .IdleTimeout.checkIdleTimeout(IdleTimeout.java:171)
>>>>> solr-01    |    at org.eclipse.jetty.io
>>>>> .IdleTimeout.idleCheck(IdleTimeout.java:113)
>>>>> solr-01    |    at
>>>>> 
>>> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>>>>> solr-01    |    at
>>>>> java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>>>>> solr-01    |    at
>>>>> 
>>> java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
>>>>> solr-01    |    at
>>>>> 
>>> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>>>>> solr-01    |    at
>>>>> 
>>> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>>>>> solr-01    |    ... 1 more
>>>>> 
>>>>> 
>>>>> My expression, in case it helps. To summarize, it finds the document
>>> ids
>>>>> which exists on sourceCollection but not on target collection (DNM).
>>> Joins
>>>>> on itself to duplicate some fields (I couldn’t find another way to
>>>>> duplicate the value of field into 2 fields). Then sends the result to
>>>>> update. Source collection has about 300M documents, 24GB heap, 2
>>> shards, 2
>>>>> replicas of each shard.
>>>>> 
>>>>> update(
>>>>>    DNM,
>>>>>    batchSize=1000,
>>>>>    parallel(
>>>>>        WorkerCollection,
>>>>>        leftOuterJoin(
>>>>>            fetch(
>>>>>                sourceCollection,
>>>>>                complement(
>>>>>                    search(
>>>>>                        sourceCollection,
>>>>>                        q="*:*",
>>>>>                        qt="/export",
>>>>>                        fq="...some filters...",
>>>>>                        sort="id_str asc",
>>>>>                        fl="id_str",
>>>>>                        partitionKeys="id_str"
>>>>>                    ),
>>>>>                    search(
>>>>>                        DNM,
>>>>>                        q="*:*",
>>>>>                        qt="/export",
>>>>>                        sort="id_str asc",
>>>>>                        fl="id_str",
>>>>>                        partitionKeys="id_str"
>>>>>                    ),
>>>>>                    on="id_str"
>>>>>                ),
>>>>>                fl="...my many fields...",
>>>>>                on="id_str",
>>>>>                batchSize="1000"
>>>>>            ),
>>>>>            select(
>>>>>                fetch(
>>>>>                    sourceCollection,
>>>>>                    complement(
>>>>>                        search(
>>>>>                            sourceCollection,
>>>>>                            q="*:*",
>>>>>                            qt="/export",
>>>>>                            fq="...some other filters...",
>>>>>                            sort="id_str asc",
>>>>>                            fl="id_str",
>>>>>                            partitionKeys="id_str"
>>>>>                        ),
>>>>>                        search(
>>>>>                            DNM,
>>>>>                            q="*:*",
>>>>>                            qt="/export",
>>>>>                            sort="id_str asc",
>>>>>                            fl="id_str",
>>>>>                            partitionKeys="id_str"
>>>>>                        ),
>>>>>                        on="id_str"
>>>>>                    ),
>>>>>                    fl="...some other fields...",
>>>>>                    on="id_str",
>>>>>                    batchSize="1000"
>>>>>                ),
>>>>>                id_str, ..some other fields as...
>>>>>            ),
>>>>>            on="id_str"
>>>>>        ),
>>>>>        workers="4", sort="id_str asc"
>>>>>    )
>>>>> )
>>>>> 
>>>>> Sent from Mail for Windows 10
>>>>> 
>>>>> 
>>> 
>>> 
> 

RE: Idle timeout expired and Early Client Disconnect errors

Posted by ufuk yılmaz <uy...@vivaldi.net.INVALID>.
If I understand correctly, this ticket is about registering a new, custom expression. SolrClientCache and CloudSolrStream are more like backbone classes working behind every streaming expression. Is it really possible to modify them this way?

Sent from Mail for Windows 10

From: Joel Bernstein
Sent: 08 March 2021 22:02
To: users@solr.apache.org
Subject: Re: Idle timeout expired and Early Client Disconnect errors

This ticket shows how it is done in the solrconfig.xml:

https://issues.apache.org/jira/browse/SOLR-9103



Joel Bernstein
http://joelsolr.blogspot.com/


On Mon, Mar 8, 2021 at 9:18 AM ufuk yılmaz <uy...@vivaldi.net.invalid>
wrote:

> How do you “register” something like a CloudSolrStream btw? Using Blob
> Store API?
>
> Sent from Mail for Windows 10
>
> From: Susmit
> Sent: 06 March 2021 23:03
> To: users@solr.apache.org
> Subject: Re: Idle timeout expired and Early Client Disconnect errors
>
> better to use solr 8.9 and configure http timeouts from solr.in.sh
> workaround is bigger - need to extend cloudsolrstream , register it and
> install custom solrclientcache with overridden setcontext method
>
> Sent from my iPhone
>
> > On Mar 6, 2021, at 9:25 AM, ufuk yılmaz <uy...@vivaldi.net.invalid>
> wrote:
> >
> > How? O_O
> >
> > Sent from Mail for Windows 10
> >
> > From: Susmit
> > Sent: 06 March 2021 18:35
> > To: solr-user@lucene.apache.org
> > Subject: Re: Idle timeout expired and Early Client Disconnect errors
> >
> > i have used a workaround to increase the default (hard coded) timeout of
> 2 min in solrclientcache.
> > i can run 9+ hour long streaming queries with no issues.
> >
> > Sent from my iPhone
> >
> >> On Mar 2, 2021, at 5:32 PM, ufuk yılmaz <uy...@vivaldi.net.invalid>
> wrote:
> >>
> >> I divided the query to 1000 pieces and removed the parallel stream
> clause, it seems to be working without timeout so far, if it does I just
> can divide it to even smaller pieces I guess.
> >>
> >> I tried to send all 1000 pieces in a “list” expression to be executed
> linearly, it didn’t work but I was just curious if it could handle such a
> large query 😃
> >>
> >> Now I’m just generating expression strings from java code and sending
> them one by one. I tried to use SolrJ for this, but encountered a weird
> problem where even the simplest expression (echo) stops working after a few
> iterations in a loop. I’m guessing the underlying HttpClient is not closing
> connections timely, hitting the OS per-host connection limit. I asked a
> separate question about this. I was following the example on lucidworks:
> https://lucidworks.com/post/streaming-expressions-in-solrj/
> >>
> >> I just modified my code to use regular REST calls using okhttp3, it’s a
> shame that I couldn’t use SolrJ since it truly streams every result 1 by 1
> continuously. REST just returns a single large response at the very end of
> the stream.
> >>
> >> Thanks again for your help.
> >>
> >> Sent from Mail for Windows 10
> >>
> >> From: Joel Bernstein
> >> Sent: 02 March 2021 00:19
> >> To: solr-user@lucene.apache.org
> >> Subject: Re: Idle timeout expired and Early Client Disconnect errors
> >>
> >> Also the parallel function builds hash partitioning filters that could
> lead
> >> to timeouts if they take too long to build. Try the query without the
> >> parallel function if you're still getting timeouts when making the query
> >> smaller.
> >>
> >>
> >>
> >> Joel Bernstein
> >> http://joelsolr.blogspot.com/
> >>
> >>
> >>>> On Mon, Mar 1, 2021 at 4:03 PM Joel Bernstein <jo...@gmail.com>
> wrote:
> >>>
> >>> The settings in your version are 30 seconds and 15 seconds for socket
> and
> >>> connection timeouts.
> >>>
> >>> Typically timeouts occur because one or more shards in the query are
> idle
> >>> beyond the timeout threshold. This happens because lot's of data is
> being
> >>> read from other shards.
> >>>
> >>> Breaking the query into small parts would be a good strategy.
> >>>
> >>>
> >>>
> >>>
> >>> Joel Bernstein
> >>> http://joelsolr.blogspot.com/
> >>>
> >>>
> >>> On Mon, Mar 1, 2021 at 3:30 PM ufuk yılmaz <uyilmaz@vivaldi.net.invalid
> >
> >>> wrote:
> >>>
> >>>> Hello Mr. Bernstein,
> >>>>
> >>>> I’m using version 8.4. So, if I understand correctly, I can’t increase
> >>>> timeouts and they are bound to happen in such a large stream. Should
> I just
> >>>> reduce the output of my search expressions?
> >>>>
> >>>> Maybe I can split my search results into ~100 parts and run the same
> >>>> query 100 times in series. Each part would emit ~3M documents so they
> >>>> should finish before timeout?
> >>>>
> >>>> Is this a reasonable solution?
> >>>>
> >>>> Btw how long is the default hard-coded timeout value? Because
> yesterday I
> >>>> ran another query which took more than 1 hour without any timeouts and
> >>>> finished successfully.
> >>>>
> >>>> Sent from Mail for Windows 10
> >>>>
> >>>> From: Joel Bernstein
> >>>> Sent: 01 March 2021 23:03
> >>>> To: solr-user@lucene.apache.org
> >>>> Subject: Re: Idle timeout expired and Early Client Disconnect errors
> >>>>
> >>>> Oh wait, I misread your email. The idle timeout issue is configurable
> in:
> >>>>
> >>>> https://issues.apache.org/jira/browse/SOLR-14672
> >>>>
> >>>> This unfortunately missed the 8.8 release and will be 8.9.
> >>>>
> >>>>
> >>>>
> >>>> This i
> >>>>
> >>>>
> >>>>
> >>>> Joel Bernstein
> >>>> http://joelsolr.blogspot.com/
> >>>>
> >>>>
> >>>>> On Mon, Mar 1, 2021 at 2:56 PM Joel Bernstein <jo...@gmail.com>
> wrote:
> >>>>
> >>>>> What version are you using?
> >>>>>
> >>>>> Solr 8.7 has changes that caused these errors to hit the logs. These
> >>>> used
> >>>>> to be suppressed. This has been fixed in Solr 9.0 but it has not been
> >>>> back
> >>>>> ported to Solr 8.x.
> >>>>>
> >>>>> The errors are actually normal operational occurrences when doing
> joins
> >>>> so
> >>>>> should be suppressed in the logs and were before the specific
> release.
> >>>>>
> >>>>> It might make sense to do a release that specifically suppresses
> these
> >>>>> errors without backporting the full Solr 9.0 changes which impact the
> >>>>> memory footprint of export.
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> Joel Bernstein
> >>>>> http://joelsolr.blogspot.com/
> >>>>>
> >>>>>
> >>>>> On Mon, Mar 1, 2021 at 10:29 AM ufuk yılmaz
> <uyilmaz@vivaldi.net.invalid
> >>>>>
> >>>>> wrote:
> >>>>>
> >>>>>> Hello all,
> >>>>>>
> >>>>>> I’m running a large streaming expression and feeding the result to
> >>>> update
> >>>>>> expression.
> >>>>>>
> >>>>>> update(targetCollection, ...long running stream here...,
> >>>>>>
> >>>>>> I tried sending the exact same query multiple times, it sometimes
> works
> >>>>>> and indexes some results, then gives exception, other times fails
> with
> >>>> an
> >>>>>> exception after 2 minutes.
> >>>>>>
> >>>>>> Response is like:
> >>>>>> "EXCEPTION":"java.util.concurrent.ExecutionException:
> >>>>>> java.io.IOException: params distrib=false&numWorkers=4.... and my
> long
> >>>>>> stream expression
> >>>>>>
> >>>>>> Server log (short):
> >>>>>> [c:DNM s:shard1 r:core_node2 x:DNM_shard1_replica_n1]
> >>>>>> o.a.s.s.HttpSolrCall null:java.io.IOException:
> >>>>>> java.util.concurrent.TimeoutException: Idle timeout expired:
> >>>> 120000/120000
> >>>>>> ms
> >>>>>> o.a.s.s.HttpSolrCall null:java.io.IOException:
> >>>>>> java.util.concurrent.TimeoutException: Idle timeout expired:
> >>>> 120000/120000
> >>>>>> ms
> >>>>>>
> >>>>>> I tried to increase the jetty idle timeout value on the node which
> >>>> hosts
> >>>>>> my target collection to something like an hour. It didn’t affect.
> >>>>>>
> >>>>>>
> >>>>>> Server logs (long)
> >>>>>> ERROR (qtp832292933-589) [c:DNM s:shard1 r:core_node2
> >>>>>> x:DNM_shard1_replica_n1] o.a.s.s.HttpSolrCall
> null:java.io.IOException:
> >>>>>> java.util.concurrent.TimeoutException: Idle timeout expired: 1
> >>>>>>                               20000/120000 ms
> >>>>>> solr-01    |    at
> >>>>>>
> >>>>
> org.eclipse.jetty.util.SharedBlockingCallback$Blocker.block(SharedBlockingCallback.java:235)
> >>>>>> solr-01    |    at
> >>>>>> org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:226)
> >>>>>> solr-01    |    at
> >>>>>> org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:524)
> >>>>>> solr-01    |    at
> >>>>>>
> >>>>
> org.apache.solr.servlet.ServletOutputStreamWrapper.write(ServletOutputStreamWrapper.java:134)
> >>>>>> solr-01    |    at
> >>>>>>
> java.base/sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:233)
> >>>>>> solr-01    |    at
> >>>>>> java.base/sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:303)
> >>>>>> solr-01    |    at
> >>>>>> java.base/sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:281)
> >>>>>> solr-01    |    at
> >>>>>> java.base/sun.nio.cs.StreamEncoder.write(StreamEncoder.java:125)
> >>>>>> solr-01    |    at java.base/java.io
> >>>>>> .OutputStreamWriter.write(OutputStreamWriter.java:211)
> >>>>>> solr-01    |    at
> >>>>>> org.apache.solr.common.util.FastWriter.flush(FastWriter.java:140)
> >>>>>> solr-01    |    at
> >>>>>> org.apache.solr.common.util.FastWriter.write(FastWriter.java:54)
> >>>>>> solr-01    |    at
> >>>>>> org.apache.solr.response.JSONWriter._writeChar(JSONWriter.java:173)
> >>>>>> solr-01    |    at
> >>>>>>
> >>>>
> org.apache.solr.common.util.JsonTextWriter.writeStr(JsonTextWriter.java:86)
> >>>>>> solr-01    |    at
> >>>>>> org.apache.solr.common.util.TextWriter.writeVal(TextWriter.java:52)
> >>>>>> solr-01    |    at
> >>>>>>
> >>>>
> org.apache.solr.response.TextResponseWriter.writeVal(TextResponseWriter.java:152)
> >>>>>> solr-01    |    at
> >>>>>>
> >>>>
> org.apache.solr.common.util.JsonTextWriter$2.put(JsonTextWriter.java:176)
> >>>>>> solr-01    |    at
> >>>>>> org.apache.solr.common.MapWriter$EntryWriter.put(MapWriter.java:154)
> >>>>>> solr-01    |    at
> >>>>>>
> >>>>
> org.apache.solr.handler.export.StringFieldWriter.write(StringFieldWriter.java:77)
> >>>>>> solr-01    |    at
> >>>>>>
> >>>>
> org.apache.solr.handler.export.ExportWriter.writeDoc(ExportWriter.java:313)
> >>>>>> solr-01    |    at
> >>>>>>
> >>>>
> org.apache.solr.handler.export.ExportWriter.lambda$addDocsToItemWriter$4(ExportWriter.java:263)
> >>>>>> --
> >>>>>> solr-01    |    at org.eclipse.jetty.io
> >>>>>> .FillInterest.fillable(FillInterest.java:103)
> >>>>>> solr-01    |    at org.eclipse.jetty.io
> >>>>>> .ChannelEndPoint$2.run(ChannelEndPoint.java:117)
> >>>>>> solr-01    |    at
> >>>>>>
> >>>>
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333)
> >>>>>> solr-01    |    at
> >>>>>>
> >>>>
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310)
> >>>>>> solr-01    |    at
> >>>>>>
> >>>>
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)
> >>>>>> solr-01    |    at
> >>>>>>
> >>>>
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126)
> >>>>>> solr-01    |    at
> >>>>>>
> >>>>
> org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366)
> >>>>>> solr-01    |    at
> >>>>>>
> >>>>
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:781)
> >>>>>> solr-01    |    at
> >>>>>>
> >>>>
> org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:917)
> >>>>>> solr-01    |    at java.base/java.lang.Thread.run(Thread.java:834)
> >>>>>> solr-01    | Caused by: java.util.concurrent.TimeoutException: Idle
> >>>>>> timeout expired: 120000/120000 ms
> >>>>>> solr-01    |    at org.eclipse.jetty.io
> >>>>>> .IdleTimeout.checkIdleTimeout(IdleTimeout.java:171)
> >>>>>> solr-01    |    at org.eclipse.jetty.io
> >>>>>> .IdleTimeout.idleCheck(IdleTimeout.java:113)
> >>>>>> solr-01    |    at
> >>>>>>
> >>>>
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
> >>>>>> solr-01    |    at
> >>>>>> java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> >>>>>> solr-01    |    at
> >>>>>>
> >>>>
> java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
> >>>>>> solr-01    |    at
> >>>>>>
> >>>>
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> >>>>>> solr-01    |    at
> >>>>>>
> >>>>
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> >>>>>> solr-01    |    ... 1 more
> >>>>>>
> >>>>>>
> >>>>>> My expression, in case it helps. To summarize, it finds the document
> >>>> ids
> >>>>>> which exists on sourceCollection but not on target collection (DNM).
> >>>> Joins
> >>>>>> on itself to duplicate some fields (I couldn’t find another way to
> >>>>>> duplicate the value of field into 2 fields). Then sends the result
> to
> >>>>>> update. Source collection has about 300M documents, 24GB heap, 2
> >>>> shards, 2
> >>>>>> replicas of each shard.
> >>>>>>
> >>>>>> update(
> >>>>>>   DNM,
> >>>>>>   batchSize=1000,
> >>>>>>   parallel(
> >>>>>>       WorkerCollection,
> >>>>>>       leftOuterJoin(
> >>>>>>           fetch(
> >>>>>>               sourceCollection,
> >>>>>>               complement(
> >>>>>>                   search(
> >>>>>>                       sourceCollection,
> >>>>>>                       q="*:*",
> >>>>>>                       qt="/export",
> >>>>>>                       fq="...some filters...",
> >>>>>>                       sort="id_str asc",
> >>>>>>                       fl="id_str",
> >>>>>>                       partitionKeys="id_str"
> >>>>>>                   ),
> >>>>>>                   search(
> >>>>>>                       DNM,
> >>>>>>                       q="*:*",
> >>>>>>                       qt="/export",
> >>>>>>                       sort="id_str asc",
> >>>>>>                       fl="id_str",
> >>>>>>                       partitionKeys="id_str"
> >>>>>>                   ),
> >>>>>>                   on="id_str"
> >>>>>>               ),
> >>>>>>               fl="...my many fields...",
> >>>>>>               on="id_str",
> >>>>>>               batchSize="1000"
> >>>>>>           ),
> >>>>>>           select(
> >>>>>>               fetch(
> >>>>>>                   sourceCollection,
> >>>>>>                   complement(
> >>>>>>                       search(
> >>>>>>                           sourceCollection,
> >>>>>>                           q="*:*",
> >>>>>>                           qt="/export",
> >>>>>>                           fq="...some other filters...",
> >>>>>>                           sort="id_str asc",
> >>>>>>                           fl="id_str",
> >>>>>>                           partitionKeys="id_str"
> >>>>>>                       ),
> >>>>>>                       search(
> >>>>>>                           DNM,
> >>>>>>                           q="*:*",
> >>>>>>                           qt="/export",
> >>>>>>                           sort="id_str asc",
> >>>>>>                           fl="id_str",
> >>>>>>                           partitionKeys="id_str"
> >>>>>>                       ),
> >>>>>>                       on="id_str"
> >>>>>>                   ),
> >>>>>>                   fl="...some other fields...",
> >>>>>>                   on="id_str",
> >>>>>>                   batchSize="1000"
> >>>>>>               ),
> >>>>>>               id_str, ..some other fields as...
> >>>>>>           ),
> >>>>>>           on="id_str"
> >>>>>>       ),
> >>>>>>       workers="4", sort="id_str asc"
> >>>>>>   )
> >>>>>> )
> >>>>>>
> >>>>>> Sent from Mail for Windows 10
> >>>>>>
> >>>>>>
> >>>>
> >>>>
> >>
> >
>
>


Re: Idle timeout expired and Early Client Disconnect errors

Posted by Joel Bernstein <jo...@gmail.com>.
This ticket shows how it is done in the solrconfig.xml:

https://issues.apache.org/jira/browse/SOLR-9103



Joel Bernstein
http://joelsolr.blogspot.com/


On Mon, Mar 8, 2021 at 9:18 AM ufuk yılmaz <uy...@vivaldi.net.invalid>
wrote:

> How do you “register” something like a CloudSolrStream btw? Using Blob
> Store API?
>
> Sent from Mail for Windows 10
>
> From: Susmit
> Sent: 06 March 2021 23:03
> To: users@solr.apache.org
> Subject: Re: Idle timeout expired and Early Client Disconnect errors
>
> better to use solr 8.9 and configure http timeouts from solr.in.sh
> workaround is bigger - need to extend cloudsolrstream , register it and
> install custom solrclientcache with overridden setcontext method
>
> Sent from my iPhone
>
> > On Mar 6, 2021, at 9:25 AM, ufuk yılmaz <uy...@vivaldi.net.invalid>
> wrote:
> >
> > How? O_O
> >
> > Sent from Mail for Windows 10
> >
> > From: Susmit
> > Sent: 06 March 2021 18:35
> > To: solr-user@lucene.apache.org
> > Subject: Re: Idle timeout expired and Early Client Disconnect errors
> >
> > i have used a workaround to increase the default (hard coded) timeout of
> 2 min in solrclientcache.
> > i can run 9+ hour long streaming queries with no issues.
> >
> > Sent from my iPhone
> >
> >> On Mar 2, 2021, at 5:32 PM, ufuk yılmaz <uy...@vivaldi.net.invalid>
> wrote:
> >>
> >> I divided the query to 1000 pieces and removed the parallel stream
> clause, it seems to be working without timeout so far, if it does I just
> can divide it to even smaller pieces I guess.
> >>
> >> I tried to send all 1000 pieces in a “list” expression to be executed
> linearly, it didn’t work but I was just curious if it could handle such a
> large query 😃
> >>
> >> Now I’m just generating expression strings from java code and sending
> them one by one. I tried to use SolrJ for this, but encountered a weird
> problem where even the simplest expression (echo) stops working after a few
> iterations in a loop. I’m guessing the underlying HttpClient is not closing
> connections timely, hitting the OS per-host connection limit. I asked a
> separate question about this. I was following the example on lucidworks:
> https://lucidworks.com/post/streaming-expressions-in-solrj/
> >>
> >> I just modified my code to use regular REST calls using okhttp3, it’s a
> shame that I couldn’t use SolrJ since it truly streams every result 1 by 1
> continuously. REST just returns a single large response at the very end of
> the stream.
> >>
> >> Thanks again for your help.
> >>
> >> Sent from Mail for Windows 10
> >>
> >> From: Joel Bernstein
> >> Sent: 02 March 2021 00:19
> >> To: solr-user@lucene.apache.org
> >> Subject: Re: Idle timeout expired and Early Client Disconnect errors
> >>
> >> Also the parallel function builds hash partitioning filters that could
> lead
> >> to timeouts if they take too long to build. Try the query without the
> >> parallel function if you're still getting timeouts when making the query
> >> smaller.
> >>
> >>
> >>
> >> Joel Bernstein
> >> http://joelsolr.blogspot.com/
> >>
> >>
> >>>> On Mon, Mar 1, 2021 at 4:03 PM Joel Bernstein <jo...@gmail.com>
> wrote:
> >>>
> >>> The settings in your version are 30 seconds and 15 seconds for socket
> and
> >>> connection timeouts.
> >>>
> >>> Typically timeouts occur because one or more shards in the query are
> idle
> >>> beyond the timeout threshold. This happens because lot's of data is
> being
> >>> read from other shards.
> >>>
> >>> Breaking the query into small parts would be a good strategy.
> >>>
> >>>
> >>>
> >>>
> >>> Joel Bernstein
> >>> http://joelsolr.blogspot.com/
> >>>
> >>>
> >>> On Mon, Mar 1, 2021 at 3:30 PM ufuk yılmaz <uyilmaz@vivaldi.net.invalid
> >
> >>> wrote:
> >>>
> >>>> Hello Mr. Bernstein,
> >>>>
> >>>> I’m using version 8.4. So, if I understand correctly, I can’t increase
> >>>> timeouts and they are bound to happen in such a large stream. Should
> I just
> >>>> reduce the output of my search expressions?
> >>>>
> >>>> Maybe I can split my search results into ~100 parts and run the same
> >>>> query 100 times in series. Each part would emit ~3M documents so they
> >>>> should finish before timeout?
> >>>>
> >>>> Is this a reasonable solution?
> >>>>
> >>>> Btw how long is the default hard-coded timeout value? Because
> yesterday I
> >>>> ran another query which took more than 1 hour without any timeouts and
> >>>> finished successfully.
> >>>>
> >>>> Sent from Mail for Windows 10
> >>>>
> >>>> From: Joel Bernstein
> >>>> Sent: 01 March 2021 23:03
> >>>> To: solr-user@lucene.apache.org
> >>>> Subject: Re: Idle timeout expired and Early Client Disconnect errors
> >>>>
> >>>> Oh wait, I misread your email. The idle timeout issue is configurable
> in:
> >>>>
> >>>> https://issues.apache.org/jira/browse/SOLR-14672
> >>>>
> >>>> This unfortunately missed the 8.8 release and will be 8.9.
> >>>>
> >>>>
> >>>>
> >>>> This i
> >>>>
> >>>>
> >>>>
> >>>> Joel Bernstein
> >>>> http://joelsolr.blogspot.com/
> >>>>
> >>>>
> >>>>> On Mon, Mar 1, 2021 at 2:56 PM Joel Bernstein <jo...@gmail.com>
> wrote:
> >>>>
> >>>>> What version are you using?
> >>>>>
> >>>>> Solr 8.7 has changes that caused these errors to hit the logs. These
> >>>> used
> >>>>> to be suppressed. This has been fixed in Solr 9.0 but it has not been
> >>>> back
> >>>>> ported to Solr 8.x.
> >>>>>
> >>>>> The errors are actually normal operational occurrences when doing
> joins
> >>>> so
> >>>>> should be suppressed in the logs and were before the specific
> release.
> >>>>>
> >>>>> It might make sense to do a release that specifically suppresses
> these
> >>>>> errors without backporting the full Solr 9.0 changes which impact the
> >>>>> memory footprint of export.
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> Joel Bernstein
> >>>>> http://joelsolr.blogspot.com/
> >>>>>
> >>>>>
> >>>>> On Mon, Mar 1, 2021 at 10:29 AM ufuk yılmaz
> <uyilmaz@vivaldi.net.invalid
> >>>>>
> >>>>> wrote:
> >>>>>
> >>>>>> Hello all,
> >>>>>>
> >>>>>> I’m running a large streaming expression and feeding the result to
> >>>> update
> >>>>>> expression.
> >>>>>>
> >>>>>> update(targetCollection, ...long running stream here...,
> >>>>>>
> >>>>>> I tried sending the exact same query multiple times, it sometimes
> works
> >>>>>> and indexes some results, then gives exception, other times fails
> with
> >>>> an
> >>>>>> exception after 2 minutes.
> >>>>>>
> >>>>>> Response is like:
> >>>>>> "EXCEPTION":"java.util.concurrent.ExecutionException:
> >>>>>> java.io.IOException: params distrib=false&numWorkers=4.... and my
> long
> >>>>>> stream expression
> >>>>>>
> >>>>>> Server log (short):
> >>>>>> [c:DNM s:shard1 r:core_node2 x:DNM_shard1_replica_n1]
> >>>>>> o.a.s.s.HttpSolrCall null:java.io.IOException:
> >>>>>> java.util.concurrent.TimeoutException: Idle timeout expired:
> >>>> 120000/120000
> >>>>>> ms
> >>>>>> o.a.s.s.HttpSolrCall null:java.io.IOException:
> >>>>>> java.util.concurrent.TimeoutException: Idle timeout expired:
> >>>> 120000/120000
> >>>>>> ms
> >>>>>>
> >>>>>> I tried to increase the jetty idle timeout value on the node which
> >>>> hosts
> >>>>>> my target collection to something like an hour. It didn’t affect.
> >>>>>>
> >>>>>>
> >>>>>> Server logs (long)
> >>>>>> ERROR (qtp832292933-589) [c:DNM s:shard1 r:core_node2
> >>>>>> x:DNM_shard1_replica_n1] o.a.s.s.HttpSolrCall
> null:java.io.IOException:
> >>>>>> java.util.concurrent.TimeoutException: Idle timeout expired: 1
> >>>>>>                               20000/120000 ms
> >>>>>> solr-01    |    at
> >>>>>>
> >>>>
> org.eclipse.jetty.util.SharedBlockingCallback$Blocker.block(SharedBlockingCallback.java:235)
> >>>>>> solr-01    |    at
> >>>>>> org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:226)
> >>>>>> solr-01    |    at
> >>>>>> org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:524)
> >>>>>> solr-01    |    at
> >>>>>>
> >>>>
> org.apache.solr.servlet.ServletOutputStreamWrapper.write(ServletOutputStreamWrapper.java:134)
> >>>>>> solr-01    |    at
> >>>>>>
> java.base/sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:233)
> >>>>>> solr-01    |    at
> >>>>>> java.base/sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:303)
> >>>>>> solr-01    |    at
> >>>>>> java.base/sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:281)
> >>>>>> solr-01    |    at
> >>>>>> java.base/sun.nio.cs.StreamEncoder.write(StreamEncoder.java:125)
> >>>>>> solr-01    |    at java.base/java.io
> >>>>>> .OutputStreamWriter.write(OutputStreamWriter.java:211)
> >>>>>> solr-01    |    at
> >>>>>> org.apache.solr.common.util.FastWriter.flush(FastWriter.java:140)
> >>>>>> solr-01    |    at
> >>>>>> org.apache.solr.common.util.FastWriter.write(FastWriter.java:54)
> >>>>>> solr-01    |    at
> >>>>>> org.apache.solr.response.JSONWriter._writeChar(JSONWriter.java:173)
> >>>>>> solr-01    |    at
> >>>>>>
> >>>>
> org.apache.solr.common.util.JsonTextWriter.writeStr(JsonTextWriter.java:86)
> >>>>>> solr-01    |    at
> >>>>>> org.apache.solr.common.util.TextWriter.writeVal(TextWriter.java:52)
> >>>>>> solr-01    |    at
> >>>>>>
> >>>>
> org.apache.solr.response.TextResponseWriter.writeVal(TextResponseWriter.java:152)
> >>>>>> solr-01    |    at
> >>>>>>
> >>>>
> org.apache.solr.common.util.JsonTextWriter$2.put(JsonTextWriter.java:176)
> >>>>>> solr-01    |    at
> >>>>>> org.apache.solr.common.MapWriter$EntryWriter.put(MapWriter.java:154)
> >>>>>> solr-01    |    at
> >>>>>>
> >>>>
> org.apache.solr.handler.export.StringFieldWriter.write(StringFieldWriter.java:77)
> >>>>>> solr-01    |    at
> >>>>>>
> >>>>
> org.apache.solr.handler.export.ExportWriter.writeDoc(ExportWriter.java:313)
> >>>>>> solr-01    |    at
> >>>>>>
> >>>>
> org.apache.solr.handler.export.ExportWriter.lambda$addDocsToItemWriter$4(ExportWriter.java:263)
> >>>>>> --
> >>>>>> solr-01    |    at org.eclipse.jetty.io
> >>>>>> .FillInterest.fillable(FillInterest.java:103)
> >>>>>> solr-01    |    at org.eclipse.jetty.io
> >>>>>> .ChannelEndPoint$2.run(ChannelEndPoint.java:117)
> >>>>>> solr-01    |    at
> >>>>>>
> >>>>
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333)
> >>>>>> solr-01    |    at
> >>>>>>
> >>>>
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310)
> >>>>>> solr-01    |    at
> >>>>>>
> >>>>
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)
> >>>>>> solr-01    |    at
> >>>>>>
> >>>>
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126)
> >>>>>> solr-01    |    at
> >>>>>>
> >>>>
> org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366)
> >>>>>> solr-01    |    at
> >>>>>>
> >>>>
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:781)
> >>>>>> solr-01    |    at
> >>>>>>
> >>>>
> org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:917)
> >>>>>> solr-01    |    at java.base/java.lang.Thread.run(Thread.java:834)
> >>>>>> solr-01    | Caused by: java.util.concurrent.TimeoutException: Idle
> >>>>>> timeout expired: 120000/120000 ms
> >>>>>> solr-01    |    at org.eclipse.jetty.io
> >>>>>> .IdleTimeout.checkIdleTimeout(IdleTimeout.java:171)
> >>>>>> solr-01    |    at org.eclipse.jetty.io
> >>>>>> .IdleTimeout.idleCheck(IdleTimeout.java:113)
> >>>>>> solr-01    |    at
> >>>>>>
> >>>>
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
> >>>>>> solr-01    |    at
> >>>>>> java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> >>>>>> solr-01    |    at
> >>>>>>
> >>>>
> java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
> >>>>>> solr-01    |    at
> >>>>>>
> >>>>
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> >>>>>> solr-01    |    at
> >>>>>>
> >>>>
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> >>>>>> solr-01    |    ... 1 more
> >>>>>>
> >>>>>>
> >>>>>> My expression, in case it helps. To summarize, it finds the document
> >>>> ids
> >>>>>> which exists on sourceCollection but not on target collection (DNM).
> >>>> Joins
> >>>>>> on itself to duplicate some fields (I couldn’t find another way to
> >>>>>> duplicate the value of field into 2 fields). Then sends the result
> to
> >>>>>> update. Source collection has about 300M documents, 24GB heap, 2
> >>>> shards, 2
> >>>>>> replicas of each shard.
> >>>>>>
> >>>>>> update(
> >>>>>>   DNM,
> >>>>>>   batchSize=1000,
> >>>>>>   parallel(
> >>>>>>       WorkerCollection,
> >>>>>>       leftOuterJoin(
> >>>>>>           fetch(
> >>>>>>               sourceCollection,
> >>>>>>               complement(
> >>>>>>                   search(
> >>>>>>                       sourceCollection,
> >>>>>>                       q="*:*",
> >>>>>>                       qt="/export",
> >>>>>>                       fq="...some filters...",
> >>>>>>                       sort="id_str asc",
> >>>>>>                       fl="id_str",
> >>>>>>                       partitionKeys="id_str"
> >>>>>>                   ),
> >>>>>>                   search(
> >>>>>>                       DNM,
> >>>>>>                       q="*:*",
> >>>>>>                       qt="/export",
> >>>>>>                       sort="id_str asc",
> >>>>>>                       fl="id_str",
> >>>>>>                       partitionKeys="id_str"
> >>>>>>                   ),
> >>>>>>                   on="id_str"
> >>>>>>               ),
> >>>>>>               fl="...my many fields...",
> >>>>>>               on="id_str",
> >>>>>>               batchSize="1000"
> >>>>>>           ),
> >>>>>>           select(
> >>>>>>               fetch(
> >>>>>>                   sourceCollection,
> >>>>>>                   complement(
> >>>>>>                       search(
> >>>>>>                           sourceCollection,
> >>>>>>                           q="*:*",
> >>>>>>                           qt="/export",
> >>>>>>                           fq="...some other filters...",
> >>>>>>                           sort="id_str asc",
> >>>>>>                           fl="id_str",
> >>>>>>                           partitionKeys="id_str"
> >>>>>>                       ),
> >>>>>>                       search(
> >>>>>>                           DNM,
> >>>>>>                           q="*:*",
> >>>>>>                           qt="/export",
> >>>>>>                           sort="id_str asc",
> >>>>>>                           fl="id_str",
> >>>>>>                           partitionKeys="id_str"
> >>>>>>                       ),
> >>>>>>                       on="id_str"
> >>>>>>                   ),
> >>>>>>                   fl="...some other fields...",
> >>>>>>                   on="id_str",
> >>>>>>                   batchSize="1000"
> >>>>>>               ),
> >>>>>>               id_str, ..some other fields as...
> >>>>>>           ),
> >>>>>>           on="id_str"
> >>>>>>       ),
> >>>>>>       workers="4", sort="id_str asc"
> >>>>>>   )
> >>>>>> )
> >>>>>>
> >>>>>> Sent from Mail for Windows 10
> >>>>>>
> >>>>>>
> >>>>
> >>>>
> >>
> >
>
>

RE: Idle timeout expired and Early Client Disconnect errors

Posted by ufuk yılmaz <uy...@vivaldi.net.INVALID>.
How do you “register” something like a CloudSolrStream btw? Using Blob Store API?

Sent from Mail for Windows 10

From: Susmit
Sent: 06 March 2021 23:03
To: users@solr.apache.org
Subject: Re: Idle timeout expired and Early Client Disconnect errors

better to use solr 8.9 and configure http timeouts from solr.in.sh
workaround is bigger - need to extend cloudsolrstream , register it and install custom solrclientcache with overridden setcontext method 

Sent from my iPhone

> On Mar 6, 2021, at 9:25 AM, ufuk yılmaz <uy...@vivaldi.net.invalid> wrote:
> 
> How? O_O
> 
> Sent from Mail for Windows 10
> 
> From: Susmit
> Sent: 06 March 2021 18:35
> To: solr-user@lucene.apache.org
> Subject: Re: Idle timeout expired and Early Client Disconnect errors
> 
> i have used a workaround to increase the default (hard coded) timeout of 2 min in solrclientcache. 
> i can run 9+ hour long streaming queries with no issues.
> 
> Sent from my iPhone
> 
>> On Mar 2, 2021, at 5:32 PM, ufuk yılmaz <uy...@vivaldi.net.invalid> wrote:
>> 
>> I divided the query to 1000 pieces and removed the parallel stream clause, it seems to be working without timeout so far, if it does I just can divide it to even smaller pieces I guess.
>> 
>> I tried to send all 1000 pieces in a “list” expression to be executed linearly, it didn’t work but I was just curious if it could handle such a large query 😃
>> 
>> Now I’m just generating expression strings from java code and sending them one by one. I tried to use SolrJ for this, but encountered a weird problem where even the simplest expression (echo) stops working after a few iterations in a loop. I’m guessing the underlying HttpClient is not closing connections timely, hitting the OS per-host connection limit. I asked a separate question about this. I was following the example on lucidworks: https://lucidworks.com/post/streaming-expressions-in-solrj/
>> 
>> I just modified my code to use regular REST calls using okhttp3, it’s a shame that I couldn’t use SolrJ since it truly streams every result 1 by 1 continuously. REST just returns a single large response at the very end of the stream.
>> 
>> Thanks again for your help.
>> 
>> Sent from Mail for Windows 10
>> 
>> From: Joel Bernstein
>> Sent: 02 March 2021 00:19
>> To: solr-user@lucene.apache.org
>> Subject: Re: Idle timeout expired and Early Client Disconnect errors
>> 
>> Also the parallel function builds hash partitioning filters that could lead
>> to timeouts if they take too long to build. Try the query without the
>> parallel function if you're still getting timeouts when making the query
>> smaller.
>> 
>> 
>> 
>> Joel Bernstein
>> http://joelsolr.blogspot.com/
>> 
>> 
>>>> On Mon, Mar 1, 2021 at 4:03 PM Joel Bernstein <jo...@gmail.com> wrote:
>>> 
>>> The settings in your version are 30 seconds and 15 seconds for socket and
>>> connection timeouts.
>>> 
>>> Typically timeouts occur because one or more shards in the query are idle
>>> beyond the timeout threshold. This happens because lot's of data is being
>>> read from other shards.
>>> 
>>> Breaking the query into small parts would be a good strategy.
>>> 
>>> 
>>> 
>>> 
>>> Joel Bernstein
>>> http://joelsolr.blogspot.com/
>>> 
>>> 
>>> On Mon, Mar 1, 2021 at 3:30 PM ufuk yılmaz <uy...@vivaldi.net.invalid>
>>> wrote:
>>> 
>>>> Hello Mr. Bernstein,
>>>> 
>>>> I’m using version 8.4. So, if I understand correctly, I can’t increase
>>>> timeouts and they are bound to happen in such a large stream. Should I just
>>>> reduce the output of my search expressions?
>>>> 
>>>> Maybe I can split my search results into ~100 parts and run the same
>>>> query 100 times in series. Each part would emit ~3M documents so they
>>>> should finish before timeout?
>>>> 
>>>> Is this a reasonable solution?
>>>> 
>>>> Btw how long is the default hard-coded timeout value? Because yesterday I
>>>> ran another query which took more than 1 hour without any timeouts and
>>>> finished successfully.
>>>> 
>>>> Sent from Mail for Windows 10
>>>> 
>>>> From: Joel Bernstein
>>>> Sent: 01 March 2021 23:03
>>>> To: solr-user@lucene.apache.org
>>>> Subject: Re: Idle timeout expired and Early Client Disconnect errors
>>>> 
>>>> Oh wait, I misread your email. The idle timeout issue is configurable in:
>>>> 
>>>> https://issues.apache.org/jira/browse/SOLR-14672
>>>> 
>>>> This unfortunately missed the 8.8 release and will be 8.9.
>>>> 
>>>> 
>>>> 
>>>> This i
>>>> 
>>>> 
>>>> 
>>>> Joel Bernstein
>>>> http://joelsolr.blogspot.com/
>>>> 
>>>> 
>>>>> On Mon, Mar 1, 2021 at 2:56 PM Joel Bernstein <jo...@gmail.com> wrote:
>>>> 
>>>>> What version are you using?
>>>>> 
>>>>> Solr 8.7 has changes that caused these errors to hit the logs. These
>>>> used
>>>>> to be suppressed. This has been fixed in Solr 9.0 but it has not been
>>>> back
>>>>> ported to Solr 8.x.
>>>>> 
>>>>> The errors are actually normal operational occurrences when doing joins
>>>> so
>>>>> should be suppressed in the logs and were before the specific release.
>>>>> 
>>>>> It might make sense to do a release that specifically suppresses these
>>>>> errors without backporting the full Solr 9.0 changes which impact the
>>>>> memory footprint of export.
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> Joel Bernstein
>>>>> http://joelsolr.blogspot.com/
>>>>> 
>>>>> 
>>>>> On Mon, Mar 1, 2021 at 10:29 AM ufuk yılmaz <uyilmaz@vivaldi.net.invalid
>>>>> 
>>>>> wrote:
>>>>> 
>>>>>> Hello all,
>>>>>> 
>>>>>> I’m running a large streaming expression and feeding the result to
>>>> update
>>>>>> expression.
>>>>>> 
>>>>>> update(targetCollection, ...long running stream here...,
>>>>>> 
>>>>>> I tried sending the exact same query multiple times, it sometimes works
>>>>>> and indexes some results, then gives exception, other times fails with
>>>> an
>>>>>> exception after 2 minutes.
>>>>>> 
>>>>>> Response is like:
>>>>>> "EXCEPTION":"java.util.concurrent.ExecutionException:
>>>>>> java.io.IOException: params distrib=false&numWorkers=4.... and my long
>>>>>> stream expression
>>>>>> 
>>>>>> Server log (short):
>>>>>> [c:DNM s:shard1 r:core_node2 x:DNM_shard1_replica_n1]
>>>>>> o.a.s.s.HttpSolrCall null:java.io.IOException:
>>>>>> java.util.concurrent.TimeoutException: Idle timeout expired:
>>>> 120000/120000
>>>>>> ms
>>>>>> o.a.s.s.HttpSolrCall null:java.io.IOException:
>>>>>> java.util.concurrent.TimeoutException: Idle timeout expired:
>>>> 120000/120000
>>>>>> ms
>>>>>> 
>>>>>> I tried to increase the jetty idle timeout value on the node which
>>>> hosts
>>>>>> my target collection to something like an hour. It didn’t affect.
>>>>>> 
>>>>>> 
>>>>>> Server logs (long)
>>>>>> ERROR (qtp832292933-589) [c:DNM s:shard1 r:core_node2
>>>>>> x:DNM_shard1_replica_n1] o.a.s.s.HttpSolrCall null:java.io.IOException:
>>>>>> java.util.concurrent.TimeoutException: Idle timeout expired: 1
>>>>>>                               20000/120000 ms
>>>>>> solr-01    |    at
>>>>>> 
>>>> org.eclipse.jetty.util.SharedBlockingCallback$Blocker.block(SharedBlockingCallback.java:235)
>>>>>> solr-01    |    at
>>>>>> org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:226)
>>>>>> solr-01    |    at
>>>>>> org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:524)
>>>>>> solr-01    |    at
>>>>>> 
>>>> org.apache.solr.servlet.ServletOutputStreamWrapper.write(ServletOutputStreamWrapper.java:134)
>>>>>> solr-01    |    at
>>>>>> java.base/sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:233)
>>>>>> solr-01    |    at
>>>>>> java.base/sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:303)
>>>>>> solr-01    |    at
>>>>>> java.base/sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:281)
>>>>>> solr-01    |    at
>>>>>> java.base/sun.nio.cs.StreamEncoder.write(StreamEncoder.java:125)
>>>>>> solr-01    |    at java.base/java.io
>>>>>> .OutputStreamWriter.write(OutputStreamWriter.java:211)
>>>>>> solr-01    |    at
>>>>>> org.apache.solr.common.util.FastWriter.flush(FastWriter.java:140)
>>>>>> solr-01    |    at
>>>>>> org.apache.solr.common.util.FastWriter.write(FastWriter.java:54)
>>>>>> solr-01    |    at
>>>>>> org.apache.solr.response.JSONWriter._writeChar(JSONWriter.java:173)
>>>>>> solr-01    |    at
>>>>>> 
>>>> org.apache.solr.common.util.JsonTextWriter.writeStr(JsonTextWriter.java:86)
>>>>>> solr-01    |    at
>>>>>> org.apache.solr.common.util.TextWriter.writeVal(TextWriter.java:52)
>>>>>> solr-01    |    at
>>>>>> 
>>>> org.apache.solr.response.TextResponseWriter.writeVal(TextResponseWriter.java:152)
>>>>>> solr-01    |    at
>>>>>> 
>>>> org.apache.solr.common.util.JsonTextWriter$2.put(JsonTextWriter.java:176)
>>>>>> solr-01    |    at
>>>>>> org.apache.solr.common.MapWriter$EntryWriter.put(MapWriter.java:154)
>>>>>> solr-01    |    at
>>>>>> 
>>>> org.apache.solr.handler.export.StringFieldWriter.write(StringFieldWriter.java:77)
>>>>>> solr-01    |    at
>>>>>> 
>>>> org.apache.solr.handler.export.ExportWriter.writeDoc(ExportWriter.java:313)
>>>>>> solr-01    |    at
>>>>>> 
>>>> org.apache.solr.handler.export.ExportWriter.lambda$addDocsToItemWriter$4(ExportWriter.java:263)
>>>>>> --
>>>>>> solr-01    |    at org.eclipse.jetty.io
>>>>>> .FillInterest.fillable(FillInterest.java:103)
>>>>>> solr-01    |    at org.eclipse.jetty.io
>>>>>> .ChannelEndPoint$2.run(ChannelEndPoint.java:117)
>>>>>> solr-01    |    at
>>>>>> 
>>>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333)
>>>>>> solr-01    |    at
>>>>>> 
>>>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310)
>>>>>> solr-01    |    at
>>>>>> 
>>>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)
>>>>>> solr-01    |    at
>>>>>> 
>>>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126)
>>>>>> solr-01    |    at
>>>>>> 
>>>> org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366)
>>>>>> solr-01    |    at
>>>>>> 
>>>> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:781)
>>>>>> solr-01    |    at
>>>>>> 
>>>> org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:917)
>>>>>> solr-01    |    at java.base/java.lang.Thread.run(Thread.java:834)
>>>>>> solr-01    | Caused by: java.util.concurrent.TimeoutException: Idle
>>>>>> timeout expired: 120000/120000 ms
>>>>>> solr-01    |    at org.eclipse.jetty.io
>>>>>> .IdleTimeout.checkIdleTimeout(IdleTimeout.java:171)
>>>>>> solr-01    |    at org.eclipse.jetty.io
>>>>>> .IdleTimeout.idleCheck(IdleTimeout.java:113)
>>>>>> solr-01    |    at
>>>>>> 
>>>> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>>>>>> solr-01    |    at
>>>>>> java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>>>>>> solr-01    |    at
>>>>>> 
>>>> java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
>>>>>> solr-01    |    at
>>>>>> 
>>>> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>>>>>> solr-01    |    at
>>>>>> 
>>>> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>>>>>> solr-01    |    ... 1 more
>>>>>> 
>>>>>> 
>>>>>> My expression, in case it helps. To summarize, it finds the document
>>>> ids
>>>>>> which exists on sourceCollection but not on target collection (DNM).
>>>> Joins
>>>>>> on itself to duplicate some fields (I couldn’t find another way to
>>>>>> duplicate the value of field into 2 fields). Then sends the result to
>>>>>> update. Source collection has about 300M documents, 24GB heap, 2
>>>> shards, 2
>>>>>> replicas of each shard.
>>>>>> 
>>>>>> update(
>>>>>>   DNM,
>>>>>>   batchSize=1000,
>>>>>>   parallel(
>>>>>>       WorkerCollection,
>>>>>>       leftOuterJoin(
>>>>>>           fetch(
>>>>>>               sourceCollection,
>>>>>>               complement(
>>>>>>                   search(
>>>>>>                       sourceCollection,
>>>>>>                       q="*:*",
>>>>>>                       qt="/export",
>>>>>>                       fq="...some filters...",
>>>>>>                       sort="id_str asc",
>>>>>>                       fl="id_str",
>>>>>>                       partitionKeys="id_str"
>>>>>>                   ),
>>>>>>                   search(
>>>>>>                       DNM,
>>>>>>                       q="*:*",
>>>>>>                       qt="/export",
>>>>>>                       sort="id_str asc",
>>>>>>                       fl="id_str",
>>>>>>                       partitionKeys="id_str"
>>>>>>                   ),
>>>>>>                   on="id_str"
>>>>>>               ),
>>>>>>               fl="...my many fields...",
>>>>>>               on="id_str",
>>>>>>               batchSize="1000"
>>>>>>           ),
>>>>>>           select(
>>>>>>               fetch(
>>>>>>                   sourceCollection,
>>>>>>                   complement(
>>>>>>                       search(
>>>>>>                           sourceCollection,
>>>>>>                           q="*:*",
>>>>>>                           qt="/export",
>>>>>>                           fq="...some other filters...",
>>>>>>                           sort="id_str asc",
>>>>>>                           fl="id_str",
>>>>>>                           partitionKeys="id_str"
>>>>>>                       ),
>>>>>>                       search(
>>>>>>                           DNM,
>>>>>>                           q="*:*",
>>>>>>                           qt="/export",
>>>>>>                           sort="id_str asc",
>>>>>>                           fl="id_str",
>>>>>>                           partitionKeys="id_str"
>>>>>>                       ),
>>>>>>                       on="id_str"
>>>>>>                   ),
>>>>>>                   fl="...some other fields...",
>>>>>>                   on="id_str",
>>>>>>                   batchSize="1000"
>>>>>>               ),
>>>>>>               id_str, ..some other fields as...
>>>>>>           ),
>>>>>>           on="id_str"
>>>>>>       ),
>>>>>>       workers="4", sort="id_str asc"
>>>>>>   )
>>>>>> )
>>>>>> 
>>>>>> Sent from Mail for Windows 10
>>>>>> 
>>>>>> 
>>>> 
>>>> 
>> 
> 


Re: Idle timeout expired and Early Client Disconnect errors

Posted by Susmit <sh...@gmail.com>.
better to use solr 8.9 and configure http timeouts from solr.in.sh
workaround is bigger - need to extend cloudsolrstream , register it and install custom solrclientcache with overridden setcontext method 

Sent from my iPhone

> On Mar 6, 2021, at 9:25 AM, ufuk yılmaz <uy...@vivaldi.net.invalid> wrote:
> 
> How? O_O
> 
> Sent from Mail for Windows 10
> 
> From: Susmit
> Sent: 06 March 2021 18:35
> To: solr-user@lucene.apache.org
> Subject: Re: Idle timeout expired and Early Client Disconnect errors
> 
> i have used a workaround to increase the default (hard coded) timeout of 2 min in solrclientcache. 
> i can run 9+ hour long streaming queries with no issues.
> 
> Sent from my iPhone
> 
>> On Mar 2, 2021, at 5:32 PM, ufuk yılmaz <uy...@vivaldi.net.invalid> wrote:
>> 
>> I divided the query to 1000 pieces and removed the parallel stream clause, it seems to be working without timeout so far, if it does I just can divide it to even smaller pieces I guess.
>> 
>> I tried to send all 1000 pieces in a “list” expression to be executed linearly, it didn’t work but I was just curious if it could handle such a large query 😃
>> 
>> Now I’m just generating expression strings from java code and sending them one by one. I tried to use SolrJ for this, but encountered a weird problem where even the simplest expression (echo) stops working after a few iterations in a loop. I’m guessing the underlying HttpClient is not closing connections timely, hitting the OS per-host connection limit. I asked a separate question about this. I was following the example on lucidworks: https://lucidworks.com/post/streaming-expressions-in-solrj/
>> 
>> I just modified my code to use regular REST calls using okhttp3, it’s a shame that I couldn’t use SolrJ since it truly streams every result 1 by 1 continuously. REST just returns a single large response at the very end of the stream.
>> 
>> Thanks again for your help.
>> 
>> Sent from Mail for Windows 10
>> 
>> From: Joel Bernstein
>> Sent: 02 March 2021 00:19
>> To: solr-user@lucene.apache.org
>> Subject: Re: Idle timeout expired and Early Client Disconnect errors
>> 
>> Also the parallel function builds hash partitioning filters that could lead
>> to timeouts if they take too long to build. Try the query without the
>> parallel function if you're still getting timeouts when making the query
>> smaller.
>> 
>> 
>> 
>> Joel Bernstein
>> http://joelsolr.blogspot.com/
>> 
>> 
>>>> On Mon, Mar 1, 2021 at 4:03 PM Joel Bernstein <jo...@gmail.com> wrote:
>>> 
>>> The settings in your version are 30 seconds and 15 seconds for socket and
>>> connection timeouts.
>>> 
>>> Typically timeouts occur because one or more shards in the query are idle
>>> beyond the timeout threshold. This happens because lot's of data is being
>>> read from other shards.
>>> 
>>> Breaking the query into small parts would be a good strategy.
>>> 
>>> 
>>> 
>>> 
>>> Joel Bernstein
>>> http://joelsolr.blogspot.com/
>>> 
>>> 
>>> On Mon, Mar 1, 2021 at 3:30 PM ufuk yılmaz <uy...@vivaldi.net.invalid>
>>> wrote:
>>> 
>>>> Hello Mr. Bernstein,
>>>> 
>>>> I’m using version 8.4. So, if I understand correctly, I can’t increase
>>>> timeouts and they are bound to happen in such a large stream. Should I just
>>>> reduce the output of my search expressions?
>>>> 
>>>> Maybe I can split my search results into ~100 parts and run the same
>>>> query 100 times in series. Each part would emit ~3M documents so they
>>>> should finish before timeout?
>>>> 
>>>> Is this a reasonable solution?
>>>> 
>>>> Btw how long is the default hard-coded timeout value? Because yesterday I
>>>> ran another query which took more than 1 hour without any timeouts and
>>>> finished successfully.
>>>> 
>>>> Sent from Mail for Windows 10
>>>> 
>>>> From: Joel Bernstein
>>>> Sent: 01 March 2021 23:03
>>>> To: solr-user@lucene.apache.org
>>>> Subject: Re: Idle timeout expired and Early Client Disconnect errors
>>>> 
>>>> Oh wait, I misread your email. The idle timeout issue is configurable in:
>>>> 
>>>> https://issues.apache.org/jira/browse/SOLR-14672
>>>> 
>>>> This unfortunately missed the 8.8 release and will be 8.9.
>>>> 
>>>> 
>>>> 
>>>> This i
>>>> 
>>>> 
>>>> 
>>>> Joel Bernstein
>>>> http://joelsolr.blogspot.com/
>>>> 
>>>> 
>>>>> On Mon, Mar 1, 2021 at 2:56 PM Joel Bernstein <jo...@gmail.com> wrote:
>>>> 
>>>>> What version are you using?
>>>>> 
>>>>> Solr 8.7 has changes that caused these errors to hit the logs. These
>>>> used
>>>>> to be suppressed. This has been fixed in Solr 9.0 but it has not been
>>>> back
>>>>> ported to Solr 8.x.
>>>>> 
>>>>> The errors are actually normal operational occurrences when doing joins
>>>> so
>>>>> should be suppressed in the logs and were before the specific release.
>>>>> 
>>>>> It might make sense to do a release that specifically suppresses these
>>>>> errors without backporting the full Solr 9.0 changes which impact the
>>>>> memory footprint of export.
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> Joel Bernstein
>>>>> http://joelsolr.blogspot.com/
>>>>> 
>>>>> 
>>>>> On Mon, Mar 1, 2021 at 10:29 AM ufuk yılmaz <uyilmaz@vivaldi.net.invalid
>>>>> 
>>>>> wrote:
>>>>> 
>>>>>> Hello all,
>>>>>> 
>>>>>> I’m running a large streaming expression and feeding the result to
>>>> update
>>>>>> expression.
>>>>>> 
>>>>>> update(targetCollection, ...long running stream here...,
>>>>>> 
>>>>>> I tried sending the exact same query multiple times, it sometimes works
>>>>>> and indexes some results, then gives exception, other times fails with
>>>> an
>>>>>> exception after 2 minutes.
>>>>>> 
>>>>>> Response is like:
>>>>>> "EXCEPTION":"java.util.concurrent.ExecutionException:
>>>>>> java.io.IOException: params distrib=false&numWorkers=4.... and my long
>>>>>> stream expression
>>>>>> 
>>>>>> Server log (short):
>>>>>> [c:DNM s:shard1 r:core_node2 x:DNM_shard1_replica_n1]
>>>>>> o.a.s.s.HttpSolrCall null:java.io.IOException:
>>>>>> java.util.concurrent.TimeoutException: Idle timeout expired:
>>>> 120000/120000
>>>>>> ms
>>>>>> o.a.s.s.HttpSolrCall null:java.io.IOException:
>>>>>> java.util.concurrent.TimeoutException: Idle timeout expired:
>>>> 120000/120000
>>>>>> ms
>>>>>> 
>>>>>> I tried to increase the jetty idle timeout value on the node which
>>>> hosts
>>>>>> my target collection to something like an hour. It didn’t affect.
>>>>>> 
>>>>>> 
>>>>>> Server logs (long)
>>>>>> ERROR (qtp832292933-589) [c:DNM s:shard1 r:core_node2
>>>>>> x:DNM_shard1_replica_n1] o.a.s.s.HttpSolrCall null:java.io.IOException:
>>>>>> java.util.concurrent.TimeoutException: Idle timeout expired: 1
>>>>>>                               20000/120000 ms
>>>>>> solr-01    |    at
>>>>>> 
>>>> org.eclipse.jetty.util.SharedBlockingCallback$Blocker.block(SharedBlockingCallback.java:235)
>>>>>> solr-01    |    at
>>>>>> org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:226)
>>>>>> solr-01    |    at
>>>>>> org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:524)
>>>>>> solr-01    |    at
>>>>>> 
>>>> org.apache.solr.servlet.ServletOutputStreamWrapper.write(ServletOutputStreamWrapper.java:134)
>>>>>> solr-01    |    at
>>>>>> java.base/sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:233)
>>>>>> solr-01    |    at
>>>>>> java.base/sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:303)
>>>>>> solr-01    |    at
>>>>>> java.base/sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:281)
>>>>>> solr-01    |    at
>>>>>> java.base/sun.nio.cs.StreamEncoder.write(StreamEncoder.java:125)
>>>>>> solr-01    |    at java.base/java.io
>>>>>> .OutputStreamWriter.write(OutputStreamWriter.java:211)
>>>>>> solr-01    |    at
>>>>>> org.apache.solr.common.util.FastWriter.flush(FastWriter.java:140)
>>>>>> solr-01    |    at
>>>>>> org.apache.solr.common.util.FastWriter.write(FastWriter.java:54)
>>>>>> solr-01    |    at
>>>>>> org.apache.solr.response.JSONWriter._writeChar(JSONWriter.java:173)
>>>>>> solr-01    |    at
>>>>>> 
>>>> org.apache.solr.common.util.JsonTextWriter.writeStr(JsonTextWriter.java:86)
>>>>>> solr-01    |    at
>>>>>> org.apache.solr.common.util.TextWriter.writeVal(TextWriter.java:52)
>>>>>> solr-01    |    at
>>>>>> 
>>>> org.apache.solr.response.TextResponseWriter.writeVal(TextResponseWriter.java:152)
>>>>>> solr-01    |    at
>>>>>> 
>>>> org.apache.solr.common.util.JsonTextWriter$2.put(JsonTextWriter.java:176)
>>>>>> solr-01    |    at
>>>>>> org.apache.solr.common.MapWriter$EntryWriter.put(MapWriter.java:154)
>>>>>> solr-01    |    at
>>>>>> 
>>>> org.apache.solr.handler.export.StringFieldWriter.write(StringFieldWriter.java:77)
>>>>>> solr-01    |    at
>>>>>> 
>>>> org.apache.solr.handler.export.ExportWriter.writeDoc(ExportWriter.java:313)
>>>>>> solr-01    |    at
>>>>>> 
>>>> org.apache.solr.handler.export.ExportWriter.lambda$addDocsToItemWriter$4(ExportWriter.java:263)
>>>>>> --
>>>>>> solr-01    |    at org.eclipse.jetty.io
>>>>>> .FillInterest.fillable(FillInterest.java:103)
>>>>>> solr-01    |    at org.eclipse.jetty.io
>>>>>> .ChannelEndPoint$2.run(ChannelEndPoint.java:117)
>>>>>> solr-01    |    at
>>>>>> 
>>>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333)
>>>>>> solr-01    |    at
>>>>>> 
>>>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310)
>>>>>> solr-01    |    at
>>>>>> 
>>>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)
>>>>>> solr-01    |    at
>>>>>> 
>>>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126)
>>>>>> solr-01    |    at
>>>>>> 
>>>> org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366)
>>>>>> solr-01    |    at
>>>>>> 
>>>> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:781)
>>>>>> solr-01    |    at
>>>>>> 
>>>> org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:917)
>>>>>> solr-01    |    at java.base/java.lang.Thread.run(Thread.java:834)
>>>>>> solr-01    | Caused by: java.util.concurrent.TimeoutException: Idle
>>>>>> timeout expired: 120000/120000 ms
>>>>>> solr-01    |    at org.eclipse.jetty.io
>>>>>> .IdleTimeout.checkIdleTimeout(IdleTimeout.java:171)
>>>>>> solr-01    |    at org.eclipse.jetty.io
>>>>>> .IdleTimeout.idleCheck(IdleTimeout.java:113)
>>>>>> solr-01    |    at
>>>>>> 
>>>> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>>>>>> solr-01    |    at
>>>>>> java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>>>>>> solr-01    |    at
>>>>>> 
>>>> java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
>>>>>> solr-01    |    at
>>>>>> 
>>>> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>>>>>> solr-01    |    at
>>>>>> 
>>>> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>>>>>> solr-01    |    ... 1 more
>>>>>> 
>>>>>> 
>>>>>> My expression, in case it helps. To summarize, it finds the document
>>>> ids
>>>>>> which exists on sourceCollection but not on target collection (DNM).
>>>> Joins
>>>>>> on itself to duplicate some fields (I couldn’t find another way to
>>>>>> duplicate the value of field into 2 fields). Then sends the result to
>>>>>> update. Source collection has about 300M documents, 24GB heap, 2
>>>> shards, 2
>>>>>> replicas of each shard.
>>>>>> 
>>>>>> update(
>>>>>>   DNM,
>>>>>>   batchSize=1000,
>>>>>>   parallel(
>>>>>>       WorkerCollection,
>>>>>>       leftOuterJoin(
>>>>>>           fetch(
>>>>>>               sourceCollection,
>>>>>>               complement(
>>>>>>                   search(
>>>>>>                       sourceCollection,
>>>>>>                       q="*:*",
>>>>>>                       qt="/export",
>>>>>>                       fq="...some filters...",
>>>>>>                       sort="id_str asc",
>>>>>>                       fl="id_str",
>>>>>>                       partitionKeys="id_str"
>>>>>>                   ),
>>>>>>                   search(
>>>>>>                       DNM,
>>>>>>                       q="*:*",
>>>>>>                       qt="/export",
>>>>>>                       sort="id_str asc",
>>>>>>                       fl="id_str",
>>>>>>                       partitionKeys="id_str"
>>>>>>                   ),
>>>>>>                   on="id_str"
>>>>>>               ),
>>>>>>               fl="...my many fields...",
>>>>>>               on="id_str",
>>>>>>               batchSize="1000"
>>>>>>           ),
>>>>>>           select(
>>>>>>               fetch(
>>>>>>                   sourceCollection,
>>>>>>                   complement(
>>>>>>                       search(
>>>>>>                           sourceCollection,
>>>>>>                           q="*:*",
>>>>>>                           qt="/export",
>>>>>>                           fq="...some other filters...",
>>>>>>                           sort="id_str asc",
>>>>>>                           fl="id_str",
>>>>>>                           partitionKeys="id_str"
>>>>>>                       ),
>>>>>>                       search(
>>>>>>                           DNM,
>>>>>>                           q="*:*",
>>>>>>                           qt="/export",
>>>>>>                           sort="id_str asc",
>>>>>>                           fl="id_str",
>>>>>>                           partitionKeys="id_str"
>>>>>>                       ),
>>>>>>                       on="id_str"
>>>>>>                   ),
>>>>>>                   fl="...some other fields...",
>>>>>>                   on="id_str",
>>>>>>                   batchSize="1000"
>>>>>>               ),
>>>>>>               id_str, ..some other fields as...
>>>>>>           ),
>>>>>>           on="id_str"
>>>>>>       ),
>>>>>>       workers="4", sort="id_str asc"
>>>>>>   )
>>>>>> )
>>>>>> 
>>>>>> Sent from Mail for Windows 10
>>>>>> 
>>>>>> 
>>>> 
>>>> 
>> 
> 

RE: Idle timeout expired and Early Client Disconnect errors

Posted by ufuk yılmaz <uy...@vivaldi.net.INVALID>.
Thanks!!  I’ll work on it.

Right now I’m playing around modifying bytecode of SolrClientCache directly. Don’t know if it will work though.

Sent from Mail for Windows 10

From: ufuk yılmaz
Sent: 06 March 2021 20:25
To: users@solr.apache.org
Subject: RE: Idle timeout expired and Early Client Disconnect errors

How? O_O

Sent from Mail for Windows 10

From: Susmit
Sent: 06 March 2021 18:35
To: solr-user@lucene.apache.org
Subject: Re: Idle timeout expired and Early Client Disconnect errors

i have used a workaround to increase the default (hard coded) timeout of 2 min in solrclientcache. 
i can run 9+ hour long streaming queries with no issues.

Sent from my iPhone

> On Mar 2, 2021, at 5:32 PM, ufuk yılmaz <uy...@vivaldi.net.invalid> wrote:
> 
> I divided the query to 1000 pieces and removed the parallel stream clause, it seems to be working without timeout so far, if it does I just can divide it to even smaller pieces I guess.
> 
> I tried to send all 1000 pieces in a “list” expression to be executed linearly, it didn’t work but I was just curious if it could handle such a large query 😃
> 
> Now I’m just generating expression strings from java code and sending them one by one. I tried to use SolrJ for this, but encountered a weird problem where even the simplest expression (echo) stops working after a few iterations in a loop. I’m guessing the underlying HttpClient is not closing connections timely, hitting the OS per-host connection limit. I asked a separate question about this. I was following the example on lucidworks: https://lucidworks.com/post/streaming-expressions-in-solrj/
> 
> I just modified my code to use regular REST calls using okhttp3, it’s a shame that I couldn’t use SolrJ since it truly streams every result 1 by 1 continuously. REST just returns a single large response at the very end of the stream.
> 
> Thanks again for your help.
> 
> Sent from Mail for Windows 10
> 
> From: Joel Bernstein
> Sent: 02 March 2021 00:19
> To: solr-user@lucene.apache.org
> Subject: Re: Idle timeout expired and Early Client Disconnect errors
> 
> Also the parallel function builds hash partitioning filters that could lead
> to timeouts if they take too long to build. Try the query without the
> parallel function if you're still getting timeouts when making the query
> smaller.
> 
> 
> 
> Joel Bernstein
> http://joelsolr.blogspot.com/
> 
> 
>> On Mon, Mar 1, 2021 at 4:03 PM Joel Bernstein <jo...@gmail.com> wrote:
>> 
>> The settings in your version are 30 seconds and 15 seconds for socket and
>> connection timeouts.
>> 
>> Typically timeouts occur because one or more shards in the query are idle
>> beyond the timeout threshold. This happens because lot's of data is being
>> read from other shards.
>> 
>> Breaking the query into small parts would be a good strategy.
>> 
>> 
>> 
>> 
>> Joel Bernstein
>> http://joelsolr.blogspot.com/
>> 
>> 
>> On Mon, Mar 1, 2021 at 3:30 PM ufuk yılmaz <uy...@vivaldi.net.invalid>
>> wrote:
>> 
>>> Hello Mr. Bernstein,
>>> 
>>> I’m using version 8.4. So, if I understand correctly, I can’t increase
>>> timeouts and they are bound to happen in such a large stream. Should I just
>>> reduce the output of my search expressions?
>>> 
>>> Maybe I can split my search results into ~100 parts and run the same
>>> query 100 times in series. Each part would emit ~3M documents so they
>>> should finish before timeout?
>>> 
>>> Is this a reasonable solution?
>>> 
>>> Btw how long is the default hard-coded timeout value? Because yesterday I
>>> ran another query which took more than 1 hour without any timeouts and
>>> finished successfully.
>>> 
>>> Sent from Mail for Windows 10
>>> 
>>> From: Joel Bernstein
>>> Sent: 01 March 2021 23:03
>>> To: solr-user@lucene.apache.org
>>> Subject: Re: Idle timeout expired and Early Client Disconnect errors
>>> 
>>> Oh wait, I misread your email. The idle timeout issue is configurable in:
>>> 
>>> https://issues.apache.org/jira/browse/SOLR-14672
>>> 
>>> This unfortunately missed the 8.8 release and will be 8.9.
>>> 
>>> 
>>> 
>>> This i
>>> 
>>> 
>>> 
>>> Joel Bernstein
>>> http://joelsolr.blogspot.com/
>>> 
>>> 
>>>> On Mon, Mar 1, 2021 at 2:56 PM Joel Bernstein <jo...@gmail.com> wrote:
>>> 
>>>> What version are you using?
>>>> 
>>>> Solr 8.7 has changes that caused these errors to hit the logs. These
>>> used
>>>> to be suppressed. This has been fixed in Solr 9.0 but it has not been
>>> back
>>>> ported to Solr 8.x.
>>>> 
>>>> The errors are actually normal operational occurrences when doing joins
>>> so
>>>> should be suppressed in the logs and were before the specific release.
>>>> 
>>>> It might make sense to do a release that specifically suppresses these
>>>> errors without backporting the full Solr 9.0 changes which impact the
>>>> memory footprint of export.
>>>> 
>>>> 
>>>> 
>>>> 
>>>> Joel Bernstein
>>>> http://joelsolr.blogspot.com/
>>>> 
>>>> 
>>>> On Mon, Mar 1, 2021 at 10:29 AM ufuk yılmaz <uyilmaz@vivaldi.net.invalid
>>>> 
>>>> wrote:
>>>> 
>>>>> Hello all,
>>>>> 
>>>>> I’m running a large streaming expression and feeding the result to
>>> update
>>>>> expression.
>>>>> 
>>>>> update(targetCollection, ...long running stream here...,
>>>>> 
>>>>> I tried sending the exact same query multiple times, it sometimes works
>>>>> and indexes some results, then gives exception, other times fails with
>>> an
>>>>> exception after 2 minutes.
>>>>> 
>>>>> Response is like:
>>>>> "EXCEPTION":"java.util.concurrent.ExecutionException:
>>>>> java.io.IOException: params distrib=false&numWorkers=4.... and my long
>>>>> stream expression
>>>>> 
>>>>> Server log (short):
>>>>> [c:DNM s:shard1 r:core_node2 x:DNM_shard1_replica_n1]
>>>>> o.a.s.s.HttpSolrCall null:java.io.IOException:
>>>>> java.util.concurrent.TimeoutException: Idle timeout expired:
>>> 120000/120000
>>>>> ms
>>>>> o.a.s.s.HttpSolrCall null:java.io.IOException:
>>>>> java.util.concurrent.TimeoutException: Idle timeout expired:
>>> 120000/120000
>>>>> ms
>>>>> 
>>>>> I tried to increase the jetty idle timeout value on the node which
>>> hosts
>>>>> my target collection to something like an hour. It didn’t affect.
>>>>> 
>>>>> 
>>>>> Server logs (long)
>>>>> ERROR (qtp832292933-589) [c:DNM s:shard1 r:core_node2
>>>>> x:DNM_shard1_replica_n1] o.a.s.s.HttpSolrCall null:java.io.IOException:
>>>>> java.util.concurrent.TimeoutException: Idle timeout expired: 1
>>>>>                                20000/120000 ms
>>>>> solr-01    |    at
>>>>> 
>>> org.eclipse.jetty.util.SharedBlockingCallback$Blocker.block(SharedBlockingCallback.java:235)
>>>>> solr-01    |    at
>>>>> org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:226)
>>>>> solr-01    |    at
>>>>> org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:524)
>>>>> solr-01    |    at
>>>>> 
>>> org.apache.solr.servlet.ServletOutputStreamWrapper.write(ServletOutputStreamWrapper.java:134)
>>>>> solr-01    |    at
>>>>> java.base/sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:233)
>>>>> solr-01    |    at
>>>>> java.base/sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:303)
>>>>> solr-01    |    at
>>>>> java.base/sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:281)
>>>>> solr-01    |    at
>>>>> java.base/sun.nio.cs.StreamEncoder.write(StreamEncoder.java:125)
>>>>> solr-01    |    at java.base/java.io
>>>>> .OutputStreamWriter.write(OutputStreamWriter.java:211)
>>>>> solr-01    |    at
>>>>> org.apache.solr.common.util.FastWriter.flush(FastWriter.java:140)
>>>>> solr-01    |    at
>>>>> org.apache.solr.common.util.FastWriter.write(FastWriter.java:54)
>>>>> solr-01    |    at
>>>>> org.apache.solr.response.JSONWriter._writeChar(JSONWriter.java:173)
>>>>> solr-01    |    at
>>>>> 
>>> org.apache.solr.common.util.JsonTextWriter.writeStr(JsonTextWriter.java:86)
>>>>> solr-01    |    at
>>>>> org.apache.solr.common.util.TextWriter.writeVal(TextWriter.java:52)
>>>>> solr-01    |    at
>>>>> 
>>> org.apache.solr.response.TextResponseWriter.writeVal(TextResponseWriter.java:152)
>>>>> solr-01    |    at
>>>>> 
>>> org.apache.solr.common.util.JsonTextWriter$2.put(JsonTextWriter.java:176)
>>>>> solr-01    |    at
>>>>> org.apache.solr.common.MapWriter$EntryWriter.put(MapWriter.java:154)
>>>>> solr-01    |    at
>>>>> 
>>> org.apache.solr.handler.export.StringFieldWriter.write(StringFieldWriter.java:77)
>>>>> solr-01    |    at
>>>>> 
>>> org.apache.solr.handler.export.ExportWriter.writeDoc(ExportWriter.java:313)
>>>>> solr-01    |    at
>>>>> 
>>> org.apache.solr.handler.export.ExportWriter.lambda$addDocsToItemWriter$4(ExportWriter.java:263)
>>>>> --
>>>>> solr-01    |    at org.eclipse.jetty.io
>>>>> .FillInterest.fillable(FillInterest.java:103)
>>>>> solr-01    |    at org.eclipse.jetty.io
>>>>> .ChannelEndPoint$2.run(ChannelEndPoint.java:117)
>>>>> solr-01    |    at
>>>>> 
>>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333)
>>>>> solr-01    |    at
>>>>> 
>>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310)
>>>>> solr-01    |    at
>>>>> 
>>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)
>>>>> solr-01    |    at
>>>>> 
>>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126)
>>>>> solr-01    |    at
>>>>> 
>>> org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366)
>>>>> solr-01    |    at
>>>>> 
>>> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:781)
>>>>> solr-01    |    at
>>>>> 
>>> org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:917)
>>>>> solr-01    |    at java.base/java.lang.Thread.run(Thread.java:834)
>>>>> solr-01    | Caused by: java.util.concurrent.TimeoutException: Idle
>>>>> timeout expired: 120000/120000 ms
>>>>> solr-01    |    at org.eclipse.jetty.io
>>>>> .IdleTimeout.checkIdleTimeout(IdleTimeout.java:171)
>>>>> solr-01    |    at org.eclipse.jetty.io
>>>>> .IdleTimeout.idleCheck(IdleTimeout.java:113)
>>>>> solr-01    |    at
>>>>> 
>>> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>>>>> solr-01    |    at
>>>>> java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>>>>> solr-01    |    at
>>>>> 
>>> java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
>>>>> solr-01    |    at
>>>>> 
>>> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>>>>> solr-01    |    at
>>>>> 
>>> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>>>>> solr-01    |    ... 1 more
>>>>> 
>>>>> 
>>>>> My expression, in case it helps. To summarize, it finds the document
>>> ids
>>>>> which exists on sourceCollection but not on target collection (DNM).
>>> Joins
>>>>> on itself to duplicate some fields (I couldn’t find another way to
>>>>> duplicate the value of field into 2 fields). Then sends the result to
>>>>> update. Source collection has about 300M documents, 24GB heap, 2
>>> shards, 2
>>>>> replicas of each shard.
>>>>> 
>>>>> update(
>>>>>    DNM,
>>>>>    batchSize=1000,
>>>>>    parallel(
>>>>>        WorkerCollection,
>>>>>        leftOuterJoin(
>>>>>            fetch(
>>>>>                sourceCollection,
>>>>>                complement(
>>>>>                    search(
>>>>>                        sourceCollection,
>>>>>                        q="*:*",
>>>>>                        qt="/export",
>>>>>                        fq="...some filters...",
>>>>>                        sort="id_str asc",
>>>>>                        fl="id_str",
>>>>>                        partitionKeys="id_str"
>>>>>                    ),
>>>>>                    search(
>>>>>                        DNM,
>>>>>                        q="*:*",
>>>>>                        qt="/export",
>>>>>                        sort="id_str asc",
>>>>>                        fl="id_str",
>>>>>                        partitionKeys="id_str"
>>>>>                    ),
>>>>>                    on="id_str"
>>>>>                ),
>>>>>                fl="...my many fields...",
>>>>>                on="id_str",
>>>>>                batchSize="1000"
>>>>>            ),
>>>>>            select(
>>>>>                fetch(
>>>>>                    sourceCollection,
>>>>>                    complement(
>>>>>                        search(
>>>>>                            sourceCollection,
>>>>>                            q="*:*",
>>>>>                            qt="/export",
>>>>>                            fq="...some other filters...",
>>>>>                            sort="id_str asc",
>>>>>                            fl="id_str",
>>>>>                            partitionKeys="id_str"
>>>>>                        ),
>>>>>                        search(
>>>>>                            DNM,
>>>>>                            q="*:*",
>>>>>                            qt="/export",
>>>>>                            sort="id_str asc",
>>>>>                            fl="id_str",
>>>>>                            partitionKeys="id_str"
>>>>>                        ),
>>>>>                        on="id_str"
>>>>>                    ),
>>>>>                    fl="...some other fields...",
>>>>>                    on="id_str",
>>>>>                    batchSize="1000"
>>>>>                ),
>>>>>                id_str, ..some other fields as...
>>>>>            ),
>>>>>            on="id_str"
>>>>>        ),
>>>>>        workers="4", sort="id_str asc"
>>>>>    )
>>>>> )
>>>>> 
>>>>> Sent from Mail for Windows 10
>>>>> 
>>>>> 
>>> 
>>> 
> 



RE: Idle timeout expired and Early Client Disconnect errors

Posted by ufuk yılmaz <uy...@vivaldi.net.INVALID>.
How? O_O

Sent from Mail for Windows 10

From: Susmit
Sent: 06 March 2021 18:35
To: solr-user@lucene.apache.org
Subject: Re: Idle timeout expired and Early Client Disconnect errors

i have used a workaround to increase the default (hard coded) timeout of 2 min in solrclientcache. 
i can run 9+ hour long streaming queries with no issues.

Sent from my iPhone

> On Mar 2, 2021, at 5:32 PM, ufuk yılmaz <uy...@vivaldi.net.invalid> wrote:
> 
> I divided the query to 1000 pieces and removed the parallel stream clause, it seems to be working without timeout so far, if it does I just can divide it to even smaller pieces I guess.
> 
> I tried to send all 1000 pieces in a “list” expression to be executed linearly, it didn’t work but I was just curious if it could handle such a large query 😃
> 
> Now I’m just generating expression strings from java code and sending them one by one. I tried to use SolrJ for this, but encountered a weird problem where even the simplest expression (echo) stops working after a few iterations in a loop. I’m guessing the underlying HttpClient is not closing connections timely, hitting the OS per-host connection limit. I asked a separate question about this. I was following the example on lucidworks: https://lucidworks.com/post/streaming-expressions-in-solrj/
> 
> I just modified my code to use regular REST calls using okhttp3, it’s a shame that I couldn’t use SolrJ since it truly streams every result 1 by 1 continuously. REST just returns a single large response at the very end of the stream.
> 
> Thanks again for your help.
> 
> Sent from Mail for Windows 10
> 
> From: Joel Bernstein
> Sent: 02 March 2021 00:19
> To: solr-user@lucene.apache.org
> Subject: Re: Idle timeout expired and Early Client Disconnect errors
> 
> Also the parallel function builds hash partitioning filters that could lead
> to timeouts if they take too long to build. Try the query without the
> parallel function if you're still getting timeouts when making the query
> smaller.
> 
> 
> 
> Joel Bernstein
> http://joelsolr.blogspot.com/
> 
> 
>> On Mon, Mar 1, 2021 at 4:03 PM Joel Bernstein <jo...@gmail.com> wrote:
>> 
>> The settings in your version are 30 seconds and 15 seconds for socket and
>> connection timeouts.
>> 
>> Typically timeouts occur because one or more shards in the query are idle
>> beyond the timeout threshold. This happens because lot's of data is being
>> read from other shards.
>> 
>> Breaking the query into small parts would be a good strategy.
>> 
>> 
>> 
>> 
>> Joel Bernstein
>> http://joelsolr.blogspot.com/
>> 
>> 
>> On Mon, Mar 1, 2021 at 3:30 PM ufuk yılmaz <uy...@vivaldi.net.invalid>
>> wrote:
>> 
>>> Hello Mr. Bernstein,
>>> 
>>> I’m using version 8.4. So, if I understand correctly, I can’t increase
>>> timeouts and they are bound to happen in such a large stream. Should I just
>>> reduce the output of my search expressions?
>>> 
>>> Maybe I can split my search results into ~100 parts and run the same
>>> query 100 times in series. Each part would emit ~3M documents so they
>>> should finish before timeout?
>>> 
>>> Is this a reasonable solution?
>>> 
>>> Btw how long is the default hard-coded timeout value? Because yesterday I
>>> ran another query which took more than 1 hour without any timeouts and
>>> finished successfully.
>>> 
>>> Sent from Mail for Windows 10
>>> 
>>> From: Joel Bernstein
>>> Sent: 01 March 2021 23:03
>>> To: solr-user@lucene.apache.org
>>> Subject: Re: Idle timeout expired and Early Client Disconnect errors
>>> 
>>> Oh wait, I misread your email. The idle timeout issue is configurable in:
>>> 
>>> https://issues.apache.org/jira/browse/SOLR-14672
>>> 
>>> This unfortunately missed the 8.8 release and will be 8.9.
>>> 
>>> 
>>> 
>>> This i
>>> 
>>> 
>>> 
>>> Joel Bernstein
>>> http://joelsolr.blogspot.com/
>>> 
>>> 
>>>> On Mon, Mar 1, 2021 at 2:56 PM Joel Bernstein <jo...@gmail.com> wrote:
>>> 
>>>> What version are you using?
>>>> 
>>>> Solr 8.7 has changes that caused these errors to hit the logs. These
>>> used
>>>> to be suppressed. This has been fixed in Solr 9.0 but it has not been
>>> back
>>>> ported to Solr 8.x.
>>>> 
>>>> The errors are actually normal operational occurrences when doing joins
>>> so
>>>> should be suppressed in the logs and were before the specific release.
>>>> 
>>>> It might make sense to do a release that specifically suppresses these
>>>> errors without backporting the full Solr 9.0 changes which impact the
>>>> memory footprint of export.
>>>> 
>>>> 
>>>> 
>>>> 
>>>> Joel Bernstein
>>>> http://joelsolr.blogspot.com/
>>>> 
>>>> 
>>>> On Mon, Mar 1, 2021 at 10:29 AM ufuk yılmaz <uyilmaz@vivaldi.net.invalid
>>>> 
>>>> wrote:
>>>> 
>>>>> Hello all,
>>>>> 
>>>>> I’m running a large streaming expression and feeding the result to
>>> update
>>>>> expression.
>>>>> 
>>>>> update(targetCollection, ...long running stream here...,
>>>>> 
>>>>> I tried sending the exact same query multiple times, it sometimes works
>>>>> and indexes some results, then gives exception, other times fails with
>>> an
>>>>> exception after 2 minutes.
>>>>> 
>>>>> Response is like:
>>>>> "EXCEPTION":"java.util.concurrent.ExecutionException:
>>>>> java.io.IOException: params distrib=false&numWorkers=4.... and my long
>>>>> stream expression
>>>>> 
>>>>> Server log (short):
>>>>> [c:DNM s:shard1 r:core_node2 x:DNM_shard1_replica_n1]
>>>>> o.a.s.s.HttpSolrCall null:java.io.IOException:
>>>>> java.util.concurrent.TimeoutException: Idle timeout expired:
>>> 120000/120000
>>>>> ms
>>>>> o.a.s.s.HttpSolrCall null:java.io.IOException:
>>>>> java.util.concurrent.TimeoutException: Idle timeout expired:
>>> 120000/120000
>>>>> ms
>>>>> 
>>>>> I tried to increase the jetty idle timeout value on the node which
>>> hosts
>>>>> my target collection to something like an hour. It didn’t affect.
>>>>> 
>>>>> 
>>>>> Server logs (long)
>>>>> ERROR (qtp832292933-589) [c:DNM s:shard1 r:core_node2
>>>>> x:DNM_shard1_replica_n1] o.a.s.s.HttpSolrCall null:java.io.IOException:
>>>>> java.util.concurrent.TimeoutException: Idle timeout expired: 1
>>>>>                                20000/120000 ms
>>>>> solr-01    |    at
>>>>> 
>>> org.eclipse.jetty.util.SharedBlockingCallback$Blocker.block(SharedBlockingCallback.java:235)
>>>>> solr-01    |    at
>>>>> org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:226)
>>>>> solr-01    |    at
>>>>> org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:524)
>>>>> solr-01    |    at
>>>>> 
>>> org.apache.solr.servlet.ServletOutputStreamWrapper.write(ServletOutputStreamWrapper.java:134)
>>>>> solr-01    |    at
>>>>> java.base/sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:233)
>>>>> solr-01    |    at
>>>>> java.base/sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:303)
>>>>> solr-01    |    at
>>>>> java.base/sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:281)
>>>>> solr-01    |    at
>>>>> java.base/sun.nio.cs.StreamEncoder.write(StreamEncoder.java:125)
>>>>> solr-01    |    at java.base/java.io
>>>>> .OutputStreamWriter.write(OutputStreamWriter.java:211)
>>>>> solr-01    |    at
>>>>> org.apache.solr.common.util.FastWriter.flush(FastWriter.java:140)
>>>>> solr-01    |    at
>>>>> org.apache.solr.common.util.FastWriter.write(FastWriter.java:54)
>>>>> solr-01    |    at
>>>>> org.apache.solr.response.JSONWriter._writeChar(JSONWriter.java:173)
>>>>> solr-01    |    at
>>>>> 
>>> org.apache.solr.common.util.JsonTextWriter.writeStr(JsonTextWriter.java:86)
>>>>> solr-01    |    at
>>>>> org.apache.solr.common.util.TextWriter.writeVal(TextWriter.java:52)
>>>>> solr-01    |    at
>>>>> 
>>> org.apache.solr.response.TextResponseWriter.writeVal(TextResponseWriter.java:152)
>>>>> solr-01    |    at
>>>>> 
>>> org.apache.solr.common.util.JsonTextWriter$2.put(JsonTextWriter.java:176)
>>>>> solr-01    |    at
>>>>> org.apache.solr.common.MapWriter$EntryWriter.put(MapWriter.java:154)
>>>>> solr-01    |    at
>>>>> 
>>> org.apache.solr.handler.export.StringFieldWriter.write(StringFieldWriter.java:77)
>>>>> solr-01    |    at
>>>>> 
>>> org.apache.solr.handler.export.ExportWriter.writeDoc(ExportWriter.java:313)
>>>>> solr-01    |    at
>>>>> 
>>> org.apache.solr.handler.export.ExportWriter.lambda$addDocsToItemWriter$4(ExportWriter.java:263)
>>>>> --
>>>>> solr-01    |    at org.eclipse.jetty.io
>>>>> .FillInterest.fillable(FillInterest.java:103)
>>>>> solr-01    |    at org.eclipse.jetty.io
>>>>> .ChannelEndPoint$2.run(ChannelEndPoint.java:117)
>>>>> solr-01    |    at
>>>>> 
>>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333)
>>>>> solr-01    |    at
>>>>> 
>>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310)
>>>>> solr-01    |    at
>>>>> 
>>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)
>>>>> solr-01    |    at
>>>>> 
>>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126)
>>>>> solr-01    |    at
>>>>> 
>>> org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366)
>>>>> solr-01    |    at
>>>>> 
>>> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:781)
>>>>> solr-01    |    at
>>>>> 
>>> org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:917)
>>>>> solr-01    |    at java.base/java.lang.Thread.run(Thread.java:834)
>>>>> solr-01    | Caused by: java.util.concurrent.TimeoutException: Idle
>>>>> timeout expired: 120000/120000 ms
>>>>> solr-01    |    at org.eclipse.jetty.io
>>>>> .IdleTimeout.checkIdleTimeout(IdleTimeout.java:171)
>>>>> solr-01    |    at org.eclipse.jetty.io
>>>>> .IdleTimeout.idleCheck(IdleTimeout.java:113)
>>>>> solr-01    |    at
>>>>> 
>>> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>>>>> solr-01    |    at
>>>>> java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>>>>> solr-01    |    at
>>>>> 
>>> java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
>>>>> solr-01    |    at
>>>>> 
>>> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>>>>> solr-01    |    at
>>>>> 
>>> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>>>>> solr-01    |    ... 1 more
>>>>> 
>>>>> 
>>>>> My expression, in case it helps. To summarize, it finds the document
>>> ids
>>>>> which exists on sourceCollection but not on target collection (DNM).
>>> Joins
>>>>> on itself to duplicate some fields (I couldn’t find another way to
>>>>> duplicate the value of field into 2 fields). Then sends the result to
>>>>> update. Source collection has about 300M documents, 24GB heap, 2
>>> shards, 2
>>>>> replicas of each shard.
>>>>> 
>>>>> update(
>>>>>    DNM,
>>>>>    batchSize=1000,
>>>>>    parallel(
>>>>>        WorkerCollection,
>>>>>        leftOuterJoin(
>>>>>            fetch(
>>>>>                sourceCollection,
>>>>>                complement(
>>>>>                    search(
>>>>>                        sourceCollection,
>>>>>                        q="*:*",
>>>>>                        qt="/export",
>>>>>                        fq="...some filters...",
>>>>>                        sort="id_str asc",
>>>>>                        fl="id_str",
>>>>>                        partitionKeys="id_str"
>>>>>                    ),
>>>>>                    search(
>>>>>                        DNM,
>>>>>                        q="*:*",
>>>>>                        qt="/export",
>>>>>                        sort="id_str asc",
>>>>>                        fl="id_str",
>>>>>                        partitionKeys="id_str"
>>>>>                    ),
>>>>>                    on="id_str"
>>>>>                ),
>>>>>                fl="...my many fields...",
>>>>>                on="id_str",
>>>>>                batchSize="1000"
>>>>>            ),
>>>>>            select(
>>>>>                fetch(
>>>>>                    sourceCollection,
>>>>>                    complement(
>>>>>                        search(
>>>>>                            sourceCollection,
>>>>>                            q="*:*",
>>>>>                            qt="/export",
>>>>>                            fq="...some other filters...",
>>>>>                            sort="id_str asc",
>>>>>                            fl="id_str",
>>>>>                            partitionKeys="id_str"
>>>>>                        ),
>>>>>                        search(
>>>>>                            DNM,
>>>>>                            q="*:*",
>>>>>                            qt="/export",
>>>>>                            sort="id_str asc",
>>>>>                            fl="id_str",
>>>>>                            partitionKeys="id_str"
>>>>>                        ),
>>>>>                        on="id_str"
>>>>>                    ),
>>>>>                    fl="...some other fields...",
>>>>>                    on="id_str",
>>>>>                    batchSize="1000"
>>>>>                ),
>>>>>                id_str, ..some other fields as...
>>>>>            ),
>>>>>            on="id_str"
>>>>>        ),
>>>>>        workers="4", sort="id_str asc"
>>>>>    )
>>>>> )
>>>>> 
>>>>> Sent from Mail for Windows 10
>>>>> 
>>>>> 
>>> 
>>> 
>