You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Yogendra Kumar Soni <yo...@dolcera.com> on 2018/06/04 06:36:45 UTC
Solr 7.3 suggest dictionary building fails in cloud mode with large
number of rows
I have 130 million documents and each document has unique document id. I
want to build suggester on document id. suggest dictionary building is
failing for 130 millions. while testing it was successful with 50 million
documents.
8 nodes with 50 GB head for each node and total 600 gb ram
heap usage is around 10 GB - 12 GB per node.
It takes around 50 -60 min then fail and different shards fails on diffrent
tries.
Solr version 7.3.0
OS: linux
runtime: Oracle Corporation Java HotSpot(TM) 64-Bit Server VM 1.8.0_172
25.172-b11
suggester configuration in solrconfig.xml
<searchComponent name="suggestdn" class="solr.SuggestComponent">
<lst name="suggester">
<str name="name">suggestdn</str>
<str name="lookupImpl">FuzzyLookupFactory</str>
<str name="dictionaryImpl">DocumentDictionaryFactory</str>
<str name="field">docid</str>
<str name="suggestAnalyzerFieldType">string</str>
<str name="exactMatchFirst">true</str>
<str name="buildOnStartup">false</str>
<str name="buildOnCommit">false</str>
</lst>
</searchComponent>
<requestHandler name="/suggestdn" class="solr.SearchHandler" startup="lazy" >
<lst name="defaults">
<str name="suggest">true</str>
<str name="suggest.count">10</str>
<str name="suggest.dictionary">suggestpn</str>
<bool name="distrib">true</bool>
<str name="shards.qt">/suggestdn</str>
</lst>
<arr name="components">
<str>suggestdn</str>
</arr>
</requestHandler>
I am getting following stacktrace:
HttpSolrCall
null:org.apache.solr.common.SolrException:
org.apache.solr.client.solrj.SolrServerException: Timeout occured
while waiting response from server at:
http://localhost:11180/solr/shard5_replica_n8
null:org.apache.solr.common.SolrException:
org.apache.solr.client.solrj.SolrServerException: Timeout occured
while waiting response from server at:
http://localhost:11180/solr/hard5_replica_n8
at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:410)
at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:195)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)
at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:711)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:517)
at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:384)
at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:330)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1629)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:190)
at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595)
at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:188)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1253)
at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:168)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)
at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:166)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1155)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:219)
at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
at org.eclipse.jetty.server.Server.handle(Server.java:530)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:347)
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:256)
at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:279)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:102)
at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:124)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:247)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produce(EatWhatYouKill.java:140)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:131)
at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:382)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:708)
at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:626)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.solr.client.solrj.SolrServerException: Timeout
occured while waiting response from server at:
http://10.1.1.189:14080/solr/alexandria-standard_shard5_replica_n8
at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:654)
at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:255)
at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:244)
at org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1219)
at org.apache.solr.handler.component.HttpShardHandler.lambda$submit$0(HttpShardHandler.java:172)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176)
at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:188)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
... 1 more
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:171)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137)
at org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153)
at org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:282)
at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:138)
at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56)
at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259)
at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163)
at org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:165)
at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273)
at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
at org.apache.solr.util.stats.InstrumentedHttpRequestExecutor.execute(InstrumentedHttpRequestExecutor.java:118)
at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272)
at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185)
at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:111)
at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:542)
... 12 more
*Thanks and Regards,*
*Yogendra Kumar Soni*
Re: Solr 7.3 suggest dictionary building fails in cloud mode with
large number of rows
Posted by Alessandro Benedetti <a....@sease.io>.
In addition to what Erick and Walter correctly mentioned :
"heap usage varies from 5 gb to 12 gb . Initially it was 5 gb then increased
to 12 gb gradually and decreasing to 5 gb again. (may be because of garbage
collection)
10-12 GB maximum heap uses, allocated is 50 GB. "
Did I read it right ?
Is 50 Gb allocated to the phisical/virtual machine where Solr is running or
to the Solr JVM ?
If the first is ok, the latter is considered a bad practice unless you
really need all that heap for your Solr process ( which is extremely
unlikely)
You need to leave memory to the OS memory mapping ( which is heavily used by
Solr).
With such a big heap, you GC may indeed end up in long pauses.
It is recommended to allocate to the Solr process as little as possible (
according yo your requirements)
Regards
-----
---------------
Alessandro Benedetti
Search Consultant, R&D Software Engineer, Director
Sease Ltd. - www.sease.io
--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Re: Solr 7.3 suggest dictionary building fails in cloud mode with
large number of rows
Posted by Walter Underwood <wu...@wunderwood.org>.
Yes, why are you doing this? A suggester is designed to have a smaller set of terms than the entire index.
I would never expect a 130 million term suggester to work. I’m astonished that it works with 50 million terms.
We typically have about 50 thousand terms in a suggester.
Also, you haven’t said which kind of suggester you have configured. Some of them are in memory.
wunder
Walter Underwood
wunder@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Jun 4, 2018, at 9:09 AM, Erick Erickson <er...@gmail.com> wrote:
>
> bq. I have 130 million documents and each document has unique document id. I
> want to build suggester on document id.
>
> Why do it this way? I'm supposing you want to have someone start
> typing in the doc ID
> then do autocomplete on it. For such a simple operation, it would be
> far easier and
> pretty certainly fast enough to just use the Terms component and specify
> terms.prefix. See:
> https://lucene.apache.org/solr/guide/6_6/the-terms-component.html
>
> This would not require any build step, would be as up-to-date as your
> last commit,
> would not consume the additional resources a suggester would work if
> you shard.....
>
> Best,
> Erick
>
> On Mon, Jun 4, 2018 at 4:23 AM, Yogendra Kumar Soni
> <yo...@dolcera.com> wrote:
>> I sent log of node to which i sent the request. need to check other nodes
>> log
>>>> In SolrCloud an investigation does not isolate to a single Solr log : you
>>>> see a timeout, i would recommend to check both the nodes involved.
>>
>>
>> monitored from admin UI, could not find any clue at the time of failure.
>>
>>>> Are you monitoring the memory utilisation for your Solr nodes ?
>>
>>
>>>> When you say : " heap usage is around 10 GB - 12 GB per node.", do you
>> refer
>> to the effective usage by the Solr JVM or the allocated heap ?
>>
>>
>> heap usage varies from 5 gb to 12 gb . Initially it was 5 gb then increased
>> to 12 gb gradually and decreasing to 5 gb again. (may be because of garbage
>> collection)
>> 10-12 GB maximum heap uses, allocated is 50 GB.
>>
>>>> Are Garbage Collection cycles behaving correctly ?
>>>> When a timeout occurs, something bad happened in the communication between
>>>> the Solr nodes.
>>
>> Need to analyze GC pause. Any suggestion how i can monitor resource usage
>> and GC pause effectively.
>>>> It could be network, but in your case it may be some Stop World situation
>>>> caused by GC.
>>
>> On Mon, Jun 4, 2018 at 3:27 PM, Alessandro Benedetti <a....@sease.io>
>> wrote:
>>
>>> Hi Yogendra,
>>> you mentioned you are using SolrCloud.
>>> In SolrCloud an investigation does not isolate to a single Solr log : you
>>> see a timeout, i would recommend to check both the nodes involved.
>>>
>>> When you say : " heap usage is around 10 GB - 12 GB per node.", do you
>>> refer
>>> to the effective usage by the Solr JVM or the allocated heap ?
>>> Are you monitoring the memory utilisation for your Solr nodes ?
>>> Are Garbage Collection cycles behaving correctly ?
>>> When a timeout occurs, something bad happened in the communication between
>>> the Solr nodes.
>>> It could be network, but in your case it may be some Stop World situation
>>> caused by GC.
>>>
>>>
>>>
>>>
>>> -----
>>> ---------------
>>> Alessandro Benedetti
>>> Search Consultant, R&D Software Engineer, Director
>>> Sease Ltd. - www.sease.io
>>> --
>>> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>>>
>>
>>
>>
>> --
>> *Thanks and Regards,*
>> *Yogendra Kumar Soni*
Re: Solr 7.3 suggest dictionary building fails in cloud mode with
large number of rows
Posted by Erick Erickson <er...@gmail.com>.
bq. I have 130 million documents and each document has unique document id. I
want to build suggester on document id.
Why do it this way? I'm supposing you want to have someone start
typing in the doc ID
then do autocomplete on it. For such a simple operation, it would be
far easier and
pretty certainly fast enough to just use the Terms component and specify
terms.prefix. See:
https://lucene.apache.org/solr/guide/6_6/the-terms-component.html
This would not require any build step, would be as up-to-date as your
last commit,
would not consume the additional resources a suggester would work if
you shard.....
Best,
Erick
On Mon, Jun 4, 2018 at 4:23 AM, Yogendra Kumar Soni
<yo...@dolcera.com> wrote:
> I sent log of node to which i sent the request. need to check other nodes
> log
>>>In SolrCloud an investigation does not isolate to a single Solr log : you
>>>see a timeout, i would recommend to check both the nodes involved.
>
>
> monitored from admin UI, could not find any clue at the time of failure.
>
>>>Are you monitoring the memory utilisation for your Solr nodes ?
>
>
>>>When you say : " heap usage is around 10 GB - 12 GB per node.", do you
> refer
> to the effective usage by the Solr JVM or the allocated heap ?
>
>
> heap usage varies from 5 gb to 12 gb . Initially it was 5 gb then increased
> to 12 gb gradually and decreasing to 5 gb again. (may be because of garbage
> collection)
> 10-12 GB maximum heap uses, allocated is 50 GB.
>
>>>Are Garbage Collection cycles behaving correctly ?
>>>When a timeout occurs, something bad happened in the communication between
>>>the Solr nodes.
>
> Need to analyze GC pause. Any suggestion how i can monitor resource usage
> and GC pause effectively.
>>>It could be network, but in your case it may be some Stop World situation
>>>caused by GC.
>
> On Mon, Jun 4, 2018 at 3:27 PM, Alessandro Benedetti <a....@sease.io>
> wrote:
>
>> Hi Yogendra,
>> you mentioned you are using SolrCloud.
>> In SolrCloud an investigation does not isolate to a single Solr log : you
>> see a timeout, i would recommend to check both the nodes involved.
>>
>> When you say : " heap usage is around 10 GB - 12 GB per node.", do you
>> refer
>> to the effective usage by the Solr JVM or the allocated heap ?
>> Are you monitoring the memory utilisation for your Solr nodes ?
>> Are Garbage Collection cycles behaving correctly ?
>> When a timeout occurs, something bad happened in the communication between
>> the Solr nodes.
>> It could be network, but in your case it may be some Stop World situation
>> caused by GC.
>>
>>
>>
>>
>> -----
>> ---------------
>> Alessandro Benedetti
>> Search Consultant, R&D Software Engineer, Director
>> Sease Ltd. - www.sease.io
>> --
>> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>>
>
>
>
> --
> *Thanks and Regards,*
> *Yogendra Kumar Soni*
Re: Solr 7.3 suggest dictionary building fails in cloud mode with
large number of rows
Posted by Yogendra Kumar Soni <yo...@dolcera.com>.
I sent log of node to which i sent the request. need to check other nodes
log
>>In SolrCloud an investigation does not isolate to a single Solr log : you
>>see a timeout, i would recommend to check both the nodes involved.
monitored from admin UI, could not find any clue at the time of failure.
>>Are you monitoring the memory utilisation for your Solr nodes ?
>>When you say : " heap usage is around 10 GB - 12 GB per node.", do you
refer
to the effective usage by the Solr JVM or the allocated heap ?
heap usage varies from 5 gb to 12 gb . Initially it was 5 gb then increased
to 12 gb gradually and decreasing to 5 gb again. (may be because of garbage
collection)
10-12 GB maximum heap uses, allocated is 50 GB.
>>Are Garbage Collection cycles behaving correctly ?
>>When a timeout occurs, something bad happened in the communication between
>>the Solr nodes.
Need to analyze GC pause. Any suggestion how i can monitor resource usage
and GC pause effectively.
>>It could be network, but in your case it may be some Stop World situation
>>caused by GC.
On Mon, Jun 4, 2018 at 3:27 PM, Alessandro Benedetti <a....@sease.io>
wrote:
> Hi Yogendra,
> you mentioned you are using SolrCloud.
> In SolrCloud an investigation does not isolate to a single Solr log : you
> see a timeout, i would recommend to check both the nodes involved.
>
> When you say : " heap usage is around 10 GB - 12 GB per node.", do you
> refer
> to the effective usage by the Solr JVM or the allocated heap ?
> Are you monitoring the memory utilisation for your Solr nodes ?
> Are Garbage Collection cycles behaving correctly ?
> When a timeout occurs, something bad happened in the communication between
> the Solr nodes.
> It could be network, but in your case it may be some Stop World situation
> caused by GC.
>
>
>
>
> -----
> ---------------
> Alessandro Benedetti
> Search Consultant, R&D Software Engineer, Director
> Sease Ltd. - www.sease.io
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>
--
*Thanks and Regards,*
*Yogendra Kumar Soni*
Re: Solr 7.3 suggest dictionary building fails in cloud mode with
large number of rows
Posted by Alessandro Benedetti <a....@sease.io>.
Hi Yogendra,
you mentioned you are using SolrCloud.
In SolrCloud an investigation does not isolate to a single Solr log : you
see a timeout, i would recommend to check both the nodes involved.
When you say : " heap usage is around 10 GB - 12 GB per node.", do you refer
to the effective usage by the Solr JVM or the allocated heap ?
Are you monitoring the memory utilisation for your Solr nodes ?
Are Garbage Collection cycles behaving correctly ?
When a timeout occurs, something bad happened in the communication between
the Solr nodes.
It could be network, but in your case it may be some Stop World situation
caused by GC.
-----
---------------
Alessandro Benedetti
Search Consultant, R&D Software Engineer, Director
Sease Ltd. - www.sease.io
--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html