You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Joe Obernberger <jo...@gmail.com> on 2017/07/17 12:36:34 UTC

Solr 6.6.0 - Indexing errors

We've been indexing data on a 45 node cluster with 100 shards and 3 
replicas, but our indexing processes have been stopping due to errors.  
On the server side the error is "Error logging add". Stack trace:

2017-07-17 12:29:24.057 INFO  (qtp985934102-5161548) [c:UNCLASS 
s:shard58 r:core_node290 x:UNCLASS_shard58_replica1] 
o.a.s.u.p.LogUpdateProcessorFactory [UNCLASS_shard58_replica1] 
webapp=/solr path=/update 
params={update.distrib=FROMLEADER&update.chain=add-unknown-fields-to-the-schema&distrib.from=http://tarvos:9100/solr/UNCLASS_shard58_replica2/&wt=javabin&version=2}{add=[COLLECT20003218348784 
(1573172872544780288), COLLECT20003218351447 (1573172872620277760), 
COLLECT20003218353085 (1573172872625520640), COLLECT20003218357937 
(1573172872627617792), COLLECT20003218361860 (1573172872629714944), 
COLLECT20003218362535 (1573172872631812096)]} 0 171
2017-07-17 12:29:24.160 INFO  (qtp985934102-5160762) [c:UNCLASS 
s:shard13 r:core_node81 x:UNCLASS_shard13_replica1] 
o.a.s.u.p.LogUpdateProcessorFactory [UNCLASS_shard13_replica1] 
webapp=/solr path=/update 
params={update.distrib=FROMLEADER&update.chain=add-unknown-fields-to-the-schema&distrib.from=http://tarvos:9100/solr/UNCLASS_shard13_replica2/&wt=javabin&version=2}{add=[COLLECT20003218344436 
(1573172872538488832), COLLECT20003218347497 (1573172872620277760), 
COLLECT20003218351645 (1573172872625520640), COLLECT20003218356965 
(1573172872629714944), COLLECT20003218357775 (1573172872632860672), 
COLLECT20003218358017 (1573172872646492160), COLLECT20003218358152 
(1573172872650686464), COLLECT20003218359395 (1573172872651735040), 
COLLECT20003218362571 (1573172872652783616)]} 0 274
2017-07-17 12:29:24.163 INFO  (qtp985934102-5161057) [c:UNCLASS 
s:shard43 r:core_node108 x:UNCLASS_shard43_replica1] 
o.a.s.u.p.LogUpdateProcessorFactory [UNCLASS_shard43_replica1] 
webapp=/solr path=/update 
params={update.distrib=FROMLEADER&update.chain=add-unknown-fields-to-the-schema&distrib.from=http://tarvos:9100/solr/UNCLASS_shard43_replica2/&wt=javabin&version=2}{} 
0 0
2017-07-17 12:29:24.164 ERROR (qtp985934102-5161057) [c:UNCLASS 
s:shard43 r:core_node108 x:UNCLASS_shard43_replica1] 
o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: Error 
logging add
         at 
org.apache.solr.update.TransactionLog.write(TransactionLog.java:418)
         at org.apache.solr.update.UpdateLog.add(UpdateLog.java:532)
         at org.apache.solr.update.UpdateLog.add(UpdateLog.java:516)
         at 
org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:1113)
         at 
org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:748)
         at 
org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:103)
         at 
org.apache.solr.handler.loader.JavabinLoader$1.update(JavabinLoader.java:98)
         at 
org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:180)
         at 
org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readIterator(JavaBinUpdateRequestCodec.java:136)
         at 
org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:306)
         at 
org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:251)
         at 
org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readNamedList(JavaBinUpdateRequestCodec.java:122)
         at 
org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:271)
         at 
org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:251)
         at 
org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:173)
         at 
org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec.unmarshal(JavaBinUpdateRequestCodec.java:187)
         at 
org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDocs(JavabinLoader.java:108)
         at 
org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoader.java:55)
         at 
org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:97)
         at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68)
         at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:173)
         at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477)
         at 
org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:723)
         at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:529)
         at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:361)
         at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:305)
         at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1691)
         at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
         at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
         at 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
         at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
         at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
         at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
         at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
         at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
         at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
         at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
         at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
         at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
         at 
org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
         at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
         at org.eclipse.jetty.server.Server.handle(Server.java:534)
         at 
org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
         at 
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
         at 
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
         at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
         at 
org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
         at 
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
         at 
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
         at 
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
         at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
         at 
org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
         at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.hadoop.ipc.RemoteException(java.io.IOException): 
File /solr6.6.0/UNCLASS/core_node108/data/tlog/tlog.0000000000000006211 
could only be replicated to 0 nodes instead of minReplication (=1).  
There are 40 datanode(s) running and no node(s) are excluded in this 
operation.
         at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1622)
         at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3351)
         at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:683)
         at 
org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.addBlock(AuthorizationProviderProxyClientProtocol.java:214)
         at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:495)
         at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
         at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2216)
         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2212)
         at java.security.AccessController.doPrivileged(Native Method)
         at javax.security.auth.Subject.doAs(Subject.java:422)
         at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2210)

         at org.apache.hadoop.ipc.Client.call(Client.java:1475)
         at org.apache.hadoop.ipc.Client.call(Client.java:1412)
         at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
         at com.sun.proxy.$Proxy11.addBlock(Unknown Source)
         at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:418)
         at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
         at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
         at java.lang.reflect.Method.invoke(Method.java:498)
         at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
         at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
         at com.sun.proxy.$Proxy12.addBlock(Unknown Source)
         at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1459)
         at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1255)
         at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449)

2017-07-17 12:29:24.164 ERROR (qtp985934102-5161057) [c:UNCLASS 
s:shard43 r:core_node108 x:UNCLASS_shard43_replica1] 
o.a.s.s.HttpSolrCall null:org.apache.solr.common.SolrException: Error 
logging add
         at 
org.apache.solr.update.TransactionLog.write(TransactionLog.java:418)
         at org.apache.solr.update.UpdateLog.add(UpdateLog.java:532)
         at org.apache.solr.update.UpdateLog.add(UpdateLog.java:516)
         at 
org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:1113)
         at 
org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:748)
         at 
org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:103)
         at 
org.apache.solr.handler.loader.JavabinLoader$1.update(JavabinLoader.java:98)
         at 
org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:180)
         at 
org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readIterator(JavaBinUpdateRequestCodec.java:136)
         at 
org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:306)
         at 
org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:251)
         at 
org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readNamedList(JavaBinUpdateRequestCodec.java:122)
         at 
org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:271)
         at 
org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:251)
         at 
org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:173)
         at 
org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec.unmarshal(JavaBinUpdateRequestCodec.java:187)
         at 
org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDocs(JavabinLoader.java:108)
         at 
org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoader.java:55)
         at 
org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:97)
         at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68)
         at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:173)
         at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477)
         at 
org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:723)
         at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:529)
         at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:361)
         at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:305)
         at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1691)
         at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
         at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
         at 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
         at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
         at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
         at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
         at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
         at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
         at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
         at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
         at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
         at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
         at 
org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
         at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
         at org.eclipse.jetty.server.Server.handle(Server.java:534)
         at 
org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
         at 
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
         at 
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
         at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
         at 
org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
         at 
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
         at 
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
         at 
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
         at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
         at 
org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
         at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.hadoop.ipc.RemoteException(java.io.IOException): 
File /solr6.6.0/UNCLASS/core_node108/data/tlog/tlog.0000000000000006211 
could only be replicated to 0 nodes instead of minReplication (=1).  
There are 40 datanode(s) running and no node(s) are excluded in this 
operation.
         at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1622)
         at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3351)
         at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:683)
         at 
org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.addBlock(AuthorizationProviderProxyClientProtocol.java:214)
         at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:495)
         at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
         at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2216)
         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2212)
         at java.security.AccessController.doPrivileged(Native Method)
         at javax.security.auth.Subject.doAs(Subject.java:422)
         at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2210)

         at org.apache.hadoop.ipc.Client.call(Client.java:1475)
         at org.apache.hadoop.ipc.Client.call(Client.java:1412)
         at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
         at com.sun.proxy.$Proxy11.addBlock(Unknown Source)
         at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:418)
         at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
         at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
         at java.lang.reflect.Method.invoke(Method.java:498)
         at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
         at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
         at com.sun.proxy.$Proxy12.addBlock(Unknown Source)
         at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1459)
         at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1255)
         at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449)

2017-07-17 12:29:24.187 INFO 
(zkCallback-5-thread-144-processing-n:juliet:9100_solr) [   ] 
o.a.s.c.c.ZkStateReader A cluster state change: [WatchedEvent 
state:SyncConnected type:NodeDataChanged 
path:/collections/UNCLASS/state.json] for collection [UNCLASS] has 
occurred - updating... (live nodes size: [45])

On the client side, the error looks like:
2017-07-16 19:03:16,118 WARN 
[com.ngc.bigdata.ie_solrindexer.IndexDocument] Indexing error: 
org.apache.solr.client.solrj.impl.CloudSolrClient$RouteException: Error 
from server at http://leda:9100/solr/UNCLASS_shard44_replica3: Exception 
writing document id COLLECT10086453202 to the index; possible analysis 
error. for collection: UNCLASS
org.apache.solr.client.solrj.impl.CloudSolrClient$RouteException: Error 
from server at http://leda:9100/solr/UNCLASS_shard44_replica3: Exception 
writing document id COLLECT10086453202 to the index; possible analysis 
error.
         at 
org.apache.solr.client.solrj.impl.CloudSolrClient.directUpdate(CloudSolrClient.java:819)
         at 
org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:1263)
         at 
org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:1134)
         at 
org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:1073)
         at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:160)
         at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:106)
         at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:71)
         at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:85)
         at 
com.ngc.bigdata.ie_solrindexer.IndexDocument.indexSolrDocs(IndexDocument.java:959)
         at 
com.ngc.bigdata.ie_solrindexer.IndexDocument.index(IndexDocument.java:236)
         at 
com.ngc.bigdata.ie_solrindexer.SolrIndexerProcessor.doWork(SolrIndexerProcessor.java:63)
         at 
com.ngc.intelenterprise.intelentutil.utils.Processor.run(Processor.java:140)
         at 
com.ngc.intelenterprise.intelentutil.jms.IntelEntQueueProc.process(IntelEntQueueProc.java:208)
         at 
org.apache.camel.processor.DelegateSyncProcessor.process(DelegateSyncProcessor.java:63)
         at 
org.apache.camel.management.InstrumentationProcessor.process(InstrumentationProcessor.java:77)
         at 
org.apache.camel.processor.RedeliveryErrorHandler.process(RedeliveryErrorHandler.java:460)
         at 
org.apache.camel.processor.CamelInternalProcessor.process(CamelInternalProcessor.java:190)
         at 
org.apache.camel.processor.CamelInternalProcessor.process(CamelInternalProcessor.java:190)
         at 
org.apache.camel.component.seda.SedaConsumer.sendToConsumers(SedaConsumer.java:298)
         at 
org.apache.camel.component.seda.SedaConsumer.doRun(SedaConsumer.java:207)
         at 
org.apache.camel.component.seda.SedaConsumer.run(SedaConsumer.java:154)
         at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
         at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
         at java.lang.Thread.run(Thread.java:748)
Caused by: 
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: 
Error from server at http://leda:9100/solr/UNCLASS_shard44_replica3: 
Exception writing document id COLLECT10086453202 to the index; possible 
analysis error.
         at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:610)
         at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:279)
         at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:268)
         at 
org.apache.solr.client.solrj.impl.LBHttpSolrClient.doRequest(LBHttpSolrClient.java:447)
         at 
org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(LBHttpSolrClient.java:388)
         at 
org.apache.solr.client.solrj.impl.CloudSolrClient.lambda$directUpdate$0(CloudSolrClient.java:796)
         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
         at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229)
         ... 3 more
2017-07-16 19:03:16,134 ERROR 
[com.ngc.bigdata.ie_solrindexer.IndexDocument] Error indexing: 
org.apache.solr.client.solrj.impl.CloudSolrClient$RouteException: Error 
from server at http://leda:9100/solr/UNCLASS_shard44_replica3: Exception 
writing document id COLLECT10086453202 to the index; possible analysis 
error. for collection: UNCLASS.
2017-07-16 19:03:16,135 ERROR 
[com.ngc.bigdata.ie_solrindexer.IndexDocument] Exception during 
indexing: 
org.apache.solr.client.solrj.impl.CloudSolrClient$RouteException: Error 
from server at http://leda:9100/solr/UNCLASS_shard44_replica3: Exception 
writing document id COLLECT10086453202 to the index; possible analysis 
error.

I can fire them back up, but they only run for a short time before 
getting more indexing errors.  Several of the nodes show as down in the 
cloud view.  Any help would be appreciated!  Thank you!


-Joe


Re: Solr 6.6.0 - Indexing errors

Posted by Joe Obernberger <jo...@gmail.com>.
Erick - thank you.  I meant to disable field guessing as our indexer 
does this internally.  Thanks for seeing that!  Yes, we've seen things 
come in like IDs that are 12345 (int), but then next ID is 12AF456 (string).

There is also a version mismatch between our Cloudera 5.10.2 hadoop 
version and the version shipped with 6.6.0; correcting that.
Thanks again!

-Joe


On 7/17/2017 11:53 AM, Erick Erickson wrote:
> Joe:
>
> I agree that 46 million docs later you'd expect things to have settled
> out. However, I do note that you have
> "add-unknown-fields-to-the-schema" in your error stack which means
> you're using "field guessing", sometimes called data_driven. I would
> recommend you do _not_ use this for production as, while it does the
> best job it can it has to make assumptions about what the data looks
> like based on the first document it sees which may later be violated.
> Getting "possible analysis error" is one of the messages that happens
> when this occurs.
>
> The simple example is that if the first time data_driven sees "1"
> it'll guess integer. If sometime later there's a doc with "1.0" it'll
> generate a parse error.
>
> I totally agree that 46 million docs later you'd expect all of this
> kind of thing to have flushed out, but the "possible analysis error"
> seems to be pointing that direction. If this is, indeed, the problem
> you'll see better evidence on the Solr instance that's actually having
> the problem. Unfortunately you'll just to look at one Solr log from
> each shard to see whether this is an issue.
>
> Best,
> Erick
>
> On Mon, Jul 17, 2017 at 7:23 AM, Joe Obernberger
> <jo...@gmail.com> wrote:
>> So far we've indexed about 46 million documents, but over the weekend, these
>> errors started coming up.  I would expect that if there was a basic issue,
>> it would have started right away?  We ran a test cluster with just a few
>> shards/replicas prior and didn't see any issues using the same indexing
>> code, but we're running a lot more indexers simultaneously with the larger
>> cluster; perhaps we're just overloading HDFS?  The same nodes that run Solr
>> also run HDFS datanodes, but they are pretty beefy machines; we're not
>> swapping.
>>
>> As Shawn pointed out, I will be checking the HDFS version (we're using
>> Cloudera CDH 5.10.2), and the HDFS logs.
>>
>> -Joe
>>
>>
>>
>> On 7/17/2017 10:16 AM, Susheel Kumar wrote:
>>> There is some analysis error also.  I would suggest to test the indexer on
>>> just one shard setup first, then test for a replica (1 shard and 1
>>> replica)
>>> and then test for 2 shards and 2 replica.  This would confirm if there is
>>> basic issue with indexing / cluster setup.
>>>
>>> On Mon, Jul 17, 2017 at 9:04 AM, Joe Obernberger <
>>> joseph.obernberger@gmail.com> wrote:
>>>
>>>> Some more info:
>>>>
>>>> When I stop all the indexers, in about 5-10 minutes the cluster goes all
>>>> green.  When I start just one indexer, several nodes immediately go down
>>>> with the 'Error adding log' message.
>>>>
>>>> I'm using CloudSolrClient.add(List<SolrInputDocument>) to do the
>>>> indexing.  Is this correct for SolrCloud?
>>>>
>>>> Thank you!
>>>>
>>>> -Joe
>>>>
>>>>
>>>>
>>>> On 7/17/2017 8:36 AM, Joe Obernberger wrote:
>>>>
>>>>> We've been indexing data on a 45 node cluster with 100 shards and 3
>>>>> replicas, but our indexing processes have been stopping due to errors.
>>>>> On
>>>>> the server side the error is "Error logging add". Stack trace:
>>>>>
>>>>> 2017-07-17 12:29:24.057 INFO  (qtp985934102-5161548) [c:UNCLASS
>>>>> s:shard58
>>>>> r:core_node290 x:UNCLASS_shard58_replica1]
>>>>> o.a.s.u.p.LogUpdateProcessorFactory
>>>>> [UNCLASS_shard58_replica1] webapp=/solr path=/update
>>>>> params={update.distrib=FROMLEADER&update.chain=add-unknown-
>>>>> fields-to-the-schema&distrib.from=http://tarvos:9100/solr/
>>>>> UNCLASS_shard58_replica2/&wt=javabin&version=2}{add=[
>>>>> COLLECT20003218348784 (1573172872544780288), COLLECT20003218351447
>>>>> (1573172872620277760), COLLECT20003218353085 (1573172872625520640),
>>>>> COLLECT20003218357937 (1573172872627617792), COLLECT20003218361860
>>>>> (1573172872629714944), COLLECT20003218362535 (1573172872631812096)]} 0
>>>>> 171
>>>>> 2017-07-17 12:29:24.160 INFO  (qtp985934102-5160762) [c:UNCLASS
>>>>> s:shard13
>>>>> r:core_node81 x:UNCLASS_shard13_replica1]
>>>>> o.a.s.u.p.LogUpdateProcessorFactory
>>>>> [UNCLASS_shard13_replica1] webapp=/solr path=/update
>>>>> params={update.distrib=FROMLEADER&update.chain=add-unknown-
>>>>> fields-to-the-schema&distrib.from=http://tarvos:9100/solr/
>>>>> UNCLASS_shard13_replica2/&wt=javabin&version=2}{add=[
>>>>> COLLECT20003218344436 (1573172872538488832), COLLECT20003218347497
>>>>> (1573172872620277760), COLLECT20003218351645 (1573172872625520640),
>>>>> COLLECT20003218356965 (1573172872629714944), COLLECT20003218357775
>>>>> (1573172872632860672), COLLECT20003218358017 (1573172872646492160),
>>>>> COLLECT20003218358152 (1573172872650686464), COLLECT20003218359395
>>>>> (1573172872651735040), COLLECT20003218362571 (1573172872652783616)]} 0
>>>>> 274
>>>>> 2017-07-17 12:29:24.163 INFO  (qtp985934102-5161057) [c:UNCLASS
>>>>> s:shard43
>>>>> r:core_node108 x:UNCLASS_shard43_replica1]
>>>>> o.a.s.u.p.LogUpdateProcessorFactory
>>>>> [UNCLASS_shard43_replica1] webapp=/solr path=/update
>>>>> params={update.distrib=FROMLEADER&update.chain=add-unknown-
>>>>> fields-to-the-schema&distrib.from=http://tarvos:9100/solr/
>>>>> UNCLASS_shard43_replica2/&wt=javabin&version=2}{} 0 0
>>>>> 2017-07-17 12:29:24.164 ERROR (qtp985934102-5161057) [c:UNCLASS
>>>>> s:shard43
>>>>> r:core_node108 x:UNCLASS_shard43_replica1] o.a.s.h.RequestHandlerBase
>>>>> org.apache.solr.common.SolrException: Error logging add
>>>>>           at org.apache.solr.update.TransactionLog.write(TransactionLog.
>>>>> java:418)
>>>>>           at org.apache.solr.update.UpdateLog.add(UpdateLog.java:532)
>>>>>           at org.apache.solr.update.UpdateLog.add(UpdateLog.java:516)
>>>>>           at org.apache.solr.update.processor.DistributedUpdateProcessor.
>>>>> versionAdd(DistributedUpdateProcessor.java:1113)
>>>>>           at org.apache.solr.update.processor.DistributedUpdateProcessor.
>>>>> processAdd(DistributedUpdateProcessor.java:748)
>>>>>           at org.apache.solr.update.processor.LogUpdateProcessorFactory$L
>>>>> ogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:103)
>>>>>           at org.apache.solr.handler.loader.JavabinLoader$1.update(Javabi
>>>>> nLoader.java:98)
>>>>>           at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>>>>> ec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:180)
>>>>>           at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>>>>> ec$1.readIterator(JavaBinUpdateRequestCodec.java:136)
>>>>>           at org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinC
>>>>> odec.java:306)
>>>>>           at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCode
>>>>> c.java:251)
>>>>>           at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>>>>> ec$1.readNamedList(JavaBinUpdateRequestCodec.java:122)
>>>>>           at org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinC
>>>>> odec.java:271)
>>>>>           at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCode
>>>>> c.java:251)
>>>>>           at org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCo
>>>>> dec.java:173)
>>>>>           at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>>>>> ec.unmarshal(JavaBinUpdateRequestCodec.java:187)
>>>>>           at org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDoc
>>>>> s(JavabinLoader.java:108)
>>>>>           at org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoa
>>>>> der.java:55)
>>>>>           at org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRe
>>>>> questHandler.java:97)
>>>>>           at org.apache.solr.handler.ContentStreamHandlerBase.handleReque
>>>>> stBody(ContentStreamHandlerBase.java:68)
>>>>>           at org.apache.solr.handler.RequestHandlerBase.handleRequest(Req
>>>>> uestHandlerBase.java:173)
>>>>>           at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477)
>>>>>           at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.
>>>>> java:723)
>>>>>           at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:
>>>>> 529)
>>>>>           at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDisp
>>>>> atchFilter.java:361)
>>>>>           at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDisp
>>>>> atchFilter.java:305)
>>>>>           at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilte
>>>>> r(ServletHandler.java:1691)
>>>>>           at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHan
>>>>> dler.java:582)
>>>>>           at org.eclipse.jetty.server.handler.ScopedHandler.handle(Scoped
>>>>> Handler.java:143)
>>>>>           at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHa
>>>>> ndler.java:548)
>>>>>           at org.eclipse.jetty.server.session.SessionHandler.doHandle(
>>>>> SessionHandler.java:226)
>>>>>           at org.eclipse.jetty.server.handler.ContextHandler.doHandle(
>>>>> ContextHandler.java:1180)
>>>>>           at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHand
>>>>> ler.java:512)
>>>>>           at org.eclipse.jetty.server.session.SessionHandler.doScope(
>>>>> SessionHandler.java:185)
>>>>>           at org.eclipse.jetty.server.handler.ContextHandler.doScope(
>>>>> ContextHandler.java:1112)
>>>>>           at org.eclipse.jetty.server.handler.ScopedHandler.handle(Scoped
>>>>> Handler.java:141)
>>>>>           at org.eclipse.jetty.server.handler.ContextHandlerCollection.ha
>>>>> ndle(ContextHandlerCollection.java:213)
>>>>>           at org.eclipse.jetty.server.handler.HandlerCollection.handle(
>>>>> HandlerCollection.java:119)
>>>>>           at org.eclipse.jetty.server.handler.HandlerWrapper.handle(Handl
>>>>> erWrapper.java:134)
>>>>>           at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(Rewr
>>>>> iteHandler.java:335)
>>>>>           at org.eclipse.jetty.server.handler.HandlerWrapper.handle(Handl
>>>>> erWrapper.java:134)
>>>>>           at org.eclipse.jetty.server.Server.handle(Server.java:534)
>>>>>           at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.
>>>>> java:320)
>>>>>           at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConne
>>>>> ction.java:251)
>>>>>           at org.eclipse.jetty.io.AbstractConnection$ReadCallback.
>>>>> succeeded(AbstractConnection.java:273)
>>>>>           at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.
>>>>> java:95)
>>>>>           at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChann
>>>>> elEndPoint.java:93)
>>>>>           at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume
>>>>> .executeProduceConsume(ExecuteProduceConsume.java:303)
>>>>>           at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume
>>>>> .produceConsume(ExecuteProduceConsume.java:148)
>>>>>           at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume
>>>>> .run(ExecuteProduceConsume.java:136)
>>>>>           at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(Queued
>>>>> ThreadPool.java:671)
>>>>>           at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedT
>>>>> hreadPool.java:589)
>>>>>           at java.lang.Thread.run(Thread.java:748)
>>>>> Caused by: org.apache.hadoop.ipc.RemoteException(java.io.IOException):
>>>>> File /solr6.6.0/UNCLASS/core_node108/data/tlog/tlog.0000000000000006211
>>>>> could only be replicated to 0 nodes instead of minReplication (=1).
>>>>> There
>>>>> are 40 datanode(s) running and no node(s) are excluded in this
>>>>> operation.
>>>>>           at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.c
>>>>> hooseTarget4NewBlock(BlockManager.java:1622)
>>>>>           at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAddit
>>>>> ionalBlock(FSNamesystem.java:3351)
>>>>>           at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.add
>>>>> Block(NameNodeRpcServer.java:683)
>>>>>           at org.apache.hadoop.hdfs.server.namenode.AuthorizationProvider
>>>>> ProxyClientProtocol.addBlock(AuthorizationProviderProxyClien
>>>>> tProtocol.java:214)
>>>>>           at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServ
>>>>> erSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSide
>>>>> TranslatorPB.java:495)
>>>>>           at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocol
>>>>> Protos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNam
>>>>> enodeProtocolProtos.java)
>>>>>           at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcIn
>>>>> voker.call(ProtobufRpcEngine.java:617)
>>>>>           at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
>>>>>           at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2216)
>>>>>           at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2212)
>>>>>           at java.security.AccessController.doPrivileged(Native Method)
>>>>>           at javax.security.auth.Subject.doAs(Subject.java:422)
>>>>>           at org.apache.hadoop.security.UserGroupInformation.doAs(UserGro
>>>>> upInformation.java:1920)
>>>>>           at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2210)
>>>>>
>>>>>           at org.apache.hadoop.ipc.Client.call(Client.java:1475)
>>>>>           at org.apache.hadoop.ipc.Client.call(Client.java:1412)
>>>>>           at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(
>>>>> ProtobufRpcEngine.java:229)
>>>>>           at com.sun.proxy.$Proxy11.addBlock(Unknown Source)
>>>>>           at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTran
>>>>> slatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:418)
>>>>>           at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
>>>>>           at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
>>>>> thodAccessorImpl.java:43)
>>>>>           at java.lang.reflect.Method.invoke(Method.java:498)
>>>>>           at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMeth
>>>>> od(RetryInvocationHandler.java:191)
>>>>>           at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(Ret
>>>>> ryInvocationHandler.java:102)
>>>>>           at com.sun.proxy.$Proxy12.addBlock(Unknown Source)
>>>>>           at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFo
>>>>> llowingBlock(DFSOutputStream.java:1459)
>>>>>           at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBloc
>>>>> kOutputStream(DFSOutputStream.java:1255)
>>>>>           at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(
>>>>> DFSOutputStream.java:449)
>>>>>
>>>>> 2017-07-17 12:29:24.164 ERROR (qtp985934102-5161057) [c:UNCLASS
>>>>> s:shard43
>>>>> r:core_node108 x:UNCLASS_shard43_replica1] o.a.s.s.HttpSolrCall null:
>>>>> org.apache.solr.common.SolrException: Error logging add
>>>>>           at org.apache.solr.update.TransactionLog.write(TransactionLog.
>>>>> java:418)
>>>>>           at org.apache.solr.update.UpdateLog.add(UpdateLog.java:532)
>>>>>           at org.apache.solr.update.UpdateLog.add(UpdateLog.java:516)
>>>>>           at org.apache.solr.update.processor.DistributedUpdateProcessor.
>>>>> versionAdd(DistributedUpdateProcessor.java:1113)
>>>>>           at org.apache.solr.update.processor.DistributedUpdateProcessor.
>>>>> processAdd(DistributedUpdateProcessor.java:748)
>>>>>           at org.apache.solr.update.processor.LogUpdateProcessorFactory$L
>>>>> ogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:103)
>>>>>           at org.apache.solr.handler.loader.JavabinLoader$1.update(Javabi
>>>>> nLoader.java:98)
>>>>>           at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>>>>> ec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:180)
>>>>>           at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>>>>> ec$1.readIterator(JavaBinUpdateRequestCodec.java:136)
>>>>>           at org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinC
>>>>> odec.java:306)
>>>>>           at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCode
>>>>> c.java:251)
>>>>>           at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>>>>> ec$1.readNamedList(JavaBinUpdateRequestCodec.java:122)
>>>>>           at org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinC
>>>>> odec.java:271)
>>>>>           at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCode
>>>>> c.java:251)
>>>>>           at org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCo
>>>>> dec.java:173)
>>>>>           at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>>>>> ec.unmarshal(JavaBinUpdateRequestCodec.java:187)
>>>>>           at org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDoc
>>>>> s(JavabinLoader.java:108)
>>>>>           at org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoa
>>>>> der.java:55)
>>>>>           at org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRe
>>>>> questHandler.java:97)
>>>>>           at org.apache.solr.handler.ContentStreamHandlerBase.handleReque
>>>>> stBody(ContentStreamHandlerBase.java:68)
>>>>>           at org.apache.solr.handler.RequestHandlerBase.handleRequest(Req
>>>>> uestHandlerBase.java:173)
>>>>>           at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477)
>>>>>           at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.
>>>>> java:723)
>>>>>           at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:
>>>>> 529)
>>>>>           at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDisp
>>>>> atchFilter.java:361)
>>>>>           at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDisp
>>>>> atchFilter.java:305)
>>>>>           at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilte
>>>>> r(ServletHandler.java:1691)
>>>>>           at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHan
>>>>> dler.java:582)
>>>>>           at org.eclipse.jetty.server.handler.ScopedHandler.handle(Scoped
>>>>> Handler.java:143)
>>>>>           at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHa
>>>>> ndler.java:548)
>>>>>           at org.eclipse.jetty.server.session.SessionHandler.doHandle(
>>>>> SessionHandler.java:226)
>>>>>           at org.eclipse.jetty.server.handler.ContextHandler.doHandle(
>>>>> ContextHandler.java:1180)
>>>>>           at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHand
>>>>> ler.java:512)
>>>>>           at org.eclipse.jetty.server.session.SessionHandler.doScope(
>>>>> SessionHandler.java:185)
>>>>>           at org.eclipse.jetty.server.handler.ContextHandler.doScope(
>>>>> ContextHandler.java:1112)
>>>>>           at org.eclipse.jetty.server.handler.ScopedHandler.handle(Scoped
>>>>> Handler.java:141)
>>>>>           at org.eclipse.jetty.server.handler.ContextHandlerCollection.ha
>>>>> ndle(ContextHandlerCollection.java:213)
>>>>>           at org.eclipse.jetty.server.handler.HandlerCollection.handle(
>>>>> HandlerCollection.java:119)
>>>>>           at org.eclipse.jetty.server.handler.HandlerWrapper.handle(Handl
>>>>> erWrapper.java:134)
>>>>>           at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(Rewr
>>>>> iteHandler.java:335)
>>>>>           at org.eclipse.jetty.server.handler.HandlerWrapper.handle(Handl
>>>>> erWrapper.java:134)
>>>>>           at org.eclipse.jetty.server.Server.handle(Server.java:534)
>>>>>           at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.
>>>>> java:320)
>>>>>           at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConne
>>>>> ction.java:251)
>>>>>           at org.eclipse.jetty.io.AbstractConnection$ReadCallback.
>>>>> succeeded(AbstractConnection.java:273)
>>>>>           at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.
>>>>> java:95)
>>>>>           at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChann
>>>>> elEndPoint.java:93)
>>>>>           at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume
>>>>> .executeProduceConsume(ExecuteProduceConsume.java:303)
>>>>>           at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume
>>>>> .produceConsume(ExecuteProduceConsume.java:148)
>>>>>           at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume
>>>>> .run(ExecuteProduceConsume.java:136)
>>>>>           at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(Queued
>>>>> ThreadPool.java:671)
>>>>>           at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedT
>>>>> hreadPool.java:589)
>>>>>           at java.lang.Thread.run(Thread.java:748)
>>>>> Caused by: org.apache.hadoop.ipc.RemoteException(java.io.IOException):
>>>>> File /solr6.6.0/UNCLASS/core_node108/data/tlog/tlog.0000000000000006211
>>>>> could only be replicated to 0 nodes instead of minReplication (=1).
>>>>> There
>>>>> are 40 datanode(s) running and no node(s) are excluded in this
>>>>> operation.
>>>>>           at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.c
>>>>> hooseTarget4NewBlock(BlockManager.java:1622)
>>>>>           at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAddit
>>>>> ionalBlock(FSNamesystem.java:3351)
>>>>>           at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.add
>>>>> Block(NameNodeRpcServer.java:683)
>>>>>           at org.apache.hadoop.hdfs.server.namenode.AuthorizationProvider
>>>>> ProxyClientProtocol.addBlock(AuthorizationProviderProxyClien
>>>>> tProtocol.java:214)
>>>>>           at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServ
>>>>> erSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSide
>>>>> TranslatorPB.java:495)
>>>>>           at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocol
>>>>> Protos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNam
>>>>> enodeProtocolProtos.java)
>>>>>           at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcIn
>>>>> voker.call(ProtobufRpcEngine.java:617)
>>>>>           at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
>>>>>           at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2216)
>>>>>           at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2212)
>>>>>           at java.security.AccessController.doPrivileged(Native Method)
>>>>>           at javax.security.auth.Subject.doAs(Subject.java:422)
>>>>>           at org.apache.hadoop.security.UserGroupInformation.doAs(UserGro
>>>>> upInformation.java:1920)
>>>>>           at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2210)
>>>>>
>>>>>           at org.apache.hadoop.ipc.Client.call(Client.java:1475)
>>>>>           at org.apache.hadoop.ipc.Client.call(Client.java:1412)
>>>>>           at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(
>>>>> ProtobufRpcEngine.java:229)
>>>>>           at com.sun.proxy.$Proxy11.addBlock(Unknown Source)
>>>>>           at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTran
>>>>> slatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:418)
>>>>>           at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
>>>>>           at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
>>>>> thodAccessorImpl.java:43)
>>>>>           at java.lang.reflect.Method.invoke(Method.java:498)
>>>>>           at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMeth
>>>>> od(RetryInvocationHandler.java:191)
>>>>>           at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(Ret
>>>>> ryInvocationHandler.java:102)
>>>>>           at com.sun.proxy.$Proxy12.addBlock(Unknown Source)
>>>>>           at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFo
>>>>> llowingBlock(DFSOutputStream.java:1459)
>>>>>           at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBloc
>>>>> kOutputStream(DFSOutputStream.java:1255)
>>>>>           at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(
>>>>> DFSOutputStream.java:449)
>>>>>
>>>>> 2017-07-17 12:29:24.187 INFO
>>>>> (zkCallback-5-thread-144-processing-n:juliet:9100_solr)
>>>>> [   ] o.a.s.c.c.ZkStateReader A cluster state change: [WatchedEvent
>>>>> state:SyncConnected type:NodeDataChanged
>>>>> path:/collections/UNCLASS/state.json]
>>>>> for collection [UNCLASS] has occurred - updating... (live nodes size:
>>>>> [45])
>>>>>
>>>>> On the client side, the error looks like:
>>>>> 2017-07-16 19:03:16,118 WARN
>>>>> [com.ngc.bigdata.ie_solrindexer.IndexDocument]
>>>>> Indexing error: org.apache.solr.client.solrj.i
>>>>> mpl.CloudSolrClient$RouteException: Error from server at
>>>>> http://leda:9100/solr/UNCLASS_shard44_replica3: Exception writing
>>>>> document id COLLECT10086453202 to the index; possible analysis error.
>>>>> for
>>>>> collection: UNCLASS
>>>>> org.apache.solr.client.solrj.impl.CloudSolrClient$RouteException: Error
>>>>> from server at http://leda:9100/solr/UNCLASS_shard44_replica3: Exception
>>>>> writing document id COLLECT10086453202 to the index; possible analysis
>>>>> error.
>>>>>           at org.apache.solr.client.solrj.impl.CloudSolrClient.directUpda
>>>>> te(CloudSolrClient.java:819)
>>>>>           at org.apache.solr.client.solrj.impl.CloudSolrClient.sendReques
>>>>> t(CloudSolrClient.java:1263)
>>>>>           at org.apache.solr.client.solrj.impl.CloudSolrClient.requestWit
>>>>> hRetryOnStaleState(CloudSolrClient.java:1134)
>>>>>           at org.apache.solr.client.solrj.impl.CloudSolrClient.request(Cl
>>>>> oudSolrClient.java:1073)
>>>>>           at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest
>>>>> .java:160)
>>>>>           at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:
>>>>> 106)
>>>>>           at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:
>>>>> 71)
>>>>>           at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:
>>>>> 85)
>>>>>           at com.ngc.bigdata.ie_solrindexer.IndexDocument.indexSolrDocs(
>>>>> IndexDocument.java:959)
>>>>>           at com.ngc.bigdata.ie_solrindexer.IndexDocument.index(
>>>>> IndexDocument.java:236)
>>>>>           at com.ngc.bigdata.ie_solrindexer.SolrIndexerProcessor.doWork(S
>>>>> olrIndexerProcessor.java:63)
>>>>>           at com.ngc.intelenterprise.intelentutil.utils.Processor.run(
>>>>> Processor.java:140)
>>>>>           at com.ngc.intelenterprise.intelentutil.jms.IntelEntQueueProc.
>>>>> process(IntelEntQueueProc.java:208)
>>>>>           at org.apache.camel.processor.DelegateSyncProcessor.process(Del
>>>>> egateSyncProcessor.java:63)
>>>>>           at org.apache.camel.management.InstrumentationProcessor.process
>>>>> (InstrumentationProcessor.java:77)
>>>>>           at org.apache.camel.processor.RedeliveryErrorHandler.process(Re
>>>>> deliveryErrorHandler.java:460)
>>>>>           at org.apache.camel.processor.CamelInternalProcessor.process(Ca
>>>>> melInternalProcessor.java:190)
>>>>>           at org.apache.camel.processor.CamelInternalProcessor.process(Ca
>>>>> melInternalProcessor.java:190)
>>>>>           at org.apache.camel.component.seda.SedaConsumer.sendToConsumers
>>>>> (SedaConsumer.java:298)
>>>>>           at org.apache.camel.component.seda.SedaConsumer.doRun(SedaConsu
>>>>> mer.java:207)
>>>>>           at org.apache.camel.component.seda.SedaConsumer.run(SedaConsume
>>>>> r.java:154)
>>>>>           at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
>>>>> Executor.java:1142)
>>>>>           at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
>>>>> lExecutor.java:617)
>>>>>           at java.lang.Thread.run(Thread.java:748)
>>>>> Caused by:
>>>>> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
>>>>> Error from server at http://leda:9100/solr/UNCLASS_shard44_replica3:
>>>>> Exception writing document id COLLECT10086453202 to the index; possible
>>>>> analysis error.
>>>>>           at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMeth
>>>>> od(HttpSolrClient.java:610)
>>>>>           at org.apache.solr.client.solrj.impl.HttpSolrClient.request(Htt
>>>>> pSolrClient.java:279)
>>>>>           at org.apache.solr.client.solrj.impl.HttpSolrClient.request(Htt
>>>>> pSolrClient.java:268)
>>>>>           at org.apache.solr.client.solrj.impl.LBHttpSolrClient.doRequest
>>>>> (LBHttpSolrClient.java:447)
>>>>>           at org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(L
>>>>> BHttpSolrClient.java:388)
>>>>>           at org.apache.solr.client.solrj.impl.CloudSolrClient.lambda$dir
>>>>> ectUpdate$0(CloudSolrClient.java:796)
>>>>>           at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>>>>>           at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolE
>>>>> xecutor.lambda$execute$0(ExecutorUtil.java:229)
>>>>>           ... 3 more
>>>>> 2017-07-16 19:03:16,134 ERROR
>>>>> [com.ngc.bigdata.ie_solrindexer.IndexDocument]
>>>>> Error indexing: org.apache.solr.client.solrj.i
>>>>> mpl.CloudSolrClient$RouteException: Error from server at
>>>>> http://leda:9100/solr/UNCLASS_shard44_replica3: Exception writing
>>>>> document id COLLECT10086453202 to the index; possible analysis error.
>>>>> for
>>>>> collection: UNCLASS.
>>>>> 2017-07-16 19:03:16,135 ERROR
>>>>> [com.ngc.bigdata.ie_solrindexer.IndexDocument]
>>>>> Exception during indexing: org.apache.solr.client.solrj.i
>>>>> mpl.CloudSolrClient$RouteException: Error from server at
>>>>> http://leda:9100/solr/UNCLASS_shard44_replica3: Exception writing
>>>>> document id COLLECT10086453202 to the index; possible analysis error.
>>>>>
>>>>> I can fire them back up, but they only run for a short time before
>>>>> getting more indexing errors.  Several of the nodes show as down in the
>>>>> cloud view.  Any help would be appreciated!  Thank you!
>>>>>
>>>>>
>>>>> -Joe
>>>>>
>>>>>
>>> ---
>>> This email has been checked for viruses by AVG.
>>> http://www.avg.com
>>>


Re: Solr 6.6.0 - Indexing errors

Posted by Joe Obernberger <jo...@gmail.com>.
Thank you Shawn.  We will be adjusting solr.solr.home to point some 
place else so that our puppet module will work.  We actually didn't 
loose any data since the indexes are in HDFS.  Our configuration for our 
largest collection is 100 shards with 3 replicas each on top of HDFS 
with 3x replication.  Perhaps overkill.  It's just the core properties 
files that we lost.  I ended up writing a program that uses the 
CloudSolrClient to get all the info from zookeeper and then rebuild the 
core properties files.  Looks like it is working.  For example, for a 
collection called COL1 with config called COL1:

         File output;
         Iterator<Slice> iSlice = 
mainServer.getZkStateReader().getClusterState().getCollection("COL1").getActiveSlices().iterator();
         while (iSlice != null && iSlice.hasNext()) {
             Slice s = iSlice.next();
             Iterator<Replica> replicaIt = s.getReplicas().iterator();
             while (replicaIt != null && replicaIt.hasNext()) {
                 Replica r = replicaIt.next();
                 System.out.println("Name: "+r.getCoreName());
                 System.out.println("CodeNodeName: "+r.getName());
                 System.out.println("Node name: "+r.getNodeName());
                 System.out.println("Shard: "+s.getName());

                 output = new File(r.getNodeName()+"/"+r.getCoreName());
                 output.mkdirs();
                 output = new 
File(r.getNodeName()+"/"+r.getCoreName()+"/"+"core.properties");
                 StringBuilder buff = new StringBuilder();
                 buff.append("collection.configName=COL1\n");
                 buff.append("name=").append(r.getCoreName());
                 buff.append("\nshard=").append(s.getName());
                 buff.append("\ncollection=COL1");
buff.append("\ncoreNodeName=").append(r.getName());
                 try {
                     setContents(output, buff.toString());
                 } catch (IOException ex) {
                     System.out.println("Error writting: "+ex);
                 }
             }
         }


Then I copied the files to the 45 servers and restarted solr 6.6.0 on 
each.  It came back up OK, and it has been indexing all night long.

-Joe

On 7/17/2017 3:15 PM, Erick Erickson wrote


On 7/18/2017 12:31 PM, Shawn Heisey wrote:
> On 7/17/2017 11:39 AM, Joe Obernberger wrote:
>> We use puppet to deploy the solr instance to all the nodes.  I 
>> changed what was deployed to use the CDH jars, but our puppet module 
>> deletes the old directory and replaces it.  So, all the core 
>> configuration files under server/solr/ were removed. Zookeeper still 
>> has the configuration, but the nodes won't come up.
>>
>> Is there a way around this?  Re-creating these files manually isn't 
>> realistic; do I need to re-index?
>
> Put the solr home elsewhere so it's not under the program directory 
> and doesn't get deleted when you re-deploy Solr.  When starting Solr 
> manually with bin/solr, this is done with the -s option.
>
> If you install Solr as a service, which works on operating systems 
> with a strong GNU presence (such as Linux), then the solr home will 
> typically not be in the program directory.  The configuration script 
> (default filename is /etc/default/solr.in.sh) should not get deleted 
> if Solr is reinstalled, but I have not confirmed that this is the 
> case.  The service installer script is included in the Solr download.
>
> With SolrCloud, deleting all the core data like that will NOT be 
> automatically fixed by restarting Solr.  SolrCloud will have lost part 
> of its data.  If you have enough replicas left after a losslike that 
> to remain fully operational, then you'll need to use the DELETEREPLICA 
> and ADDREPLICA actions on the Collections API to rebuild the data on 
> that server from the leader of each shard.
>
> If the collection is incomplete after the solr home on a server gets 
> deleted, you'll probably need to completely delete the collection, 
> then recreate it, and reindex.  And you'll need to look into adding 
> servers/replicas so the loss of a single server cannot take you offline.
>
> Thanks,
> Shawn
>
>
> ---
> This email has been checked for viruses by AVG.
> http://www.avg.com
>


Re: Solr 6.6.0 - Indexing errors

Posted by Shawn Heisey <ap...@elyograg.org>.
On 7/17/2017 11:39 AM, Joe Obernberger wrote:
> We use puppet to deploy the solr instance to all the nodes.  I changed 
> what was deployed to use the CDH jars, but our puppet module deletes 
> the old directory and replaces it.  So, all the core configuration 
> files under server/solr/ were removed. Zookeeper still has the 
> configuration, but the nodes won't come up.
>
> Is there a way around this?  Re-creating these files manually isn't 
> realistic; do I need to re-index?

Put the solr home elsewhere so it's not under the program directory and 
doesn't get deleted when you re-deploy Solr.  When starting Solr 
manually with bin/solr, this is done with the -s option.

If you install Solr as a service, which works on operating systems with 
a strong GNU presence (such as Linux), then the solr home will typically 
not be in the program directory.  The configuration script (default 
filename is /etc/default/solr.in.sh) should not get deleted if Solr is 
reinstalled, but I have not confirmed that this is the case.  The 
service installer script is included in the Solr download.

With SolrCloud, deleting all the core data like that will NOT be 
automatically fixed by restarting Solr.  SolrCloud will have lost part 
of its data.  If you have enough replicas left after a losslike that to 
remain fully operational, then you'll need to use the DELETEREPLICA and 
ADDREPLICA actions on the Collections API to rebuild the data on that 
server from the leader of each shard.

If the collection is incomplete after the solr home on a server gets 
deleted, you'll probably need to completely delete the collection, then 
recreate it, and reindex.  And you'll need to look into adding 
servers/replicas so the loss of a single server cannot take you offline.

Thanks,
Shawn


Re: Solr 6.6.0 - Indexing errors

Posted by Joe Obernberger <jo...@gmail.com>.
We use puppet to deploy the solr instance to all the nodes.  I changed 
what was deployed to use the CDH jars, but our puppet module deletes the 
old directory and replaces it.  So, all the core configuration files 
under server/solr/ were removed. Zookeeper still has the configuration, 
but the nodes won't come up.

Is there a way around this?  Re-creating these files manually isn't 
realistic; do I need to re-index?

-Joe


On 7/17/2017 12:07 PM, Susheel Kumar wrote:
> and there is document id mentioned above when it failed with analysis
> error.  You can look how those documents differ as Eric suggested.
>
> On Mon, Jul 17, 2017 at 11:53 AM, Erick Erickson <er...@gmail.com>
> wrote:
>
>> Joe:
>>
>> I agree that 46 million docs later you'd expect things to have settled
>> out. However, I do note that you have
>> "add-unknown-fields-to-the-schema" in your error stack which means
>> you're using "field guessing", sometimes called data_driven. I would
>> recommend you do _not_ use this for production as, while it does the
>> best job it can it has to make assumptions about what the data looks
>> like based on the first document it sees which may later be violated.
>> Getting "possible analysis error" is one of the messages that happens
>> when this occurs.
>>
>> The simple example is that if the first time data_driven sees "1"
>> it'll guess integer. If sometime later there's a doc with "1.0" it'll
>> generate a parse error.
>>
>> I totally agree that 46 million docs later you'd expect all of this
>> kind of thing to have flushed out, but the "possible analysis error"
>> seems to be pointing that direction. If this is, indeed, the problem
>> you'll see better evidence on the Solr instance that's actually having
>> the problem. Unfortunately you'll just to look at one Solr log from
>> each shard to see whether this is an issue.
>>
>> Best,
>> Erick
>>
>> On Mon, Jul 17, 2017 at 7:23 AM, Joe Obernberger
>> <jo...@gmail.com> wrote:
>>> So far we've indexed about 46 million documents, but over the weekend,
>> these
>>> errors started coming up.  I would expect that if there was a basic
>> issue,
>>> it would have started right away?  We ran a test cluster with just a few
>>> shards/replicas prior and didn't see any issues using the same indexing
>>> code, but we're running a lot more indexers simultaneously with the
>> larger
>>> cluster; perhaps we're just overloading HDFS?  The same nodes that run
>> Solr
>>> also run HDFS datanodes, but they are pretty beefy machines; we're not
>>> swapping.
>>>
>>> As Shawn pointed out, I will be checking the HDFS version (we're using
>>> Cloudera CDH 5.10.2), and the HDFS logs.
>>>
>>> -Joe
>>>
>>>
>>>
>>> On 7/17/2017 10:16 AM, Susheel Kumar wrote:
>>>> There is some analysis error also.  I would suggest to test the indexer
>> on
>>>> just one shard setup first, then test for a replica (1 shard and 1
>>>> replica)
>>>> and then test for 2 shards and 2 replica.  This would confirm if there
>> is
>>>> basic issue with indexing / cluster setup.
>>>>
>>>> On Mon, Jul 17, 2017 at 9:04 AM, Joe Obernberger <
>>>> joseph.obernberger@gmail.com> wrote:
>>>>
>>>>> Some more info:
>>>>>
>>>>> When I stop all the indexers, in about 5-10 minutes the cluster goes
>> all
>>>>> green.  When I start just one indexer, several nodes immediately go
>> down
>>>>> with the 'Error adding log' message.
>>>>>
>>>>> I'm using CloudSolrClient.add(List<SolrInputDocument>) to do the
>>>>> indexing.  Is this correct for SolrCloud?
>>>>>
>>>>> Thank you!
>>>>>
>>>>> -Joe
>>>>>
>>>>>
>>>>>
>>>>> On 7/17/2017 8:36 AM, Joe Obernberger wrote:
>>>>>
>>>>>> We've been indexing data on a 45 node cluster with 100 shards and 3
>>>>>> replicas, but our indexing processes have been stopping due to errors.
>>>>>> On
>>>>>> the server side the error is "Error logging add". Stack trace:
>>>>>>
>>>>>> 2017-07-17 12:29:24.057 INFO  (qtp985934102-5161548) [c:UNCLASS
>>>>>> s:shard58
>>>>>> r:core_node290 x:UNCLASS_shard58_replica1]
>>>>>> o.a.s.u.p.LogUpdateProcessorFactory
>>>>>> [UNCLASS_shard58_replica1] webapp=/solr path=/update
>>>>>> params={update.distrib=FROMLEADER&update.chain=add-unknown-
>>>>>> fields-to-the-schema&distrib.from=http://tarvos:9100/solr/
>>>>>> UNCLASS_shard58_replica2/&wt=javabin&version=2}{add=[
>>>>>> COLLECT20003218348784 (1573172872544780288), COLLECT20003218351447
>>>>>> (1573172872620277760), COLLECT20003218353085 (1573172872625520640),
>>>>>> COLLECT20003218357937 (1573172872627617792), COLLECT20003218361860
>>>>>> (1573172872629714944), COLLECT20003218362535 (1573172872631812096)]} 0
>>>>>> 171
>>>>>> 2017-07-17 12:29:24.160 INFO  (qtp985934102-5160762) [c:UNCLASS
>>>>>> s:shard13
>>>>>> r:core_node81 x:UNCLASS_shard13_replica1]
>>>>>> o.a.s.u.p.LogUpdateProcessorFactory
>>>>>> [UNCLASS_shard13_replica1] webapp=/solr path=/update
>>>>>> params={update.distrib=FROMLEADER&update.chain=add-unknown-
>>>>>> fields-to-the-schema&distrib.from=http://tarvos:9100/solr/
>>>>>> UNCLASS_shard13_replica2/&wt=javabin&version=2}{add=[
>>>>>> COLLECT20003218344436 (1573172872538488832), COLLECT20003218347497
>>>>>> (1573172872620277760), COLLECT20003218351645 (1573172872625520640),
>>>>>> COLLECT20003218356965 (1573172872629714944), COLLECT20003218357775
>>>>>> (1573172872632860672), COLLECT20003218358017 (1573172872646492160),
>>>>>> COLLECT20003218358152 (1573172872650686464), COLLECT20003218359395
>>>>>> (1573172872651735040), COLLECT20003218362571 (1573172872652783616)]} 0
>>>>>> 274
>>>>>> 2017-07-17 12:29:24.163 INFO  (qtp985934102-5161057) [c:UNCLASS
>>>>>> s:shard43
>>>>>> r:core_node108 x:UNCLASS_shard43_replica1]
>>>>>> o.a.s.u.p.LogUpdateProcessorFactory
>>>>>> [UNCLASS_shard43_replica1] webapp=/solr path=/update
>>>>>> params={update.distrib=FROMLEADER&update.chain=add-unknown-
>>>>>> fields-to-the-schema&distrib.from=http://tarvos:9100/solr/
>>>>>> UNCLASS_shard43_replica2/&wt=javabin&version=2}{} 0 0
>>>>>> 2017-07-17 12:29:24.164 ERROR (qtp985934102-5161057) [c:UNCLASS
>>>>>> s:shard43
>>>>>> r:core_node108 x:UNCLASS_shard43_replica1] o.a.s.h.RequestHandlerBase
>>>>>> org.apache.solr.common.SolrException: Error logging add
>>>>>>           at org.apache.solr.update.TransactionLog.write(
>> TransactionLog.
>>>>>> java:418)
>>>>>>           at org.apache.solr.update.UpdateLog.add(UpdateLog.java:532)
>>>>>>           at org.apache.solr.update.UpdateLog.add(UpdateLog.java:516)
>>>>>>           at org.apache.solr.update.processor.
>> DistributedUpdateProcessor.
>>>>>> versionAdd(DistributedUpdateProcessor.java:1113)
>>>>>>           at org.apache.solr.update.processor.
>> DistributedUpdateProcessor.
>>>>>> processAdd(DistributedUpdateProcessor.java:748)
>>>>>>           at org.apache.solr.update.processor.
>> LogUpdateProcessorFactory$L
>>>>>> ogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:103)
>>>>>>           at org.apache.solr.handler.loader.JavabinLoader$1.update(
>> Javabi
>>>>>> nLoader.java:98)
>>>>>>           at org.apache.solr.client.solrj.request.
>> JavaBinUpdateRequestCod
>>>>>> ec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:180)
>>>>>>           at org.apache.solr.client.solrj.request.
>> JavaBinUpdateRequestCod
>>>>>> ec$1.readIterator(JavaBinUpdateRequestCodec.java:136)
>>>>>>           at org.apache.solr.common.util.JavaBinCodec.readObject(
>> JavaBinC
>>>>>> odec.java:306)
>>>>>>           at org.apache.solr.common.util.JavaBinCodec.readVal(
>> JavaBinCode
>>>>>> c.java:251)
>>>>>>           at org.apache.solr.client.solrj.request.
>> JavaBinUpdateRequestCod
>>>>>> ec$1.readNamedList(JavaBinUpdateRequestCodec.java:122)
>>>>>>           at org.apache.solr.common.util.JavaBinCodec.readObject(
>> JavaBinC
>>>>>> odec.java:271)
>>>>>>           at org.apache.solr.common.util.JavaBinCodec.readVal(
>> JavaBinCode
>>>>>> c.java:251)
>>>>>>           at org.apache.solr.common.util.JavaBinCodec.unmarshal(
>> JavaBinCo
>>>>>> dec.java:173)
>>>>>>           at org.apache.solr.client.solrj.request.
>> JavaBinUpdateRequestCod
>>>>>> ec.unmarshal(JavaBinUpdateRequestCodec.java:187)
>>>>>>           at org.apache.solr.handler.loader.JavabinLoader.
>> parseAndLoadDoc
>>>>>> s(JavabinLoader.java:108)
>>>>>>           at org.apache.solr.handler.loader.JavabinLoader.load(
>> JavabinLoa
>>>>>> der.java:55)
>>>>>>           at org.apache.solr.handler.UpdateRequestHandler$1.load(
>> UpdateRe
>>>>>> questHandler.java:97)
>>>>>>           at org.apache.solr.handler.ContentStreamHandlerBase.
>> handleReque
>>>>>> stBody(ContentStreamHandlerBase.java:68)
>>>>>>           at org.apache.solr.handler.RequestHandlerBase.
>> handleRequest(Req
>>>>>> uestHandlerBase.java:173)
>>>>>>           at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477)
>>>>>>           at org.apache.solr.servlet.HttpSolrCall.execute(
>> HttpSolrCall.
>>>>>> java:723)
>>>>>>           at org.apache.solr.servlet.HttpSolrCall.call(
>> HttpSolrCall.java:
>>>>>> 529)
>>>>>>           at org.apache.solr.servlet.SolrDispatchFilter.doFilter(
>> SolrDisp
>>>>>> atchFilter.java:361)
>>>>>>           at org.apache.solr.servlet.SolrDispatchFilter.doFilter(
>> SolrDisp
>>>>>> atchFilter.java:305)
>>>>>>           at org.eclipse.jetty.servlet.ServletHandler$CachedChain.
>> doFilte
>>>>>> r(ServletHandler.java:1691)
>>>>>>           at org.eclipse.jetty.servlet.ServletHandler.doHandle(
>> ServletHan
>>>>>> dler.java:582)
>>>>>>           at org.eclipse.jetty.server.handler.ScopedHandler.handle(
>> Scoped
>>>>>> Handler.java:143)
>>>>>>           at org.eclipse.jetty.security.SecurityHandler.handle(
>> SecurityHa
>>>>>> ndler.java:548)
>>>>>>           at org.eclipse.jetty.server.session.SessionHandler.doHandle(
>>>>>> SessionHandler.java:226)
>>>>>>           at org.eclipse.jetty.server.handler.ContextHandler.doHandle(
>>>>>> ContextHandler.java:1180)
>>>>>>           at org.eclipse.jetty.servlet.ServletHandler.doScope(
>> ServletHand
>>>>>> ler.java:512)
>>>>>>           at org.eclipse.jetty.server.session.SessionHandler.doScope(
>>>>>> SessionHandler.java:185)
>>>>>>           at org.eclipse.jetty.server.handler.ContextHandler.doScope(
>>>>>> ContextHandler.java:1112)
>>>>>>           at org.eclipse.jetty.server.handler.ScopedHandler.handle(
>> Scoped
>>>>>> Handler.java:141)
>>>>>>           at org.eclipse.jetty.server.handler.
>> ContextHandlerCollection.ha
>>>>>> ndle(ContextHandlerCollection.java:213)
>>>>>>           at org.eclipse.jetty.server.handler.HandlerCollection.
>> handle(
>>>>>> HandlerCollection.java:119)
>>>>>>           at org.eclipse.jetty.server.handler.HandlerWrapper.handle(
>> Handl
>>>>>> erWrapper.java:134)
>>>>>>           at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(
>> Rewr
>>>>>> iteHandler.java:335)
>>>>>>           at org.eclipse.jetty.server.handler.HandlerWrapper.handle(
>> Handl
>>>>>> erWrapper.java:134)
>>>>>>           at org.eclipse.jetty.server.Server.handle(Server.java:534)
>>>>>>           at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.
>>>>>> java:320)
>>>>>>           at org.eclipse.jetty.server.HttpConnection.onFillable(
>> HttpConne
>>>>>> ction.java:251)
>>>>>>           at org.eclipse.jetty.io.AbstractConnection$ReadCallback.
>>>>>> succeeded(AbstractConnection.java:273)
>>>>>>           at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.
>>>>>> java:95)
>>>>>>           at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(
>> SelectChann
>>>>>> elEndPoint.java:93)
>>>>>>           at org.eclipse.jetty.util.thread.
>> strategy.ExecuteProduceConsume
>>>>>> .executeProduceConsume(ExecuteProduceConsume.java:303)
>>>>>>           at org.eclipse.jetty.util.thread.
>> strategy.ExecuteProduceConsume
>>>>>> .produceConsume(ExecuteProduceConsume.java:148)
>>>>>>           at org.eclipse.jetty.util.thread.
>> strategy.ExecuteProduceConsume
>>>>>> .run(ExecuteProduceConsume.java:136)
>>>>>>           at org.eclipse.jetty.util.thread.
>> QueuedThreadPool.runJob(Queued
>>>>>> ThreadPool.java:671)
>>>>>>           at org.eclipse.jetty.util.thread.
>> QueuedThreadPool$2.run(QueuedT
>>>>>> hreadPool.java:589)
>>>>>>           at java.lang.Thread.run(Thread.java:748)
>>>>>> Caused by: org.apache.hadoop.ipc.RemoteException(java.io.
>> IOException):
>>>>>> File /solr6.6.0/UNCLASS/core_node108/data/tlog/tlog.
>> 0000000000000006211
>>>>>> could only be replicated to 0 nodes instead of minReplication (=1).
>>>>>> There
>>>>>> are 40 datanode(s) running and no node(s) are excluded in this
>>>>>> operation.
>>>>>>           at org.apache.hadoop.hdfs.server.
>> blockmanagement.BlockManager.c
>>>>>> hooseTarget4NewBlock(BlockManager.java:1622)
>>>>>>           at org.apache.hadoop.hdfs.server.
>> namenode.FSNamesystem.getAddit
>>>>>> ionalBlock(FSNamesystem.java:3351)
>>>>>>           at org.apache.hadoop.hdfs.server.
>> namenode.NameNodeRpcServer.add
>>>>>> Block(NameNodeRpcServer.java:683)
>>>>>>           at org.apache.hadoop.hdfs.server.
>> namenode.AuthorizationProvider
>>>>>> ProxyClientProtocol.addBlock(AuthorizationProviderProxyClien
>>>>>> tProtocol.java:214)
>>>>>>           at org.apache.hadoop.hdfs.protocolPB.
>> ClientNamenodeProtocolServ
>>>>>> erSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSide
>>>>>> TranslatorPB.java:495)
>>>>>>           at org.apache.hadoop.hdfs.protocol.proto.
>> ClientNamenodeProtocol
>>>>>> Protos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNam
>>>>>> enodeProtocolProtos.java)
>>>>>>           at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$
>> ProtoBufRpcIn
>>>>>> voker.call(ProtobufRpcEngine.java:617)
>>>>>>           at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
>>>>>>           at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:
>> 2216)
>>>>>>           at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:
>> 2212)
>>>>>>           at java.security.AccessController.doPrivileged(Native
>> Method)
>>>>>>           at javax.security.auth.Subject.doAs(Subject.java:422)
>>>>>>           at org.apache.hadoop.security.UserGroupInformation.doAs(
>> UserGro
>>>>>> upInformation.java:1920)
>>>>>>           at org.apache.hadoop.ipc.Server$
>> Handler.run(Server.java:2210)
>>>>>>           at org.apache.hadoop.ipc.Client.call(Client.java:1475)
>>>>>>           at org.apache.hadoop.ipc.Client.call(Client.java:1412)
>>>>>>           at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(
>>>>>> ProtobufRpcEngine.java:229)
>>>>>>           at com.sun.proxy.$Proxy11.addBlock(Unknown Source)
>>>>>>           at org.apache.hadoop.hdfs.protocolPB.
>> ClientNamenodeProtocolTran
>>>>>> slatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:418)
>>>>>>           at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown
>> Source)
>>>>>>           at sun.reflect.DelegatingMethodAccessorImpl.
>> invoke(DelegatingMe
>>>>>> thodAccessorImpl.java:43)
>>>>>>           at java.lang.reflect.Method.invoke(Method.java:498)
>>>>>>           at org.apache.hadoop.io.retry.RetryInvocationHandler.
>> invokeMeth
>>>>>> od(RetryInvocationHandler.java:191)
>>>>>>           at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(
>> Ret
>>>>>> ryInvocationHandler.java:102)
>>>>>>           at com.sun.proxy.$Proxy12.addBlock(Unknown Source)
>>>>>>           at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.
>> locateFo
>>>>>> llowingBlock(DFSOutputStream.java:1459)
>>>>>>           at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.
>> nextBloc
>>>>>> kOutputStream(DFSOutputStream.java:1255)
>>>>>>           at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(
>>>>>> DFSOutputStream.java:449)
>>>>>>
>>>>>> 2017-07-17 12:29:24.164 ERROR (qtp985934102-5161057) [c:UNCLASS
>>>>>> s:shard43
>>>>>> r:core_node108 x:UNCLASS_shard43_replica1] o.a.s.s.HttpSolrCall null:
>>>>>> org.apache.solr.common.SolrException: Error logging add
>>>>>>           at org.apache.solr.update.TransactionLog.write(
>> TransactionLog.
>>>>>> java:418)
>>>>>>           at org.apache.solr.update.UpdateLog.add(UpdateLog.java:532)
>>>>>>           at org.apache.solr.update.UpdateLog.add(UpdateLog.java:516)
>>>>>>           at org.apache.solr.update.processor.
>> DistributedUpdateProcessor.
>>>>>> versionAdd(DistributedUpdateProcessor.java:1113)
>>>>>>           at org.apache.solr.update.processor.
>> DistributedUpdateProcessor.
>>>>>> processAdd(DistributedUpdateProcessor.java:748)
>>>>>>           at org.apache.solr.update.processor.
>> LogUpdateProcessorFactory$L
>>>>>> ogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:103)
>>>>>>           at org.apache.solr.handler.loader.JavabinLoader$1.update(
>> Javabi
>>>>>> nLoader.java:98)
>>>>>>           at org.apache.solr.client.solrj.request.
>> JavaBinUpdateRequestCod
>>>>>> ec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:180)
>>>>>>           at org.apache.solr.client.solrj.request.
>> JavaBinUpdateRequestCod
>>>>>> ec$1.readIterator(JavaBinUpdateRequestCodec.java:136)
>>>>>>           at org.apache.solr.common.util.JavaBinCodec.readObject(
>> JavaBinC
>>>>>> odec.java:306)
>>>>>>           at org.apache.solr.common.util.JavaBinCodec.readVal(
>> JavaBinCode
>>>>>> c.java:251)
>>>>>>           at org.apache.solr.client.solrj.request.
>> JavaBinUpdateRequestCod
>>>>>> ec$1.readNamedList(JavaBinUpdateRequestCodec.java:122)
>>>>>>           at org.apache.solr.common.util.JavaBinCodec.readObject(
>> JavaBinC
>>>>>> odec.java:271)
>>>>>>           at org.apache.solr.common.util.JavaBinCodec.readVal(
>> JavaBinCode
>>>>>> c.java:251)
>>>>>>           at org.apache.solr.common.util.JavaBinCodec.unmarshal(
>> JavaBinCo
>>>>>> dec.java:173)
>>>>>>           at org.apache.solr.client.solrj.request.
>> JavaBinUpdateRequestCod
>>>>>> ec.unmarshal(JavaBinUpdateRequestCodec.java:187)
>>>>>>           at org.apache.solr.handler.loader.JavabinLoader.
>> parseAndLoadDoc
>>>>>> s(JavabinLoader.java:108)
>>>>>>           at org.apache.solr.handler.loader.JavabinLoader.load(
>> JavabinLoa
>>>>>> der.java:55)
>>>>>>           at org.apache.solr.handler.UpdateRequestHandler$1.load(
>> UpdateRe
>>>>>> questHandler.java:97)
>>>>>>           at org.apache.solr.handler.ContentStreamHandlerBase.
>> handleReque
>>>>>> stBody(ContentStreamHandlerBase.java:68)
>>>>>>           at org.apache.solr.handler.RequestHandlerBase.
>> handleRequest(Req
>>>>>> uestHandlerBase.java:173)
>>>>>>           at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477)
>>>>>>           at org.apache.solr.servlet.HttpSolrCall.execute(
>> HttpSolrCall.
>>>>>> java:723)
>>>>>>           at org.apache.solr.servlet.HttpSolrCall.call(
>> HttpSolrCall.java:
>>>>>> 529)
>>>>>>           at org.apache.solr.servlet.SolrDispatchFilter.doFilter(
>> SolrDisp
>>>>>> atchFilter.java:361)
>>>>>>           at org.apache.solr.servlet.SolrDispatchFilter.doFilter(
>> SolrDisp
>>>>>> atchFilter.java:305)
>>>>>>           at org.eclipse.jetty.servlet.ServletHandler$CachedChain.
>> doFilte
>>>>>> r(ServletHandler.java:1691)
>>>>>>           at org.eclipse.jetty.servlet.ServletHandler.doHandle(
>> ServletHan
>>>>>> dler.java:582)
>>>>>>           at org.eclipse.jetty.server.handler.ScopedHandler.handle(
>> Scoped
>>>>>> Handler.java:143)
>>>>>>           at org.eclipse.jetty.security.SecurityHandler.handle(
>> SecurityHa
>>>>>> ndler.java:548)
>>>>>>           at org.eclipse.jetty.server.session.SessionHandler.doHandle(
>>>>>> SessionHandler.java:226)
>>>>>>           at org.eclipse.jetty.server.handler.ContextHandler.doHandle(
>>>>>> ContextHandler.java:1180)
>>>>>>           at org.eclipse.jetty.servlet.ServletHandler.doScope(
>> ServletHand
>>>>>> ler.java:512)
>>>>>>           at org.eclipse.jetty.server.session.SessionHandler.doScope(
>>>>>> SessionHandler.java:185)
>>>>>>           at org.eclipse.jetty.server.handler.ContextHandler.doScope(
>>>>>> ContextHandler.java:1112)
>>>>>>           at org.eclipse.jetty.server.handler.ScopedHandler.handle(
>> Scoped
>>>>>> Handler.java:141)
>>>>>>           at org.eclipse.jetty.server.handler.
>> ContextHandlerCollection.ha
>>>>>> ndle(ContextHandlerCollection.java:213)
>>>>>>           at org.eclipse.jetty.server.handler.HandlerCollection.
>> handle(
>>>>>> HandlerCollection.java:119)
>>>>>>           at org.eclipse.jetty.server.handler.HandlerWrapper.handle(
>> Handl
>>>>>> erWrapper.java:134)
>>>>>>           at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(
>> Rewr
>>>>>> iteHandler.java:335)
>>>>>>           at org.eclipse.jetty.server.handler.HandlerWrapper.handle(
>> Handl
>>>>>> erWrapper.java:134)
>>>>>>           at org.eclipse.jetty.server.Server.handle(Server.java:534)
>>>>>>           at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.
>>>>>> java:320)
>>>>>>           at org.eclipse.jetty.server.HttpConnection.onFillable(
>> HttpConne
>>>>>> ction.java:251)
>>>>>>           at org.eclipse.jetty.io.AbstractConnection$ReadCallback.
>>>>>> succeeded(AbstractConnection.java:273)
>>>>>>           at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.
>>>>>> java:95)
>>>>>>           at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(
>> SelectChann
>>>>>> elEndPoint.java:93)
>>>>>>           at org.eclipse.jetty.util.thread.
>> strategy.ExecuteProduceConsume
>>>>>> .executeProduceConsume(ExecuteProduceConsume.java:303)
>>>>>>           at org.eclipse.jetty.util.thread.
>> strategy.ExecuteProduceConsume
>>>>>> .produceConsume(ExecuteProduceConsume.java:148)
>>>>>>           at org.eclipse.jetty.util.thread.
>> strategy.ExecuteProduceConsume
>>>>>> .run(ExecuteProduceConsume.java:136)
>>>>>>           at org.eclipse.jetty.util.thread.
>> QueuedThreadPool.runJob(Queued
>>>>>> ThreadPool.java:671)
>>>>>>           at org.eclipse.jetty.util.thread.
>> QueuedThreadPool$2.run(QueuedT
>>>>>> hreadPool.java:589)
>>>>>>           at java.lang.Thread.run(Thread.java:748)
>>>>>> Caused by: org.apache.hadoop.ipc.RemoteException(java.io.
>> IOException):
>>>>>> File /solr6.6.0/UNCLASS/core_node108/data/tlog/tlog.
>> 0000000000000006211
>>>>>> could only be replicated to 0 nodes instead of minReplication (=1).
>>>>>> There
>>>>>> are 40 datanode(s) running and no node(s) are excluded in this
>>>>>> operation.
>>>>>>           at org.apache.hadoop.hdfs.server.
>> blockmanagement.BlockManager.c
>>>>>> hooseTarget4NewBlock(BlockManager.java:1622)
>>>>>>           at org.apache.hadoop.hdfs.server.
>> namenode.FSNamesystem.getAddit
>>>>>> ionalBlock(FSNamesystem.java:3351)
>>>>>>           at org.apache.hadoop.hdfs.server.
>> namenode.NameNodeRpcServer.add
>>>>>> Block(NameNodeRpcServer.java:683)
>>>>>>           at org.apache.hadoop.hdfs.server.
>> namenode.AuthorizationProvider
>>>>>> ProxyClientProtocol.addBlock(AuthorizationProviderProxyClien
>>>>>> tProtocol.java:214)
>>>>>>           at org.apache.hadoop.hdfs.protocolPB.
>> ClientNamenodeProtocolServ
>>>>>> erSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSide
>>>>>> TranslatorPB.java:495)
>>>>>>           at org.apache.hadoop.hdfs.protocol.proto.
>> ClientNamenodeProtocol
>>>>>> Protos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNam
>>>>>> enodeProtocolProtos.java)
>>>>>>           at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$
>> ProtoBufRpcIn
>>>>>> voker.call(ProtobufRpcEngine.java:617)
>>>>>>           at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
>>>>>>           at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:
>> 2216)
>>>>>>           at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:
>> 2212)
>>>>>>           at java.security.AccessController.doPrivileged(Native
>> Method)
>>>>>>           at javax.security.auth.Subject.doAs(Subject.java:422)
>>>>>>           at org.apache.hadoop.security.UserGroupInformation.doAs(
>> UserGro
>>>>>> upInformation.java:1920)
>>>>>>           at org.apache.hadoop.ipc.Server$
>> Handler.run(Server.java:2210)
>>>>>>           at org.apache.hadoop.ipc.Client.call(Client.java:1475)
>>>>>>           at org.apache.hadoop.ipc.Client.call(Client.java:1412)
>>>>>>           at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(
>>>>>> ProtobufRpcEngine.java:229)
>>>>>>           at com.sun.proxy.$Proxy11.addBlock(Unknown Source)
>>>>>>           at org.apache.hadoop.hdfs.protocolPB.
>> ClientNamenodeProtocolTran
>>>>>> slatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:418)
>>>>>>           at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown
>> Source)
>>>>>>           at sun.reflect.DelegatingMethodAccessorImpl.
>> invoke(DelegatingMe
>>>>>> thodAccessorImpl.java:43)
>>>>>>           at java.lang.reflect.Method.invoke(Method.java:498)
>>>>>>           at org.apache.hadoop.io.retry.RetryInvocationHandler.
>> invokeMeth
>>>>>> od(RetryInvocationHandler.java:191)
>>>>>>           at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(
>> Ret
>>>>>> ryInvocationHandler.java:102)
>>>>>>           at com.sun.proxy.$Proxy12.addBlock(Unknown Source)
>>>>>>           at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.
>> locateFo
>>>>>> llowingBlock(DFSOutputStream.java:1459)
>>>>>>           at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.
>> nextBloc
>>>>>> kOutputStream(DFSOutputStream.java:1255)
>>>>>>           at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(
>>>>>> DFSOutputStream.java:449)
>>>>>>
>>>>>> 2017-07-17 12:29:24.187 INFO
>>>>>> (zkCallback-5-thread-144-processing-n:juliet:9100_solr)
>>>>>> [   ] o.a.s.c.c.ZkStateReader A cluster state change: [WatchedEvent
>>>>>> state:SyncConnected type:NodeDataChanged
>>>>>> path:/collections/UNCLASS/state.json]
>>>>>> for collection [UNCLASS] has occurred - updating... (live nodes size:
>>>>>> [45])
>>>>>>
>>>>>> On the client side, the error looks like:
>>>>>> 2017-07-16 19:03:16,118 WARN
>>>>>> [com.ngc.bigdata.ie_solrindexer.IndexDocument]
>>>>>> Indexing error: org.apache.solr.client.solrj.i
>>>>>> mpl.CloudSolrClient$RouteException: Error from server at
>>>>>> http://leda:9100/solr/UNCLASS_shard44_replica3: Exception writing
>>>>>> document id COLLECT10086453202 to the index; possible analysis error.
>>>>>> for
>>>>>> collection: UNCLASS
>>>>>> org.apache.solr.client.solrj.impl.CloudSolrClient$RouteException:
>> Error
>>>>>> from server at http://leda:9100/solr/UNCLASS_shard44_replica3:
>> Exception
>>>>>> writing document id COLLECT10086453202 to the index; possible analysis
>>>>>> error.
>>>>>>           at org.apache.solr.client.solrj.impl.CloudSolrClient.
>> directUpda
>>>>>> te(CloudSolrClient.java:819)
>>>>>>           at org.apache.solr.client.solrj.impl.CloudSolrClient.
>> sendReques
>>>>>> t(CloudSolrClient.java:1263)
>>>>>>           at org.apache.solr.client.solrj.impl.CloudSolrClient.
>> requestWit
>>>>>> hRetryOnStaleState(CloudSolrClient.java:1134)
>>>>>>           at org.apache.solr.client.solrj.
>> impl.CloudSolrClient.request(Cl
>>>>>> oudSolrClient.java:1073)
>>>>>>           at org.apache.solr.client.solrj.SolrRequest.process(
>> SolrRequest
>>>>>> .java:160)
>>>>>>           at org.apache.solr.client.solrj.SolrClient.add(SolrClient.
>> java:
>>>>>> 106)
>>>>>>           at org.apache.solr.client.solrj.SolrClient.add(SolrClient.
>> java:
>>>>>> 71)
>>>>>>           at org.apache.solr.client.solrj.SolrClient.add(SolrClient.
>> java:
>>>>>> 85)
>>>>>>           at com.ngc.bigdata.ie_solrindexer.IndexDocument.
>> indexSolrDocs(
>>>>>> IndexDocument.java:959)
>>>>>>           at com.ngc.bigdata.ie_solrindexer.IndexDocument.index(
>>>>>> IndexDocument.java:236)
>>>>>>           at com.ngc.bigdata.ie_solrindexer.
>> SolrIndexerProcessor.doWork(S
>>>>>> olrIndexerProcessor.java:63)
>>>>>>           at com.ngc.intelenterprise.intelentutil.utils.Processor.run(
>>>>>> Processor.java:140)
>>>>>>           at com.ngc.intelenterprise.intelentutil.jms.
>> IntelEntQueueProc.
>>>>>> process(IntelEntQueueProc.java:208)
>>>>>>           at org.apache.camel.processor.DelegateSyncProcessor.process(
>> Del
>>>>>> egateSyncProcessor.java:63)
>>>>>>           at org.apache.camel.management.InstrumentationProcessor.
>> process
>>>>>> (InstrumentationProcessor.java:77)
>>>>>>           at org.apache.camel.processor.RedeliveryErrorHandler.
>> process(Re
>>>>>> deliveryErrorHandler.java:460)
>>>>>>           at org.apache.camel.processor.CamelInternalProcessor.
>> process(Ca
>>>>>> melInternalProcessor.java:190)
>>>>>>           at org.apache.camel.processor.CamelInternalProcessor.
>> process(Ca
>>>>>> melInternalProcessor.java:190)
>>>>>>           at org.apache.camel.component.seda.SedaConsumer.
>> sendToConsumers
>>>>>> (SedaConsumer.java:298)
>>>>>>           at org.apache.camel.component.seda.SedaConsumer.doRun(
>> SedaConsu
>>>>>> mer.java:207)
>>>>>>           at org.apache.camel.component.seda.SedaConsumer.run(
>> SedaConsume
>>>>>> r.java:154)
>>>>>>           at java.util.concurrent.ThreadPoolExecutor.runWorker(
>> ThreadPool
>>>>>> Executor.java:1142)
>>>>>>           at java.util.concurrent.ThreadPoolExecutor$Worker.run(
>> ThreadPoo
>>>>>> lExecutor.java:617)
>>>>>>           at java.lang.Thread.run(Thread.java:748)
>>>>>> Caused by:
>>>>>> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
>>>>>> Error from server at http://leda:9100/solr/UNCLASS_shard44_replica3:
>>>>>> Exception writing document id COLLECT10086453202 to the index;
>> possible
>>>>>> analysis error.
>>>>>>           at org.apache.solr.client.solrj.impl.HttpSolrClient.
>> executeMeth
>>>>>> od(HttpSolrClient.java:610)
>>>>>>           at org.apache.solr.client.solrj.impl.HttpSolrClient.request(
>> Htt
>>>>>> pSolrClient.java:279)
>>>>>>           at org.apache.solr.client.solrj.impl.HttpSolrClient.request(
>> Htt
>>>>>> pSolrClient.java:268)
>>>>>>           at org.apache.solr.client.solrj.impl.LBHttpSolrClient.
>> doRequest
>>>>>> (LBHttpSolrClient.java:447)
>>>>>>           at org.apache.solr.client.solrj.
>> impl.LBHttpSolrClient.request(L
>>>>>> BHttpSolrClient.java:388)
>>>>>>           at org.apache.solr.client.solrj.impl.CloudSolrClient.lambda$
>> dir
>>>>>> ectUpdate$0(CloudSolrClient.java:796)
>>>>>>           at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>>>>>>           at org.apache.solr.common.util.ExecutorUtil$
>> MDCAwareThreadPoolE
>>>>>> xecutor.lambda$execute$0(ExecutorUtil.java:229)
>>>>>>           ... 3 more
>>>>>> 2017-07-16 19:03:16,134 ERROR
>>>>>> [com.ngc.bigdata.ie_solrindexer.IndexDocument]
>>>>>> Error indexing: org.apache.solr.client.solrj.i
>>>>>> mpl.CloudSolrClient$RouteException: Error from server at
>>>>>> http://leda:9100/solr/UNCLASS_shard44_replica3: Exception writing
>>>>>> document id COLLECT10086453202 to the index; possible analysis error.
>>>>>> for
>>>>>> collection: UNCLASS.
>>>>>> 2017-07-16 19:03:16,135 ERROR
>>>>>> [com.ngc.bigdata.ie_solrindexer.IndexDocument]
>>>>>> Exception during indexing: org.apache.solr.client.solrj.i
>>>>>> mpl.CloudSolrClient$RouteException: Error from server at
>>>>>> http://leda:9100/solr/UNCLASS_shard44_replica3: Exception writing
>>>>>> document id COLLECT10086453202 to the index; possible analysis error.
>>>>>>
>>>>>> I can fire them back up, but they only run for a short time before
>>>>>> getting more indexing errors.  Several of the nodes show as down in
>> the
>>>>>> cloud view.  Any help would be appreciated!  Thank you!
>>>>>>
>>>>>>
>>>>>> -Joe
>>>>>>
>>>>>>
>>>> ---
>>>> This email has been checked for viruses by AVG.
>>>> http://www.avg.com
>>>>


Re: Solr 6.6.0 - Indexing errors

Posted by Susheel Kumar <su...@gmail.com>.
and there is document id mentioned above when it failed with analysis
error.  You can look how those documents differ as Eric suggested.

On Mon, Jul 17, 2017 at 11:53 AM, Erick Erickson <er...@gmail.com>
wrote:

> Joe:
>
> I agree that 46 million docs later you'd expect things to have settled
> out. However, I do note that you have
> "add-unknown-fields-to-the-schema" in your error stack which means
> you're using "field guessing", sometimes called data_driven. I would
> recommend you do _not_ use this for production as, while it does the
> best job it can it has to make assumptions about what the data looks
> like based on the first document it sees which may later be violated.
> Getting "possible analysis error" is one of the messages that happens
> when this occurs.
>
> The simple example is that if the first time data_driven sees "1"
> it'll guess integer. If sometime later there's a doc with "1.0" it'll
> generate a parse error.
>
> I totally agree that 46 million docs later you'd expect all of this
> kind of thing to have flushed out, but the "possible analysis error"
> seems to be pointing that direction. If this is, indeed, the problem
> you'll see better evidence on the Solr instance that's actually having
> the problem. Unfortunately you'll just to look at one Solr log from
> each shard to see whether this is an issue.
>
> Best,
> Erick
>
> On Mon, Jul 17, 2017 at 7:23 AM, Joe Obernberger
> <jo...@gmail.com> wrote:
> > So far we've indexed about 46 million documents, but over the weekend,
> these
> > errors started coming up.  I would expect that if there was a basic
> issue,
> > it would have started right away?  We ran a test cluster with just a few
> > shards/replicas prior and didn't see any issues using the same indexing
> > code, but we're running a lot more indexers simultaneously with the
> larger
> > cluster; perhaps we're just overloading HDFS?  The same nodes that run
> Solr
> > also run HDFS datanodes, but they are pretty beefy machines; we're not
> > swapping.
> >
> > As Shawn pointed out, I will be checking the HDFS version (we're using
> > Cloudera CDH 5.10.2), and the HDFS logs.
> >
> > -Joe
> >
> >
> >
> > On 7/17/2017 10:16 AM, Susheel Kumar wrote:
> >>
> >> There is some analysis error also.  I would suggest to test the indexer
> on
> >> just one shard setup first, then test for a replica (1 shard and 1
> >> replica)
> >> and then test for 2 shards and 2 replica.  This would confirm if there
> is
> >> basic issue with indexing / cluster setup.
> >>
> >> On Mon, Jul 17, 2017 at 9:04 AM, Joe Obernberger <
> >> joseph.obernberger@gmail.com> wrote:
> >>
> >>> Some more info:
> >>>
> >>> When I stop all the indexers, in about 5-10 minutes the cluster goes
> all
> >>> green.  When I start just one indexer, several nodes immediately go
> down
> >>> with the 'Error adding log' message.
> >>>
> >>> I'm using CloudSolrClient.add(List<SolrInputDocument>) to do the
> >>> indexing.  Is this correct for SolrCloud?
> >>>
> >>> Thank you!
> >>>
> >>> -Joe
> >>>
> >>>
> >>>
> >>> On 7/17/2017 8:36 AM, Joe Obernberger wrote:
> >>>
> >>>> We've been indexing data on a 45 node cluster with 100 shards and 3
> >>>> replicas, but our indexing processes have been stopping due to errors.
> >>>> On
> >>>> the server side the error is "Error logging add". Stack trace:
> >>>>
> >>>> 2017-07-17 12:29:24.057 INFO  (qtp985934102-5161548) [c:UNCLASS
> >>>> s:shard58
> >>>> r:core_node290 x:UNCLASS_shard58_replica1]
> >>>> o.a.s.u.p.LogUpdateProcessorFactory
> >>>> [UNCLASS_shard58_replica1] webapp=/solr path=/update
> >>>> params={update.distrib=FROMLEADER&update.chain=add-unknown-
> >>>> fields-to-the-schema&distrib.from=http://tarvos:9100/solr/
> >>>> UNCLASS_shard58_replica2/&wt=javabin&version=2}{add=[
> >>>> COLLECT20003218348784 (1573172872544780288), COLLECT20003218351447
> >>>> (1573172872620277760), COLLECT20003218353085 (1573172872625520640),
> >>>> COLLECT20003218357937 (1573172872627617792), COLLECT20003218361860
> >>>> (1573172872629714944), COLLECT20003218362535 (1573172872631812096)]} 0
> >>>> 171
> >>>> 2017-07-17 12:29:24.160 INFO  (qtp985934102-5160762) [c:UNCLASS
> >>>> s:shard13
> >>>> r:core_node81 x:UNCLASS_shard13_replica1]
> >>>> o.a.s.u.p.LogUpdateProcessorFactory
> >>>> [UNCLASS_shard13_replica1] webapp=/solr path=/update
> >>>> params={update.distrib=FROMLEADER&update.chain=add-unknown-
> >>>> fields-to-the-schema&distrib.from=http://tarvos:9100/solr/
> >>>> UNCLASS_shard13_replica2/&wt=javabin&version=2}{add=[
> >>>> COLLECT20003218344436 (1573172872538488832), COLLECT20003218347497
> >>>> (1573172872620277760), COLLECT20003218351645 (1573172872625520640),
> >>>> COLLECT20003218356965 (1573172872629714944), COLLECT20003218357775
> >>>> (1573172872632860672), COLLECT20003218358017 (1573172872646492160),
> >>>> COLLECT20003218358152 (1573172872650686464), COLLECT20003218359395
> >>>> (1573172872651735040), COLLECT20003218362571 (1573172872652783616)]} 0
> >>>> 274
> >>>> 2017-07-17 12:29:24.163 INFO  (qtp985934102-5161057) [c:UNCLASS
> >>>> s:shard43
> >>>> r:core_node108 x:UNCLASS_shard43_replica1]
> >>>> o.a.s.u.p.LogUpdateProcessorFactory
> >>>> [UNCLASS_shard43_replica1] webapp=/solr path=/update
> >>>> params={update.distrib=FROMLEADER&update.chain=add-unknown-
> >>>> fields-to-the-schema&distrib.from=http://tarvos:9100/solr/
> >>>> UNCLASS_shard43_replica2/&wt=javabin&version=2}{} 0 0
> >>>> 2017-07-17 12:29:24.164 ERROR (qtp985934102-5161057) [c:UNCLASS
> >>>> s:shard43
> >>>> r:core_node108 x:UNCLASS_shard43_replica1] o.a.s.h.RequestHandlerBase
> >>>> org.apache.solr.common.SolrException: Error logging add
> >>>>          at org.apache.solr.update.TransactionLog.write(
> TransactionLog.
> >>>> java:418)
> >>>>          at org.apache.solr.update.UpdateLog.add(UpdateLog.java:532)
> >>>>          at org.apache.solr.update.UpdateLog.add(UpdateLog.java:516)
> >>>>          at org.apache.solr.update.processor.
> DistributedUpdateProcessor.
> >>>> versionAdd(DistributedUpdateProcessor.java:1113)
> >>>>          at org.apache.solr.update.processor.
> DistributedUpdateProcessor.
> >>>> processAdd(DistributedUpdateProcessor.java:748)
> >>>>          at org.apache.solr.update.processor.
> LogUpdateProcessorFactory$L
> >>>> ogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:103)
> >>>>          at org.apache.solr.handler.loader.JavabinLoader$1.update(
> Javabi
> >>>> nLoader.java:98)
> >>>>          at org.apache.solr.client.solrj.request.
> JavaBinUpdateRequestCod
> >>>> ec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:180)
> >>>>          at org.apache.solr.client.solrj.request.
> JavaBinUpdateRequestCod
> >>>> ec$1.readIterator(JavaBinUpdateRequestCodec.java:136)
> >>>>          at org.apache.solr.common.util.JavaBinCodec.readObject(
> JavaBinC
> >>>> odec.java:306)
> >>>>          at org.apache.solr.common.util.JavaBinCodec.readVal(
> JavaBinCode
> >>>> c.java:251)
> >>>>          at org.apache.solr.client.solrj.request.
> JavaBinUpdateRequestCod
> >>>> ec$1.readNamedList(JavaBinUpdateRequestCodec.java:122)
> >>>>          at org.apache.solr.common.util.JavaBinCodec.readObject(
> JavaBinC
> >>>> odec.java:271)
> >>>>          at org.apache.solr.common.util.JavaBinCodec.readVal(
> JavaBinCode
> >>>> c.java:251)
> >>>>          at org.apache.solr.common.util.JavaBinCodec.unmarshal(
> JavaBinCo
> >>>> dec.java:173)
> >>>>          at org.apache.solr.client.solrj.request.
> JavaBinUpdateRequestCod
> >>>> ec.unmarshal(JavaBinUpdateRequestCodec.java:187)
> >>>>          at org.apache.solr.handler.loader.JavabinLoader.
> parseAndLoadDoc
> >>>> s(JavabinLoader.java:108)
> >>>>          at org.apache.solr.handler.loader.JavabinLoader.load(
> JavabinLoa
> >>>> der.java:55)
> >>>>          at org.apache.solr.handler.UpdateRequestHandler$1.load(
> UpdateRe
> >>>> questHandler.java:97)
> >>>>          at org.apache.solr.handler.ContentStreamHandlerBase.
> handleReque
> >>>> stBody(ContentStreamHandlerBase.java:68)
> >>>>          at org.apache.solr.handler.RequestHandlerBase.
> handleRequest(Req
> >>>> uestHandlerBase.java:173)
> >>>>          at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477)
> >>>>          at org.apache.solr.servlet.HttpSolrCall.execute(
> HttpSolrCall.
> >>>> java:723)
> >>>>          at org.apache.solr.servlet.HttpSolrCall.call(
> HttpSolrCall.java:
> >>>> 529)
> >>>>          at org.apache.solr.servlet.SolrDispatchFilter.doFilter(
> SolrDisp
> >>>> atchFilter.java:361)
> >>>>          at org.apache.solr.servlet.SolrDispatchFilter.doFilter(
> SolrDisp
> >>>> atchFilter.java:305)
> >>>>          at org.eclipse.jetty.servlet.ServletHandler$CachedChain.
> doFilte
> >>>> r(ServletHandler.java:1691)
> >>>>          at org.eclipse.jetty.servlet.ServletHandler.doHandle(
> ServletHan
> >>>> dler.java:582)
> >>>>          at org.eclipse.jetty.server.handler.ScopedHandler.handle(
> Scoped
> >>>> Handler.java:143)
> >>>>          at org.eclipse.jetty.security.SecurityHandler.handle(
> SecurityHa
> >>>> ndler.java:548)
> >>>>          at org.eclipse.jetty.server.session.SessionHandler.doHandle(
> >>>> SessionHandler.java:226)
> >>>>          at org.eclipse.jetty.server.handler.ContextHandler.doHandle(
> >>>> ContextHandler.java:1180)
> >>>>          at org.eclipse.jetty.servlet.ServletHandler.doScope(
> ServletHand
> >>>> ler.java:512)
> >>>>          at org.eclipse.jetty.server.session.SessionHandler.doScope(
> >>>> SessionHandler.java:185)
> >>>>          at org.eclipse.jetty.server.handler.ContextHandler.doScope(
> >>>> ContextHandler.java:1112)
> >>>>          at org.eclipse.jetty.server.handler.ScopedHandler.handle(
> Scoped
> >>>> Handler.java:141)
> >>>>          at org.eclipse.jetty.server.handler.
> ContextHandlerCollection.ha
> >>>> ndle(ContextHandlerCollection.java:213)
> >>>>          at org.eclipse.jetty.server.handler.HandlerCollection.
> handle(
> >>>> HandlerCollection.java:119)
> >>>>          at org.eclipse.jetty.server.handler.HandlerWrapper.handle(
> Handl
> >>>> erWrapper.java:134)
> >>>>          at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(
> Rewr
> >>>> iteHandler.java:335)
> >>>>          at org.eclipse.jetty.server.handler.HandlerWrapper.handle(
> Handl
> >>>> erWrapper.java:134)
> >>>>          at org.eclipse.jetty.server.Server.handle(Server.java:534)
> >>>>          at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.
> >>>> java:320)
> >>>>          at org.eclipse.jetty.server.HttpConnection.onFillable(
> HttpConne
> >>>> ction.java:251)
> >>>>          at org.eclipse.jetty.io.AbstractConnection$ReadCallback.
> >>>> succeeded(AbstractConnection.java:273)
> >>>>          at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.
> >>>> java:95)
> >>>>          at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(
> SelectChann
> >>>> elEndPoint.java:93)
> >>>>          at org.eclipse.jetty.util.thread.
> strategy.ExecuteProduceConsume
> >>>> .executeProduceConsume(ExecuteProduceConsume.java:303)
> >>>>          at org.eclipse.jetty.util.thread.
> strategy.ExecuteProduceConsume
> >>>> .produceConsume(ExecuteProduceConsume.java:148)
> >>>>          at org.eclipse.jetty.util.thread.
> strategy.ExecuteProduceConsume
> >>>> .run(ExecuteProduceConsume.java:136)
> >>>>          at org.eclipse.jetty.util.thread.
> QueuedThreadPool.runJob(Queued
> >>>> ThreadPool.java:671)
> >>>>          at org.eclipse.jetty.util.thread.
> QueuedThreadPool$2.run(QueuedT
> >>>> hreadPool.java:589)
> >>>>          at java.lang.Thread.run(Thread.java:748)
> >>>> Caused by: org.apache.hadoop.ipc.RemoteException(java.io.
> IOException):
> >>>> File /solr6.6.0/UNCLASS/core_node108/data/tlog/tlog.
> 0000000000000006211
> >>>> could only be replicated to 0 nodes instead of minReplication (=1).
> >>>> There
> >>>> are 40 datanode(s) running and no node(s) are excluded in this
> >>>> operation.
> >>>>          at org.apache.hadoop.hdfs.server.
> blockmanagement.BlockManager.c
> >>>> hooseTarget4NewBlock(BlockManager.java:1622)
> >>>>          at org.apache.hadoop.hdfs.server.
> namenode.FSNamesystem.getAddit
> >>>> ionalBlock(FSNamesystem.java:3351)
> >>>>          at org.apache.hadoop.hdfs.server.
> namenode.NameNodeRpcServer.add
> >>>> Block(NameNodeRpcServer.java:683)
> >>>>          at org.apache.hadoop.hdfs.server.
> namenode.AuthorizationProvider
> >>>> ProxyClientProtocol.addBlock(AuthorizationProviderProxyClien
> >>>> tProtocol.java:214)
> >>>>          at org.apache.hadoop.hdfs.protocolPB.
> ClientNamenodeProtocolServ
> >>>> erSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSide
> >>>> TranslatorPB.java:495)
> >>>>          at org.apache.hadoop.hdfs.protocol.proto.
> ClientNamenodeProtocol
> >>>> Protos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNam
> >>>> enodeProtocolProtos.java)
> >>>>          at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$
> ProtoBufRpcIn
> >>>> voker.call(ProtobufRpcEngine.java:617)
> >>>>          at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
> >>>>          at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:
> 2216)
> >>>>          at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:
> 2212)
> >>>>          at java.security.AccessController.doPrivileged(Native
> Method)
> >>>>          at javax.security.auth.Subject.doAs(Subject.java:422)
> >>>>          at org.apache.hadoop.security.UserGroupInformation.doAs(
> UserGro
> >>>> upInformation.java:1920)
> >>>>          at org.apache.hadoop.ipc.Server$
> Handler.run(Server.java:2210)
> >>>>
> >>>>          at org.apache.hadoop.ipc.Client.call(Client.java:1475)
> >>>>          at org.apache.hadoop.ipc.Client.call(Client.java:1412)
> >>>>          at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(
> >>>> ProtobufRpcEngine.java:229)
> >>>>          at com.sun.proxy.$Proxy11.addBlock(Unknown Source)
> >>>>          at org.apache.hadoop.hdfs.protocolPB.
> ClientNamenodeProtocolTran
> >>>> slatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:418)
> >>>>          at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown
> Source)
> >>>>          at sun.reflect.DelegatingMethodAccessorImpl.
> invoke(DelegatingMe
> >>>> thodAccessorImpl.java:43)
> >>>>          at java.lang.reflect.Method.invoke(Method.java:498)
> >>>>          at org.apache.hadoop.io.retry.RetryInvocationHandler.
> invokeMeth
> >>>> od(RetryInvocationHandler.java:191)
> >>>>          at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(
> Ret
> >>>> ryInvocationHandler.java:102)
> >>>>          at com.sun.proxy.$Proxy12.addBlock(Unknown Source)
> >>>>          at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.
> locateFo
> >>>> llowingBlock(DFSOutputStream.java:1459)
> >>>>          at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.
> nextBloc
> >>>> kOutputStream(DFSOutputStream.java:1255)
> >>>>          at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(
> >>>> DFSOutputStream.java:449)
> >>>>
> >>>> 2017-07-17 12:29:24.164 ERROR (qtp985934102-5161057) [c:UNCLASS
> >>>> s:shard43
> >>>> r:core_node108 x:UNCLASS_shard43_replica1] o.a.s.s.HttpSolrCall null:
> >>>> org.apache.solr.common.SolrException: Error logging add
> >>>>          at org.apache.solr.update.TransactionLog.write(
> TransactionLog.
> >>>> java:418)
> >>>>          at org.apache.solr.update.UpdateLog.add(UpdateLog.java:532)
> >>>>          at org.apache.solr.update.UpdateLog.add(UpdateLog.java:516)
> >>>>          at org.apache.solr.update.processor.
> DistributedUpdateProcessor.
> >>>> versionAdd(DistributedUpdateProcessor.java:1113)
> >>>>          at org.apache.solr.update.processor.
> DistributedUpdateProcessor.
> >>>> processAdd(DistributedUpdateProcessor.java:748)
> >>>>          at org.apache.solr.update.processor.
> LogUpdateProcessorFactory$L
> >>>> ogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:103)
> >>>>          at org.apache.solr.handler.loader.JavabinLoader$1.update(
> Javabi
> >>>> nLoader.java:98)
> >>>>          at org.apache.solr.client.solrj.request.
> JavaBinUpdateRequestCod
> >>>> ec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:180)
> >>>>          at org.apache.solr.client.solrj.request.
> JavaBinUpdateRequestCod
> >>>> ec$1.readIterator(JavaBinUpdateRequestCodec.java:136)
> >>>>          at org.apache.solr.common.util.JavaBinCodec.readObject(
> JavaBinC
> >>>> odec.java:306)
> >>>>          at org.apache.solr.common.util.JavaBinCodec.readVal(
> JavaBinCode
> >>>> c.java:251)
> >>>>          at org.apache.solr.client.solrj.request.
> JavaBinUpdateRequestCod
> >>>> ec$1.readNamedList(JavaBinUpdateRequestCodec.java:122)
> >>>>          at org.apache.solr.common.util.JavaBinCodec.readObject(
> JavaBinC
> >>>> odec.java:271)
> >>>>          at org.apache.solr.common.util.JavaBinCodec.readVal(
> JavaBinCode
> >>>> c.java:251)
> >>>>          at org.apache.solr.common.util.JavaBinCodec.unmarshal(
> JavaBinCo
> >>>> dec.java:173)
> >>>>          at org.apache.solr.client.solrj.request.
> JavaBinUpdateRequestCod
> >>>> ec.unmarshal(JavaBinUpdateRequestCodec.java:187)
> >>>>          at org.apache.solr.handler.loader.JavabinLoader.
> parseAndLoadDoc
> >>>> s(JavabinLoader.java:108)
> >>>>          at org.apache.solr.handler.loader.JavabinLoader.load(
> JavabinLoa
> >>>> der.java:55)
> >>>>          at org.apache.solr.handler.UpdateRequestHandler$1.load(
> UpdateRe
> >>>> questHandler.java:97)
> >>>>          at org.apache.solr.handler.ContentStreamHandlerBase.
> handleReque
> >>>> stBody(ContentStreamHandlerBase.java:68)
> >>>>          at org.apache.solr.handler.RequestHandlerBase.
> handleRequest(Req
> >>>> uestHandlerBase.java:173)
> >>>>          at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477)
> >>>>          at org.apache.solr.servlet.HttpSolrCall.execute(
> HttpSolrCall.
> >>>> java:723)
> >>>>          at org.apache.solr.servlet.HttpSolrCall.call(
> HttpSolrCall.java:
> >>>> 529)
> >>>>          at org.apache.solr.servlet.SolrDispatchFilter.doFilter(
> SolrDisp
> >>>> atchFilter.java:361)
> >>>>          at org.apache.solr.servlet.SolrDispatchFilter.doFilter(
> SolrDisp
> >>>> atchFilter.java:305)
> >>>>          at org.eclipse.jetty.servlet.ServletHandler$CachedChain.
> doFilte
> >>>> r(ServletHandler.java:1691)
> >>>>          at org.eclipse.jetty.servlet.ServletHandler.doHandle(
> ServletHan
> >>>> dler.java:582)
> >>>>          at org.eclipse.jetty.server.handler.ScopedHandler.handle(
> Scoped
> >>>> Handler.java:143)
> >>>>          at org.eclipse.jetty.security.SecurityHandler.handle(
> SecurityHa
> >>>> ndler.java:548)
> >>>>          at org.eclipse.jetty.server.session.SessionHandler.doHandle(
> >>>> SessionHandler.java:226)
> >>>>          at org.eclipse.jetty.server.handler.ContextHandler.doHandle(
> >>>> ContextHandler.java:1180)
> >>>>          at org.eclipse.jetty.servlet.ServletHandler.doScope(
> ServletHand
> >>>> ler.java:512)
> >>>>          at org.eclipse.jetty.server.session.SessionHandler.doScope(
> >>>> SessionHandler.java:185)
> >>>>          at org.eclipse.jetty.server.handler.ContextHandler.doScope(
> >>>> ContextHandler.java:1112)
> >>>>          at org.eclipse.jetty.server.handler.ScopedHandler.handle(
> Scoped
> >>>> Handler.java:141)
> >>>>          at org.eclipse.jetty.server.handler.
> ContextHandlerCollection.ha
> >>>> ndle(ContextHandlerCollection.java:213)
> >>>>          at org.eclipse.jetty.server.handler.HandlerCollection.
> handle(
> >>>> HandlerCollection.java:119)
> >>>>          at org.eclipse.jetty.server.handler.HandlerWrapper.handle(
> Handl
> >>>> erWrapper.java:134)
> >>>>          at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(
> Rewr
> >>>> iteHandler.java:335)
> >>>>          at org.eclipse.jetty.server.handler.HandlerWrapper.handle(
> Handl
> >>>> erWrapper.java:134)
> >>>>          at org.eclipse.jetty.server.Server.handle(Server.java:534)
> >>>>          at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.
> >>>> java:320)
> >>>>          at org.eclipse.jetty.server.HttpConnection.onFillable(
> HttpConne
> >>>> ction.java:251)
> >>>>          at org.eclipse.jetty.io.AbstractConnection$ReadCallback.
> >>>> succeeded(AbstractConnection.java:273)
> >>>>          at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.
> >>>> java:95)
> >>>>          at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(
> SelectChann
> >>>> elEndPoint.java:93)
> >>>>          at org.eclipse.jetty.util.thread.
> strategy.ExecuteProduceConsume
> >>>> .executeProduceConsume(ExecuteProduceConsume.java:303)
> >>>>          at org.eclipse.jetty.util.thread.
> strategy.ExecuteProduceConsume
> >>>> .produceConsume(ExecuteProduceConsume.java:148)
> >>>>          at org.eclipse.jetty.util.thread.
> strategy.ExecuteProduceConsume
> >>>> .run(ExecuteProduceConsume.java:136)
> >>>>          at org.eclipse.jetty.util.thread.
> QueuedThreadPool.runJob(Queued
> >>>> ThreadPool.java:671)
> >>>>          at org.eclipse.jetty.util.thread.
> QueuedThreadPool$2.run(QueuedT
> >>>> hreadPool.java:589)
> >>>>          at java.lang.Thread.run(Thread.java:748)
> >>>> Caused by: org.apache.hadoop.ipc.RemoteException(java.io.
> IOException):
> >>>> File /solr6.6.0/UNCLASS/core_node108/data/tlog/tlog.
> 0000000000000006211
> >>>> could only be replicated to 0 nodes instead of minReplication (=1).
> >>>> There
> >>>> are 40 datanode(s) running and no node(s) are excluded in this
> >>>> operation.
> >>>>          at org.apache.hadoop.hdfs.server.
> blockmanagement.BlockManager.c
> >>>> hooseTarget4NewBlock(BlockManager.java:1622)
> >>>>          at org.apache.hadoop.hdfs.server.
> namenode.FSNamesystem.getAddit
> >>>> ionalBlock(FSNamesystem.java:3351)
> >>>>          at org.apache.hadoop.hdfs.server.
> namenode.NameNodeRpcServer.add
> >>>> Block(NameNodeRpcServer.java:683)
> >>>>          at org.apache.hadoop.hdfs.server.
> namenode.AuthorizationProvider
> >>>> ProxyClientProtocol.addBlock(AuthorizationProviderProxyClien
> >>>> tProtocol.java:214)
> >>>>          at org.apache.hadoop.hdfs.protocolPB.
> ClientNamenodeProtocolServ
> >>>> erSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSide
> >>>> TranslatorPB.java:495)
> >>>>          at org.apache.hadoop.hdfs.protocol.proto.
> ClientNamenodeProtocol
> >>>> Protos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNam
> >>>> enodeProtocolProtos.java)
> >>>>          at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$
> ProtoBufRpcIn
> >>>> voker.call(ProtobufRpcEngine.java:617)
> >>>>          at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
> >>>>          at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:
> 2216)
> >>>>          at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:
> 2212)
> >>>>          at java.security.AccessController.doPrivileged(Native
> Method)
> >>>>          at javax.security.auth.Subject.doAs(Subject.java:422)
> >>>>          at org.apache.hadoop.security.UserGroupInformation.doAs(
> UserGro
> >>>> upInformation.java:1920)
> >>>>          at org.apache.hadoop.ipc.Server$
> Handler.run(Server.java:2210)
> >>>>
> >>>>          at org.apache.hadoop.ipc.Client.call(Client.java:1475)
> >>>>          at org.apache.hadoop.ipc.Client.call(Client.java:1412)
> >>>>          at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(
> >>>> ProtobufRpcEngine.java:229)
> >>>>          at com.sun.proxy.$Proxy11.addBlock(Unknown Source)
> >>>>          at org.apache.hadoop.hdfs.protocolPB.
> ClientNamenodeProtocolTran
> >>>> slatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:418)
> >>>>          at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown
> Source)
> >>>>          at sun.reflect.DelegatingMethodAccessorImpl.
> invoke(DelegatingMe
> >>>> thodAccessorImpl.java:43)
> >>>>          at java.lang.reflect.Method.invoke(Method.java:498)
> >>>>          at org.apache.hadoop.io.retry.RetryInvocationHandler.
> invokeMeth
> >>>> od(RetryInvocationHandler.java:191)
> >>>>          at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(
> Ret
> >>>> ryInvocationHandler.java:102)
> >>>>          at com.sun.proxy.$Proxy12.addBlock(Unknown Source)
> >>>>          at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.
> locateFo
> >>>> llowingBlock(DFSOutputStream.java:1459)
> >>>>          at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.
> nextBloc
> >>>> kOutputStream(DFSOutputStream.java:1255)
> >>>>          at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(
> >>>> DFSOutputStream.java:449)
> >>>>
> >>>> 2017-07-17 12:29:24.187 INFO
> >>>> (zkCallback-5-thread-144-processing-n:juliet:9100_solr)
> >>>> [   ] o.a.s.c.c.ZkStateReader A cluster state change: [WatchedEvent
> >>>> state:SyncConnected type:NodeDataChanged
> >>>> path:/collections/UNCLASS/state.json]
> >>>> for collection [UNCLASS] has occurred - updating... (live nodes size:
> >>>> [45])
> >>>>
> >>>> On the client side, the error looks like:
> >>>> 2017-07-16 19:03:16,118 WARN
> >>>> [com.ngc.bigdata.ie_solrindexer.IndexDocument]
> >>>> Indexing error: org.apache.solr.client.solrj.i
> >>>> mpl.CloudSolrClient$RouteException: Error from server at
> >>>> http://leda:9100/solr/UNCLASS_shard44_replica3: Exception writing
> >>>> document id COLLECT10086453202 to the index; possible analysis error.
> >>>> for
> >>>> collection: UNCLASS
> >>>> org.apache.solr.client.solrj.impl.CloudSolrClient$RouteException:
> Error
> >>>> from server at http://leda:9100/solr/UNCLASS_shard44_replica3:
> Exception
> >>>> writing document id COLLECT10086453202 to the index; possible analysis
> >>>> error.
> >>>>          at org.apache.solr.client.solrj.impl.CloudSolrClient.
> directUpda
> >>>> te(CloudSolrClient.java:819)
> >>>>          at org.apache.solr.client.solrj.impl.CloudSolrClient.
> sendReques
> >>>> t(CloudSolrClient.java:1263)
> >>>>          at org.apache.solr.client.solrj.impl.CloudSolrClient.
> requestWit
> >>>> hRetryOnStaleState(CloudSolrClient.java:1134)
> >>>>          at org.apache.solr.client.solrj.
> impl.CloudSolrClient.request(Cl
> >>>> oudSolrClient.java:1073)
> >>>>          at org.apache.solr.client.solrj.SolrRequest.process(
> SolrRequest
> >>>> .java:160)
> >>>>          at org.apache.solr.client.solrj.SolrClient.add(SolrClient.
> java:
> >>>> 106)
> >>>>          at org.apache.solr.client.solrj.SolrClient.add(SolrClient.
> java:
> >>>> 71)
> >>>>          at org.apache.solr.client.solrj.SolrClient.add(SolrClient.
> java:
> >>>> 85)
> >>>>          at com.ngc.bigdata.ie_solrindexer.IndexDocument.
> indexSolrDocs(
> >>>> IndexDocument.java:959)
> >>>>          at com.ngc.bigdata.ie_solrindexer.IndexDocument.index(
> >>>> IndexDocument.java:236)
> >>>>          at com.ngc.bigdata.ie_solrindexer.
> SolrIndexerProcessor.doWork(S
> >>>> olrIndexerProcessor.java:63)
> >>>>          at com.ngc.intelenterprise.intelentutil.utils.Processor.run(
> >>>> Processor.java:140)
> >>>>          at com.ngc.intelenterprise.intelentutil.jms.
> IntelEntQueueProc.
> >>>> process(IntelEntQueueProc.java:208)
> >>>>          at org.apache.camel.processor.DelegateSyncProcessor.process(
> Del
> >>>> egateSyncProcessor.java:63)
> >>>>          at org.apache.camel.management.InstrumentationProcessor.
> process
> >>>> (InstrumentationProcessor.java:77)
> >>>>          at org.apache.camel.processor.RedeliveryErrorHandler.
> process(Re
> >>>> deliveryErrorHandler.java:460)
> >>>>          at org.apache.camel.processor.CamelInternalProcessor.
> process(Ca
> >>>> melInternalProcessor.java:190)
> >>>>          at org.apache.camel.processor.CamelInternalProcessor.
> process(Ca
> >>>> melInternalProcessor.java:190)
> >>>>          at org.apache.camel.component.seda.SedaConsumer.
> sendToConsumers
> >>>> (SedaConsumer.java:298)
> >>>>          at org.apache.camel.component.seda.SedaConsumer.doRun(
> SedaConsu
> >>>> mer.java:207)
> >>>>          at org.apache.camel.component.seda.SedaConsumer.run(
> SedaConsume
> >>>> r.java:154)
> >>>>          at java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPool
> >>>> Executor.java:1142)
> >>>>          at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoo
> >>>> lExecutor.java:617)
> >>>>          at java.lang.Thread.run(Thread.java:748)
> >>>> Caused by:
> >>>> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
> >>>> Error from server at http://leda:9100/solr/UNCLASS_shard44_replica3:
> >>>> Exception writing document id COLLECT10086453202 to the index;
> possible
> >>>> analysis error.
> >>>>          at org.apache.solr.client.solrj.impl.HttpSolrClient.
> executeMeth
> >>>> od(HttpSolrClient.java:610)
> >>>>          at org.apache.solr.client.solrj.impl.HttpSolrClient.request(
> Htt
> >>>> pSolrClient.java:279)
> >>>>          at org.apache.solr.client.solrj.impl.HttpSolrClient.request(
> Htt
> >>>> pSolrClient.java:268)
> >>>>          at org.apache.solr.client.solrj.impl.LBHttpSolrClient.
> doRequest
> >>>> (LBHttpSolrClient.java:447)
> >>>>          at org.apache.solr.client.solrj.
> impl.LBHttpSolrClient.request(L
> >>>> BHttpSolrClient.java:388)
> >>>>          at org.apache.solr.client.solrj.impl.CloudSolrClient.lambda$
> dir
> >>>> ectUpdate$0(CloudSolrClient.java:796)
> >>>>          at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> >>>>          at org.apache.solr.common.util.ExecutorUtil$
> MDCAwareThreadPoolE
> >>>> xecutor.lambda$execute$0(ExecutorUtil.java:229)
> >>>>          ... 3 more
> >>>> 2017-07-16 19:03:16,134 ERROR
> >>>> [com.ngc.bigdata.ie_solrindexer.IndexDocument]
> >>>> Error indexing: org.apache.solr.client.solrj.i
> >>>> mpl.CloudSolrClient$RouteException: Error from server at
> >>>> http://leda:9100/solr/UNCLASS_shard44_replica3: Exception writing
> >>>> document id COLLECT10086453202 to the index; possible analysis error.
> >>>> for
> >>>> collection: UNCLASS.
> >>>> 2017-07-16 19:03:16,135 ERROR
> >>>> [com.ngc.bigdata.ie_solrindexer.IndexDocument]
> >>>> Exception during indexing: org.apache.solr.client.solrj.i
> >>>> mpl.CloudSolrClient$RouteException: Error from server at
> >>>> http://leda:9100/solr/UNCLASS_shard44_replica3: Exception writing
> >>>> document id COLLECT10086453202 to the index; possible analysis error.
> >>>>
> >>>> I can fire them back up, but they only run for a short time before
> >>>> getting more indexing errors.  Several of the nodes show as down in
> the
> >>>> cloud view.  Any help would be appreciated!  Thank you!
> >>>>
> >>>>
> >>>> -Joe
> >>>>
> >>>>
> >>
> >> ---
> >> This email has been checked for viruses by AVG.
> >> http://www.avg.com
> >>
> >
>

Re: Solr 6.6.0 - Indexing errors

Posted by Erick Erickson <er...@gmail.com>.
Joe:

I agree that 46 million docs later you'd expect things to have settled
out. However, I do note that you have
"add-unknown-fields-to-the-schema" in your error stack which means
you're using "field guessing", sometimes called data_driven. I would
recommend you do _not_ use this for production as, while it does the
best job it can it has to make assumptions about what the data looks
like based on the first document it sees which may later be violated.
Getting "possible analysis error" is one of the messages that happens
when this occurs.

The simple example is that if the first time data_driven sees "1"
it'll guess integer. If sometime later there's a doc with "1.0" it'll
generate a parse error.

I totally agree that 46 million docs later you'd expect all of this
kind of thing to have flushed out, but the "possible analysis error"
seems to be pointing that direction. If this is, indeed, the problem
you'll see better evidence on the Solr instance that's actually having
the problem. Unfortunately you'll just to look at one Solr log from
each shard to see whether this is an issue.

Best,
Erick

On Mon, Jul 17, 2017 at 7:23 AM, Joe Obernberger
<jo...@gmail.com> wrote:
> So far we've indexed about 46 million documents, but over the weekend, these
> errors started coming up.  I would expect that if there was a basic issue,
> it would have started right away?  We ran a test cluster with just a few
> shards/replicas prior and didn't see any issues using the same indexing
> code, but we're running a lot more indexers simultaneously with the larger
> cluster; perhaps we're just overloading HDFS?  The same nodes that run Solr
> also run HDFS datanodes, but they are pretty beefy machines; we're not
> swapping.
>
> As Shawn pointed out, I will be checking the HDFS version (we're using
> Cloudera CDH 5.10.2), and the HDFS logs.
>
> -Joe
>
>
>
> On 7/17/2017 10:16 AM, Susheel Kumar wrote:
>>
>> There is some analysis error also.  I would suggest to test the indexer on
>> just one shard setup first, then test for a replica (1 shard and 1
>> replica)
>> and then test for 2 shards and 2 replica.  This would confirm if there is
>> basic issue with indexing / cluster setup.
>>
>> On Mon, Jul 17, 2017 at 9:04 AM, Joe Obernberger <
>> joseph.obernberger@gmail.com> wrote:
>>
>>> Some more info:
>>>
>>> When I stop all the indexers, in about 5-10 minutes the cluster goes all
>>> green.  When I start just one indexer, several nodes immediately go down
>>> with the 'Error adding log' message.
>>>
>>> I'm using CloudSolrClient.add(List<SolrInputDocument>) to do the
>>> indexing.  Is this correct for SolrCloud?
>>>
>>> Thank you!
>>>
>>> -Joe
>>>
>>>
>>>
>>> On 7/17/2017 8:36 AM, Joe Obernberger wrote:
>>>
>>>> We've been indexing data on a 45 node cluster with 100 shards and 3
>>>> replicas, but our indexing processes have been stopping due to errors.
>>>> On
>>>> the server side the error is "Error logging add". Stack trace:
>>>>
>>>> 2017-07-17 12:29:24.057 INFO  (qtp985934102-5161548) [c:UNCLASS
>>>> s:shard58
>>>> r:core_node290 x:UNCLASS_shard58_replica1]
>>>> o.a.s.u.p.LogUpdateProcessorFactory
>>>> [UNCLASS_shard58_replica1] webapp=/solr path=/update
>>>> params={update.distrib=FROMLEADER&update.chain=add-unknown-
>>>> fields-to-the-schema&distrib.from=http://tarvos:9100/solr/
>>>> UNCLASS_shard58_replica2/&wt=javabin&version=2}{add=[
>>>> COLLECT20003218348784 (1573172872544780288), COLLECT20003218351447
>>>> (1573172872620277760), COLLECT20003218353085 (1573172872625520640),
>>>> COLLECT20003218357937 (1573172872627617792), COLLECT20003218361860
>>>> (1573172872629714944), COLLECT20003218362535 (1573172872631812096)]} 0
>>>> 171
>>>> 2017-07-17 12:29:24.160 INFO  (qtp985934102-5160762) [c:UNCLASS
>>>> s:shard13
>>>> r:core_node81 x:UNCLASS_shard13_replica1]
>>>> o.a.s.u.p.LogUpdateProcessorFactory
>>>> [UNCLASS_shard13_replica1] webapp=/solr path=/update
>>>> params={update.distrib=FROMLEADER&update.chain=add-unknown-
>>>> fields-to-the-schema&distrib.from=http://tarvos:9100/solr/
>>>> UNCLASS_shard13_replica2/&wt=javabin&version=2}{add=[
>>>> COLLECT20003218344436 (1573172872538488832), COLLECT20003218347497
>>>> (1573172872620277760), COLLECT20003218351645 (1573172872625520640),
>>>> COLLECT20003218356965 (1573172872629714944), COLLECT20003218357775
>>>> (1573172872632860672), COLLECT20003218358017 (1573172872646492160),
>>>> COLLECT20003218358152 (1573172872650686464), COLLECT20003218359395
>>>> (1573172872651735040), COLLECT20003218362571 (1573172872652783616)]} 0
>>>> 274
>>>> 2017-07-17 12:29:24.163 INFO  (qtp985934102-5161057) [c:UNCLASS
>>>> s:shard43
>>>> r:core_node108 x:UNCLASS_shard43_replica1]
>>>> o.a.s.u.p.LogUpdateProcessorFactory
>>>> [UNCLASS_shard43_replica1] webapp=/solr path=/update
>>>> params={update.distrib=FROMLEADER&update.chain=add-unknown-
>>>> fields-to-the-schema&distrib.from=http://tarvos:9100/solr/
>>>> UNCLASS_shard43_replica2/&wt=javabin&version=2}{} 0 0
>>>> 2017-07-17 12:29:24.164 ERROR (qtp985934102-5161057) [c:UNCLASS
>>>> s:shard43
>>>> r:core_node108 x:UNCLASS_shard43_replica1] o.a.s.h.RequestHandlerBase
>>>> org.apache.solr.common.SolrException: Error logging add
>>>>          at org.apache.solr.update.TransactionLog.write(TransactionLog.
>>>> java:418)
>>>>          at org.apache.solr.update.UpdateLog.add(UpdateLog.java:532)
>>>>          at org.apache.solr.update.UpdateLog.add(UpdateLog.java:516)
>>>>          at org.apache.solr.update.processor.DistributedUpdateProcessor.
>>>> versionAdd(DistributedUpdateProcessor.java:1113)
>>>>          at org.apache.solr.update.processor.DistributedUpdateProcessor.
>>>> processAdd(DistributedUpdateProcessor.java:748)
>>>>          at org.apache.solr.update.processor.LogUpdateProcessorFactory$L
>>>> ogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:103)
>>>>          at org.apache.solr.handler.loader.JavabinLoader$1.update(Javabi
>>>> nLoader.java:98)
>>>>          at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>>>> ec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:180)
>>>>          at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>>>> ec$1.readIterator(JavaBinUpdateRequestCodec.java:136)
>>>>          at org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinC
>>>> odec.java:306)
>>>>          at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCode
>>>> c.java:251)
>>>>          at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>>>> ec$1.readNamedList(JavaBinUpdateRequestCodec.java:122)
>>>>          at org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinC
>>>> odec.java:271)
>>>>          at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCode
>>>> c.java:251)
>>>>          at org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCo
>>>> dec.java:173)
>>>>          at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>>>> ec.unmarshal(JavaBinUpdateRequestCodec.java:187)
>>>>          at org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDoc
>>>> s(JavabinLoader.java:108)
>>>>          at org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoa
>>>> der.java:55)
>>>>          at org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRe
>>>> questHandler.java:97)
>>>>          at org.apache.solr.handler.ContentStreamHandlerBase.handleReque
>>>> stBody(ContentStreamHandlerBase.java:68)
>>>>          at org.apache.solr.handler.RequestHandlerBase.handleRequest(Req
>>>> uestHandlerBase.java:173)
>>>>          at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477)
>>>>          at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.
>>>> java:723)
>>>>          at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:
>>>> 529)
>>>>          at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDisp
>>>> atchFilter.java:361)
>>>>          at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDisp
>>>> atchFilter.java:305)
>>>>          at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilte
>>>> r(ServletHandler.java:1691)
>>>>          at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHan
>>>> dler.java:582)
>>>>          at org.eclipse.jetty.server.handler.ScopedHandler.handle(Scoped
>>>> Handler.java:143)
>>>>          at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHa
>>>> ndler.java:548)
>>>>          at org.eclipse.jetty.server.session.SessionHandler.doHandle(
>>>> SessionHandler.java:226)
>>>>          at org.eclipse.jetty.server.handler.ContextHandler.doHandle(
>>>> ContextHandler.java:1180)
>>>>          at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHand
>>>> ler.java:512)
>>>>          at org.eclipse.jetty.server.session.SessionHandler.doScope(
>>>> SessionHandler.java:185)
>>>>          at org.eclipse.jetty.server.handler.ContextHandler.doScope(
>>>> ContextHandler.java:1112)
>>>>          at org.eclipse.jetty.server.handler.ScopedHandler.handle(Scoped
>>>> Handler.java:141)
>>>>          at org.eclipse.jetty.server.handler.ContextHandlerCollection.ha
>>>> ndle(ContextHandlerCollection.java:213)
>>>>          at org.eclipse.jetty.server.handler.HandlerCollection.handle(
>>>> HandlerCollection.java:119)
>>>>          at org.eclipse.jetty.server.handler.HandlerWrapper.handle(Handl
>>>> erWrapper.java:134)
>>>>          at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(Rewr
>>>> iteHandler.java:335)
>>>>          at org.eclipse.jetty.server.handler.HandlerWrapper.handle(Handl
>>>> erWrapper.java:134)
>>>>          at org.eclipse.jetty.server.Server.handle(Server.java:534)
>>>>          at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.
>>>> java:320)
>>>>          at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConne
>>>> ction.java:251)
>>>>          at org.eclipse.jetty.io.AbstractConnection$ReadCallback.
>>>> succeeded(AbstractConnection.java:273)
>>>>          at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.
>>>> java:95)
>>>>          at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChann
>>>> elEndPoint.java:93)
>>>>          at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume
>>>> .executeProduceConsume(ExecuteProduceConsume.java:303)
>>>>          at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume
>>>> .produceConsume(ExecuteProduceConsume.java:148)
>>>>          at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume
>>>> .run(ExecuteProduceConsume.java:136)
>>>>          at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(Queued
>>>> ThreadPool.java:671)
>>>>          at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedT
>>>> hreadPool.java:589)
>>>>          at java.lang.Thread.run(Thread.java:748)
>>>> Caused by: org.apache.hadoop.ipc.RemoteException(java.io.IOException):
>>>> File /solr6.6.0/UNCLASS/core_node108/data/tlog/tlog.0000000000000006211
>>>> could only be replicated to 0 nodes instead of minReplication (=1).
>>>> There
>>>> are 40 datanode(s) running and no node(s) are excluded in this
>>>> operation.
>>>>          at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.c
>>>> hooseTarget4NewBlock(BlockManager.java:1622)
>>>>          at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAddit
>>>> ionalBlock(FSNamesystem.java:3351)
>>>>          at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.add
>>>> Block(NameNodeRpcServer.java:683)
>>>>          at org.apache.hadoop.hdfs.server.namenode.AuthorizationProvider
>>>> ProxyClientProtocol.addBlock(AuthorizationProviderProxyClien
>>>> tProtocol.java:214)
>>>>          at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServ
>>>> erSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSide
>>>> TranslatorPB.java:495)
>>>>          at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocol
>>>> Protos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNam
>>>> enodeProtocolProtos.java)
>>>>          at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcIn
>>>> voker.call(ProtobufRpcEngine.java:617)
>>>>          at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
>>>>          at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2216)
>>>>          at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2212)
>>>>          at java.security.AccessController.doPrivileged(Native Method)
>>>>          at javax.security.auth.Subject.doAs(Subject.java:422)
>>>>          at org.apache.hadoop.security.UserGroupInformation.doAs(UserGro
>>>> upInformation.java:1920)
>>>>          at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2210)
>>>>
>>>>          at org.apache.hadoop.ipc.Client.call(Client.java:1475)
>>>>          at org.apache.hadoop.ipc.Client.call(Client.java:1412)
>>>>          at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(
>>>> ProtobufRpcEngine.java:229)
>>>>          at com.sun.proxy.$Proxy11.addBlock(Unknown Source)
>>>>          at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTran
>>>> slatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:418)
>>>>          at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
>>>>          at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
>>>> thodAccessorImpl.java:43)
>>>>          at java.lang.reflect.Method.invoke(Method.java:498)
>>>>          at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMeth
>>>> od(RetryInvocationHandler.java:191)
>>>>          at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(Ret
>>>> ryInvocationHandler.java:102)
>>>>          at com.sun.proxy.$Proxy12.addBlock(Unknown Source)
>>>>          at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFo
>>>> llowingBlock(DFSOutputStream.java:1459)
>>>>          at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBloc
>>>> kOutputStream(DFSOutputStream.java:1255)
>>>>          at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(
>>>> DFSOutputStream.java:449)
>>>>
>>>> 2017-07-17 12:29:24.164 ERROR (qtp985934102-5161057) [c:UNCLASS
>>>> s:shard43
>>>> r:core_node108 x:UNCLASS_shard43_replica1] o.a.s.s.HttpSolrCall null:
>>>> org.apache.solr.common.SolrException: Error logging add
>>>>          at org.apache.solr.update.TransactionLog.write(TransactionLog.
>>>> java:418)
>>>>          at org.apache.solr.update.UpdateLog.add(UpdateLog.java:532)
>>>>          at org.apache.solr.update.UpdateLog.add(UpdateLog.java:516)
>>>>          at org.apache.solr.update.processor.DistributedUpdateProcessor.
>>>> versionAdd(DistributedUpdateProcessor.java:1113)
>>>>          at org.apache.solr.update.processor.DistributedUpdateProcessor.
>>>> processAdd(DistributedUpdateProcessor.java:748)
>>>>          at org.apache.solr.update.processor.LogUpdateProcessorFactory$L
>>>> ogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:103)
>>>>          at org.apache.solr.handler.loader.JavabinLoader$1.update(Javabi
>>>> nLoader.java:98)
>>>>          at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>>>> ec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:180)
>>>>          at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>>>> ec$1.readIterator(JavaBinUpdateRequestCodec.java:136)
>>>>          at org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinC
>>>> odec.java:306)
>>>>          at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCode
>>>> c.java:251)
>>>>          at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>>>> ec$1.readNamedList(JavaBinUpdateRequestCodec.java:122)
>>>>          at org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinC
>>>> odec.java:271)
>>>>          at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCode
>>>> c.java:251)
>>>>          at org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCo
>>>> dec.java:173)
>>>>          at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>>>> ec.unmarshal(JavaBinUpdateRequestCodec.java:187)
>>>>          at org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDoc
>>>> s(JavabinLoader.java:108)
>>>>          at org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoa
>>>> der.java:55)
>>>>          at org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRe
>>>> questHandler.java:97)
>>>>          at org.apache.solr.handler.ContentStreamHandlerBase.handleReque
>>>> stBody(ContentStreamHandlerBase.java:68)
>>>>          at org.apache.solr.handler.RequestHandlerBase.handleRequest(Req
>>>> uestHandlerBase.java:173)
>>>>          at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477)
>>>>          at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.
>>>> java:723)
>>>>          at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:
>>>> 529)
>>>>          at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDisp
>>>> atchFilter.java:361)
>>>>          at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDisp
>>>> atchFilter.java:305)
>>>>          at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilte
>>>> r(ServletHandler.java:1691)
>>>>          at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHan
>>>> dler.java:582)
>>>>          at org.eclipse.jetty.server.handler.ScopedHandler.handle(Scoped
>>>> Handler.java:143)
>>>>          at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHa
>>>> ndler.java:548)
>>>>          at org.eclipse.jetty.server.session.SessionHandler.doHandle(
>>>> SessionHandler.java:226)
>>>>          at org.eclipse.jetty.server.handler.ContextHandler.doHandle(
>>>> ContextHandler.java:1180)
>>>>          at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHand
>>>> ler.java:512)
>>>>          at org.eclipse.jetty.server.session.SessionHandler.doScope(
>>>> SessionHandler.java:185)
>>>>          at org.eclipse.jetty.server.handler.ContextHandler.doScope(
>>>> ContextHandler.java:1112)
>>>>          at org.eclipse.jetty.server.handler.ScopedHandler.handle(Scoped
>>>> Handler.java:141)
>>>>          at org.eclipse.jetty.server.handler.ContextHandlerCollection.ha
>>>> ndle(ContextHandlerCollection.java:213)
>>>>          at org.eclipse.jetty.server.handler.HandlerCollection.handle(
>>>> HandlerCollection.java:119)
>>>>          at org.eclipse.jetty.server.handler.HandlerWrapper.handle(Handl
>>>> erWrapper.java:134)
>>>>          at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(Rewr
>>>> iteHandler.java:335)
>>>>          at org.eclipse.jetty.server.handler.HandlerWrapper.handle(Handl
>>>> erWrapper.java:134)
>>>>          at org.eclipse.jetty.server.Server.handle(Server.java:534)
>>>>          at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.
>>>> java:320)
>>>>          at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConne
>>>> ction.java:251)
>>>>          at org.eclipse.jetty.io.AbstractConnection$ReadCallback.
>>>> succeeded(AbstractConnection.java:273)
>>>>          at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.
>>>> java:95)
>>>>          at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChann
>>>> elEndPoint.java:93)
>>>>          at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume
>>>> .executeProduceConsume(ExecuteProduceConsume.java:303)
>>>>          at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume
>>>> .produceConsume(ExecuteProduceConsume.java:148)
>>>>          at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume
>>>> .run(ExecuteProduceConsume.java:136)
>>>>          at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(Queued
>>>> ThreadPool.java:671)
>>>>          at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedT
>>>> hreadPool.java:589)
>>>>          at java.lang.Thread.run(Thread.java:748)
>>>> Caused by: org.apache.hadoop.ipc.RemoteException(java.io.IOException):
>>>> File /solr6.6.0/UNCLASS/core_node108/data/tlog/tlog.0000000000000006211
>>>> could only be replicated to 0 nodes instead of minReplication (=1).
>>>> There
>>>> are 40 datanode(s) running and no node(s) are excluded in this
>>>> operation.
>>>>          at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.c
>>>> hooseTarget4NewBlock(BlockManager.java:1622)
>>>>          at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAddit
>>>> ionalBlock(FSNamesystem.java:3351)
>>>>          at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.add
>>>> Block(NameNodeRpcServer.java:683)
>>>>          at org.apache.hadoop.hdfs.server.namenode.AuthorizationProvider
>>>> ProxyClientProtocol.addBlock(AuthorizationProviderProxyClien
>>>> tProtocol.java:214)
>>>>          at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServ
>>>> erSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSide
>>>> TranslatorPB.java:495)
>>>>          at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocol
>>>> Protos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNam
>>>> enodeProtocolProtos.java)
>>>>          at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcIn
>>>> voker.call(ProtobufRpcEngine.java:617)
>>>>          at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
>>>>          at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2216)
>>>>          at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2212)
>>>>          at java.security.AccessController.doPrivileged(Native Method)
>>>>          at javax.security.auth.Subject.doAs(Subject.java:422)
>>>>          at org.apache.hadoop.security.UserGroupInformation.doAs(UserGro
>>>> upInformation.java:1920)
>>>>          at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2210)
>>>>
>>>>          at org.apache.hadoop.ipc.Client.call(Client.java:1475)
>>>>          at org.apache.hadoop.ipc.Client.call(Client.java:1412)
>>>>          at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(
>>>> ProtobufRpcEngine.java:229)
>>>>          at com.sun.proxy.$Proxy11.addBlock(Unknown Source)
>>>>          at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTran
>>>> slatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:418)
>>>>          at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
>>>>          at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
>>>> thodAccessorImpl.java:43)
>>>>          at java.lang.reflect.Method.invoke(Method.java:498)
>>>>          at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMeth
>>>> od(RetryInvocationHandler.java:191)
>>>>          at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(Ret
>>>> ryInvocationHandler.java:102)
>>>>          at com.sun.proxy.$Proxy12.addBlock(Unknown Source)
>>>>          at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFo
>>>> llowingBlock(DFSOutputStream.java:1459)
>>>>          at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBloc
>>>> kOutputStream(DFSOutputStream.java:1255)
>>>>          at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(
>>>> DFSOutputStream.java:449)
>>>>
>>>> 2017-07-17 12:29:24.187 INFO
>>>> (zkCallback-5-thread-144-processing-n:juliet:9100_solr)
>>>> [   ] o.a.s.c.c.ZkStateReader A cluster state change: [WatchedEvent
>>>> state:SyncConnected type:NodeDataChanged
>>>> path:/collections/UNCLASS/state.json]
>>>> for collection [UNCLASS] has occurred - updating... (live nodes size:
>>>> [45])
>>>>
>>>> On the client side, the error looks like:
>>>> 2017-07-16 19:03:16,118 WARN
>>>> [com.ngc.bigdata.ie_solrindexer.IndexDocument]
>>>> Indexing error: org.apache.solr.client.solrj.i
>>>> mpl.CloudSolrClient$RouteException: Error from server at
>>>> http://leda:9100/solr/UNCLASS_shard44_replica3: Exception writing
>>>> document id COLLECT10086453202 to the index; possible analysis error.
>>>> for
>>>> collection: UNCLASS
>>>> org.apache.solr.client.solrj.impl.CloudSolrClient$RouteException: Error
>>>> from server at http://leda:9100/solr/UNCLASS_shard44_replica3: Exception
>>>> writing document id COLLECT10086453202 to the index; possible analysis
>>>> error.
>>>>          at org.apache.solr.client.solrj.impl.CloudSolrClient.directUpda
>>>> te(CloudSolrClient.java:819)
>>>>          at org.apache.solr.client.solrj.impl.CloudSolrClient.sendReques
>>>> t(CloudSolrClient.java:1263)
>>>>          at org.apache.solr.client.solrj.impl.CloudSolrClient.requestWit
>>>> hRetryOnStaleState(CloudSolrClient.java:1134)
>>>>          at org.apache.solr.client.solrj.impl.CloudSolrClient.request(Cl
>>>> oudSolrClient.java:1073)
>>>>          at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest
>>>> .java:160)
>>>>          at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:
>>>> 106)
>>>>          at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:
>>>> 71)
>>>>          at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:
>>>> 85)
>>>>          at com.ngc.bigdata.ie_solrindexer.IndexDocument.indexSolrDocs(
>>>> IndexDocument.java:959)
>>>>          at com.ngc.bigdata.ie_solrindexer.IndexDocument.index(
>>>> IndexDocument.java:236)
>>>>          at com.ngc.bigdata.ie_solrindexer.SolrIndexerProcessor.doWork(S
>>>> olrIndexerProcessor.java:63)
>>>>          at com.ngc.intelenterprise.intelentutil.utils.Processor.run(
>>>> Processor.java:140)
>>>>          at com.ngc.intelenterprise.intelentutil.jms.IntelEntQueueProc.
>>>> process(IntelEntQueueProc.java:208)
>>>>          at org.apache.camel.processor.DelegateSyncProcessor.process(Del
>>>> egateSyncProcessor.java:63)
>>>>          at org.apache.camel.management.InstrumentationProcessor.process
>>>> (InstrumentationProcessor.java:77)
>>>>          at org.apache.camel.processor.RedeliveryErrorHandler.process(Re
>>>> deliveryErrorHandler.java:460)
>>>>          at org.apache.camel.processor.CamelInternalProcessor.process(Ca
>>>> melInternalProcessor.java:190)
>>>>          at org.apache.camel.processor.CamelInternalProcessor.process(Ca
>>>> melInternalProcessor.java:190)
>>>>          at org.apache.camel.component.seda.SedaConsumer.sendToConsumers
>>>> (SedaConsumer.java:298)
>>>>          at org.apache.camel.component.seda.SedaConsumer.doRun(SedaConsu
>>>> mer.java:207)
>>>>          at org.apache.camel.component.seda.SedaConsumer.run(SedaConsume
>>>> r.java:154)
>>>>          at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
>>>> Executor.java:1142)
>>>>          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
>>>> lExecutor.java:617)
>>>>          at java.lang.Thread.run(Thread.java:748)
>>>> Caused by:
>>>> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
>>>> Error from server at http://leda:9100/solr/UNCLASS_shard44_replica3:
>>>> Exception writing document id COLLECT10086453202 to the index; possible
>>>> analysis error.
>>>>          at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMeth
>>>> od(HttpSolrClient.java:610)
>>>>          at org.apache.solr.client.solrj.impl.HttpSolrClient.request(Htt
>>>> pSolrClient.java:279)
>>>>          at org.apache.solr.client.solrj.impl.HttpSolrClient.request(Htt
>>>> pSolrClient.java:268)
>>>>          at org.apache.solr.client.solrj.impl.LBHttpSolrClient.doRequest
>>>> (LBHttpSolrClient.java:447)
>>>>          at org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(L
>>>> BHttpSolrClient.java:388)
>>>>          at org.apache.solr.client.solrj.impl.CloudSolrClient.lambda$dir
>>>> ectUpdate$0(CloudSolrClient.java:796)
>>>>          at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>>>>          at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolE
>>>> xecutor.lambda$execute$0(ExecutorUtil.java:229)
>>>>          ... 3 more
>>>> 2017-07-16 19:03:16,134 ERROR
>>>> [com.ngc.bigdata.ie_solrindexer.IndexDocument]
>>>> Error indexing: org.apache.solr.client.solrj.i
>>>> mpl.CloudSolrClient$RouteException: Error from server at
>>>> http://leda:9100/solr/UNCLASS_shard44_replica3: Exception writing
>>>> document id COLLECT10086453202 to the index; possible analysis error.
>>>> for
>>>> collection: UNCLASS.
>>>> 2017-07-16 19:03:16,135 ERROR
>>>> [com.ngc.bigdata.ie_solrindexer.IndexDocument]
>>>> Exception during indexing: org.apache.solr.client.solrj.i
>>>> mpl.CloudSolrClient$RouteException: Error from server at
>>>> http://leda:9100/solr/UNCLASS_shard44_replica3: Exception writing
>>>> document id COLLECT10086453202 to the index; possible analysis error.
>>>>
>>>> I can fire them back up, but they only run for a short time before
>>>> getting more indexing errors.  Several of the nodes show as down in the
>>>> cloud view.  Any help would be appreciated!  Thank you!
>>>>
>>>>
>>>> -Joe
>>>>
>>>>
>>
>> ---
>> This email has been checked for viruses by AVG.
>> http://www.avg.com
>>
>

Re: Solr 6.6.0 - Indexing errors

Posted by Joe Obernberger <jo...@gmail.com>.
So far we've indexed about 46 million documents, but over the weekend, 
these errors started coming up.  I would expect that if there was a 
basic issue, it would have started right away?  We ran a test cluster 
with just a few shards/replicas prior and didn't see any issues using 
the same indexing code, but we're running a lot more indexers 
simultaneously with the larger cluster; perhaps we're just overloading 
HDFS?  The same nodes that run Solr also run HDFS datanodes, but they 
are pretty beefy machines; we're not swapping.

As Shawn pointed out, I will be checking the HDFS version (we're using 
Cloudera CDH 5.10.2), and the HDFS logs.

-Joe


On 7/17/2017 10:16 AM, Susheel Kumar wrote:
> There is some analysis error also.  I would suggest to test the indexer on
> just one shard setup first, then test for a replica (1 shard and 1 replica)
> and then test for 2 shards and 2 replica.  This would confirm if there is
> basic issue with indexing / cluster setup.
>
> On Mon, Jul 17, 2017 at 9:04 AM, Joe Obernberger <
> joseph.obernberger@gmail.com> wrote:
>
>> Some more info:
>>
>> When I stop all the indexers, in about 5-10 minutes the cluster goes all
>> green.  When I start just one indexer, several nodes immediately go down
>> with the 'Error adding log' message.
>>
>> I'm using CloudSolrClient.add(List<SolrInputDocument>) to do the
>> indexing.  Is this correct for SolrCloud?
>>
>> Thank you!
>>
>> -Joe
>>
>>
>>
>> On 7/17/2017 8:36 AM, Joe Obernberger wrote:
>>
>>> We've been indexing data on a 45 node cluster with 100 shards and 3
>>> replicas, but our indexing processes have been stopping due to errors.  On
>>> the server side the error is "Error logging add". Stack trace:
>>>
>>> 2017-07-17 12:29:24.057 INFO  (qtp985934102-5161548) [c:UNCLASS s:shard58
>>> r:core_node290 x:UNCLASS_shard58_replica1] o.a.s.u.p.LogUpdateProcessorFactory
>>> [UNCLASS_shard58_replica1] webapp=/solr path=/update
>>> params={update.distrib=FROMLEADER&update.chain=add-unknown-
>>> fields-to-the-schema&distrib.from=http://tarvos:9100/solr/
>>> UNCLASS_shard58_replica2/&wt=javabin&version=2}{add=[
>>> COLLECT20003218348784 (1573172872544780288), COLLECT20003218351447
>>> (1573172872620277760), COLLECT20003218353085 (1573172872625520640),
>>> COLLECT20003218357937 (1573172872627617792), COLLECT20003218361860
>>> (1573172872629714944), COLLECT20003218362535 (1573172872631812096)]} 0 171
>>> 2017-07-17 12:29:24.160 INFO  (qtp985934102-5160762) [c:UNCLASS s:shard13
>>> r:core_node81 x:UNCLASS_shard13_replica1] o.a.s.u.p.LogUpdateProcessorFactory
>>> [UNCLASS_shard13_replica1] webapp=/solr path=/update
>>> params={update.distrib=FROMLEADER&update.chain=add-unknown-
>>> fields-to-the-schema&distrib.from=http://tarvos:9100/solr/
>>> UNCLASS_shard13_replica2/&wt=javabin&version=2}{add=[
>>> COLLECT20003218344436 (1573172872538488832), COLLECT20003218347497
>>> (1573172872620277760), COLLECT20003218351645 (1573172872625520640),
>>> COLLECT20003218356965 (1573172872629714944), COLLECT20003218357775
>>> (1573172872632860672), COLLECT20003218358017 (1573172872646492160),
>>> COLLECT20003218358152 (1573172872650686464), COLLECT20003218359395
>>> (1573172872651735040), COLLECT20003218362571 (1573172872652783616)]} 0 274
>>> 2017-07-17 12:29:24.163 INFO  (qtp985934102-5161057) [c:UNCLASS s:shard43
>>> r:core_node108 x:UNCLASS_shard43_replica1] o.a.s.u.p.LogUpdateProcessorFactory
>>> [UNCLASS_shard43_replica1] webapp=/solr path=/update
>>> params={update.distrib=FROMLEADER&update.chain=add-unknown-
>>> fields-to-the-schema&distrib.from=http://tarvos:9100/solr/
>>> UNCLASS_shard43_replica2/&wt=javabin&version=2}{} 0 0
>>> 2017-07-17 12:29:24.164 ERROR (qtp985934102-5161057) [c:UNCLASS s:shard43
>>> r:core_node108 x:UNCLASS_shard43_replica1] o.a.s.h.RequestHandlerBase
>>> org.apache.solr.common.SolrException: Error logging add
>>>          at org.apache.solr.update.TransactionLog.write(TransactionLog.
>>> java:418)
>>>          at org.apache.solr.update.UpdateLog.add(UpdateLog.java:532)
>>>          at org.apache.solr.update.UpdateLog.add(UpdateLog.java:516)
>>>          at org.apache.solr.update.processor.DistributedUpdateProcessor.
>>> versionAdd(DistributedUpdateProcessor.java:1113)
>>>          at org.apache.solr.update.processor.DistributedUpdateProcessor.
>>> processAdd(DistributedUpdateProcessor.java:748)
>>>          at org.apache.solr.update.processor.LogUpdateProcessorFactory$L
>>> ogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:103)
>>>          at org.apache.solr.handler.loader.JavabinLoader$1.update(Javabi
>>> nLoader.java:98)
>>>          at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>>> ec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:180)
>>>          at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>>> ec$1.readIterator(JavaBinUpdateRequestCodec.java:136)
>>>          at org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinC
>>> odec.java:306)
>>>          at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCode
>>> c.java:251)
>>>          at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>>> ec$1.readNamedList(JavaBinUpdateRequestCodec.java:122)
>>>          at org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinC
>>> odec.java:271)
>>>          at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCode
>>> c.java:251)
>>>          at org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCo
>>> dec.java:173)
>>>          at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>>> ec.unmarshal(JavaBinUpdateRequestCodec.java:187)
>>>          at org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDoc
>>> s(JavabinLoader.java:108)
>>>          at org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoa
>>> der.java:55)
>>>          at org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRe
>>> questHandler.java:97)
>>>          at org.apache.solr.handler.ContentStreamHandlerBase.handleReque
>>> stBody(ContentStreamHandlerBase.java:68)
>>>          at org.apache.solr.handler.RequestHandlerBase.handleRequest(Req
>>> uestHandlerBase.java:173)
>>>          at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477)
>>>          at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.
>>> java:723)
>>>          at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:
>>> 529)
>>>          at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDisp
>>> atchFilter.java:361)
>>>          at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDisp
>>> atchFilter.java:305)
>>>          at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilte
>>> r(ServletHandler.java:1691)
>>>          at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHan
>>> dler.java:582)
>>>          at org.eclipse.jetty.server.handler.ScopedHandler.handle(Scoped
>>> Handler.java:143)
>>>          at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHa
>>> ndler.java:548)
>>>          at org.eclipse.jetty.server.session.SessionHandler.doHandle(
>>> SessionHandler.java:226)
>>>          at org.eclipse.jetty.server.handler.ContextHandler.doHandle(
>>> ContextHandler.java:1180)
>>>          at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHand
>>> ler.java:512)
>>>          at org.eclipse.jetty.server.session.SessionHandler.doScope(
>>> SessionHandler.java:185)
>>>          at org.eclipse.jetty.server.handler.ContextHandler.doScope(
>>> ContextHandler.java:1112)
>>>          at org.eclipse.jetty.server.handler.ScopedHandler.handle(Scoped
>>> Handler.java:141)
>>>          at org.eclipse.jetty.server.handler.ContextHandlerCollection.ha
>>> ndle(ContextHandlerCollection.java:213)
>>>          at org.eclipse.jetty.server.handler.HandlerCollection.handle(
>>> HandlerCollection.java:119)
>>>          at org.eclipse.jetty.server.handler.HandlerWrapper.handle(Handl
>>> erWrapper.java:134)
>>>          at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(Rewr
>>> iteHandler.java:335)
>>>          at org.eclipse.jetty.server.handler.HandlerWrapper.handle(Handl
>>> erWrapper.java:134)
>>>          at org.eclipse.jetty.server.Server.handle(Server.java:534)
>>>          at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.
>>> java:320)
>>>          at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConne
>>> ction.java:251)
>>>          at org.eclipse.jetty.io.AbstractConnection$ReadCallback.
>>> succeeded(AbstractConnection.java:273)
>>>          at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.
>>> java:95)
>>>          at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChann
>>> elEndPoint.java:93)
>>>          at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume
>>> .executeProduceConsume(ExecuteProduceConsume.java:303)
>>>          at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume
>>> .produceConsume(ExecuteProduceConsume.java:148)
>>>          at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume
>>> .run(ExecuteProduceConsume.java:136)
>>>          at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(Queued
>>> ThreadPool.java:671)
>>>          at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedT
>>> hreadPool.java:589)
>>>          at java.lang.Thread.run(Thread.java:748)
>>> Caused by: org.apache.hadoop.ipc.RemoteException(java.io.IOException):
>>> File /solr6.6.0/UNCLASS/core_node108/data/tlog/tlog.0000000000000006211
>>> could only be replicated to 0 nodes instead of minReplication (=1).  There
>>> are 40 datanode(s) running and no node(s) are excluded in this operation.
>>>          at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.c
>>> hooseTarget4NewBlock(BlockManager.java:1622)
>>>          at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAddit
>>> ionalBlock(FSNamesystem.java:3351)
>>>          at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.add
>>> Block(NameNodeRpcServer.java:683)
>>>          at org.apache.hadoop.hdfs.server.namenode.AuthorizationProvider
>>> ProxyClientProtocol.addBlock(AuthorizationProviderProxyClien
>>> tProtocol.java:214)
>>>          at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServ
>>> erSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSide
>>> TranslatorPB.java:495)
>>>          at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocol
>>> Protos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNam
>>> enodeProtocolProtos.java)
>>>          at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcIn
>>> voker.call(ProtobufRpcEngine.java:617)
>>>          at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
>>>          at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2216)
>>>          at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2212)
>>>          at java.security.AccessController.doPrivileged(Native Method)
>>>          at javax.security.auth.Subject.doAs(Subject.java:422)
>>>          at org.apache.hadoop.security.UserGroupInformation.doAs(UserGro
>>> upInformation.java:1920)
>>>          at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2210)
>>>
>>>          at org.apache.hadoop.ipc.Client.call(Client.java:1475)
>>>          at org.apache.hadoop.ipc.Client.call(Client.java:1412)
>>>          at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(
>>> ProtobufRpcEngine.java:229)
>>>          at com.sun.proxy.$Proxy11.addBlock(Unknown Source)
>>>          at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTran
>>> slatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:418)
>>>          at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
>>>          at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
>>> thodAccessorImpl.java:43)
>>>          at java.lang.reflect.Method.invoke(Method.java:498)
>>>          at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMeth
>>> od(RetryInvocationHandler.java:191)
>>>          at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(Ret
>>> ryInvocationHandler.java:102)
>>>          at com.sun.proxy.$Proxy12.addBlock(Unknown Source)
>>>          at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFo
>>> llowingBlock(DFSOutputStream.java:1459)
>>>          at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBloc
>>> kOutputStream(DFSOutputStream.java:1255)
>>>          at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(
>>> DFSOutputStream.java:449)
>>>
>>> 2017-07-17 12:29:24.164 ERROR (qtp985934102-5161057) [c:UNCLASS s:shard43
>>> r:core_node108 x:UNCLASS_shard43_replica1] o.a.s.s.HttpSolrCall null:
>>> org.apache.solr.common.SolrException: Error logging add
>>>          at org.apache.solr.update.TransactionLog.write(TransactionLog.
>>> java:418)
>>>          at org.apache.solr.update.UpdateLog.add(UpdateLog.java:532)
>>>          at org.apache.solr.update.UpdateLog.add(UpdateLog.java:516)
>>>          at org.apache.solr.update.processor.DistributedUpdateProcessor.
>>> versionAdd(DistributedUpdateProcessor.java:1113)
>>>          at org.apache.solr.update.processor.DistributedUpdateProcessor.
>>> processAdd(DistributedUpdateProcessor.java:748)
>>>          at org.apache.solr.update.processor.LogUpdateProcessorFactory$L
>>> ogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:103)
>>>          at org.apache.solr.handler.loader.JavabinLoader$1.update(Javabi
>>> nLoader.java:98)
>>>          at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>>> ec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:180)
>>>          at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>>> ec$1.readIterator(JavaBinUpdateRequestCodec.java:136)
>>>          at org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinC
>>> odec.java:306)
>>>          at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCode
>>> c.java:251)
>>>          at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>>> ec$1.readNamedList(JavaBinUpdateRequestCodec.java:122)
>>>          at org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinC
>>> odec.java:271)
>>>          at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCode
>>> c.java:251)
>>>          at org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCo
>>> dec.java:173)
>>>          at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>>> ec.unmarshal(JavaBinUpdateRequestCodec.java:187)
>>>          at org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDoc
>>> s(JavabinLoader.java:108)
>>>          at org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoa
>>> der.java:55)
>>>          at org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRe
>>> questHandler.java:97)
>>>          at org.apache.solr.handler.ContentStreamHandlerBase.handleReque
>>> stBody(ContentStreamHandlerBase.java:68)
>>>          at org.apache.solr.handler.RequestHandlerBase.handleRequest(Req
>>> uestHandlerBase.java:173)
>>>          at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477)
>>>          at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.
>>> java:723)
>>>          at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:
>>> 529)
>>>          at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDisp
>>> atchFilter.java:361)
>>>          at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDisp
>>> atchFilter.java:305)
>>>          at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilte
>>> r(ServletHandler.java:1691)
>>>          at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHan
>>> dler.java:582)
>>>          at org.eclipse.jetty.server.handler.ScopedHandler.handle(Scoped
>>> Handler.java:143)
>>>          at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHa
>>> ndler.java:548)
>>>          at org.eclipse.jetty.server.session.SessionHandler.doHandle(
>>> SessionHandler.java:226)
>>>          at org.eclipse.jetty.server.handler.ContextHandler.doHandle(
>>> ContextHandler.java:1180)
>>>          at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHand
>>> ler.java:512)
>>>          at org.eclipse.jetty.server.session.SessionHandler.doScope(
>>> SessionHandler.java:185)
>>>          at org.eclipse.jetty.server.handler.ContextHandler.doScope(
>>> ContextHandler.java:1112)
>>>          at org.eclipse.jetty.server.handler.ScopedHandler.handle(Scoped
>>> Handler.java:141)
>>>          at org.eclipse.jetty.server.handler.ContextHandlerCollection.ha
>>> ndle(ContextHandlerCollection.java:213)
>>>          at org.eclipse.jetty.server.handler.HandlerCollection.handle(
>>> HandlerCollection.java:119)
>>>          at org.eclipse.jetty.server.handler.HandlerWrapper.handle(Handl
>>> erWrapper.java:134)
>>>          at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(Rewr
>>> iteHandler.java:335)
>>>          at org.eclipse.jetty.server.handler.HandlerWrapper.handle(Handl
>>> erWrapper.java:134)
>>>          at org.eclipse.jetty.server.Server.handle(Server.java:534)
>>>          at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.
>>> java:320)
>>>          at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConne
>>> ction.java:251)
>>>          at org.eclipse.jetty.io.AbstractConnection$ReadCallback.
>>> succeeded(AbstractConnection.java:273)
>>>          at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.
>>> java:95)
>>>          at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChann
>>> elEndPoint.java:93)
>>>          at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume
>>> .executeProduceConsume(ExecuteProduceConsume.java:303)
>>>          at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume
>>> .produceConsume(ExecuteProduceConsume.java:148)
>>>          at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume
>>> .run(ExecuteProduceConsume.java:136)
>>>          at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(Queued
>>> ThreadPool.java:671)
>>>          at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedT
>>> hreadPool.java:589)
>>>          at java.lang.Thread.run(Thread.java:748)
>>> Caused by: org.apache.hadoop.ipc.RemoteException(java.io.IOException):
>>> File /solr6.6.0/UNCLASS/core_node108/data/tlog/tlog.0000000000000006211
>>> could only be replicated to 0 nodes instead of minReplication (=1).  There
>>> are 40 datanode(s) running and no node(s) are excluded in this operation.
>>>          at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.c
>>> hooseTarget4NewBlock(BlockManager.java:1622)
>>>          at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAddit
>>> ionalBlock(FSNamesystem.java:3351)
>>>          at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.add
>>> Block(NameNodeRpcServer.java:683)
>>>          at org.apache.hadoop.hdfs.server.namenode.AuthorizationProvider
>>> ProxyClientProtocol.addBlock(AuthorizationProviderProxyClien
>>> tProtocol.java:214)
>>>          at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServ
>>> erSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSide
>>> TranslatorPB.java:495)
>>>          at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocol
>>> Protos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNam
>>> enodeProtocolProtos.java)
>>>          at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcIn
>>> voker.call(ProtobufRpcEngine.java:617)
>>>          at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
>>>          at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2216)
>>>          at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2212)
>>>          at java.security.AccessController.doPrivileged(Native Method)
>>>          at javax.security.auth.Subject.doAs(Subject.java:422)
>>>          at org.apache.hadoop.security.UserGroupInformation.doAs(UserGro
>>> upInformation.java:1920)
>>>          at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2210)
>>>
>>>          at org.apache.hadoop.ipc.Client.call(Client.java:1475)
>>>          at org.apache.hadoop.ipc.Client.call(Client.java:1412)
>>>          at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(
>>> ProtobufRpcEngine.java:229)
>>>          at com.sun.proxy.$Proxy11.addBlock(Unknown Source)
>>>          at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTran
>>> slatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:418)
>>>          at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
>>>          at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
>>> thodAccessorImpl.java:43)
>>>          at java.lang.reflect.Method.invoke(Method.java:498)
>>>          at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMeth
>>> od(RetryInvocationHandler.java:191)
>>>          at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(Ret
>>> ryInvocationHandler.java:102)
>>>          at com.sun.proxy.$Proxy12.addBlock(Unknown Source)
>>>          at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFo
>>> llowingBlock(DFSOutputStream.java:1459)
>>>          at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBloc
>>> kOutputStream(DFSOutputStream.java:1255)
>>>          at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(
>>> DFSOutputStream.java:449)
>>>
>>> 2017-07-17 12:29:24.187 INFO (zkCallback-5-thread-144-processing-n:juliet:9100_solr)
>>> [   ] o.a.s.c.c.ZkStateReader A cluster state change: [WatchedEvent
>>> state:SyncConnected type:NodeDataChanged path:/collections/UNCLASS/state.json]
>>> for collection [UNCLASS] has occurred - updating... (live nodes size: [45])
>>>
>>> On the client side, the error looks like:
>>> 2017-07-16 19:03:16,118 WARN [com.ngc.bigdata.ie_solrindexer.IndexDocument]
>>> Indexing error: org.apache.solr.client.solrj.i
>>> mpl.CloudSolrClient$RouteException: Error from server at
>>> http://leda:9100/solr/UNCLASS_shard44_replica3: Exception writing
>>> document id COLLECT10086453202 to the index; possible analysis error. for
>>> collection: UNCLASS
>>> org.apache.solr.client.solrj.impl.CloudSolrClient$RouteException: Error
>>> from server at http://leda:9100/solr/UNCLASS_shard44_replica3: Exception
>>> writing document id COLLECT10086453202 to the index; possible analysis
>>> error.
>>>          at org.apache.solr.client.solrj.impl.CloudSolrClient.directUpda
>>> te(CloudSolrClient.java:819)
>>>          at org.apache.solr.client.solrj.impl.CloudSolrClient.sendReques
>>> t(CloudSolrClient.java:1263)
>>>          at org.apache.solr.client.solrj.impl.CloudSolrClient.requestWit
>>> hRetryOnStaleState(CloudSolrClient.java:1134)
>>>          at org.apache.solr.client.solrj.impl.CloudSolrClient.request(Cl
>>> oudSolrClient.java:1073)
>>>          at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest
>>> .java:160)
>>>          at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:
>>> 106)
>>>          at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:
>>> 71)
>>>          at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:
>>> 85)
>>>          at com.ngc.bigdata.ie_solrindexer.IndexDocument.indexSolrDocs(
>>> IndexDocument.java:959)
>>>          at com.ngc.bigdata.ie_solrindexer.IndexDocument.index(
>>> IndexDocument.java:236)
>>>          at com.ngc.bigdata.ie_solrindexer.SolrIndexerProcessor.doWork(S
>>> olrIndexerProcessor.java:63)
>>>          at com.ngc.intelenterprise.intelentutil.utils.Processor.run(
>>> Processor.java:140)
>>>          at com.ngc.intelenterprise.intelentutil.jms.IntelEntQueueProc.
>>> process(IntelEntQueueProc.java:208)
>>>          at org.apache.camel.processor.DelegateSyncProcessor.process(Del
>>> egateSyncProcessor.java:63)
>>>          at org.apache.camel.management.InstrumentationProcessor.process
>>> (InstrumentationProcessor.java:77)
>>>          at org.apache.camel.processor.RedeliveryErrorHandler.process(Re
>>> deliveryErrorHandler.java:460)
>>>          at org.apache.camel.processor.CamelInternalProcessor.process(Ca
>>> melInternalProcessor.java:190)
>>>          at org.apache.camel.processor.CamelInternalProcessor.process(Ca
>>> melInternalProcessor.java:190)
>>>          at org.apache.camel.component.seda.SedaConsumer.sendToConsumers
>>> (SedaConsumer.java:298)
>>>          at org.apache.camel.component.seda.SedaConsumer.doRun(SedaConsu
>>> mer.java:207)
>>>          at org.apache.camel.component.seda.SedaConsumer.run(SedaConsume
>>> r.java:154)
>>>          at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
>>> Executor.java:1142)
>>>          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
>>> lExecutor.java:617)
>>>          at java.lang.Thread.run(Thread.java:748)
>>> Caused by: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
>>> Error from server at http://leda:9100/solr/UNCLASS_shard44_replica3:
>>> Exception writing document id COLLECT10086453202 to the index; possible
>>> analysis error.
>>>          at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMeth
>>> od(HttpSolrClient.java:610)
>>>          at org.apache.solr.client.solrj.impl.HttpSolrClient.request(Htt
>>> pSolrClient.java:279)
>>>          at org.apache.solr.client.solrj.impl.HttpSolrClient.request(Htt
>>> pSolrClient.java:268)
>>>          at org.apache.solr.client.solrj.impl.LBHttpSolrClient.doRequest
>>> (LBHttpSolrClient.java:447)
>>>          at org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(L
>>> BHttpSolrClient.java:388)
>>>          at org.apache.solr.client.solrj.impl.CloudSolrClient.lambda$dir
>>> ectUpdate$0(CloudSolrClient.java:796)
>>>          at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>>>          at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolE
>>> xecutor.lambda$execute$0(ExecutorUtil.java:229)
>>>          ... 3 more
>>> 2017-07-16 19:03:16,134 ERROR [com.ngc.bigdata.ie_solrindexer.IndexDocument]
>>> Error indexing: org.apache.solr.client.solrj.i
>>> mpl.CloudSolrClient$RouteException: Error from server at
>>> http://leda:9100/solr/UNCLASS_shard44_replica3: Exception writing
>>> document id COLLECT10086453202 to the index; possible analysis error. for
>>> collection: UNCLASS.
>>> 2017-07-16 19:03:16,135 ERROR [com.ngc.bigdata.ie_solrindexer.IndexDocument]
>>> Exception during indexing: org.apache.solr.client.solrj.i
>>> mpl.CloudSolrClient$RouteException: Error from server at
>>> http://leda:9100/solr/UNCLASS_shard44_replica3: Exception writing
>>> document id COLLECT10086453202 to the index; possible analysis error.
>>>
>>> I can fire them back up, but they only run for a short time before
>>> getting more indexing errors.  Several of the nodes show as down in the
>>> cloud view.  Any help would be appreciated!  Thank you!
>>>
>>>
>>> -Joe
>>>
>>>
>
> ---
> This email has been checked for viruses by AVG.
> http://www.avg.com
>


Re: Solr 6.6.0 - Indexing errors

Posted by Susheel Kumar <su...@gmail.com>.
There is some analysis error also.  I would suggest to test the indexer on
just one shard setup first, then test for a replica (1 shard and 1 replica)
and then test for 2 shards and 2 replica.  This would confirm if there is
basic issue with indexing / cluster setup.

On Mon, Jul 17, 2017 at 9:04 AM, Joe Obernberger <
joseph.obernberger@gmail.com> wrote:

> Some more info:
>
> When I stop all the indexers, in about 5-10 minutes the cluster goes all
> green.  When I start just one indexer, several nodes immediately go down
> with the 'Error adding log' message.
>
> I'm using CloudSolrClient.add(List<SolrInputDocument>) to do the
> indexing.  Is this correct for SolrCloud?
>
> Thank you!
>
> -Joe
>
>
>
> On 7/17/2017 8:36 AM, Joe Obernberger wrote:
>
>> We've been indexing data on a 45 node cluster with 100 shards and 3
>> replicas, but our indexing processes have been stopping due to errors.  On
>> the server side the error is "Error logging add". Stack trace:
>>
>> 2017-07-17 12:29:24.057 INFO  (qtp985934102-5161548) [c:UNCLASS s:shard58
>> r:core_node290 x:UNCLASS_shard58_replica1] o.a.s.u.p.LogUpdateProcessorFactory
>> [UNCLASS_shard58_replica1] webapp=/solr path=/update
>> params={update.distrib=FROMLEADER&update.chain=add-unknown-
>> fields-to-the-schema&distrib.from=http://tarvos:9100/solr/
>> UNCLASS_shard58_replica2/&wt=javabin&version=2}{add=[
>> COLLECT20003218348784 (1573172872544780288), COLLECT20003218351447
>> (1573172872620277760), COLLECT20003218353085 (1573172872625520640),
>> COLLECT20003218357937 (1573172872627617792), COLLECT20003218361860
>> (1573172872629714944), COLLECT20003218362535 (1573172872631812096)]} 0 171
>> 2017-07-17 12:29:24.160 INFO  (qtp985934102-5160762) [c:UNCLASS s:shard13
>> r:core_node81 x:UNCLASS_shard13_replica1] o.a.s.u.p.LogUpdateProcessorFactory
>> [UNCLASS_shard13_replica1] webapp=/solr path=/update
>> params={update.distrib=FROMLEADER&update.chain=add-unknown-
>> fields-to-the-schema&distrib.from=http://tarvos:9100/solr/
>> UNCLASS_shard13_replica2/&wt=javabin&version=2}{add=[
>> COLLECT20003218344436 (1573172872538488832), COLLECT20003218347497
>> (1573172872620277760), COLLECT20003218351645 (1573172872625520640),
>> COLLECT20003218356965 (1573172872629714944), COLLECT20003218357775
>> (1573172872632860672), COLLECT20003218358017 (1573172872646492160),
>> COLLECT20003218358152 (1573172872650686464), COLLECT20003218359395
>> (1573172872651735040), COLLECT20003218362571 (1573172872652783616)]} 0 274
>> 2017-07-17 12:29:24.163 INFO  (qtp985934102-5161057) [c:UNCLASS s:shard43
>> r:core_node108 x:UNCLASS_shard43_replica1] o.a.s.u.p.LogUpdateProcessorFactory
>> [UNCLASS_shard43_replica1] webapp=/solr path=/update
>> params={update.distrib=FROMLEADER&update.chain=add-unknown-
>> fields-to-the-schema&distrib.from=http://tarvos:9100/solr/
>> UNCLASS_shard43_replica2/&wt=javabin&version=2}{} 0 0
>> 2017-07-17 12:29:24.164 ERROR (qtp985934102-5161057) [c:UNCLASS s:shard43
>> r:core_node108 x:UNCLASS_shard43_replica1] o.a.s.h.RequestHandlerBase
>> org.apache.solr.common.SolrException: Error logging add
>>         at org.apache.solr.update.TransactionLog.write(TransactionLog.
>> java:418)
>>         at org.apache.solr.update.UpdateLog.add(UpdateLog.java:532)
>>         at org.apache.solr.update.UpdateLog.add(UpdateLog.java:516)
>>         at org.apache.solr.update.processor.DistributedUpdateProcessor.
>> versionAdd(DistributedUpdateProcessor.java:1113)
>>         at org.apache.solr.update.processor.DistributedUpdateProcessor.
>> processAdd(DistributedUpdateProcessor.java:748)
>>         at org.apache.solr.update.processor.LogUpdateProcessorFactory$L
>> ogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:103)
>>         at org.apache.solr.handler.loader.JavabinLoader$1.update(Javabi
>> nLoader.java:98)
>>         at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>> ec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:180)
>>         at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>> ec$1.readIterator(JavaBinUpdateRequestCodec.java:136)
>>         at org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinC
>> odec.java:306)
>>         at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCode
>> c.java:251)
>>         at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>> ec$1.readNamedList(JavaBinUpdateRequestCodec.java:122)
>>         at org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinC
>> odec.java:271)
>>         at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCode
>> c.java:251)
>>         at org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCo
>> dec.java:173)
>>         at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>> ec.unmarshal(JavaBinUpdateRequestCodec.java:187)
>>         at org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDoc
>> s(JavabinLoader.java:108)
>>         at org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoa
>> der.java:55)
>>         at org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRe
>> questHandler.java:97)
>>         at org.apache.solr.handler.ContentStreamHandlerBase.handleReque
>> stBody(ContentStreamHandlerBase.java:68)
>>         at org.apache.solr.handler.RequestHandlerBase.handleRequest(Req
>> uestHandlerBase.java:173)
>>         at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477)
>>         at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.
>> java:723)
>>         at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:
>> 529)
>>         at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDisp
>> atchFilter.java:361)
>>         at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDisp
>> atchFilter.java:305)
>>         at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilte
>> r(ServletHandler.java:1691)
>>         at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHan
>> dler.java:582)
>>         at org.eclipse.jetty.server.handler.ScopedHandler.handle(Scoped
>> Handler.java:143)
>>         at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHa
>> ndler.java:548)
>>         at org.eclipse.jetty.server.session.SessionHandler.doHandle(
>> SessionHandler.java:226)
>>         at org.eclipse.jetty.server.handler.ContextHandler.doHandle(
>> ContextHandler.java:1180)
>>         at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHand
>> ler.java:512)
>>         at org.eclipse.jetty.server.session.SessionHandler.doScope(
>> SessionHandler.java:185)
>>         at org.eclipse.jetty.server.handler.ContextHandler.doScope(
>> ContextHandler.java:1112)
>>         at org.eclipse.jetty.server.handler.ScopedHandler.handle(Scoped
>> Handler.java:141)
>>         at org.eclipse.jetty.server.handler.ContextHandlerCollection.ha
>> ndle(ContextHandlerCollection.java:213)
>>         at org.eclipse.jetty.server.handler.HandlerCollection.handle(
>> HandlerCollection.java:119)
>>         at org.eclipse.jetty.server.handler.HandlerWrapper.handle(Handl
>> erWrapper.java:134)
>>         at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(Rewr
>> iteHandler.java:335)
>>         at org.eclipse.jetty.server.handler.HandlerWrapper.handle(Handl
>> erWrapper.java:134)
>>         at org.eclipse.jetty.server.Server.handle(Server.java:534)
>>         at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.
>> java:320)
>>         at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConne
>> ction.java:251)
>>         at org.eclipse.jetty.io.AbstractConnection$ReadCallback.
>> succeeded(AbstractConnection.java:273)
>>         at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.
>> java:95)
>>         at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChann
>> elEndPoint.java:93)
>>         at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume
>> .executeProduceConsume(ExecuteProduceConsume.java:303)
>>         at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume
>> .produceConsume(ExecuteProduceConsume.java:148)
>>         at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume
>> .run(ExecuteProduceConsume.java:136)
>>         at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(Queued
>> ThreadPool.java:671)
>>         at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedT
>> hreadPool.java:589)
>>         at java.lang.Thread.run(Thread.java:748)
>> Caused by: org.apache.hadoop.ipc.RemoteException(java.io.IOException):
>> File /solr6.6.0/UNCLASS/core_node108/data/tlog/tlog.0000000000000006211
>> could only be replicated to 0 nodes instead of minReplication (=1).  There
>> are 40 datanode(s) running and no node(s) are excluded in this operation.
>>         at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.c
>> hooseTarget4NewBlock(BlockManager.java:1622)
>>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAddit
>> ionalBlock(FSNamesystem.java:3351)
>>         at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.add
>> Block(NameNodeRpcServer.java:683)
>>         at org.apache.hadoop.hdfs.server.namenode.AuthorizationProvider
>> ProxyClientProtocol.addBlock(AuthorizationProviderProxyClien
>> tProtocol.java:214)
>>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServ
>> erSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSide
>> TranslatorPB.java:495)
>>         at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocol
>> Protos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNam
>> enodeProtocolProtos.java)
>>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcIn
>> voker.call(ProtobufRpcEngine.java:617)
>>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2216)
>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2212)
>>         at java.security.AccessController.doPrivileged(Native Method)
>>         at javax.security.auth.Subject.doAs(Subject.java:422)
>>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGro
>> upInformation.java:1920)
>>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2210)
>>
>>         at org.apache.hadoop.ipc.Client.call(Client.java:1475)
>>         at org.apache.hadoop.ipc.Client.call(Client.java:1412)
>>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(
>> ProtobufRpcEngine.java:229)
>>         at com.sun.proxy.$Proxy11.addBlock(Unknown Source)
>>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTran
>> slatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:418)
>>         at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
>>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
>> thodAccessorImpl.java:43)
>>         at java.lang.reflect.Method.invoke(Method.java:498)
>>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMeth
>> od(RetryInvocationHandler.java:191)
>>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(Ret
>> ryInvocationHandler.java:102)
>>         at com.sun.proxy.$Proxy12.addBlock(Unknown Source)
>>         at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFo
>> llowingBlock(DFSOutputStream.java:1459)
>>         at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBloc
>> kOutputStream(DFSOutputStream.java:1255)
>>         at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(
>> DFSOutputStream.java:449)
>>
>> 2017-07-17 12:29:24.164 ERROR (qtp985934102-5161057) [c:UNCLASS s:shard43
>> r:core_node108 x:UNCLASS_shard43_replica1] o.a.s.s.HttpSolrCall null:
>> org.apache.solr.common.SolrException: Error logging add
>>         at org.apache.solr.update.TransactionLog.write(TransactionLog.
>> java:418)
>>         at org.apache.solr.update.UpdateLog.add(UpdateLog.java:532)
>>         at org.apache.solr.update.UpdateLog.add(UpdateLog.java:516)
>>         at org.apache.solr.update.processor.DistributedUpdateProcessor.
>> versionAdd(DistributedUpdateProcessor.java:1113)
>>         at org.apache.solr.update.processor.DistributedUpdateProcessor.
>> processAdd(DistributedUpdateProcessor.java:748)
>>         at org.apache.solr.update.processor.LogUpdateProcessorFactory$L
>> ogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:103)
>>         at org.apache.solr.handler.loader.JavabinLoader$1.update(Javabi
>> nLoader.java:98)
>>         at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>> ec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:180)
>>         at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>> ec$1.readIterator(JavaBinUpdateRequestCodec.java:136)
>>         at org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinC
>> odec.java:306)
>>         at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCode
>> c.java:251)
>>         at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>> ec$1.readNamedList(JavaBinUpdateRequestCodec.java:122)
>>         at org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinC
>> odec.java:271)
>>         at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCode
>> c.java:251)
>>         at org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCo
>> dec.java:173)
>>         at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>> ec.unmarshal(JavaBinUpdateRequestCodec.java:187)
>>         at org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDoc
>> s(JavabinLoader.java:108)
>>         at org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoa
>> der.java:55)
>>         at org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRe
>> questHandler.java:97)
>>         at org.apache.solr.handler.ContentStreamHandlerBase.handleReque
>> stBody(ContentStreamHandlerBase.java:68)
>>         at org.apache.solr.handler.RequestHandlerBase.handleRequest(Req
>> uestHandlerBase.java:173)
>>         at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477)
>>         at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.
>> java:723)
>>         at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:
>> 529)
>>         at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDisp
>> atchFilter.java:361)
>>         at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDisp
>> atchFilter.java:305)
>>         at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilte
>> r(ServletHandler.java:1691)
>>         at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHan
>> dler.java:582)
>>         at org.eclipse.jetty.server.handler.ScopedHandler.handle(Scoped
>> Handler.java:143)
>>         at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHa
>> ndler.java:548)
>>         at org.eclipse.jetty.server.session.SessionHandler.doHandle(
>> SessionHandler.java:226)
>>         at org.eclipse.jetty.server.handler.ContextHandler.doHandle(
>> ContextHandler.java:1180)
>>         at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHand
>> ler.java:512)
>>         at org.eclipse.jetty.server.session.SessionHandler.doScope(
>> SessionHandler.java:185)
>>         at org.eclipse.jetty.server.handler.ContextHandler.doScope(
>> ContextHandler.java:1112)
>>         at org.eclipse.jetty.server.handler.ScopedHandler.handle(Scoped
>> Handler.java:141)
>>         at org.eclipse.jetty.server.handler.ContextHandlerCollection.ha
>> ndle(ContextHandlerCollection.java:213)
>>         at org.eclipse.jetty.server.handler.HandlerCollection.handle(
>> HandlerCollection.java:119)
>>         at org.eclipse.jetty.server.handler.HandlerWrapper.handle(Handl
>> erWrapper.java:134)
>>         at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(Rewr
>> iteHandler.java:335)
>>         at org.eclipse.jetty.server.handler.HandlerWrapper.handle(Handl
>> erWrapper.java:134)
>>         at org.eclipse.jetty.server.Server.handle(Server.java:534)
>>         at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.
>> java:320)
>>         at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConne
>> ction.java:251)
>>         at org.eclipse.jetty.io.AbstractConnection$ReadCallback.
>> succeeded(AbstractConnection.java:273)
>>         at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.
>> java:95)
>>         at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChann
>> elEndPoint.java:93)
>>         at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume
>> .executeProduceConsume(ExecuteProduceConsume.java:303)
>>         at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume
>> .produceConsume(ExecuteProduceConsume.java:148)
>>         at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume
>> .run(ExecuteProduceConsume.java:136)
>>         at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(Queued
>> ThreadPool.java:671)
>>         at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedT
>> hreadPool.java:589)
>>         at java.lang.Thread.run(Thread.java:748)
>> Caused by: org.apache.hadoop.ipc.RemoteException(java.io.IOException):
>> File /solr6.6.0/UNCLASS/core_node108/data/tlog/tlog.0000000000000006211
>> could only be replicated to 0 nodes instead of minReplication (=1).  There
>> are 40 datanode(s) running and no node(s) are excluded in this operation.
>>         at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.c
>> hooseTarget4NewBlock(BlockManager.java:1622)
>>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAddit
>> ionalBlock(FSNamesystem.java:3351)
>>         at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.add
>> Block(NameNodeRpcServer.java:683)
>>         at org.apache.hadoop.hdfs.server.namenode.AuthorizationProvider
>> ProxyClientProtocol.addBlock(AuthorizationProviderProxyClien
>> tProtocol.java:214)
>>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServ
>> erSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSide
>> TranslatorPB.java:495)
>>         at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocol
>> Protos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNam
>> enodeProtocolProtos.java)
>>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcIn
>> voker.call(ProtobufRpcEngine.java:617)
>>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2216)
>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2212)
>>         at java.security.AccessController.doPrivileged(Native Method)
>>         at javax.security.auth.Subject.doAs(Subject.java:422)
>>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGro
>> upInformation.java:1920)
>>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2210)
>>
>>         at org.apache.hadoop.ipc.Client.call(Client.java:1475)
>>         at org.apache.hadoop.ipc.Client.call(Client.java:1412)
>>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(
>> ProtobufRpcEngine.java:229)
>>         at com.sun.proxy.$Proxy11.addBlock(Unknown Source)
>>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTran
>> slatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:418)
>>         at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
>>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
>> thodAccessorImpl.java:43)
>>         at java.lang.reflect.Method.invoke(Method.java:498)
>>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMeth
>> od(RetryInvocationHandler.java:191)
>>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(Ret
>> ryInvocationHandler.java:102)
>>         at com.sun.proxy.$Proxy12.addBlock(Unknown Source)
>>         at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFo
>> llowingBlock(DFSOutputStream.java:1459)
>>         at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBloc
>> kOutputStream(DFSOutputStream.java:1255)
>>         at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(
>> DFSOutputStream.java:449)
>>
>> 2017-07-17 12:29:24.187 INFO (zkCallback-5-thread-144-processing-n:juliet:9100_solr)
>> [   ] o.a.s.c.c.ZkStateReader A cluster state change: [WatchedEvent
>> state:SyncConnected type:NodeDataChanged path:/collections/UNCLASS/state.json]
>> for collection [UNCLASS] has occurred - updating... (live nodes size: [45])
>>
>> On the client side, the error looks like:
>> 2017-07-16 19:03:16,118 WARN [com.ngc.bigdata.ie_solrindexer.IndexDocument]
>> Indexing error: org.apache.solr.client.solrj.i
>> mpl.CloudSolrClient$RouteException: Error from server at
>> http://leda:9100/solr/UNCLASS_shard44_replica3: Exception writing
>> document id COLLECT10086453202 to the index; possible analysis error. for
>> collection: UNCLASS
>> org.apache.solr.client.solrj.impl.CloudSolrClient$RouteException: Error
>> from server at http://leda:9100/solr/UNCLASS_shard44_replica3: Exception
>> writing document id COLLECT10086453202 to the index; possible analysis
>> error.
>>         at org.apache.solr.client.solrj.impl.CloudSolrClient.directUpda
>> te(CloudSolrClient.java:819)
>>         at org.apache.solr.client.solrj.impl.CloudSolrClient.sendReques
>> t(CloudSolrClient.java:1263)
>>         at org.apache.solr.client.solrj.impl.CloudSolrClient.requestWit
>> hRetryOnStaleState(CloudSolrClient.java:1134)
>>         at org.apache.solr.client.solrj.impl.CloudSolrClient.request(Cl
>> oudSolrClient.java:1073)
>>         at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest
>> .java:160)
>>         at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:
>> 106)
>>         at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:
>> 71)
>>         at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:
>> 85)
>>         at com.ngc.bigdata.ie_solrindexer.IndexDocument.indexSolrDocs(
>> IndexDocument.java:959)
>>         at com.ngc.bigdata.ie_solrindexer.IndexDocument.index(
>> IndexDocument.java:236)
>>         at com.ngc.bigdata.ie_solrindexer.SolrIndexerProcessor.doWork(S
>> olrIndexerProcessor.java:63)
>>         at com.ngc.intelenterprise.intelentutil.utils.Processor.run(
>> Processor.java:140)
>>         at com.ngc.intelenterprise.intelentutil.jms.IntelEntQueueProc.
>> process(IntelEntQueueProc.java:208)
>>         at org.apache.camel.processor.DelegateSyncProcessor.process(Del
>> egateSyncProcessor.java:63)
>>         at org.apache.camel.management.InstrumentationProcessor.process
>> (InstrumentationProcessor.java:77)
>>         at org.apache.camel.processor.RedeliveryErrorHandler.process(Re
>> deliveryErrorHandler.java:460)
>>         at org.apache.camel.processor.CamelInternalProcessor.process(Ca
>> melInternalProcessor.java:190)
>>         at org.apache.camel.processor.CamelInternalProcessor.process(Ca
>> melInternalProcessor.java:190)
>>         at org.apache.camel.component.seda.SedaConsumer.sendToConsumers
>> (SedaConsumer.java:298)
>>         at org.apache.camel.component.seda.SedaConsumer.doRun(SedaConsu
>> mer.java:207)
>>         at org.apache.camel.component.seda.SedaConsumer.run(SedaConsume
>> r.java:154)
>>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
>> Executor.java:1142)
>>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
>> lExecutor.java:617)
>>         at java.lang.Thread.run(Thread.java:748)
>> Caused by: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
>> Error from server at http://leda:9100/solr/UNCLASS_shard44_replica3:
>> Exception writing document id COLLECT10086453202 to the index; possible
>> analysis error.
>>         at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMeth
>> od(HttpSolrClient.java:610)
>>         at org.apache.solr.client.solrj.impl.HttpSolrClient.request(Htt
>> pSolrClient.java:279)
>>         at org.apache.solr.client.solrj.impl.HttpSolrClient.request(Htt
>> pSolrClient.java:268)
>>         at org.apache.solr.client.solrj.impl.LBHttpSolrClient.doRequest
>> (LBHttpSolrClient.java:447)
>>         at org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(L
>> BHttpSolrClient.java:388)
>>         at org.apache.solr.client.solrj.impl.CloudSolrClient.lambda$dir
>> ectUpdate$0(CloudSolrClient.java:796)
>>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>>         at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolE
>> xecutor.lambda$execute$0(ExecutorUtil.java:229)
>>         ... 3 more
>> 2017-07-16 19:03:16,134 ERROR [com.ngc.bigdata.ie_solrindexer.IndexDocument]
>> Error indexing: org.apache.solr.client.solrj.i
>> mpl.CloudSolrClient$RouteException: Error from server at
>> http://leda:9100/solr/UNCLASS_shard44_replica3: Exception writing
>> document id COLLECT10086453202 to the index; possible analysis error. for
>> collection: UNCLASS.
>> 2017-07-16 19:03:16,135 ERROR [com.ngc.bigdata.ie_solrindexer.IndexDocument]
>> Exception during indexing: org.apache.solr.client.solrj.i
>> mpl.CloudSolrClient$RouteException: Error from server at
>> http://leda:9100/solr/UNCLASS_shard44_replica3: Exception writing
>> document id COLLECT10086453202 to the index; possible analysis error.
>>
>> I can fire them back up, but they only run for a short time before
>> getting more indexing errors.  Several of the nodes show as down in the
>> cloud view.  Any help would be appreciated!  Thank you!
>>
>>
>> -Joe
>>
>>
>

Re: Solr 6.6.0 - Indexing errors

Posted by Joe Obernberger <jo...@gmail.com>.
Some more info:

When I stop all the indexers, in about 5-10 minutes the cluster goes all 
green.  When I start just one indexer, several nodes immediately go down 
with the 'Error adding log' message.

I'm using CloudSolrClient.add(List<SolrInputDocument>) to do the 
indexing.  Is this correct for SolrCloud?

Thank you!

-Joe


On 7/17/2017 8:36 AM, Joe Obernberger wrote:
> We've been indexing data on a 45 node cluster with 100 shards and 3 
> replicas, but our indexing processes have been stopping due to 
> errors.  On the server side the error is "Error logging add". Stack 
> trace:
>
> 2017-07-17 12:29:24.057 INFO  (qtp985934102-5161548) [c:UNCLASS 
> s:shard58 r:core_node290 x:UNCLASS_shard58_replica1] 
> o.a.s.u.p.LogUpdateProcessorFactory [UNCLASS_shard58_replica1] 
> webapp=/solr path=/update 
> params={update.distrib=FROMLEADER&update.chain=add-unknown-fields-to-the-schema&distrib.from=http://tarvos:9100/solr/UNCLASS_shard58_replica2/&wt=javabin&version=2}{add=[COLLECT20003218348784 
> (1573172872544780288), COLLECT20003218351447 (1573172872620277760), 
> COLLECT20003218353085 (1573172872625520640), COLLECT20003218357937 
> (1573172872627617792), COLLECT20003218361860 (1573172872629714944), 
> COLLECT20003218362535 (1573172872631812096)]} 0 171
> 2017-07-17 12:29:24.160 INFO  (qtp985934102-5160762) [c:UNCLASS 
> s:shard13 r:core_node81 x:UNCLASS_shard13_replica1] 
> o.a.s.u.p.LogUpdateProcessorFactory [UNCLASS_shard13_replica1] 
> webapp=/solr path=/update 
> params={update.distrib=FROMLEADER&update.chain=add-unknown-fields-to-the-schema&distrib.from=http://tarvos:9100/solr/UNCLASS_shard13_replica2/&wt=javabin&version=2}{add=[COLLECT20003218344436 
> (1573172872538488832), COLLECT20003218347497 (1573172872620277760), 
> COLLECT20003218351645 (1573172872625520640), COLLECT20003218356965 
> (1573172872629714944), COLLECT20003218357775 (1573172872632860672), 
> COLLECT20003218358017 (1573172872646492160), COLLECT20003218358152 
> (1573172872650686464), COLLECT20003218359395 (1573172872651735040), 
> COLLECT20003218362571 (1573172872652783616)]} 0 274
> 2017-07-17 12:29:24.163 INFO  (qtp985934102-5161057) [c:UNCLASS 
> s:shard43 r:core_node108 x:UNCLASS_shard43_replica1] 
> o.a.s.u.p.LogUpdateProcessorFactory [UNCLASS_shard43_replica1] 
> webapp=/solr path=/update 
> params={update.distrib=FROMLEADER&update.chain=add-unknown-fields-to-the-schema&distrib.from=http://tarvos:9100/solr/UNCLASS_shard43_replica2/&wt=javabin&version=2}{} 
> 0 0
> 2017-07-17 12:29:24.164 ERROR (qtp985934102-5161057) [c:UNCLASS 
> s:shard43 r:core_node108 x:UNCLASS_shard43_replica1] 
> o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: Error 
> logging add
>         at 
> org.apache.solr.update.TransactionLog.write(TransactionLog.java:418)
>         at org.apache.solr.update.UpdateLog.add(UpdateLog.java:532)
>         at org.apache.solr.update.UpdateLog.add(UpdateLog.java:516)
>         at 
> org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:1113)
>         at 
> org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:748)
>         at 
> org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:103)
>         at 
> org.apache.solr.handler.loader.JavabinLoader$1.update(JavabinLoader.java:98)
>         at 
> org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:180)
>         at 
> org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readIterator(JavaBinUpdateRequestCodec.java:136)
>         at 
> org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:306)
>         at 
> org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:251)
>         at 
> org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readNamedList(JavaBinUpdateRequestCodec.java:122)
>         at 
> org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:271)
>         at 
> org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:251)
>         at 
> org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:173)
>         at 
> org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec.unmarshal(JavaBinUpdateRequestCodec.java:187)
>         at 
> org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDocs(JavabinLoader.java:108)
>         at 
> org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoader.java:55)
>         at 
> org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:97)
>         at 
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68)
>         at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:173)
>         at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477)
>         at 
> org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:723)
>         at 
> org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:529)
>         at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:361)
>         at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:305)
>         at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1691)
>         at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
>         at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>         at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>         at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
>         at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
>         at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
>         at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
>         at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
>         at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>         at 
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
>         at 
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
>         at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
>         at 
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
>         at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
>         at org.eclipse.jetty.server.Server.handle(Server.java:534)
>         at 
> org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
>         at 
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
>         at 
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
>         at 
> org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
>         at 
> org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
>         at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
>         at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
>         at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
>         at 
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
>         at 
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
>         at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.hadoop.ipc.RemoteException(java.io.IOException): 
> File 
> /solr6.6.0/UNCLASS/core_node108/data/tlog/tlog.0000000000000006211 
> could only be replicated to 0 nodes instead of minReplication (=1).  
> There are 40 datanode(s) running and no node(s) are excluded in this 
> operation.
>         at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1622)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3351)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:683)
>         at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.addBlock(AuthorizationProviderProxyClientProtocol.java:214)
>         at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:495)
>         at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>         at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2216)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2212)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2210)
>
>         at org.apache.hadoop.ipc.Client.call(Client.java:1475)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1412)
>         at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
>         at com.sun.proxy.$Proxy11.addBlock(Unknown Source)
>         at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:418)
>         at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:498)
>         at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
>         at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>         at com.sun.proxy.$Proxy12.addBlock(Unknown Source)
>         at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1459)
>         at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1255)
>         at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449)
>
> 2017-07-17 12:29:24.164 ERROR (qtp985934102-5161057) [c:UNCLASS 
> s:shard43 r:core_node108 x:UNCLASS_shard43_replica1] 
> o.a.s.s.HttpSolrCall null:org.apache.solr.common.SolrException: Error 
> logging add
>         at 
> org.apache.solr.update.TransactionLog.write(TransactionLog.java:418)
>         at org.apache.solr.update.UpdateLog.add(UpdateLog.java:532)
>         at org.apache.solr.update.UpdateLog.add(UpdateLog.java:516)
>         at 
> org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:1113)
>         at 
> org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:748)
>         at 
> org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:103)
>         at 
> org.apache.solr.handler.loader.JavabinLoader$1.update(JavabinLoader.java:98)
>         at 
> org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:180)
>         at 
> org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readIterator(JavaBinUpdateRequestCodec.java:136)
>         at 
> org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:306)
>         at 
> org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:251)
>         at 
> org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readNamedList(JavaBinUpdateRequestCodec.java:122)
>         at 
> org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:271)
>         at 
> org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:251)
>         at 
> org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:173)
>         at 
> org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec.unmarshal(JavaBinUpdateRequestCodec.java:187)
>         at 
> org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDocs(JavabinLoader.java:108)
>         at 
> org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoader.java:55)
>         at 
> org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:97)
>         at 
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68)
>         at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:173)
>         at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477)
>         at 
> org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:723)
>         at 
> org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:529)
>         at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:361)
>         at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:305)
>         at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1691)
>         at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
>         at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>         at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>         at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
>         at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
>         at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
>         at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
>         at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
>         at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>         at 
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
>         at 
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
>         at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
>         at 
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
>         at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
>         at org.eclipse.jetty.server.Server.handle(Server.java:534)
>         at 
> org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
>         at 
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
>         at 
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
>         at 
> org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
>         at 
> org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
>         at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
>         at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
>         at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
>         at 
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
>         at 
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
>         at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.hadoop.ipc.RemoteException(java.io.IOException): 
> File 
> /solr6.6.0/UNCLASS/core_node108/data/tlog/tlog.0000000000000006211 
> could only be replicated to 0 nodes instead of minReplication (=1).  
> There are 40 datanode(s) running and no node(s) are excluded in this 
> operation.
>         at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1622)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3351)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:683)
>         at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.addBlock(AuthorizationProviderProxyClientProtocol.java:214)
>         at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:495)
>         at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>         at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2216)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2212)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2210)
>
>         at org.apache.hadoop.ipc.Client.call(Client.java:1475)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1412)
>         at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
>         at com.sun.proxy.$Proxy11.addBlock(Unknown Source)
>         at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:418)
>         at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:498)
>         at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
>         at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>         at com.sun.proxy.$Proxy12.addBlock(Unknown Source)
>         at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1459)
>         at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1255)
>         at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449)
>
> 2017-07-17 12:29:24.187 INFO 
> (zkCallback-5-thread-144-processing-n:juliet:9100_solr) [   ] 
> o.a.s.c.c.ZkStateReader A cluster state change: [WatchedEvent 
> state:SyncConnected type:NodeDataChanged 
> path:/collections/UNCLASS/state.json] for collection [UNCLASS] has 
> occurred - updating... (live nodes size: [45])
>
> On the client side, the error looks like:
> 2017-07-16 19:03:16,118 WARN 
> [com.ngc.bigdata.ie_solrindexer.IndexDocument] Indexing error: 
> org.apache.solr.client.solrj.impl.CloudSolrClient$RouteException: 
> Error from server at http://leda:9100/solr/UNCLASS_shard44_replica3: 
> Exception writing document id COLLECT10086453202 to the index; 
> possible analysis error. for collection: UNCLASS
> org.apache.solr.client.solrj.impl.CloudSolrClient$RouteException: 
> Error from server at http://leda:9100/solr/UNCLASS_shard44_replica3: 
> Exception writing document id COLLECT10086453202 to the index; 
> possible analysis error.
>         at 
> org.apache.solr.client.solrj.impl.CloudSolrClient.directUpdate(CloudSolrClient.java:819)
>         at 
> org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:1263)
>         at 
> org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:1134)
>         at 
> org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:1073)
>         at 
> org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:160)
>         at 
> org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:106)
>         at 
> org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:71)
>         at 
> org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:85)
>         at 
> com.ngc.bigdata.ie_solrindexer.IndexDocument.indexSolrDocs(IndexDocument.java:959)
>         at 
> com.ngc.bigdata.ie_solrindexer.IndexDocument.index(IndexDocument.java:236)
>         at 
> com.ngc.bigdata.ie_solrindexer.SolrIndexerProcessor.doWork(SolrIndexerProcessor.java:63)
>         at 
> com.ngc.intelenterprise.intelentutil.utils.Processor.run(Processor.java:140)
>         at 
> com.ngc.intelenterprise.intelentutil.jms.IntelEntQueueProc.process(IntelEntQueueProc.java:208)
>         at 
> org.apache.camel.processor.DelegateSyncProcessor.process(DelegateSyncProcessor.java:63)
>         at 
> org.apache.camel.management.InstrumentationProcessor.process(InstrumentationProcessor.java:77)
>         at 
> org.apache.camel.processor.RedeliveryErrorHandler.process(RedeliveryErrorHandler.java:460)
>         at 
> org.apache.camel.processor.CamelInternalProcessor.process(CamelInternalProcessor.java:190)
>         at 
> org.apache.camel.processor.CamelInternalProcessor.process(CamelInternalProcessor.java:190)
>         at 
> org.apache.camel.component.seda.SedaConsumer.sendToConsumers(SedaConsumer.java:298)
>         at 
> org.apache.camel.component.seda.SedaConsumer.doRun(SedaConsumer.java:207)
>         at 
> org.apache.camel.component.seda.SedaConsumer.run(SedaConsumer.java:154)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:748)
> Caused by: 
> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: 
> Error from server at http://leda:9100/solr/UNCLASS_shard44_replica3: 
> Exception writing document id COLLECT10086453202 to the index; 
> possible analysis error.
>         at 
> org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:610)
>         at 
> org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:279)
>         at 
> org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:268)
>         at 
> org.apache.solr.client.solrj.impl.LBHttpSolrClient.doRequest(LBHttpSolrClient.java:447)
>         at 
> org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(LBHttpSolrClient.java:388)
>         at 
> org.apache.solr.client.solrj.impl.CloudSolrClient.lambda$directUpdate$0(CloudSolrClient.java:796)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at 
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229)
>         ... 3 more
> 2017-07-16 19:03:16,134 ERROR 
> [com.ngc.bigdata.ie_solrindexer.IndexDocument] Error indexing: 
> org.apache.solr.client.solrj.impl.CloudSolrClient$RouteException: 
> Error from server at http://leda:9100/solr/UNCLASS_shard44_replica3: 
> Exception writing document id COLLECT10086453202 to the index; 
> possible analysis error. for collection: UNCLASS.
> 2017-07-16 19:03:16,135 ERROR 
> [com.ngc.bigdata.ie_solrindexer.IndexDocument] Exception during 
> indexing: 
> org.apache.solr.client.solrj.impl.CloudSolrClient$RouteException: 
> Error from server at http://leda:9100/solr/UNCLASS_shard44_replica3: 
> Exception writing document id COLLECT10086453202 to the index; 
> possible analysis error.
>
> I can fire them back up, but they only run for a short time before 
> getting more indexing errors.  Several of the nodes show as down in 
> the cloud view.  Any help would be appreciated!  Thank you!
>
>
> -Joe
>


Re: Solr 6.6.0 - Indexing errors

Posted by Shawn Heisey <ap...@elyograg.org>.
On 7/17/2017 6:36 AM, Joe Obernberger wrote:
> We've been indexing data on a 45 node cluster with 100 shards and 3
> replicas, but our indexing processes have been stopping due to
> errors.  On the server side the error is "Error logging add". Stack
> trace:
 <snip>
> Caused by: org.apache.hadoop.ipc.RemoteException(java.io.IOException):
> File
> /solr6.6.0/UNCLASS/core_node108/data/tlog/tlog.0000000000000006211
> could only be replicated to 0 nodes instead of minReplication (=1). 
> There are 40 datanode(s) running and no node(s) are excluded in this
> operation.

The excerpt from your log that I preserved above shows that the root of
the problem is something going wrong with Solr writing to HDFS.  I can
only tell that there was a problem, I do not what actually went wrong.

I think you'll need to take this information to the hadoop project and
ask them what could cause it and what can be done about it.

Solr includes Hadoop 2.7.2 jars.  This is not the latest version of
Hadoop, so it's possible there might be a known issue with this version
that is fixed in a later version.  There is a task to update Solr's
Hadoop to 3.0 when it gets released:

https://issues.apache.org/jira/browse/SOLR-9515

Thanks,
Shawn