You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Mauricio Ferreyra <ma...@gmail.com> on 2014/09/01 18:31:59 UTC

SolR replication issue

Hi folks,
I'm using Solr 4.3.1 with a master/slave configuration.

Configuration:

Master:
*      <lst name="master">*
*         <str name="replicateAfter">commit</str>*
*         <str name="replicateAfter">startup</str>*
*         <str name="confFiles">schema.xml,stopwords.txt</str>*
*       </lst>*


Slave:
 *     <lst name="slave">*
*         <str name="masterUrl">http://10.xx.xx.xx:9081/solr
<http://10.xx.xx.xx:9081/solr></str>*
*         <str name="pollInterval">00:00:60</str>*
*       </lst>*

The replication sometimes fails with the exception

*Error closing old IndexWriter.
core=collection1:java.lang.IllegalArgumentException: Unknown directory:
NRTCachingDirectory...*
*ReplicationHandler SnapPull failed :org.apache.solr.common.SolrException:
Index fetch failed :*

This is happening with any index size.

Any suggestions would be great

Thanks,

-- 
*Mauri Ferreyra*
Cordoba - Argentina

Re: SolR replication issue

Posted by Mauricio Ferreyra <ma...@gmail.com>.
The entire stacktrace:

ERROR SolrIndexWriter
Coud not unlock directory after seemingly failed IndexWriter#close()
org.apache.lucene.store.LockReleaseFailedException: Cannot forcefully
unlock a NativeFSLock which is held by another indexer component:
/home/miapp/collection1/data/index.20140901140800014/write.lock
 at
org.apache.lucene.store.NativeFSLock.release(NativeFSLockFactory.java:295)
at org.apache.lucene.index.IndexWriter.unlock(IndexWriter.java:4162)
 at org.apache.solr.update.SolrIndexWriter.close(SolrIndexWriter.java:156)
at
org.apache.solr.update.DefaultSolrCoreState.newIndexWriter(DefaultSolrCoreState.java:164)
 at
org.apache.solr.update.DirectUpdateHandler2.newIndexWriter(DirectUpdateHandler2.java:624)
at
org.apache.solr.handler.SnapPuller.openNewWriterAndSearcher(SnapPuller.java:622)
 at org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:446)
at
org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:317)
 at org.apache.solr.handler.SnapPuller$1.run(SnapPuller.java:223)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
 at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
 at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
16:39:00
ERROR DefaultSolrCoreState
Error closing old IndexWriter.
core=collection1:java.lang.IllegalArgumentException: Unknown directory:
NRTCachingDirectory(org.apache.lucene.store.MMapDirectory@/home/miapp/collection1/data/index.20140901140800014
lockFactory=org.apache.lucene.store.NativeFSLockFactory@77b024a9;
maxCacheMB=48.0 maxMergeSizeMB=4.0)
{NRTCachingDirectory(org.apache.lucene.store.MMapDirectory@/home/miapp/collection1/data
lockFactory=org.apache.lucene.store.NativeFSLockFactory@6de47a0f;
maxCacheMB=48.0
maxMergeSizeMB=4.0)=CachedDir<<refCount=0;path=/home/miapp/collection1/data;done=false>>,?
NRTCachingDirectory(org.apache.lucene.store.MMapDirectory@/home/miapp/collection1/data/index.20140901140800016
lockFactory=org.apache.lucene.store.NativeFSLockFactory@791e6360;
maxCacheMB=48.0
maxMergeSizeMB=4.0)=CachedDir<<refCount=-3;path=/home/miapp/collection1/data/index.20140901140800016;done=false>>,?
NRTCachingDirectory(org.apache.lucene.store.MMapDirectory@/home/miapp/collection1/data/index.20140901163900022
lockFactory=org.apache.lucene.store.NativeFSLockFactory@6a6de862;
maxCacheMB=48.0
maxMergeSizeMB=4.0)=CachedDir<<refCount=1;path=/home/miapp/collection1/data/index.20140901163900022;done=false>>,?
NRTCachingDirectory(org.apache.lucene.store.MMapDirectory@/home/miapp/collection1/data/index.20140901140800014
lockFactory=org.apache.lucene.store.NativeFSLockFactory@5863c9e0;
maxCacheMB=48.0
maxMergeSizeMB=4.0)=CachedDir<<refCount=1;path=/home/miapp/collection1/data/index.20140901140800014;done=false>>}


ERROR SolrIndexWriter SolrIndexWriter was not closed prior to finalize(),​
indicates a bug -- POSSIBLE RESOURCE LEAK!!!






On Mon, Sep 1, 2014 at 4:28 PM, Shawn Heisey <so...@elyograg.org> wrote:

> On 9/1/2014 10:31 AM, Mauricio Ferreyra wrote:
> > I'm using Solr 4.3.1 with a master/slave configuration.
> >
> > Configuration:
> >
> > Master:
> > *      <lst name="master">*
> > *         <str name="replicateAfter">commit</str>*
> > *         <str name="replicateAfter">startup</str>*
> > *         <str name="confFiles">schema.xml,stopwords.txt</str>*
> > *       </lst>*
> >
> >
> > Slave:
> >  *     <lst name="slave">*
> > *         <str name="masterUrl">http://10.xx.xx.xx:9081/solr
> > <http://10.xx.xx.xx:9081/solr></str>*
> > *         <str name="pollInterval">00:00:60</str>*
> > *       </lst>*
> >
> > The replication sometimes fails with the exception
> >
> > *Error closing old IndexWriter.
> > core=collection1:java.lang.IllegalArgumentException: Unknown directory:
> > NRTCachingDirectory...*
> > *ReplicationHandler SnapPull failed
> :org.apache.solr.common.SolrException:
> > Index fetch failed :*
> >
> > This is happening with any index size.
>
> We would need the entire stacktrace from that error message to make any
> real determination, but without that, I offer this:
>
> If it's happening with a tiny index (kilobytes or only a few megabytes),
> then I would suspect a bug in Solr.  4.3.1 is ancient history now --
> Solr's development and release schedule is very aggressive.  There have
> been ten new versions in the 14 months since 4.3.1 was announced, and
> the 4.10 release is imminent.  There have been a number of replication
> bugs fixed in those releases.
>
> If "any index size" means that some of them are in the range of several
> gigabytes, then it may simply be a configuration problem.  The
> commitReserveDuration parameter may need increasing beyond its default
> of ten seconds.  Large updates after optimizing or a major automatic
> merge can take many minutes to transfer, and if that parameter is set
> too low, the old index can disappear before it can finish replicating.
>
> Thanks,
> Shawn
>
>


-- 
*Mauri Ferreyra*
Cordoba - Argentina

Re: SolR replication issue

Posted by Shawn Heisey <so...@elyograg.org>.
On 9/1/2014 10:31 AM, Mauricio Ferreyra wrote:
> I'm using Solr 4.3.1 with a master/slave configuration.
> 
> Configuration:
> 
> Master:
> *      <lst name="master">*
> *         <str name="replicateAfter">commit</str>*
> *         <str name="replicateAfter">startup</str>*
> *         <str name="confFiles">schema.xml,stopwords.txt</str>*
> *       </lst>*
> 
> 
> Slave:
>  *     <lst name="slave">*
> *         <str name="masterUrl">http://10.xx.xx.xx:9081/solr
> <http://10.xx.xx.xx:9081/solr></str>*
> *         <str name="pollInterval">00:00:60</str>*
> *       </lst>*
> 
> The replication sometimes fails with the exception
> 
> *Error closing old IndexWriter.
> core=collection1:java.lang.IllegalArgumentException: Unknown directory:
> NRTCachingDirectory...*
> *ReplicationHandler SnapPull failed :org.apache.solr.common.SolrException:
> Index fetch failed :*
> 
> This is happening with any index size.

We would need the entire stacktrace from that error message to make any
real determination, but without that, I offer this:

If it's happening with a tiny index (kilobytes or only a few megabytes),
then I would suspect a bug in Solr.  4.3.1 is ancient history now --
Solr's development and release schedule is very aggressive.  There have
been ten new versions in the 14 months since 4.3.1 was announced, and
the 4.10 release is imminent.  There have been a number of replication
bugs fixed in those releases.

If "any index size" means that some of them are in the range of several
gigabytes, then it may simply be a configuration problem.  The
commitReserveDuration parameter may need increasing beyond its default
of ten seconds.  Large updates after optimizing or a major automatic
merge can take many minutes to transfer, and if that parameter is set
too low, the old index can disappear before it can finish replicating.

Thanks,
Shawn