You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Fuad Efendi <fu...@efendi.ca> on 2009/11/02 20:27:47 UTC

RE: Lucene FieldCache memory requirements

Any thoughts regarding the subject? I hope FieldCache doesn't use more than
6 bytes per document-field instance... I am too lazy to research Lucene
source code, I hope someone can provide exact answer... Thanks


> Subject: Lucene FieldCache memory requirements
> 
> Hi,
> 
> 
> Can anyone confirm Lucene FieldCache memory requirements? I have 100
> millions docs with non-tokenized field "country" (10 different countries);
I
> expect it requires array of ("int", "long"), size of array 100,000,000,
> without any impact of "country" field length;
> 
> it requires 600,000,000 bytes: "int" is pointer to document (Lucene
document
> ID),  and "long" is pointer to String value...
> 
> Am I right, is it 600Mb just for this "country" (indexed, non-tokenized,
> non-boolean) field and 100 millions docs? I need to calculate exact
minimum RAM
> requirements...
> 
> I believe it shouldn't depend on cardinality (distribution) of field...
> 
> Thanks,
> Fuad
> 
> 
> 
> 




RE: Segment file not found error - after replicating

Posted by Maduranga Kannangara <mk...@infomedia.com.au>.
Yes. We have tried Solr 1.4 and so far its been great success.

Still I am investigating why Solr 1.3 gave an issue like before.

Currently seems to me org.apache.lucene.index.SegmentInfos.FindSegmentFile.run() is not able to figure out correct segment file name. (May be index replication issue -- leading to "not fully replicated".. but its so hard to believe as both master and slave are having 100% same data now!)

Anyway.. will keep on trying till I find something useful.. and will let you know.


Thanks
Madu


-----Original Message-----
From: Otis Gospodnetic [mailto:otis_gospodnetic@yahoo.com]
Sent: Wednesday, 11 November 2009 10:03 AM
To: solr-user@lucene.apache.org
Subject: Re: Segment file not found error - after replicating

It sounds like your index is not being fully replicated.  I can't tell why, but I can suggest you try the new Solr 1.4 replication.

Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



----- Original Message ----
> From: Maduranga Kannangara <mk...@infomedia.com.au>
> To: "solr-user@lucene.apache.org" <so...@lucene.apache.org>
> Sent: Tue, November 10, 2009 5:42:44 PM
> Subject: RE: Segment file not found error - after replicating
>
> Thanks Otis,
>
> I did the du -s for all three index directories as you said right after
> replicating and when I find errors.
>
> All three gave me the exact same value. This time I found the error in a rather
> small index too (31Mb).
>
> BTW, if I copy the segment_x file to what Solr is looking for, and restart the
> Solr web-app from Tomcat manager, this resolves. But it's just a work around,
> never good enough for the production deployments.
>
> My next plan is to do a remote debug to see what exactly happening in the code.
>
> Any other things I should looking at?
> Any help is really appreciated on this matter.
>
> Thanks
> Madu
>
>
> -----Original Message-----
> From: Otis Gospodnetic [mailto:otis_gospodnetic@yahoo.com]
> Sent: Tuesday, 10 November 2009 1:14 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Segment file not found error - after replicating
>
> Madu,
>
> So are you saying that all slaves have the exact same index, and that index is
> exactly the same as the one on the master, yet only some of those slaves exhibit
> this error, while others do not?  Mind listing index directories of 1) master 2)
> slave without errors, 3) slave with errors and doing:
> du -s /path/to/index/on/master
> du -s /path/to/index/on/slave/without/errors
> du -s /path/to/index/on/slave/with/errors
>
>
> Otis
> --
> Sematext is hiring -- http://sematext.com/about/jobs.html?mls
> Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
>
>
>
> ----- Original Message ----
> > From: Maduranga Kannangara
> > To: "solr-user@lucene.apache.org"
> > Sent: Mon, November 9, 2009 7:47:04 PM
> > Subject: RE: Segment file not found error - after replicating
> >
> > Thanks Otis!
> >
> > Yes, I checked the index directories and they are 100% same, both timestamp
> and
> > size wise.
> >
> > Not all the slaves face this issue. I would say roughly 50% has this trouble.
> >
> > Logs do not have any errors too :-(
> >
> > Any other things I should do/look at?
> >
> > Cheers
> > Madu
> >
> >
> > -----Original Message-----
> > From: Otis Gospodnetic [mailto:otis_gospodnetic@yahoo.com]
> > Sent: Tuesday, 10 November 2009 9:26 AM
> > To: solr-user@lucene.apache.org
> > Subject: Re: Segment file not found error - after replicating
> >
> > It's hard to troubleshoot blindly like this, but have you tried manually
> > comparing the contents of the index dir on the master and on the slave(s)?
> > If they are out of sync, have you tried forcing of replication to see if one
> of
> > the subsequent replication attempts gets the dirs in sync?
> > Do you have more than 1 slave and do they all start having this problem at the
> > same time?
> > Any errors in the logs for any of the scripts involved in replication in 1.3?
> >
> > Otis
> > --
> > Sematext is hiring -- http://sematext.com/about/jobs.html?mls
> > Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
> >
> >
> >
> > ----- Original Message ----
> > > From: Maduranga Kannangara
> > > To: "solr-user@lucene.apache.org"
> > > Sent: Sun, November 8, 2009 10:30:44 PM
> > > Subject: Segment file not found error - after replicating
> > >
> > > Hi guys,
> > >
> > > We use Solr 1.3 for indexing large amounts of data (50G avg) on Linux
> > > environment and use the replication scripts to make replicas those live in
> > load
> > > balancing slaves.
> > >
> > > The issue we face quite often (only in Linux servers) is that they tend to
> not
> >
> > > been able to find the segment file (segment_x etc) after the replicating
> > > completed. As this has become quite common, we started hitting a serious
> > issue.
> > >
> > > Below is a stack trace, if that helps and any help on this matter is greatly
> > > appreciated.
> > >
> > > --------------------------------
> > >
> > > Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader
> load
> > > INFO: created /admin/: org.apache.solr.handler.admin.AdminHandlers
> > > Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader
> load
> > > INFO: created /admin/ping: org.apache.solr.handler.PingRequestHandler
> > > Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader
> load
> > > INFO: created /debug/dump: org.apache.solr.handler.DumpRequestHandler
> > > Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader
> load
> > > INFO: created gap: org.apache.solr.highlight.GapFragmenter
> > > Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader
> load
> > > INFO: created regex: org.apache.solr.highlight.RegexFragmenter
> > > Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader
> load
> > > INFO: created html: org.apache.solr.highlight.HtmlFormatter
> > > Nov 5, 2009 11:34:46 PM org.apache.solr.servlet.SolrDispatchFilter init
> > > SEVERE: Could not start SOLR. Check solr/home property
> > > java.lang.RuntimeException: java.io.FileNotFoundException:
> > > /solrinstances/solrhome01/data/index/segments_v (No such file or directory)
> > >         at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:960)
> > >         at org.apache.solr.core.SolrCore.(SolrCore.java:470)
> > >         at
> > >
> >
> org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:119)
> > >         at
> > > org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69)
> > >         at
> > >
> >
> org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:275)
> > >         at
> > >
> >
> org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:397)
> > >         at
> > >
> >
> org.apache.catalina.core.ApplicationFilterConfig.(ApplicationFilterConfig.java:108)
> > >         at
> > >
> >
> org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3709)
> > >         at
> > > org.apache.catalina.core.StandardContext.start(StandardContext.java:4363)
> > >         at
> > > org.apache.catalina.core.StandardContext.reload(StandardContext.java:3099)
> > >         at
> > > org.apache.catalina.manager.ManagerServlet.reload(ManagerServlet.java:916)
> > >         at
> > >
> >
> org.apache.catalina.manager.HTMLManagerServlet.reload(HTMLManagerServlet.java:536)
> > >         at
> > >
> >
> org.apache.catalina.manager.HTMLManagerServlet.doGet(HTMLManagerServlet.java:114)
> > >         at javax.servlet.http.HttpServlet.service(HttpServlet.java:617)
> > >         at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
> > >         at
> > >
> >
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
> > >         at
> > >
> >
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
> > >         at com.jamonapi.JAMonFilter.doFilter(JAMonFilter.java:57)
> > >         at
> > >
> >
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
> > >         at
> > >
> >
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
> > >         at
> > >
> >
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
> > >         at
> > >
> >
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
> > >         at
> > >
> >
> org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:525)
> > >         at
> > >
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
> > >         at
> > >
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
> > >         at
> > >
> >
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
> > >         at
> > > org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286)
> > >         at
> > > org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:845)
> > >         at
> > >
> >
> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
> > >         at
> > > org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
> > >         at java.lang.Thread.run(Thread.java:619)
> > > Caused by: java.io.FileNotFoundException:
> > > /solrinstances/solrhome01/data/index/segments_v (No such file or directory)
> > >         at java.io.RandomAccessFile.open(Native Method)
> > >         at java.io.RandomAccessFile.(RandomAccessFile.java:212)
> > >         at
> > >
> >
> org.apache.lucene.store.FSDirectory$FSIndexInput$Descriptor.(FSDirectory.java:552)
> > >         at
> > > org.apache.lucene.store.FSDirectory$FSIndexInput.(FSDirectory.java:582)
> > >         at
> org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:488)
> > >         at
> org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:482)
> > >         at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:214)
> > >         at
> > >
> >
> org.apache.lucene.index.DirectoryIndexReader$1.doBody(DirectoryIndexReader.java:94)
> > >         at
> > >
> >
> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:653)
> > >         at
> > >
> >
> org.apache.lucene.index.DirectoryIndexReader.open(DirectoryIndexReader.java:111)
> > >         at org.apache.lucene.index.IndexReader.open(IndexReader.java:316)
> > >         at org.apache.lucene.index.IndexReader.open(IndexReader.java:237)
> > >         at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:951)
> > >         ... 30 more
> > > Nov 5, 2009 11:34:46 PM org.apache.solr.common.SolrException log
> > > SEVERE: java.lang.RuntimeException: java.io.FileNotFoundException:
> > > /solrinstances/solrhome01/data/index/segments_v (No such file or directory)
> > >         at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:960)
> > >         at org.apache.solr.core.SolrCore.(SolrCore.java:470)
> > >         at
> > >
> >
> org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:119)
> > >         at
> > > org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69)
> > >         at
> > >
> >
> org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:275)
> > >         at
> > >
> >
> org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:397)
> > >         at
> > >
> >
> org.apache.catalina.core.ApplicationFilterConfig.(ApplicationFilterConfig.java:108)
> > >         at
> > >
> >
> org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3709)
> > >         at
> > > org.apache.catalina.core.StandardContext.start(StandardContext.java:4363)
> > >         at
> > > org.apache.catalina.core.StandardContext.reload(StandardContext.java:3099)
> > >         at
> > > org.apache.catalina.manager.ManagerServlet.reload(ManagerServlet.java:916)
> > >         at
> > >
> >
> org.apache.catalina.manager.HTMLManagerServlet.reload(HTMLManagerServlet.java:536)
> > >         at
> > >
> >
> org.apache.catalina.manager.HTMLManagerServlet.doGet(HTMLManagerServlet.java:114)
> > >         at javax.servlet.http.HttpServlet.service(HttpServlet.java:617)
> > >         at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
> > >         at
> > >
> >
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
> > >         at
> > >
> >
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
> > >         at com.jamonapi.JAMonFilter.doFilter(JAMonFilter.java:57)
> > >         at
> > >
> >
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
> > >         at
> > >
> >
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
> > >         at
> > >
> >
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
> > >         at
> > >
> >
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
> > >         at
> > >
> >
> org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:525)
> > >         at
> > >
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
> > >         at
> > >
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
> > >         at
> > >
> >
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
> > >         at
> > > org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286)
> > >         at
> > > org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:845)
> > >         at
> > >
> >
> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
> > >         at
> > > org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
> > >         at java.lang.Thread.run(Thread.java:619)
> > > Caused by: java.io.FileNotFoundException:
> > > /solrinstances/solrhome01/data/index/segments_v (No such file or directory)
> > >         at java.io.RandomAccessFile.open(Native Method)
> > >         at java.io.RandomAccessFile.(RandomAccessFile.java:212)
> > >         at
> > >
> >
> org.apache.lucene.store.FSDirectory$FSIndexInput$Descriptor.(FSDirectory.java:552)
> > >         at
> > > org.apache.lucene.store.FSDirectory$FSIndexInput.(FSDirectory.java:582)
> > >         at
> org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:488)
> > >         at
> org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:482)
> > >         at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:214)
> > >         at
> > >
> >
> org.apache.lucene.index.DirectoryIndexReader$1.doBody(DirectoryIndexReader.java:94)
> > >         at
> > >
> >
> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:653)
> > >         at
> > >
> >
> org.apache.lucene.index.DirectoryIndexReader.open(DirectoryIndexReader.java:111)
> > >         at org.apache.lucene.index.IndexReader.open(IndexReader.java:316)
> > >         at org.apache.lucene.index.IndexReader.open(IndexReader.java:237)
> > >         at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:951)
> > >         ... 30 more
> > >
> > > Nov 5, 2009 11:34:46 PM org.apache.solr.servlet.SolrDispatchFilter init
> > > INFO: SolrDispatchFilter.init() done
> > > Nov 5, 2009 11:34:46 PM org.apache.solr.servlet.SolrServlet init
> > > INFO: SolrServlet.init()
> > >
> > > --------------------------------
> > >
> > > Steps to re-produce the error (However, for me did not work in my local box.
> > > Also remote server is too far away to remote-debug!).
> > >
> > > -  Post some new data to the master server (Usually about 1Gb worth text
> > files)
> > > -  Run the replicate script in slave Solr instance
> > > -  Try to login to admin in slave Solr instance
> > >
> > > And you should see above stack trace even in the Tomcat output.
> > >
> > >
> > > Thanks in advance.
> > > Madu


RE: Segment file not found error - after replicating

Posted by Maduranga Kannangara <mk...@infomedia.com.au>.
Yes, I too believed so..

The logic in earlier said method does the "gen number calculation" using segment files available (genA) and using segment.gen file content (genB). Which ever larger, would be the gen number used to look up for segment file.

When the file is not properly replicated (due to that is not being written to hard disk, or rsync ed) and segment gen number in the segment.gen file (genB) is larger than the file based calculation (genA) we hit the pre-said issue.

Cheers
Madu


-----Original Message-----
From: Mark Miller [mailto:markrmiller@gmail.com]
Sent: Monday, 16 November 2009 2:19 PM
To: solr-user@lucene.apache.org
Subject: Re: Segment file not found error - after replicating

Thats odd - that file is normally not used - its a backup method to
figure out the current generation in case it cannot be determined with a
directory listing - its basically for NFS.

Maduranga Kannangara wrote:
> Just found out the root cause:
>
> * The segments.gen file does not get replicated to slave all the time.
>
> For some reason, this small (20bytes) file lives in memory and does not get updated to the master's hard disk. Therefore it is not obviously transferred to slaves.
>
> Solution was to shut down the master web app (must be a clean shut down!, not kill of Tomcat). Then do the replication.
>
> Also, if the timestamp/size (size won't change anyway!) is not changed, Rsync does not seem to copy over this file too. So enforcing in the replication scripts solved the problem.
>
> Thanks Otis and everyone for all your support!
>
> Madu
>
>
> -----Original Message-----
> From: Maduranga Kannangara
> Sent: Monday, 16 November 2009 12:37 PM
> To: solr-user@lucene.apache.org
> Subject: RE: Segment file not found error - after replicating
>
> Yes. We have tried Solr 1.4 and so far its been great success.
>
> Still I am investigating why Solr 1.3 gave an issue like before.
>
> Currently seems to me org.apache.lucene.index.SegmentInfos.FindSegmentFile.run() is not able to figure out correct segment file name. (May be index replication issue -- leading to "not fully replicated".. but its so hard to believe as both master and slave are having 100% same data now!)
>
> Anyway.. will keep on trying till I find something useful.. and will let you know.
>
>
> Thanks
> Madu
>
>
> -----Original Message-----
> From: Otis Gospodnetic [mailto:otis_gospodnetic@yahoo.com]
> Sent: Wednesday, 11 November 2009 10:03 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Segment file not found error - after replicating
>
> It sounds like your index is not being fully replicated.  I can't tell why, but I can suggest you try the new Solr 1.4 replication.
>
> Otis
> --
> Sematext is hiring -- http://sematext.com/about/jobs.html?mls
> Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
>
>
>
> ----- Original Message ----
>
>> From: Maduranga Kannangara <mk...@infomedia.com.au>
>> To: "solr-user@lucene.apache.org" <so...@lucene.apache.org>
>> Sent: Tue, November 10, 2009 5:42:44 PM
>> Subject: RE: Segment file not found error - after replicating
>>
>> Thanks Otis,
>>
>> I did the du -s for all three index directories as you said right after
>> replicating and when I find errors.
>>
>> All three gave me the exact same value. This time I found the error in a rather
>> small index too (31Mb).
>>
>> BTW, if I copy the segment_x file to what Solr is looking for, and restart the
>> Solr web-app from Tomcat manager, this resolves. But it's just a work around,
>> never good enough for the production deployments.
>>
>> My next plan is to do a remote debug to see what exactly happening in the code.
>>
>> Any other things I should looking at?
>> Any help is really appreciated on this matter.
>>
>> Thanks
>> Madu
>>
>>
>> -----Original Message-----
>> From: Otis Gospodnetic [mailto:otis_gospodnetic@yahoo.com]
>> Sent: Tuesday, 10 November 2009 1:14 PM
>> To: solr-user@lucene.apache.org
>> Subject: Re: Segment file not found error - after replicating
>>
>> Madu,
>>
>> So are you saying that all slaves have the exact same index, and that index is
>> exactly the same as the one on the master, yet only some of those slaves exhibit
>> this error, while others do not?  Mind listing index directories of 1) master 2)
>> slave without errors, 3) slave with errors and doing:
>> du -s /path/to/index/on/master
>> du -s /path/to/index/on/slave/without/errors
>> du -s /path/to/index/on/slave/with/errors
>>
>>
>> Otis
>> --
>> Sematext is hiring -- http://sematext.com/about/jobs.html?mls
>> Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
>>
>>
>>
>> ----- Original Message ----
>>
>>> From: Maduranga Kannangara
>>> To: "solr-user@lucene.apache.org"
>>> Sent: Mon, November 9, 2009 7:47:04 PM
>>> Subject: RE: Segment file not found error - after replicating
>>>
>>> Thanks Otis!
>>>
>>> Yes, I checked the index directories and they are 100% same, both timestamp
>>>
>> and
>>
>>> size wise.
>>>
>>> Not all the slaves face this issue. I would say roughly 50% has this trouble.
>>>
>>> Logs do not have any errors too :-(
>>>
>>> Any other things I should do/look at?
>>>
>>> Cheers
>>> Madu
>>>
>>>
>>> -----Original Message-----
>>> From: Otis Gospodnetic [mailto:otis_gospodnetic@yahoo.com]
>>> Sent: Tuesday, 10 November 2009 9:26 AM
>>> To: solr-user@lucene.apache.org
>>> Subject: Re: Segment file not found error - after replicating
>>>
>>> It's hard to troubleshoot blindly like this, but have you tried manually
>>> comparing the contents of the index dir on the master and on the slave(s)?
>>> If they are out of sync, have you tried forcing of replication to see if one
>>>
>> of
>>
>>> the subsequent replication attempts gets the dirs in sync?
>>> Do you have more than 1 slave and do they all start having this problem at the
>>> same time?
>>> Any errors in the logs for any of the scripts involved in replication in 1.3?
>>>
>>> Otis
>>> --
>>> Sematext is hiring -- http://sematext.com/about/jobs.html?mls
>>> Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
>>>
>>>
>>>
>>> ----- Original Message ----
>>>
>>>> From: Maduranga Kannangara
>>>> To: "solr-user@lucene.apache.org"
>>>> Sent: Sun, November 8, 2009 10:30:44 PM
>>>> Subject: Segment file not found error - after replicating
>>>>
>>>> Hi guys,
>>>>
>>>> We use Solr 1.3 for indexing large amounts of data (50G avg) on Linux
>>>> environment and use the replication scripts to make replicas those live in
>>>>
>>> load
>>>
>>>> balancing slaves.
>>>>
>>>> The issue we face quite often (only in Linux servers) is that they tend to
>>>>
>> not
>>
>>>> been able to find the segment file (segment_x etc) after the replicating
>>>> completed. As this has become quite common, we started hitting a serious
>>>>
>>> issue.
>>>
>>>> Below is a stack trace, if that helps and any help on this matter is greatly
>>>> appreciated.
>>>>
>>>> --------------------------------
>>>>
>>>> Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader
>>>>
>> load
>>
>>>> INFO: created /admin/: org.apache.solr.handler.admin.AdminHandlers
>>>> Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader
>>>>
>> load
>>
>>>> INFO: created /admin/ping: org.apache.solr.handler.PingRequestHandler
>>>> Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader
>>>>
>> load
>>
>>>> INFO: created /debug/dump: org.apache.solr.handler.DumpRequestHandler
>>>> Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader
>>>>
>> load
>>
>>>> INFO: created gap: org.apache.solr.highlight.GapFragmenter
>>>> Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader
>>>>
>> load
>>
>>>> INFO: created regex: org.apache.solr.highlight.RegexFragmenter
>>>> Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader
>>>>
>> load
>>
>>>> INFO: created html: org.apache.solr.highlight.HtmlFormatter
>>>> Nov 5, 2009 11:34:46 PM org.apache.solr.servlet.SolrDispatchFilter init
>>>> SEVERE: Could not start SOLR. Check solr/home property
>>>> java.lang.RuntimeException: java.io.FileNotFoundException:
>>>> /solrinstances/solrhome01/data/index/segments_v (No such file or directory)
>>>>         at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:960)
>>>>         at org.apache.solr.core.SolrCore.(SolrCore.java:470)
>>>>         at
>>>>
>>>>
>> org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:119)
>>
>>>>         at
>>>> org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69)
>>>>         at
>>>>
>>>>
>> org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:275)
>>
>>>>         at
>>>>
>>>>
>> org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:397)
>>
>>>>         at
>>>>
>>>>
>> org.apache.catalina.core.ApplicationFilterConfig.(ApplicationFilterConfig.java:108)
>>
>>>>         at
>>>>
>>>>
>> org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3709)
>>
>>>>         at
>>>> org.apache.catalina.core.StandardContext.start(StandardContext.java:4363)
>>>>         at
>>>> org.apache.catalina.core.StandardContext.reload(StandardContext.java:3099)
>>>>         at
>>>> org.apache.catalina.manager.ManagerServlet.reload(ManagerServlet.java:916)
>>>>         at
>>>>
>>>>
>> org.apache.catalina.manager.HTMLManagerServlet.reload(HTMLManagerServlet.java:536)
>>
>>>>         at
>>>>
>>>>
>> org.apache.catalina.manager.HTMLManagerServlet.doGet(HTMLManagerServlet.java:114)
>>
>>>>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:617)
>>>>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
>>>>         at
>>>>
>>>>
>> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
>>
>>>>         at
>>>>
>>>>
>> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
>>
>>>>         at com.jamonapi.JAMonFilter.doFilter(JAMonFilter.java:57)
>>>>         at
>>>>
>>>>
>> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
>>
>>>>         at
>>>>
>>>>
>> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
>>
>>>>         at
>>>>
>>>>
>> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
>>
>>>>         at
>>>>
>>>>
>> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
>>
>>>>         at
>>>>
>>>>
>> org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:525)
>>
>>>>         at
>>>>
>>>>
>> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
>>
>>>>         at
>>>>
>>>>
>> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
>>
>>>>         at
>>>>
>>>>
>> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
>>
>>>>         at
>>>> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286)
>>>>         at
>>>> org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:845)
>>>>         at
>>>>
>>>>
>> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
>>
>>>>         at
>>>> org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
>>>>         at java.lang.Thread.run(Thread.java:619)
>>>> Caused by: java.io.FileNotFoundException:
>>>> /solrinstances/solrhome01/data/index/segments_v (No such file or directory)
>>>>         at java.io.RandomAccessFile.open(Native Method)
>>>>         at java.io.RandomAccessFile.(RandomAccessFile.java:212)
>>>>         at
>>>>
>>>>
>> org.apache.lucene.store.FSDirectory$FSIndexInput$Descriptor.(FSDirectory.java:552)
>>
>>>>         at
>>>> org.apache.lucene.store.FSDirectory$FSIndexInput.(FSDirectory.java:582)
>>>>         at
>>>>
>> org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:488)
>>
>>>>         at
>>>>
>> org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:482)
>>
>>>>         at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:214)
>>>>         at
>>>>
>>>>
>> org.apache.lucene.index.DirectoryIndexReader$1.doBody(DirectoryIndexReader.java:94)
>>
>>>>         at
>>>>
>>>>
>> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:653)
>>
>>>>         at
>>>>
>>>>
>> org.apache.lucene.index.DirectoryIndexReader.open(DirectoryIndexReader.java:111)
>>
>>>>         at org.apache.lucene.index.IndexReader.open(IndexReader.java:316)
>>>>         at org.apache.lucene.index.IndexReader.open(IndexReader.java:237)
>>>>         at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:951)
>>>>         ... 30 more
>>>> Nov 5, 2009 11:34:46 PM org.apache.solr.common.SolrException log
>>>> SEVERE: java.lang.RuntimeException: java.io.FileNotFoundException:
>>>> /solrinstances/solrhome01/data/index/segments_v (No such file or directory)
>>>>         at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:960)
>>>>         at org.apache.solr.core.SolrCore.(SolrCore.java:470)
>>>>         at
>>>>
>>>>
>> org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:119)
>>
>>>>         at
>>>> org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69)
>>>>         at
>>>>
>>>>
>> org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:275)
>>
>>>>         at
>>>>
>>>>
>> org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:397)
>>
>>>>         at
>>>>
>>>>
>> org.apache.catalina.core.ApplicationFilterConfig.(ApplicationFilterConfig.java:108)
>>
>>>>         at
>>>>
>>>>
>> org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3709)
>>
>>>>         at
>>>> org.apache.catalina.core.StandardContext.start(StandardContext.java:4363)
>>>>         at
>>>> org.apache.catalina.core.StandardContext.reload(StandardContext.java:3099)
>>>>         at
>>>> org.apache.catalina.manager.ManagerServlet.reload(ManagerServlet.java:916)
>>>>         at
>>>>
>>>>
>> org.apache.catalina.manager.HTMLManagerServlet.reload(HTMLManagerServlet.java:536)
>>
>>>>         at
>>>>
>>>>
>> org.apache.catalina.manager.HTMLManagerServlet.doGet(HTMLManagerServlet.java:114)
>>
>>>>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:617)
>>>>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
>>>>         at
>>>>
>>>>
>> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
>>
>>>>         at
>>>>
>>>>
>> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
>>
>>>>         at com.jamonapi.JAMonFilter.doFilter(JAMonFilter.java:57)
>>>>         at
>>>>
>>>>
>> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
>>
>>>>         at
>>>>
>>>>
>> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
>>
>>>>         at
>>>>
>>>>
>> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
>>
>>>>         at
>>>>
>>>>
>> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
>>
>>>>         at
>>>>
>>>>
>> org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:525)
>>
>>>>         at
>>>>
>>>>
>> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
>>
>>>>         at
>>>>
>>>>
>> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
>>
>>>>         at
>>>>
>>>>
>> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
>>
>>>>         at
>>>> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286)
>>>>         at
>>>> org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:845)
>>>>         at
>>>>
>>>>
>> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
>>
>>>>         at
>>>> org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
>>>>         at java.lang.Thread.run(Thread.java:619)
>>>> Caused by: java.io.FileNotFoundException:
>>>> /solrinstances/solrhome01/data/index/segments_v (No such file or directory)
>>>>         at java.io.RandomAccessFile.open(Native Method)
>>>>         at java.io.RandomAccessFile.(RandomAccessFile.java:212)
>>>>         at
>>>>
>>>>
>> org.apache.lucene.store.FSDirectory$FSIndexInput$Descriptor.(FSDirectory.java:552)
>>
>>>>         at
>>>> org.apache.lucene.store.FSDirectory$FSIndexInput.(FSDirectory.java:582)
>>>>         at
>>>>
>> org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:488)
>>
>>>>         at
>>>>
>> org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:482)
>>
>>>>         at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:214)
>>>>         at
>>>>
>>>>
>> org.apache.lucene.index.DirectoryIndexReader$1.doBody(DirectoryIndexReader.java:94)
>>
>>>>         at
>>>>
>>>>
>> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:653)
>>
>>>>         at
>>>>
>>>>
>> org.apache.lucene.index.DirectoryIndexReader.open(DirectoryIndexReader.java:111)
>>
>>>>         at org.apache.lucene.index.IndexReader.open(IndexReader.java:316)
>>>>         at org.apache.lucene.index.IndexReader.open(IndexReader.java:237)
>>>>         at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:951)
>>>>         ... 30 more
>>>>
>>>> Nov 5, 2009 11:34:46 PM org.apache.solr.servlet.SolrDispatchFilter init
>>>> INFO: SolrDispatchFilter.init() done
>>>> Nov 5, 2009 11:34:46 PM org.apache.solr.servlet.SolrServlet init
>>>> INFO: SolrServlet.init()
>>>>
>>>> --------------------------------
>>>>
>>>> Steps to re-produce the error (However, for me did not work in my local box.
>>>> Also remote server is too far away to remote-debug!).
>>>>
>>>> -  Post some new data to the master server (Usually about 1Gb worth text
>>>>
>>> files)
>>>
>>>> -  Run the replicate script in slave Solr instance
>>>> -  Try to login to admin in slave Solr instance
>>>>
>>>> And you should see above stack trace even in the Tomcat output.
>>>>
>>>>
>>>> Thanks in advance.
>>>> Madu
>>>>
>
>


--
- Mark

http://www.lucidimagination.com




Re: Segment file not found error - after replicating

Posted by Mark Miller <ma...@gmail.com>.
Maduranga Kannangara wrote:
> Permanent solution we found was to add:
>
> 1. flush() before closing the segment.gen file write (On Lucene).
>   
Hmm ... but close does flush?


> 2. Remove the slave's segment.gen before replication
>
>
> Point 1 elaborated:
>
> Lucene 2.4, org.apache.lucene.index.SegmentInfos.finishCommit(Directory dir) method:
>
> Writing of segment.gen file was changed to:
>
>   public final void prepareCommit(Directory dir) throws IOException {
> .
> .
> .
>
>     try {
>       IndexOutput genOutput = dir.createOutput(IndexFileNames.SEGMENTS_GEN);
>       try {
>         genOutput.writeInt(FORMAT_LOCKLESS);
>         genOutput.writeLong(generation);
>         genOutput.writeLong(generation);
>       } finally {
>           genOutput.flush();   // this is the simple change!!!!!!!!!
>         genOutput.close();
>       }
>     } catch (Throwable t) {
>       // It's OK if we fail to write this file since it's
>       // used only as one of the retry fallbacks.
>     }
>
>   }
>
>
> I believe, if this makes sense, we should add this simple line in Lucene! :-)
>
>
> However, since Java Replication in Solr 1.4, an application level process, should have already solved this issue in another way as well.
> Yet to test it.
>
>
> Thanks
> Madu
>
>
> -----Original Message-----
> From: Maduranga Kannangara
> Sent: Monday, 16 November 2009 2:39 PM
> To: solr-user@lucene.apache.org
> Subject: RE: Segment file not found error - after replicating
>
> Yes, I too believed so..
>
> The logic in earlier said method does the "gen number calculation" using segment files available (genA) and using segment.gen file content (genB). Which ever larger, would be the gen number used to look up for segment file.
>
> When the file is not properly replicated (due to that is not being written to hard disk, or rsync ed) and segment gen number in the segment.gen file (genB) is larger than the file based calculation (genA) we hit the pre-said issue.
>
> Cheers
> Madu
>
>
> -----Original Message-----
> From: Mark Miller [mailto:markrmiller@gmail.com]
> Sent: Monday, 16 November 2009 2:19 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Segment file not found error - after replicating
>
> Thats odd - that file is normally not used - its a backup method to
> figure out the current generation in case it cannot be determined with a
> directory listing - its basically for NFS.
>
> Maduranga Kannangara wrote:
>   
>> Just found out the root cause:
>>
>> * The segments.gen file does not get replicated to slave all the time.
>>
>> For some reason, this small (20bytes) file lives in memory and does not get updated to the master's hard disk. Therefore it is not obviously transferred to slaves.
>>
>> Solution was to shut down the master web app (must be a clean shut down!, not kill of Tomcat). Then do the replication.
>>
>> Also, if the timestamp/size (size won't change anyway!) is not changed, Rsync does not seem to copy over this file too. So enforcing in the replication scripts solved the problem.
>>
>> Thanks Otis and everyone for all your support!
>>
>> Madu
>>
>>
>> -----Original Message-----
>> From: Maduranga Kannangara
>> Sent: Monday, 16 November 2009 12:37 PM
>> To: solr-user@lucene.apache.org
>> Subject: RE: Segment file not found error - after replicating
>>
>> Yes. We have tried Solr 1.4 and so far its been great success.
>>
>> Still I am investigating why Solr 1.3 gave an issue like before.
>>
>> Currently seems to me org.apache.lucene.index.SegmentInfos.FindSegmentFile.run() is not able to figure out correct segment file name. (May be index replication issue -- leading to "not fully replicated".. but its so hard to believe as both master and slave are having 100% same data now!)
>>
>> Anyway.. will keep on trying till I find something useful.. and will let you know.
>>
>>
>> Thanks
>> Madu
>>
>>
>> -----Original Message-----
>> From: Otis Gospodnetic [mailto:otis_gospodnetic@yahoo.com]
>> Sent: Wednesday, 11 November 2009 10:03 AM
>> To: solr-user@lucene.apache.org
>> Subject: Re: Segment file not found error - after replicating
>>
>> It sounds like your index is not being fully replicated.  I can't tell why, but I can suggest you try the new Solr 1.4 replication.
>>
>> Otis
>> --
>> Sematext is hiring -- http://sematext.com/about/jobs.html?mls
>> Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
>>
>>
>>
>> ----- Original Message ----
>>
>>     
>>> From: Maduranga Kannangara <mk...@infomedia.com.au>
>>> To: "solr-user@lucene.apache.org" <so...@lucene.apache.org>
>>> Sent: Tue, November 10, 2009 5:42:44 PM
>>> Subject: RE: Segment file not found error - after replicating
>>>
>>> Thanks Otis,
>>>
>>> I did the du -s for all three index directories as you said right after
>>> replicating and when I find errors.
>>>
>>> All three gave me the exact same value. This time I found the error in a rather
>>> small index too (31Mb).
>>>
>>> BTW, if I copy the segment_x file to what Solr is looking for, and restart the
>>> Solr web-app from Tomcat manager, this resolves. But it's just a work around,
>>> never good enough for the production deployments.
>>>
>>> My next plan is to do a remote debug to see what exactly happening in the code.
>>>
>>> Any other things I should looking at?
>>> Any help is really appreciated on this matter.
>>>
>>> Thanks
>>> Madu
>>>
>>>
>>> -----Original Message-----
>>> From: Otis Gospodnetic [mailto:otis_gospodnetic@yahoo.com]
>>> Sent: Tuesday, 10 November 2009 1:14 PM
>>> To: solr-user@lucene.apache.org
>>> Subject: Re: Segment file not found error - after replicating
>>>
>>> Madu,
>>>
>>> So are you saying that all slaves have the exact same index, and that index is
>>> exactly the same as the one on the master, yet only some of those slaves exhibit
>>> this error, while others do not?  Mind listing index directories of 1) master 2)
>>> slave without errors, 3) slave with errors and doing:
>>> du -s /path/to/index/on/master
>>> du -s /path/to/index/on/slave/without/errors
>>> du -s /path/to/index/on/slave/with/errors
>>>
>>>
>>> Otis
>>> --
>>> Sematext is hiring -- http://sematext.com/about/jobs.html?mls
>>> Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
>>>
>>>
>>>
>>> ----- Original Message ----
>>>
>>>       
>>>> From: Maduranga Kannangara
>>>> To: "solr-user@lucene.apache.org"
>>>> Sent: Mon, November 9, 2009 7:47:04 PM
>>>> Subject: RE: Segment file not found error - after replicating
>>>>
>>>> Thanks Otis!
>>>>
>>>> Yes, I checked the index directories and they are 100% same, both timestamp
>>>>
>>>>         
>>> and
>>>
>>>       
>>>> size wise.
>>>>
>>>> Not all the slaves face this issue. I would say roughly 50% has this trouble.
>>>>
>>>> Logs do not have any errors too :-(
>>>>
>>>> Any other things I should do/look at?
>>>>
>>>> Cheers
>>>> Madu
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: Otis Gospodnetic [mailto:otis_gospodnetic@yahoo.com]
>>>> Sent: Tuesday, 10 November 2009 9:26 AM
>>>> To: solr-user@lucene.apache.org
>>>> Subject: Re: Segment file not found error - after replicating
>>>>
>>>> It's hard to troubleshoot blindly like this, but have you tried manually
>>>> comparing the contents of the index dir on the master and on the slave(s)?
>>>> If they are out of sync, have you tried forcing of replication to see if one
>>>>
>>>>         
>>> of
>>>
>>>       
>>>> the subsequent replication attempts gets the dirs in sync?
>>>> Do you have more than 1 slave and do they all start having this problem at the
>>>> same time?
>>>> Any errors in the logs for any of the scripts involved in replication in 1.3?
>>>>
>>>> Otis
>>>> --
>>>> Sematext is hiring -- http://sematext.com/about/jobs.html?mls
>>>> Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
>>>>
>>>>
>>>>
>>>> ----- Original Message ----
>>>>
>>>>         
>>>>> From: Maduranga Kannangara
>>>>> To: "solr-user@lucene.apache.org"
>>>>> Sent: Sun, November 8, 2009 10:30:44 PM
>>>>> Subject: Segment file not found error - after replicating
>>>>>
>>>>> Hi guys,
>>>>>
>>>>> We use Solr 1.3 for indexing large amounts of data (50G avg) on Linux
>>>>> environment and use the replication scripts to make replicas those live in
>>>>>
>>>>>           
>>>> load
>>>>
>>>>         
>>>>> balancing slaves.
>>>>>
>>>>> The issue we face quite often (only in Linux servers) is that they tend to
>>>>>
>>>>>           
>>> not
>>>
>>>       
>>>>> been able to find the segment file (segment_x etc) after the replicating
>>>>> completed. As this has become quite common, we started hitting a serious
>>>>>
>>>>>           
>>>> issue.
>>>>
>>>>         
>>>>> Below is a stack trace, if that helps and any help on this matter is greatly
>>>>> appreciated.
>>>>>
>>>>> --------------------------------
>>>>>
>>>>> Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader
>>>>>
>>>>>           
>>> load
>>>
>>>       
>>>>> INFO: created /admin/: org.apache.solr.handler.admin.AdminHandlers
>>>>> Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader
>>>>>
>>>>>           
>>> load
>>>
>>>       
>>>>> INFO: created /admin/ping: org.apache.solr.handler.PingRequestHandler
>>>>> Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader
>>>>>
>>>>>           
>>> load
>>>
>>>       
>>>>> INFO: created /debug/dump: org.apache.solr.handler.DumpRequestHandler
>>>>> Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader
>>>>>
>>>>>           
>>> load
>>>
>>>       
>>>>> INFO: created gap: org.apache.solr.highlight.GapFragmenter
>>>>> Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader
>>>>>
>>>>>           
>>> load
>>>
>>>       
>>>>> INFO: created regex: org.apache.solr.highlight.RegexFragmenter
>>>>> Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader
>>>>>
>>>>>           
>>> load
>>>
>>>       
>>>>> INFO: created html: org.apache.solr.highlight.HtmlFormatter
>>>>> Nov 5, 2009 11:34:46 PM org.apache.solr.servlet.SolrDispatchFilter init
>>>>> SEVERE: Could not start SOLR. Check solr/home property
>>>>> java.lang.RuntimeException: java.io.FileNotFoundException:
>>>>> /solrinstances/solrhome01/data/index/segments_v (No such file or directory)
>>>>>         at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:960)
>>>>>         at org.apache.solr.core.SolrCore.(SolrCore.java:470)
>>>>>         at
>>>>>
>>>>>
>>>>>           
>>> org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:119)
>>>
>>>       
>>>>>         at
>>>>> org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69)
>>>>>         at
>>>>>
>>>>>
>>>>>           
>>> org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:275)
>>>
>>>       
>>>>>         at
>>>>>
>>>>>
>>>>>           
>>> org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:397)
>>>
>>>       
>>>>>         at
>>>>>
>>>>>
>>>>>           
>>> org.apache.catalina.core.ApplicationFilterConfig.(ApplicationFilterConfig.java:108)
>>>
>>>       
>>>>>         at
>>>>>
>>>>>
>>>>>           
>>> org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3709)
>>>
>>>       
>>>>>         at
>>>>> org.apache.catalina.core.StandardContext.start(StandardContext.java:4363)
>>>>>         at
>>>>> org.apache.catalina.core.StandardContext.reload(StandardContext.java:3099)
>>>>>         at
>>>>> org.apache.catalina.manager.ManagerServlet.reload(ManagerServlet.java:916)
>>>>>         at
>>>>>
>>>>>
>>>>>           
>>> org.apache.catalina.manager.HTMLManagerServlet.reload(HTMLManagerServlet.java:536)
>>>
>>>       
>>>>>         at
>>>>>
>>>>>
>>>>>           
>>> org.apache.catalina.manager.HTMLManagerServlet.doGet(HTMLManagerServlet.java:114)
>>>
>>>       
>>>>>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:617)
>>>>>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
>>>>>         at
>>>>>
>>>>>
>>>>>           
>>> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
>>>
>>>       
>>>>>         at
>>>>>
>>>>>
>>>>>           
>>> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
>>>
>>>       
>>>>>         at com.jamonapi.JAMonFilter.doFilter(JAMonFilter.java:57)
>>>>>         at
>>>>>
>>>>>
>>>>>           
>>> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
>>>
>>>       
>>>>>         at
>>>>>
>>>>>
>>>>>           
>>> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
>>>
>>>       
>>>>>         at
>>>>>
>>>>>
>>>>>           
>>> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
>>>
>>>       
>>>>>         at
>>>>>
>>>>>
>>>>>           
>>> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
>>>
>>>       
>>>>>         at
>>>>>
>>>>>
>>>>>           
>>> org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:525)
>>>
>>>       
>>>>>         at
>>>>>
>>>>>
>>>>>           
>>> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
>>>
>>>       
>>>>>         at
>>>>>
>>>>>
>>>>>           
>>> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
>>>
>>>       
>>>>>         at
>>>>>
>>>>>
>>>>>           
>>> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
>>>
>>>       
>>>>>         at
>>>>> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286)
>>>>>         at
>>>>> org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:845)
>>>>>         at
>>>>>
>>>>>
>>>>>           
>>> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
>>>
>>>       
>>>>>         at
>>>>> org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
>>>>>         at java.lang.Thread.run(Thread.java:619)
>>>>> Caused by: java.io.FileNotFoundException:
>>>>> /solrinstances/solrhome01/data/index/segments_v (No such file or directory)
>>>>>         at java.io.RandomAccessFile.open(Native Method)
>>>>>         at java.io.RandomAccessFile.(RandomAccessFile.java:212)
>>>>>         at
>>>>>
>>>>>
>>>>>           
>>> org.apache.lucene.store.FSDirectory$FSIndexInput$Descriptor.(FSDirectory.java:552)
>>>
>>>       
>>>>>         at
>>>>> org.apache.lucene.store.FSDirectory$FSIndexInput.(FSDirectory.java:582)
>>>>>         at
>>>>>
>>>>>           
>>> org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:488)
>>>
>>>       
>>>>>         at
>>>>>
>>>>>           
>>> org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:482)
>>>
>>>       
>>>>>         at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:214)
>>>>>         at
>>>>>
>>>>>
>>>>>           
>>> org.apache.lucene.index.DirectoryIndexReader$1.doBody(DirectoryIndexReader.java:94)
>>>
>>>       
>>>>>         at
>>>>>
>>>>>
>>>>>           
>>> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:653)
>>>
>>>       
>>>>>         at
>>>>>
>>>>>
>>>>>           
>>> org.apache.lucene.index.DirectoryIndexReader.open(DirectoryIndexReader.java:111)
>>>
>>>       
>>>>>         at org.apache.lucene.index.IndexReader.open(IndexReader.java:316)
>>>>>         at org.apache.lucene.index.IndexReader.open(IndexReader.java:237)
>>>>>         at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:951)
>>>>>         ... 30 more
>>>>> Nov 5, 2009 11:34:46 PM org.apache.solr.common.SolrException log
>>>>> SEVERE: java.lang.RuntimeException: java.io.FileNotFoundException:
>>>>> /solrinstances/solrhome01/data/index/segments_v (No such file or directory)
>>>>>         at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:960)
>>>>>         at org.apache.solr.core.SolrCore.(SolrCore.java:470)
>>>>>         at
>>>>>
>>>>>
>>>>>           
>>> org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:119)
>>>
>>>       
>>>>>         at
>>>>> org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69)
>>>>>         at
>>>>>
>>>>>
>>>>>           
>>> org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:275)
>>>
>>>       
>>>>>         at
>>>>>
>>>>>
>>>>>           
>>> org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:397)
>>>
>>>       
>>>>>         at
>>>>>
>>>>>
>>>>>           
>>> org.apache.catalina.core.ApplicationFilterConfig.(ApplicationFilterConfig.java:108)
>>>
>>>       
>>>>>         at
>>>>>
>>>>>
>>>>>           
>>> org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3709)
>>>
>>>       
>>>>>         at
>>>>> org.apache.catalina.core.StandardContext.start(StandardContext.java:4363)
>>>>>         at
>>>>> org.apache.catalina.core.StandardContext.reload(StandardContext.java:3099)
>>>>>         at
>>>>> org.apache.catalina.manager.ManagerServlet.reload(ManagerServlet.java:916)
>>>>>         at
>>>>>
>>>>>
>>>>>           
>>> org.apache.catalina.manager.HTMLManagerServlet.reload(HTMLManagerServlet.java:536)
>>>
>>>       
>>>>>         at
>>>>>
>>>>>
>>>>>           
>>> org.apache.catalina.manager.HTMLManagerServlet.doGet(HTMLManagerServlet.java:114)
>>>
>>>       
>>>>>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:617)
>>>>>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
>>>>>         at
>>>>>
>>>>>
>>>>>           
>>> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
>>>
>>>       
>>>>>         at
>>>>>
>>>>>
>>>>>           
>>> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
>>>
>>>       
>>>>>         at com.jamonapi.JAMonFilter.doFilter(JAMonFilter.java:57)
>>>>>         at
>>>>>
>>>>>
>>>>>           
>>> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
>>>
>>>       
>>>>>         at
>>>>>
>>>>>
>>>>>           
>>> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
>>>
>>>       
>>>>>         at
>>>>>
>>>>>
>>>>>           
>>> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
>>>
>>>       
>>>>>         at
>>>>>
>>>>>
>>>>>           
>>> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
>>>
>>>       
>>>>>         at
>>>>>
>>>>>
>>>>>           
>>> org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:525)
>>>
>>>       
>>>>>         at
>>>>>
>>>>>
>>>>>           
>>> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
>>>
>>>       
>>>>>         at
>>>>>
>>>>>
>>>>>           
>>> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
>>>
>>>       
>>>>>         at
>>>>>
>>>>>
>>>>>           
>>> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
>>>
>>>       
>>>>>         at
>>>>> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286)
>>>>>         at
>>>>> org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:845)
>>>>>         at
>>>>>
>>>>>
>>>>>           
>>> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
>>>
>>>       
>>>>>         at
>>>>> org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
>>>>>         at java.lang.Thread.run(Thread.java:619)
>>>>> Caused by: java.io.FileNotFoundException:
>>>>> /solrinstances/solrhome01/data/index/segments_v (No such file or directory)
>>>>>         at java.io.RandomAccessFile.open(Native Method)
>>>>>         at java.io.RandomAccessFile.(RandomAccessFile.java:212)
>>>>>         at
>>>>>
>>>>>
>>>>>           
>>> org.apache.lucene.store.FSDirectory$FSIndexInput$Descriptor.(FSDirectory.java:552)
>>>
>>>       
>>>>>         at
>>>>> org.apache.lucene.store.FSDirectory$FSIndexInput.(FSDirectory.java:582)
>>>>>         at
>>>>>
>>>>>           
>>> org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:488)
>>>
>>>       
>>>>>         at
>>>>>
>>>>>           
>>> org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:482)
>>>
>>>       
>>>>>         at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:214)
>>>>>         at
>>>>>
>>>>>
>>>>>           
>>> org.apache.lucene.index.DirectoryIndexReader$1.doBody(DirectoryIndexReader.java:94)
>>>
>>>       
>>>>>         at
>>>>>
>>>>>
>>>>>           
>>> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:653)
>>>
>>>       
>>>>>         at
>>>>>
>>>>>
>>>>>           
>>> org.apache.lucene.index.DirectoryIndexReader.open(DirectoryIndexReader.java:111)
>>>
>>>       
>>>>>         at org.apache.lucene.index.IndexReader.open(IndexReader.java:316)
>>>>>         at org.apache.lucene.index.IndexReader.open(IndexReader.java:237)
>>>>>         at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:951)
>>>>>         ... 30 more
>>>>>
>>>>> Nov 5, 2009 11:34:46 PM org.apache.solr.servlet.SolrDispatchFilter init
>>>>> INFO: SolrDispatchFilter.init() done
>>>>> Nov 5, 2009 11:34:46 PM org.apache.solr.servlet.SolrServlet init
>>>>> INFO: SolrServlet.init()
>>>>>
>>>>> --------------------------------
>>>>>
>>>>> Steps to re-produce the error (However, for me did not work in my local box.
>>>>> Also remote server is too far away to remote-debug!).
>>>>>
>>>>> -  Post some new data to the master server (Usually about 1Gb worth text
>>>>>
>>>>>           
>>>> files)
>>>>
>>>>         
>>>>> -  Run the replicate script in slave Solr instance
>>>>> -  Try to login to admin in slave Solr instance
>>>>>
>>>>> And you should see above stack trace even in the Tomcat output.
>>>>>
>>>>>
>>>>> Thanks in advance.
>>>>> Madu
>>>>>
>>>>>           
>>     
>
>
> --
> - Mark
>
> http://www.lucidimagination.com
>
>
>
>   


-- 
- Mark

http://www.lucidimagination.com




RE: Segment file not found error - after replicating

Posted by Maduranga Kannangara <mk...@infomedia.com.au>.
Permanent solution we found was to add:

1. flush() before closing the segment.gen file write (On Lucene).
2. Remove the slave's segment.gen before replication


Point 1 elaborated:

Lucene 2.4, org.apache.lucene.index.SegmentInfos.finishCommit(Directory dir) method:

Writing of segment.gen file was changed to:

  public final void prepareCommit(Directory dir) throws IOException {
.
.
.

    try {
      IndexOutput genOutput = dir.createOutput(IndexFileNames.SEGMENTS_GEN);
      try {
        genOutput.writeInt(FORMAT_LOCKLESS);
        genOutput.writeLong(generation);
        genOutput.writeLong(generation);
      } finally {
          genOutput.flush();   // this is the simple change!!!!!!!!!
        genOutput.close();
      }
    } catch (Throwable t) {
      // It's OK if we fail to write this file since it's
      // used only as one of the retry fallbacks.
    }

  }


I believe, if this makes sense, we should add this simple line in Lucene! :-)


However, since Java Replication in Solr 1.4, an application level process, should have already solved this issue in another way as well.
Yet to test it.


Thanks
Madu


-----Original Message-----
From: Maduranga Kannangara
Sent: Monday, 16 November 2009 2:39 PM
To: solr-user@lucene.apache.org
Subject: RE: Segment file not found error - after replicating

Yes, I too believed so..

The logic in earlier said method does the "gen number calculation" using segment files available (genA) and using segment.gen file content (genB). Which ever larger, would be the gen number used to look up for segment file.

When the file is not properly replicated (due to that is not being written to hard disk, or rsync ed) and segment gen number in the segment.gen file (genB) is larger than the file based calculation (genA) we hit the pre-said issue.

Cheers
Madu


-----Original Message-----
From: Mark Miller [mailto:markrmiller@gmail.com]
Sent: Monday, 16 November 2009 2:19 PM
To: solr-user@lucene.apache.org
Subject: Re: Segment file not found error - after replicating

Thats odd - that file is normally not used - its a backup method to
figure out the current generation in case it cannot be determined with a
directory listing - its basically for NFS.

Maduranga Kannangara wrote:
> Just found out the root cause:
>
> * The segments.gen file does not get replicated to slave all the time.
>
> For some reason, this small (20bytes) file lives in memory and does not get updated to the master's hard disk. Therefore it is not obviously transferred to slaves.
>
> Solution was to shut down the master web app (must be a clean shut down!, not kill of Tomcat). Then do the replication.
>
> Also, if the timestamp/size (size won't change anyway!) is not changed, Rsync does not seem to copy over this file too. So enforcing in the replication scripts solved the problem.
>
> Thanks Otis and everyone for all your support!
>
> Madu
>
>
> -----Original Message-----
> From: Maduranga Kannangara
> Sent: Monday, 16 November 2009 12:37 PM
> To: solr-user@lucene.apache.org
> Subject: RE: Segment file not found error - after replicating
>
> Yes. We have tried Solr 1.4 and so far its been great success.
>
> Still I am investigating why Solr 1.3 gave an issue like before.
>
> Currently seems to me org.apache.lucene.index.SegmentInfos.FindSegmentFile.run() is not able to figure out correct segment file name. (May be index replication issue -- leading to "not fully replicated".. but its so hard to believe as both master and slave are having 100% same data now!)
>
> Anyway.. will keep on trying till I find something useful.. and will let you know.
>
>
> Thanks
> Madu
>
>
> -----Original Message-----
> From: Otis Gospodnetic [mailto:otis_gospodnetic@yahoo.com]
> Sent: Wednesday, 11 November 2009 10:03 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Segment file not found error - after replicating
>
> It sounds like your index is not being fully replicated.  I can't tell why, but I can suggest you try the new Solr 1.4 replication.
>
> Otis
> --
> Sematext is hiring -- http://sematext.com/about/jobs.html?mls
> Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
>
>
>
> ----- Original Message ----
>
>> From: Maduranga Kannangara <mk...@infomedia.com.au>
>> To: "solr-user@lucene.apache.org" <so...@lucene.apache.org>
>> Sent: Tue, November 10, 2009 5:42:44 PM
>> Subject: RE: Segment file not found error - after replicating
>>
>> Thanks Otis,
>>
>> I did the du -s for all three index directories as you said right after
>> replicating and when I find errors.
>>
>> All three gave me the exact same value. This time I found the error in a rather
>> small index too (31Mb).
>>
>> BTW, if I copy the segment_x file to what Solr is looking for, and restart the
>> Solr web-app from Tomcat manager, this resolves. But it's just a work around,
>> never good enough for the production deployments.
>>
>> My next plan is to do a remote debug to see what exactly happening in the code.
>>
>> Any other things I should looking at?
>> Any help is really appreciated on this matter.
>>
>> Thanks
>> Madu
>>
>>
>> -----Original Message-----
>> From: Otis Gospodnetic [mailto:otis_gospodnetic@yahoo.com]
>> Sent: Tuesday, 10 November 2009 1:14 PM
>> To: solr-user@lucene.apache.org
>> Subject: Re: Segment file not found error - after replicating
>>
>> Madu,
>>
>> So are you saying that all slaves have the exact same index, and that index is
>> exactly the same as the one on the master, yet only some of those slaves exhibit
>> this error, while others do not?  Mind listing index directories of 1) master 2)
>> slave without errors, 3) slave with errors and doing:
>> du -s /path/to/index/on/master
>> du -s /path/to/index/on/slave/without/errors
>> du -s /path/to/index/on/slave/with/errors
>>
>>
>> Otis
>> --
>> Sematext is hiring -- http://sematext.com/about/jobs.html?mls
>> Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
>>
>>
>>
>> ----- Original Message ----
>>
>>> From: Maduranga Kannangara
>>> To: "solr-user@lucene.apache.org"
>>> Sent: Mon, November 9, 2009 7:47:04 PM
>>> Subject: RE: Segment file not found error - after replicating
>>>
>>> Thanks Otis!
>>>
>>> Yes, I checked the index directories and they are 100% same, both timestamp
>>>
>> and
>>
>>> size wise.
>>>
>>> Not all the slaves face this issue. I would say roughly 50% has this trouble.
>>>
>>> Logs do not have any errors too :-(
>>>
>>> Any other things I should do/look at?
>>>
>>> Cheers
>>> Madu
>>>
>>>
>>> -----Original Message-----
>>> From: Otis Gospodnetic [mailto:otis_gospodnetic@yahoo.com]
>>> Sent: Tuesday, 10 November 2009 9:26 AM
>>> To: solr-user@lucene.apache.org
>>> Subject: Re: Segment file not found error - after replicating
>>>
>>> It's hard to troubleshoot blindly like this, but have you tried manually
>>> comparing the contents of the index dir on the master and on the slave(s)?
>>> If they are out of sync, have you tried forcing of replication to see if one
>>>
>> of
>>
>>> the subsequent replication attempts gets the dirs in sync?
>>> Do you have more than 1 slave and do they all start having this problem at the
>>> same time?
>>> Any errors in the logs for any of the scripts involved in replication in 1.3?
>>>
>>> Otis
>>> --
>>> Sematext is hiring -- http://sematext.com/about/jobs.html?mls
>>> Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
>>>
>>>
>>>
>>> ----- Original Message ----
>>>
>>>> From: Maduranga Kannangara
>>>> To: "solr-user@lucene.apache.org"
>>>> Sent: Sun, November 8, 2009 10:30:44 PM
>>>> Subject: Segment file not found error - after replicating
>>>>
>>>> Hi guys,
>>>>
>>>> We use Solr 1.3 for indexing large amounts of data (50G avg) on Linux
>>>> environment and use the replication scripts to make replicas those live in
>>>>
>>> load
>>>
>>>> balancing slaves.
>>>>
>>>> The issue we face quite often (only in Linux servers) is that they tend to
>>>>
>> not
>>
>>>> been able to find the segment file (segment_x etc) after the replicating
>>>> completed. As this has become quite common, we started hitting a serious
>>>>
>>> issue.
>>>
>>>> Below is a stack trace, if that helps and any help on this matter is greatly
>>>> appreciated.
>>>>
>>>> --------------------------------
>>>>
>>>> Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader
>>>>
>> load
>>
>>>> INFO: created /admin/: org.apache.solr.handler.admin.AdminHandlers
>>>> Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader
>>>>
>> load
>>
>>>> INFO: created /admin/ping: org.apache.solr.handler.PingRequestHandler
>>>> Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader
>>>>
>> load
>>
>>>> INFO: created /debug/dump: org.apache.solr.handler.DumpRequestHandler
>>>> Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader
>>>>
>> load
>>
>>>> INFO: created gap: org.apache.solr.highlight.GapFragmenter
>>>> Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader
>>>>
>> load
>>
>>>> INFO: created regex: org.apache.solr.highlight.RegexFragmenter
>>>> Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader
>>>>
>> load
>>
>>>> INFO: created html: org.apache.solr.highlight.HtmlFormatter
>>>> Nov 5, 2009 11:34:46 PM org.apache.solr.servlet.SolrDispatchFilter init
>>>> SEVERE: Could not start SOLR. Check solr/home property
>>>> java.lang.RuntimeException: java.io.FileNotFoundException:
>>>> /solrinstances/solrhome01/data/index/segments_v (No such file or directory)
>>>>         at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:960)
>>>>         at org.apache.solr.core.SolrCore.(SolrCore.java:470)
>>>>         at
>>>>
>>>>
>> org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:119)
>>
>>>>         at
>>>> org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69)
>>>>         at
>>>>
>>>>
>> org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:275)
>>
>>>>         at
>>>>
>>>>
>> org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:397)
>>
>>>>         at
>>>>
>>>>
>> org.apache.catalina.core.ApplicationFilterConfig.(ApplicationFilterConfig.java:108)
>>
>>>>         at
>>>>
>>>>
>> org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3709)
>>
>>>>         at
>>>> org.apache.catalina.core.StandardContext.start(StandardContext.java:4363)
>>>>         at
>>>> org.apache.catalina.core.StandardContext.reload(StandardContext.java:3099)
>>>>         at
>>>> org.apache.catalina.manager.ManagerServlet.reload(ManagerServlet.java:916)
>>>>         at
>>>>
>>>>
>> org.apache.catalina.manager.HTMLManagerServlet.reload(HTMLManagerServlet.java:536)
>>
>>>>         at
>>>>
>>>>
>> org.apache.catalina.manager.HTMLManagerServlet.doGet(HTMLManagerServlet.java:114)
>>
>>>>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:617)
>>>>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
>>>>         at
>>>>
>>>>
>> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
>>
>>>>         at
>>>>
>>>>
>> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
>>
>>>>         at com.jamonapi.JAMonFilter.doFilter(JAMonFilter.java:57)
>>>>         at
>>>>
>>>>
>> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
>>
>>>>         at
>>>>
>>>>
>> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
>>
>>>>         at
>>>>
>>>>
>> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
>>
>>>>         at
>>>>
>>>>
>> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
>>
>>>>         at
>>>>
>>>>
>> org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:525)
>>
>>>>         at
>>>>
>>>>
>> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
>>
>>>>         at
>>>>
>>>>
>> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
>>
>>>>         at
>>>>
>>>>
>> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
>>
>>>>         at
>>>> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286)
>>>>         at
>>>> org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:845)
>>>>         at
>>>>
>>>>
>> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
>>
>>>>         at
>>>> org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
>>>>         at java.lang.Thread.run(Thread.java:619)
>>>> Caused by: java.io.FileNotFoundException:
>>>> /solrinstances/solrhome01/data/index/segments_v (No such file or directory)
>>>>         at java.io.RandomAccessFile.open(Native Method)
>>>>         at java.io.RandomAccessFile.(RandomAccessFile.java:212)
>>>>         at
>>>>
>>>>
>> org.apache.lucene.store.FSDirectory$FSIndexInput$Descriptor.(FSDirectory.java:552)
>>
>>>>         at
>>>> org.apache.lucene.store.FSDirectory$FSIndexInput.(FSDirectory.java:582)
>>>>         at
>>>>
>> org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:488)
>>
>>>>         at
>>>>
>> org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:482)
>>
>>>>         at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:214)
>>>>         at
>>>>
>>>>
>> org.apache.lucene.index.DirectoryIndexReader$1.doBody(DirectoryIndexReader.java:94)
>>
>>>>         at
>>>>
>>>>
>> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:653)
>>
>>>>         at
>>>>
>>>>
>> org.apache.lucene.index.DirectoryIndexReader.open(DirectoryIndexReader.java:111)
>>
>>>>         at org.apache.lucene.index.IndexReader.open(IndexReader.java:316)
>>>>         at org.apache.lucene.index.IndexReader.open(IndexReader.java:237)
>>>>         at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:951)
>>>>         ... 30 more
>>>> Nov 5, 2009 11:34:46 PM org.apache.solr.common.SolrException log
>>>> SEVERE: java.lang.RuntimeException: java.io.FileNotFoundException:
>>>> /solrinstances/solrhome01/data/index/segments_v (No such file or directory)
>>>>         at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:960)
>>>>         at org.apache.solr.core.SolrCore.(SolrCore.java:470)
>>>>         at
>>>>
>>>>
>> org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:119)
>>
>>>>         at
>>>> org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69)
>>>>         at
>>>>
>>>>
>> org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:275)
>>
>>>>         at
>>>>
>>>>
>> org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:397)
>>
>>>>         at
>>>>
>>>>
>> org.apache.catalina.core.ApplicationFilterConfig.(ApplicationFilterConfig.java:108)
>>
>>>>         at
>>>>
>>>>
>> org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3709)
>>
>>>>         at
>>>> org.apache.catalina.core.StandardContext.start(StandardContext.java:4363)
>>>>         at
>>>> org.apache.catalina.core.StandardContext.reload(StandardContext.java:3099)
>>>>         at
>>>> org.apache.catalina.manager.ManagerServlet.reload(ManagerServlet.java:916)
>>>>         at
>>>>
>>>>
>> org.apache.catalina.manager.HTMLManagerServlet.reload(HTMLManagerServlet.java:536)
>>
>>>>         at
>>>>
>>>>
>> org.apache.catalina.manager.HTMLManagerServlet.doGet(HTMLManagerServlet.java:114)
>>
>>>>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:617)
>>>>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
>>>>         at
>>>>
>>>>
>> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
>>
>>>>         at
>>>>
>>>>
>> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
>>
>>>>         at com.jamonapi.JAMonFilter.doFilter(JAMonFilter.java:57)
>>>>         at
>>>>
>>>>
>> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
>>
>>>>         at
>>>>
>>>>
>> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
>>
>>>>         at
>>>>
>>>>
>> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
>>
>>>>         at
>>>>
>>>>
>> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
>>
>>>>         at
>>>>
>>>>
>> org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:525)
>>
>>>>         at
>>>>
>>>>
>> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
>>
>>>>         at
>>>>
>>>>
>> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
>>
>>>>         at
>>>>
>>>>
>> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
>>
>>>>         at
>>>> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286)
>>>>         at
>>>> org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:845)
>>>>         at
>>>>
>>>>
>> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
>>
>>>>         at
>>>> org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
>>>>         at java.lang.Thread.run(Thread.java:619)
>>>> Caused by: java.io.FileNotFoundException:
>>>> /solrinstances/solrhome01/data/index/segments_v (No such file or directory)
>>>>         at java.io.RandomAccessFile.open(Native Method)
>>>>         at java.io.RandomAccessFile.(RandomAccessFile.java:212)
>>>>         at
>>>>
>>>>
>> org.apache.lucene.store.FSDirectory$FSIndexInput$Descriptor.(FSDirectory.java:552)
>>
>>>>         at
>>>> org.apache.lucene.store.FSDirectory$FSIndexInput.(FSDirectory.java:582)
>>>>         at
>>>>
>> org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:488)
>>
>>>>         at
>>>>
>> org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:482)
>>
>>>>         at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:214)
>>>>         at
>>>>
>>>>
>> org.apache.lucene.index.DirectoryIndexReader$1.doBody(DirectoryIndexReader.java:94)
>>
>>>>         at
>>>>
>>>>
>> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:653)
>>
>>>>         at
>>>>
>>>>
>> org.apache.lucene.index.DirectoryIndexReader.open(DirectoryIndexReader.java:111)
>>
>>>>         at org.apache.lucene.index.IndexReader.open(IndexReader.java:316)
>>>>         at org.apache.lucene.index.IndexReader.open(IndexReader.java:237)
>>>>         at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:951)
>>>>         ... 30 more
>>>>
>>>> Nov 5, 2009 11:34:46 PM org.apache.solr.servlet.SolrDispatchFilter init
>>>> INFO: SolrDispatchFilter.init() done
>>>> Nov 5, 2009 11:34:46 PM org.apache.solr.servlet.SolrServlet init
>>>> INFO: SolrServlet.init()
>>>>
>>>> --------------------------------
>>>>
>>>> Steps to re-produce the error (However, for me did not work in my local box.
>>>> Also remote server is too far away to remote-debug!).
>>>>
>>>> -  Post some new data to the master server (Usually about 1Gb worth text
>>>>
>>> files)
>>>
>>>> -  Run the replicate script in slave Solr instance
>>>> -  Try to login to admin in slave Solr instance
>>>>
>>>> And you should see above stack trace even in the Tomcat output.
>>>>
>>>>
>>>> Thanks in advance.
>>>> Madu
>>>>
>
>


--
- Mark

http://www.lucidimagination.com




Re: Segment file not found error - after replicating

Posted by Mark Miller <ma...@gmail.com>.
Thats odd - that file is normally not used - its a backup method to
figure out the current generation in case it cannot be determined with a
directory listing - its basically for NFS.

Maduranga Kannangara wrote:
> Just found out the root cause:
>
> * The segments.gen file does not get replicated to slave all the time.
>
> For some reason, this small (20bytes) file lives in memory and does not get updated to the master's hard disk. Therefore it is not obviously transferred to slaves.
>
> Solution was to shut down the master web app (must be a clean shut down!, not kill of Tomcat). Then do the replication.
>
> Also, if the timestamp/size (size won't change anyway!) is not changed, Rsync does not seem to copy over this file too. So enforcing in the replication scripts solved the problem.
>
> Thanks Otis and everyone for all your support!
>
> Madu
>
>
> -----Original Message-----
> From: Maduranga Kannangara
> Sent: Monday, 16 November 2009 12:37 PM
> To: solr-user@lucene.apache.org
> Subject: RE: Segment file not found error - after replicating
>
> Yes. We have tried Solr 1.4 and so far its been great success.
>
> Still I am investigating why Solr 1.3 gave an issue like before.
>
> Currently seems to me org.apache.lucene.index.SegmentInfos.FindSegmentFile.run() is not able to figure out correct segment file name. (May be index replication issue -- leading to "not fully replicated".. but its so hard to believe as both master and slave are having 100% same data now!)
>
> Anyway.. will keep on trying till I find something useful.. and will let you know.
>
>
> Thanks
> Madu
>
>
> -----Original Message-----
> From: Otis Gospodnetic [mailto:otis_gospodnetic@yahoo.com]
> Sent: Wednesday, 11 November 2009 10:03 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Segment file not found error - after replicating
>
> It sounds like your index is not being fully replicated.  I can't tell why, but I can suggest you try the new Solr 1.4 replication.
>
> Otis
> --
> Sematext is hiring -- http://sematext.com/about/jobs.html?mls
> Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
>
>
>
> ----- Original Message ----
>   
>> From: Maduranga Kannangara <mk...@infomedia.com.au>
>> To: "solr-user@lucene.apache.org" <so...@lucene.apache.org>
>> Sent: Tue, November 10, 2009 5:42:44 PM
>> Subject: RE: Segment file not found error - after replicating
>>
>> Thanks Otis,
>>
>> I did the du -s for all three index directories as you said right after
>> replicating and when I find errors.
>>
>> All three gave me the exact same value. This time I found the error in a rather
>> small index too (31Mb).
>>
>> BTW, if I copy the segment_x file to what Solr is looking for, and restart the
>> Solr web-app from Tomcat manager, this resolves. But it's just a work around,
>> never good enough for the production deployments.
>>
>> My next plan is to do a remote debug to see what exactly happening in the code.
>>
>> Any other things I should looking at?
>> Any help is really appreciated on this matter.
>>
>> Thanks
>> Madu
>>
>>
>> -----Original Message-----
>> From: Otis Gospodnetic [mailto:otis_gospodnetic@yahoo.com]
>> Sent: Tuesday, 10 November 2009 1:14 PM
>> To: solr-user@lucene.apache.org
>> Subject: Re: Segment file not found error - after replicating
>>
>> Madu,
>>
>> So are you saying that all slaves have the exact same index, and that index is
>> exactly the same as the one on the master, yet only some of those slaves exhibit
>> this error, while others do not?  Mind listing index directories of 1) master 2)
>> slave without errors, 3) slave with errors and doing:
>> du -s /path/to/index/on/master
>> du -s /path/to/index/on/slave/without/errors
>> du -s /path/to/index/on/slave/with/errors
>>
>>
>> Otis
>> --
>> Sematext is hiring -- http://sematext.com/about/jobs.html?mls
>> Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
>>
>>
>>
>> ----- Original Message ----
>>     
>>> From: Maduranga Kannangara
>>> To: "solr-user@lucene.apache.org"
>>> Sent: Mon, November 9, 2009 7:47:04 PM
>>> Subject: RE: Segment file not found error - after replicating
>>>
>>> Thanks Otis!
>>>
>>> Yes, I checked the index directories and they are 100% same, both timestamp
>>>       
>> and
>>     
>>> size wise.
>>>
>>> Not all the slaves face this issue. I would say roughly 50% has this trouble.
>>>
>>> Logs do not have any errors too :-(
>>>
>>> Any other things I should do/look at?
>>>
>>> Cheers
>>> Madu
>>>
>>>
>>> -----Original Message-----
>>> From: Otis Gospodnetic [mailto:otis_gospodnetic@yahoo.com]
>>> Sent: Tuesday, 10 November 2009 9:26 AM
>>> To: solr-user@lucene.apache.org
>>> Subject: Re: Segment file not found error - after replicating
>>>
>>> It's hard to troubleshoot blindly like this, but have you tried manually
>>> comparing the contents of the index dir on the master and on the slave(s)?
>>> If they are out of sync, have you tried forcing of replication to see if one
>>>       
>> of
>>     
>>> the subsequent replication attempts gets the dirs in sync?
>>> Do you have more than 1 slave and do they all start having this problem at the
>>> same time?
>>> Any errors in the logs for any of the scripts involved in replication in 1.3?
>>>
>>> Otis
>>> --
>>> Sematext is hiring -- http://sematext.com/about/jobs.html?mls
>>> Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
>>>
>>>
>>>
>>> ----- Original Message ----
>>>       
>>>> From: Maduranga Kannangara
>>>> To: "solr-user@lucene.apache.org"
>>>> Sent: Sun, November 8, 2009 10:30:44 PM
>>>> Subject: Segment file not found error - after replicating
>>>>
>>>> Hi guys,
>>>>
>>>> We use Solr 1.3 for indexing large amounts of data (50G avg) on Linux
>>>> environment and use the replication scripts to make replicas those live in
>>>>         
>>> load
>>>       
>>>> balancing slaves.
>>>>
>>>> The issue we face quite often (only in Linux servers) is that they tend to
>>>>         
>> not
>>     
>>>> been able to find the segment file (segment_x etc) after the replicating
>>>> completed. As this has become quite common, we started hitting a serious
>>>>         
>>> issue.
>>>       
>>>> Below is a stack trace, if that helps and any help on this matter is greatly
>>>> appreciated.
>>>>
>>>> --------------------------------
>>>>
>>>> Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader
>>>>         
>> load
>>     
>>>> INFO: created /admin/: org.apache.solr.handler.admin.AdminHandlers
>>>> Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader
>>>>         
>> load
>>     
>>>> INFO: created /admin/ping: org.apache.solr.handler.PingRequestHandler
>>>> Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader
>>>>         
>> load
>>     
>>>> INFO: created /debug/dump: org.apache.solr.handler.DumpRequestHandler
>>>> Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader
>>>>         
>> load
>>     
>>>> INFO: created gap: org.apache.solr.highlight.GapFragmenter
>>>> Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader
>>>>         
>> load
>>     
>>>> INFO: created regex: org.apache.solr.highlight.RegexFragmenter
>>>> Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader
>>>>         
>> load
>>     
>>>> INFO: created html: org.apache.solr.highlight.HtmlFormatter
>>>> Nov 5, 2009 11:34:46 PM org.apache.solr.servlet.SolrDispatchFilter init
>>>> SEVERE: Could not start SOLR. Check solr/home property
>>>> java.lang.RuntimeException: java.io.FileNotFoundException:
>>>> /solrinstances/solrhome01/data/index/segments_v (No such file or directory)
>>>>         at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:960)
>>>>         at org.apache.solr.core.SolrCore.(SolrCore.java:470)
>>>>         at
>>>>
>>>>         
>> org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:119)
>>     
>>>>         at
>>>> org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69)
>>>>         at
>>>>
>>>>         
>> org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:275)
>>     
>>>>         at
>>>>
>>>>         
>> org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:397)
>>     
>>>>         at
>>>>
>>>>         
>> org.apache.catalina.core.ApplicationFilterConfig.(ApplicationFilterConfig.java:108)
>>     
>>>>         at
>>>>
>>>>         
>> org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3709)
>>     
>>>>         at
>>>> org.apache.catalina.core.StandardContext.start(StandardContext.java:4363)
>>>>         at
>>>> org.apache.catalina.core.StandardContext.reload(StandardContext.java:3099)
>>>>         at
>>>> org.apache.catalina.manager.ManagerServlet.reload(ManagerServlet.java:916)
>>>>         at
>>>>
>>>>         
>> org.apache.catalina.manager.HTMLManagerServlet.reload(HTMLManagerServlet.java:536)
>>     
>>>>         at
>>>>
>>>>         
>> org.apache.catalina.manager.HTMLManagerServlet.doGet(HTMLManagerServlet.java:114)
>>     
>>>>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:617)
>>>>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
>>>>         at
>>>>
>>>>         
>> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
>>     
>>>>         at
>>>>
>>>>         
>> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
>>     
>>>>         at com.jamonapi.JAMonFilter.doFilter(JAMonFilter.java:57)
>>>>         at
>>>>
>>>>         
>> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
>>     
>>>>         at
>>>>
>>>>         
>> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
>>     
>>>>         at
>>>>
>>>>         
>> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
>>     
>>>>         at
>>>>
>>>>         
>> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
>>     
>>>>         at
>>>>
>>>>         
>> org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:525)
>>     
>>>>         at
>>>>
>>>>         
>> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
>>     
>>>>         at
>>>>
>>>>         
>> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
>>     
>>>>         at
>>>>
>>>>         
>> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
>>     
>>>>         at
>>>> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286)
>>>>         at
>>>> org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:845)
>>>>         at
>>>>
>>>>         
>> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
>>     
>>>>         at
>>>> org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
>>>>         at java.lang.Thread.run(Thread.java:619)
>>>> Caused by: java.io.FileNotFoundException:
>>>> /solrinstances/solrhome01/data/index/segments_v (No such file or directory)
>>>>         at java.io.RandomAccessFile.open(Native Method)
>>>>         at java.io.RandomAccessFile.(RandomAccessFile.java:212)
>>>>         at
>>>>
>>>>         
>> org.apache.lucene.store.FSDirectory$FSIndexInput$Descriptor.(FSDirectory.java:552)
>>     
>>>>         at
>>>> org.apache.lucene.store.FSDirectory$FSIndexInput.(FSDirectory.java:582)
>>>>         at
>>>>         
>> org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:488)
>>     
>>>>         at
>>>>         
>> org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:482)
>>     
>>>>         at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:214)
>>>>         at
>>>>
>>>>         
>> org.apache.lucene.index.DirectoryIndexReader$1.doBody(DirectoryIndexReader.java:94)
>>     
>>>>         at
>>>>
>>>>         
>> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:653)
>>     
>>>>         at
>>>>
>>>>         
>> org.apache.lucene.index.DirectoryIndexReader.open(DirectoryIndexReader.java:111)
>>     
>>>>         at org.apache.lucene.index.IndexReader.open(IndexReader.java:316)
>>>>         at org.apache.lucene.index.IndexReader.open(IndexReader.java:237)
>>>>         at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:951)
>>>>         ... 30 more
>>>> Nov 5, 2009 11:34:46 PM org.apache.solr.common.SolrException log
>>>> SEVERE: java.lang.RuntimeException: java.io.FileNotFoundException:
>>>> /solrinstances/solrhome01/data/index/segments_v (No such file or directory)
>>>>         at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:960)
>>>>         at org.apache.solr.core.SolrCore.(SolrCore.java:470)
>>>>         at
>>>>
>>>>         
>> org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:119)
>>     
>>>>         at
>>>> org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69)
>>>>         at
>>>>
>>>>         
>> org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:275)
>>     
>>>>         at
>>>>
>>>>         
>> org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:397)
>>     
>>>>         at
>>>>
>>>>         
>> org.apache.catalina.core.ApplicationFilterConfig.(ApplicationFilterConfig.java:108)
>>     
>>>>         at
>>>>
>>>>         
>> org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3709)
>>     
>>>>         at
>>>> org.apache.catalina.core.StandardContext.start(StandardContext.java:4363)
>>>>         at
>>>> org.apache.catalina.core.StandardContext.reload(StandardContext.java:3099)
>>>>         at
>>>> org.apache.catalina.manager.ManagerServlet.reload(ManagerServlet.java:916)
>>>>         at
>>>>
>>>>         
>> org.apache.catalina.manager.HTMLManagerServlet.reload(HTMLManagerServlet.java:536)
>>     
>>>>         at
>>>>
>>>>         
>> org.apache.catalina.manager.HTMLManagerServlet.doGet(HTMLManagerServlet.java:114)
>>     
>>>>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:617)
>>>>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
>>>>         at
>>>>
>>>>         
>> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
>>     
>>>>         at
>>>>
>>>>         
>> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
>>     
>>>>         at com.jamonapi.JAMonFilter.doFilter(JAMonFilter.java:57)
>>>>         at
>>>>
>>>>         
>> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
>>     
>>>>         at
>>>>
>>>>         
>> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
>>     
>>>>         at
>>>>
>>>>         
>> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
>>     
>>>>         at
>>>>
>>>>         
>> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
>>     
>>>>         at
>>>>
>>>>         
>> org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:525)
>>     
>>>>         at
>>>>
>>>>         
>> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
>>     
>>>>         at
>>>>
>>>>         
>> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
>>     
>>>>         at
>>>>
>>>>         
>> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
>>     
>>>>         at
>>>> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286)
>>>>         at
>>>> org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:845)
>>>>         at
>>>>
>>>>         
>> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
>>     
>>>>         at
>>>> org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
>>>>         at java.lang.Thread.run(Thread.java:619)
>>>> Caused by: java.io.FileNotFoundException:
>>>> /solrinstances/solrhome01/data/index/segments_v (No such file or directory)
>>>>         at java.io.RandomAccessFile.open(Native Method)
>>>>         at java.io.RandomAccessFile.(RandomAccessFile.java:212)
>>>>         at
>>>>
>>>>         
>> org.apache.lucene.store.FSDirectory$FSIndexInput$Descriptor.(FSDirectory.java:552)
>>     
>>>>         at
>>>> org.apache.lucene.store.FSDirectory$FSIndexInput.(FSDirectory.java:582)
>>>>         at
>>>>         
>> org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:488)
>>     
>>>>         at
>>>>         
>> org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:482)
>>     
>>>>         at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:214)
>>>>         at
>>>>
>>>>         
>> org.apache.lucene.index.DirectoryIndexReader$1.doBody(DirectoryIndexReader.java:94)
>>     
>>>>         at
>>>>
>>>>         
>> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:653)
>>     
>>>>         at
>>>>
>>>>         
>> org.apache.lucene.index.DirectoryIndexReader.open(DirectoryIndexReader.java:111)
>>     
>>>>         at org.apache.lucene.index.IndexReader.open(IndexReader.java:316)
>>>>         at org.apache.lucene.index.IndexReader.open(IndexReader.java:237)
>>>>         at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:951)
>>>>         ... 30 more
>>>>
>>>> Nov 5, 2009 11:34:46 PM org.apache.solr.servlet.SolrDispatchFilter init
>>>> INFO: SolrDispatchFilter.init() done
>>>> Nov 5, 2009 11:34:46 PM org.apache.solr.servlet.SolrServlet init
>>>> INFO: SolrServlet.init()
>>>>
>>>> --------------------------------
>>>>
>>>> Steps to re-produce the error (However, for me did not work in my local box.
>>>> Also remote server is too far away to remote-debug!).
>>>>
>>>> -  Post some new data to the master server (Usually about 1Gb worth text
>>>>         
>>> files)
>>>       
>>>> -  Run the replicate script in slave Solr instance
>>>> -  Try to login to admin in slave Solr instance
>>>>
>>>> And you should see above stack trace even in the Tomcat output.
>>>>
>>>>
>>>> Thanks in advance.
>>>> Madu
>>>>         
>
>   


-- 
- Mark

http://www.lucidimagination.com




RE: Segment file not found error - after replicating

Posted by Maduranga Kannangara <mk...@infomedia.com.au>.
Just found out the root cause:

* The segments.gen file does not get replicated to slave all the time.

For some reason, this small (20bytes) file lives in memory and does not get updated to the master's hard disk. Therefore it is not obviously transferred to slaves.

Solution was to shut down the master web app (must be a clean shut down!, not kill of Tomcat). Then do the replication.

Also, if the timestamp/size (size won't change anyway!) is not changed, Rsync does not seem to copy over this file too. So enforcing in the replication scripts solved the problem.

Thanks Otis and everyone for all your support!

Madu


-----Original Message-----
From: Maduranga Kannangara
Sent: Monday, 16 November 2009 12:37 PM
To: solr-user@lucene.apache.org
Subject: RE: Segment file not found error - after replicating

Yes. We have tried Solr 1.4 and so far its been great success.

Still I am investigating why Solr 1.3 gave an issue like before.

Currently seems to me org.apache.lucene.index.SegmentInfos.FindSegmentFile.run() is not able to figure out correct segment file name. (May be index replication issue -- leading to "not fully replicated".. but its so hard to believe as both master and slave are having 100% same data now!)

Anyway.. will keep on trying till I find something useful.. and will let you know.


Thanks
Madu


-----Original Message-----
From: Otis Gospodnetic [mailto:otis_gospodnetic@yahoo.com]
Sent: Wednesday, 11 November 2009 10:03 AM
To: solr-user@lucene.apache.org
Subject: Re: Segment file not found error - after replicating

It sounds like your index is not being fully replicated.  I can't tell why, but I can suggest you try the new Solr 1.4 replication.

Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



----- Original Message ----
> From: Maduranga Kannangara <mk...@infomedia.com.au>
> To: "solr-user@lucene.apache.org" <so...@lucene.apache.org>
> Sent: Tue, November 10, 2009 5:42:44 PM
> Subject: RE: Segment file not found error - after replicating
>
> Thanks Otis,
>
> I did the du -s for all three index directories as you said right after
> replicating and when I find errors.
>
> All three gave me the exact same value. This time I found the error in a rather
> small index too (31Mb).
>
> BTW, if I copy the segment_x file to what Solr is looking for, and restart the
> Solr web-app from Tomcat manager, this resolves. But it's just a work around,
> never good enough for the production deployments.
>
> My next plan is to do a remote debug to see what exactly happening in the code.
>
> Any other things I should looking at?
> Any help is really appreciated on this matter.
>
> Thanks
> Madu
>
>
> -----Original Message-----
> From: Otis Gospodnetic [mailto:otis_gospodnetic@yahoo.com]
> Sent: Tuesday, 10 November 2009 1:14 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Segment file not found error - after replicating
>
> Madu,
>
> So are you saying that all slaves have the exact same index, and that index is
> exactly the same as the one on the master, yet only some of those slaves exhibit
> this error, while others do not?  Mind listing index directories of 1) master 2)
> slave without errors, 3) slave with errors and doing:
> du -s /path/to/index/on/master
> du -s /path/to/index/on/slave/without/errors
> du -s /path/to/index/on/slave/with/errors
>
>
> Otis
> --
> Sematext is hiring -- http://sematext.com/about/jobs.html?mls
> Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
>
>
>
> ----- Original Message ----
> > From: Maduranga Kannangara
> > To: "solr-user@lucene.apache.org"
> > Sent: Mon, November 9, 2009 7:47:04 PM
> > Subject: RE: Segment file not found error - after replicating
> >
> > Thanks Otis!
> >
> > Yes, I checked the index directories and they are 100% same, both timestamp
> and
> > size wise.
> >
> > Not all the slaves face this issue. I would say roughly 50% has this trouble.
> >
> > Logs do not have any errors too :-(
> >
> > Any other things I should do/look at?
> >
> > Cheers
> > Madu
> >
> >
> > -----Original Message-----
> > From: Otis Gospodnetic [mailto:otis_gospodnetic@yahoo.com]
> > Sent: Tuesday, 10 November 2009 9:26 AM
> > To: solr-user@lucene.apache.org
> > Subject: Re: Segment file not found error - after replicating
> >
> > It's hard to troubleshoot blindly like this, but have you tried manually
> > comparing the contents of the index dir on the master and on the slave(s)?
> > If they are out of sync, have you tried forcing of replication to see if one
> of
> > the subsequent replication attempts gets the dirs in sync?
> > Do you have more than 1 slave and do they all start having this problem at the
> > same time?
> > Any errors in the logs for any of the scripts involved in replication in 1.3?
> >
> > Otis
> > --
> > Sematext is hiring -- http://sematext.com/about/jobs.html?mls
> > Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
> >
> >
> >
> > ----- Original Message ----
> > > From: Maduranga Kannangara
> > > To: "solr-user@lucene.apache.org"
> > > Sent: Sun, November 8, 2009 10:30:44 PM
> > > Subject: Segment file not found error - after replicating
> > >
> > > Hi guys,
> > >
> > > We use Solr 1.3 for indexing large amounts of data (50G avg) on Linux
> > > environment and use the replication scripts to make replicas those live in
> > load
> > > balancing slaves.
> > >
> > > The issue we face quite often (only in Linux servers) is that they tend to
> not
> >
> > > been able to find the segment file (segment_x etc) after the replicating
> > > completed. As this has become quite common, we started hitting a serious
> > issue.
> > >
> > > Below is a stack trace, if that helps and any help on this matter is greatly
> > > appreciated.
> > >
> > > --------------------------------
> > >
> > > Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader
> load
> > > INFO: created /admin/: org.apache.solr.handler.admin.AdminHandlers
> > > Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader
> load
> > > INFO: created /admin/ping: org.apache.solr.handler.PingRequestHandler
> > > Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader
> load
> > > INFO: created /debug/dump: org.apache.solr.handler.DumpRequestHandler
> > > Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader
> load
> > > INFO: created gap: org.apache.solr.highlight.GapFragmenter
> > > Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader
> load
> > > INFO: created regex: org.apache.solr.highlight.RegexFragmenter
> > > Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader
> load
> > > INFO: created html: org.apache.solr.highlight.HtmlFormatter
> > > Nov 5, 2009 11:34:46 PM org.apache.solr.servlet.SolrDispatchFilter init
> > > SEVERE: Could not start SOLR. Check solr/home property
> > > java.lang.RuntimeException: java.io.FileNotFoundException:
> > > /solrinstances/solrhome01/data/index/segments_v (No such file or directory)
> > >         at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:960)
> > >         at org.apache.solr.core.SolrCore.(SolrCore.java:470)
> > >         at
> > >
> >
> org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:119)
> > >         at
> > > org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69)
> > >         at
> > >
> >
> org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:275)
> > >         at
> > >
> >
> org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:397)
> > >         at
> > >
> >
> org.apache.catalina.core.ApplicationFilterConfig.(ApplicationFilterConfig.java:108)
> > >         at
> > >
> >
> org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3709)
> > >         at
> > > org.apache.catalina.core.StandardContext.start(StandardContext.java:4363)
> > >         at
> > > org.apache.catalina.core.StandardContext.reload(StandardContext.java:3099)
> > >         at
> > > org.apache.catalina.manager.ManagerServlet.reload(ManagerServlet.java:916)
> > >         at
> > >
> >
> org.apache.catalina.manager.HTMLManagerServlet.reload(HTMLManagerServlet.java:536)
> > >         at
> > >
> >
> org.apache.catalina.manager.HTMLManagerServlet.doGet(HTMLManagerServlet.java:114)
> > >         at javax.servlet.http.HttpServlet.service(HttpServlet.java:617)
> > >         at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
> > >         at
> > >
> >
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
> > >         at
> > >
> >
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
> > >         at com.jamonapi.JAMonFilter.doFilter(JAMonFilter.java:57)
> > >         at
> > >
> >
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
> > >         at
> > >
> >
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
> > >         at
> > >
> >
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
> > >         at
> > >
> >
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
> > >         at
> > >
> >
> org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:525)
> > >         at
> > >
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
> > >         at
> > >
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
> > >         at
> > >
> >
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
> > >         at
> > > org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286)
> > >         at
> > > org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:845)
> > >         at
> > >
> >
> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
> > >         at
> > > org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
> > >         at java.lang.Thread.run(Thread.java:619)
> > > Caused by: java.io.FileNotFoundException:
> > > /solrinstances/solrhome01/data/index/segments_v (No such file or directory)
> > >         at java.io.RandomAccessFile.open(Native Method)
> > >         at java.io.RandomAccessFile.(RandomAccessFile.java:212)
> > >         at
> > >
> >
> org.apache.lucene.store.FSDirectory$FSIndexInput$Descriptor.(FSDirectory.java:552)
> > >         at
> > > org.apache.lucene.store.FSDirectory$FSIndexInput.(FSDirectory.java:582)
> > >         at
> org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:488)
> > >         at
> org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:482)
> > >         at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:214)
> > >         at
> > >
> >
> org.apache.lucene.index.DirectoryIndexReader$1.doBody(DirectoryIndexReader.java:94)
> > >         at
> > >
> >
> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:653)
> > >         at
> > >
> >
> org.apache.lucene.index.DirectoryIndexReader.open(DirectoryIndexReader.java:111)
> > >         at org.apache.lucene.index.IndexReader.open(IndexReader.java:316)
> > >         at org.apache.lucene.index.IndexReader.open(IndexReader.java:237)
> > >         at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:951)
> > >         ... 30 more
> > > Nov 5, 2009 11:34:46 PM org.apache.solr.common.SolrException log
> > > SEVERE: java.lang.RuntimeException: java.io.FileNotFoundException:
> > > /solrinstances/solrhome01/data/index/segments_v (No such file or directory)
> > >         at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:960)
> > >         at org.apache.solr.core.SolrCore.(SolrCore.java:470)
> > >         at
> > >
> >
> org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:119)
> > >         at
> > > org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69)
> > >         at
> > >
> >
> org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:275)
> > >         at
> > >
> >
> org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:397)
> > >         at
> > >
> >
> org.apache.catalina.core.ApplicationFilterConfig.(ApplicationFilterConfig.java:108)
> > >         at
> > >
> >
> org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3709)
> > >         at
> > > org.apache.catalina.core.StandardContext.start(StandardContext.java:4363)
> > >         at
> > > org.apache.catalina.core.StandardContext.reload(StandardContext.java:3099)
> > >         at
> > > org.apache.catalina.manager.ManagerServlet.reload(ManagerServlet.java:916)
> > >         at
> > >
> >
> org.apache.catalina.manager.HTMLManagerServlet.reload(HTMLManagerServlet.java:536)
> > >         at
> > >
> >
> org.apache.catalina.manager.HTMLManagerServlet.doGet(HTMLManagerServlet.java:114)
> > >         at javax.servlet.http.HttpServlet.service(HttpServlet.java:617)
> > >         at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
> > >         at
> > >
> >
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
> > >         at
> > >
> >
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
> > >         at com.jamonapi.JAMonFilter.doFilter(JAMonFilter.java:57)
> > >         at
> > >
> >
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
> > >         at
> > >
> >
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
> > >         at
> > >
> >
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
> > >         at
> > >
> >
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
> > >         at
> > >
> >
> org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:525)
> > >         at
> > >
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
> > >         at
> > >
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
> > >         at
> > >
> >
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
> > >         at
> > > org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286)
> > >         at
> > > org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:845)
> > >         at
> > >
> >
> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
> > >         at
> > > org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
> > >         at java.lang.Thread.run(Thread.java:619)
> > > Caused by: java.io.FileNotFoundException:
> > > /solrinstances/solrhome01/data/index/segments_v (No such file or directory)
> > >         at java.io.RandomAccessFile.open(Native Method)
> > >         at java.io.RandomAccessFile.(RandomAccessFile.java:212)
> > >         at
> > >
> >
> org.apache.lucene.store.FSDirectory$FSIndexInput$Descriptor.(FSDirectory.java:552)
> > >         at
> > > org.apache.lucene.store.FSDirectory$FSIndexInput.(FSDirectory.java:582)
> > >         at
> org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:488)
> > >         at
> org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:482)
> > >         at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:214)
> > >         at
> > >
> >
> org.apache.lucene.index.DirectoryIndexReader$1.doBody(DirectoryIndexReader.java:94)
> > >         at
> > >
> >
> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:653)
> > >         at
> > >
> >
> org.apache.lucene.index.DirectoryIndexReader.open(DirectoryIndexReader.java:111)
> > >         at org.apache.lucene.index.IndexReader.open(IndexReader.java:316)
> > >         at org.apache.lucene.index.IndexReader.open(IndexReader.java:237)
> > >         at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:951)
> > >         ... 30 more
> > >
> > > Nov 5, 2009 11:34:46 PM org.apache.solr.servlet.SolrDispatchFilter init
> > > INFO: SolrDispatchFilter.init() done
> > > Nov 5, 2009 11:34:46 PM org.apache.solr.servlet.SolrServlet init
> > > INFO: SolrServlet.init()
> > >
> > > --------------------------------
> > >
> > > Steps to re-produce the error (However, for me did not work in my local box.
> > > Also remote server is too far away to remote-debug!).
> > >
> > > -  Post some new data to the master server (Usually about 1Gb worth text
> > files)
> > > -  Run the replicate script in slave Solr instance
> > > -  Try to login to admin in slave Solr instance
> > >
> > > And you should see above stack trace even in the Tomcat output.
> > >
> > >
> > > Thanks in advance.
> > > Madu


Re: Segment file not found error - after replicating

Posted by Otis Gospodnetic <ot...@yahoo.com>.
It sounds like your index is not being fully replicated.  I can't tell why, but I can suggest you try the new Solr 1.4 replication.

Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



----- Original Message ----
> From: Maduranga Kannangara <mk...@infomedia.com.au>
> To: "solr-user@lucene.apache.org" <so...@lucene.apache.org>
> Sent: Tue, November 10, 2009 5:42:44 PM
> Subject: RE: Segment file not found error - after replicating
> 
> Thanks Otis,
> 
> I did the du -s for all three index directories as you said right after 
> replicating and when I find errors.
> 
> All three gave me the exact same value. This time I found the error in a rather 
> small index too (31Mb).
> 
> BTW, if I copy the segment_x file to what Solr is looking for, and restart the 
> Solr web-app from Tomcat manager, this resolves. But it's just a work around, 
> never good enough for the production deployments.
> 
> My next plan is to do a remote debug to see what exactly happening in the code.
> 
> Any other things I should looking at?
> Any help is really appreciated on this matter.
> 
> Thanks
> Madu
> 
> 
> -----Original Message-----
> From: Otis Gospodnetic [mailto:otis_gospodnetic@yahoo.com]
> Sent: Tuesday, 10 November 2009 1:14 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Segment file not found error - after replicating
> 
> Madu,
> 
> So are you saying that all slaves have the exact same index, and that index is 
> exactly the same as the one on the master, yet only some of those slaves exhibit 
> this error, while others do not?  Mind listing index directories of 1) master 2) 
> slave without errors, 3) slave with errors and doing:
> du -s /path/to/index/on/master
> du -s /path/to/index/on/slave/without/errors
> du -s /path/to/index/on/slave/with/errors
> 
> 
> Otis
> --
> Sematext is hiring -- http://sematext.com/about/jobs.html?mls
> Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
> 
> 
> 
> ----- Original Message ----
> > From: Maduranga Kannangara 
> > To: "solr-user@lucene.apache.org" 
> > Sent: Mon, November 9, 2009 7:47:04 PM
> > Subject: RE: Segment file not found error - after replicating
> >
> > Thanks Otis!
> >
> > Yes, I checked the index directories and they are 100% same, both timestamp 
> and
> > size wise.
> >
> > Not all the slaves face this issue. I would say roughly 50% has this trouble.
> >
> > Logs do not have any errors too :-(
> >
> > Any other things I should do/look at?
> >
> > Cheers
> > Madu
> >
> >
> > -----Original Message-----
> > From: Otis Gospodnetic [mailto:otis_gospodnetic@yahoo.com]
> > Sent: Tuesday, 10 November 2009 9:26 AM
> > To: solr-user@lucene.apache.org
> > Subject: Re: Segment file not found error - after replicating
> >
> > It's hard to troubleshoot blindly like this, but have you tried manually
> > comparing the contents of the index dir on the master and on the slave(s)?
> > If they are out of sync, have you tried forcing of replication to see if one 
> of
> > the subsequent replication attempts gets the dirs in sync?
> > Do you have more than 1 slave and do they all start having this problem at the
> > same time?
> > Any errors in the logs for any of the scripts involved in replication in 1.3?
> >
> > Otis
> > --
> > Sematext is hiring -- http://sematext.com/about/jobs.html?mls
> > Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
> >
> >
> >
> > ----- Original Message ----
> > > From: Maduranga Kannangara
> > > To: "solr-user@lucene.apache.org"
> > > Sent: Sun, November 8, 2009 10:30:44 PM
> > > Subject: Segment file not found error - after replicating
> > >
> > > Hi guys,
> > >
> > > We use Solr 1.3 for indexing large amounts of data (50G avg) on Linux
> > > environment and use the replication scripts to make replicas those live in
> > load
> > > balancing slaves.
> > >
> > > The issue we face quite often (only in Linux servers) is that they tend to 
> not
> >
> > > been able to find the segment file (segment_x etc) after the replicating
> > > completed. As this has become quite common, we started hitting a serious
> > issue.
> > >
> > > Below is a stack trace, if that helps and any help on this matter is greatly
> > > appreciated.
> > >
> > > --------------------------------
> > >
> > > Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader 
> load
> > > INFO: created /admin/: org.apache.solr.handler.admin.AdminHandlers
> > > Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader 
> load
> > > INFO: created /admin/ping: org.apache.solr.handler.PingRequestHandler
> > > Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader 
> load
> > > INFO: created /debug/dump: org.apache.solr.handler.DumpRequestHandler
> > > Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader 
> load
> > > INFO: created gap: org.apache.solr.highlight.GapFragmenter
> > > Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader 
> load
> > > INFO: created regex: org.apache.solr.highlight.RegexFragmenter
> > > Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader 
> load
> > > INFO: created html: org.apache.solr.highlight.HtmlFormatter
> > > Nov 5, 2009 11:34:46 PM org.apache.solr.servlet.SolrDispatchFilter init
> > > SEVERE: Could not start SOLR. Check solr/home property
> > > java.lang.RuntimeException: java.io.FileNotFoundException:
> > > /solrinstances/solrhome01/data/index/segments_v (No such file or directory)
> > >         at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:960)
> > >         at org.apache.solr.core.SolrCore.(SolrCore.java:470)
> > >         at
> > >
> > 
> org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:119)
> > >         at
> > > org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69)
> > >         at
> > >
> > 
> org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:275)
> > >         at
> > >
> > 
> org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:397)
> > >         at
> > >
> > 
> org.apache.catalina.core.ApplicationFilterConfig.(ApplicationFilterConfig.java:108)
> > >         at
> > >
> > 
> org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3709)
> > >         at
> > > org.apache.catalina.core.StandardContext.start(StandardContext.java:4363)
> > >         at
> > > org.apache.catalina.core.StandardContext.reload(StandardContext.java:3099)
> > >         at
> > > org.apache.catalina.manager.ManagerServlet.reload(ManagerServlet.java:916)
> > >         at
> > >
> > 
> org.apache.catalina.manager.HTMLManagerServlet.reload(HTMLManagerServlet.java:536)
> > >         at
> > >
> > 
> org.apache.catalina.manager.HTMLManagerServlet.doGet(HTMLManagerServlet.java:114)
> > >         at javax.servlet.http.HttpServlet.service(HttpServlet.java:617)
> > >         at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
> > >         at
> > >
> > 
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
> > >         at
> > >
> > 
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
> > >         at com.jamonapi.JAMonFilter.doFilter(JAMonFilter.java:57)
> > >         at
> > >
> > 
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
> > >         at
> > >
> > 
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
> > >         at
> > >
> > 
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
> > >         at
> > >
> > 
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
> > >         at
> > >
> > 
> org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:525)
> > >         at
> > > 
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
> > >         at
> > > 
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
> > >         at
> > >
> > 
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
> > >         at
> > > org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286)
> > >         at
> > > org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:845)
> > >         at
> > >
> > 
> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
> > >         at
> > > org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
> > >         at java.lang.Thread.run(Thread.java:619)
> > > Caused by: java.io.FileNotFoundException:
> > > /solrinstances/solrhome01/data/index/segments_v (No such file or directory)
> > >         at java.io.RandomAccessFile.open(Native Method)
> > >         at java.io.RandomAccessFile.(RandomAccessFile.java:212)
> > >         at
> > >
> > 
> org.apache.lucene.store.FSDirectory$FSIndexInput$Descriptor.(FSDirectory.java:552)
> > >         at
> > > org.apache.lucene.store.FSDirectory$FSIndexInput.(FSDirectory.java:582)
> > >         at 
> org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:488)
> > >         at 
> org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:482)
> > >         at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:214)
> > >         at
> > >
> > 
> org.apache.lucene.index.DirectoryIndexReader$1.doBody(DirectoryIndexReader.java:94)
> > >         at
> > >
> > 
> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:653)
> > >         at
> > >
> > 
> org.apache.lucene.index.DirectoryIndexReader.open(DirectoryIndexReader.java:111)
> > >         at org.apache.lucene.index.IndexReader.open(IndexReader.java:316)
> > >         at org.apache.lucene.index.IndexReader.open(IndexReader.java:237)
> > >         at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:951)
> > >         ... 30 more
> > > Nov 5, 2009 11:34:46 PM org.apache.solr.common.SolrException log
> > > SEVERE: java.lang.RuntimeException: java.io.FileNotFoundException:
> > > /solrinstances/solrhome01/data/index/segments_v (No such file or directory)
> > >         at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:960)
> > >         at org.apache.solr.core.SolrCore.(SolrCore.java:470)
> > >         at
> > >
> > 
> org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:119)
> > >         at
> > > org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69)
> > >         at
> > >
> > 
> org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:275)
> > >         at
> > >
> > 
> org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:397)
> > >         at
> > >
> > 
> org.apache.catalina.core.ApplicationFilterConfig.(ApplicationFilterConfig.java:108)
> > >         at
> > >
> > 
> org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3709)
> > >         at
> > > org.apache.catalina.core.StandardContext.start(StandardContext.java:4363)
> > >         at
> > > org.apache.catalina.core.StandardContext.reload(StandardContext.java:3099)
> > >         at
> > > org.apache.catalina.manager.ManagerServlet.reload(ManagerServlet.java:916)
> > >         at
> > >
> > 
> org.apache.catalina.manager.HTMLManagerServlet.reload(HTMLManagerServlet.java:536)
> > >         at
> > >
> > 
> org.apache.catalina.manager.HTMLManagerServlet.doGet(HTMLManagerServlet.java:114)
> > >         at javax.servlet.http.HttpServlet.service(HttpServlet.java:617)
> > >         at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
> > >         at
> > >
> > 
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
> > >         at
> > >
> > 
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
> > >         at com.jamonapi.JAMonFilter.doFilter(JAMonFilter.java:57)
> > >         at
> > >
> > 
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
> > >         at
> > >
> > 
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
> > >         at
> > >
> > 
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
> > >         at
> > >
> > 
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
> > >         at
> > >
> > 
> org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:525)
> > >         at
> > > 
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
> > >         at
> > > 
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
> > >         at
> > >
> > 
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
> > >         at
> > > org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286)
> > >         at
> > > org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:845)
> > >         at
> > >
> > 
> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
> > >         at
> > > org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
> > >         at java.lang.Thread.run(Thread.java:619)
> > > Caused by: java.io.FileNotFoundException:
> > > /solrinstances/solrhome01/data/index/segments_v (No such file or directory)
> > >         at java.io.RandomAccessFile.open(Native Method)
> > >         at java.io.RandomAccessFile.(RandomAccessFile.java:212)
> > >         at
> > >
> > 
> org.apache.lucene.store.FSDirectory$FSIndexInput$Descriptor.(FSDirectory.java:552)
> > >         at
> > > org.apache.lucene.store.FSDirectory$FSIndexInput.(FSDirectory.java:582)
> > >         at 
> org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:488)
> > >         at 
> org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:482)
> > >         at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:214)
> > >         at
> > >
> > 
> org.apache.lucene.index.DirectoryIndexReader$1.doBody(DirectoryIndexReader.java:94)
> > >         at
> > >
> > 
> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:653)
> > >         at
> > >
> > 
> org.apache.lucene.index.DirectoryIndexReader.open(DirectoryIndexReader.java:111)
> > >         at org.apache.lucene.index.IndexReader.open(IndexReader.java:316)
> > >         at org.apache.lucene.index.IndexReader.open(IndexReader.java:237)
> > >         at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:951)
> > >         ... 30 more
> > >
> > > Nov 5, 2009 11:34:46 PM org.apache.solr.servlet.SolrDispatchFilter init
> > > INFO: SolrDispatchFilter.init() done
> > > Nov 5, 2009 11:34:46 PM org.apache.solr.servlet.SolrServlet init
> > > INFO: SolrServlet.init()
> > >
> > > --------------------------------
> > >
> > > Steps to re-produce the error (However, for me did not work in my local box.
> > > Also remote server is too far away to remote-debug!).
> > >
> > > -  Post some new data to the master server (Usually about 1Gb worth text
> > files)
> > > -  Run the replicate script in slave Solr instance
> > > -  Try to login to admin in slave Solr instance
> > >
> > > And you should see above stack trace even in the Tomcat output.
> > >
> > >
> > > Thanks in advance.
> > > Madu


RE: Segment file not found error - after replicating

Posted by Maduranga Kannangara <mk...@infomedia.com.au>.
Thanks Otis,

I did the du -s for all three index directories as you said right after replicating and when I find errors.

All three gave me the exact same value. This time I found the error in a rather small index too (31Mb).

BTW, if I copy the segment_x file to what Solr is looking for, and restart the Solr web-app from Tomcat manager, this resolves. But it's just a work around, never good enough for the production deployments.

My next plan is to do a remote debug to see what exactly happening in the code.

Any other things I should looking at?
Any help is really appreciated on this matter.

Thanks
Madu


-----Original Message-----
From: Otis Gospodnetic [mailto:otis_gospodnetic@yahoo.com]
Sent: Tuesday, 10 November 2009 1:14 PM
To: solr-user@lucene.apache.org
Subject: Re: Segment file not found error - after replicating

Madu,

So are you saying that all slaves have the exact same index, and that index is exactly the same as the one on the master, yet only some of those slaves exhibit this error, while others do not?  Mind listing index directories of 1) master 2) slave without errors, 3) slave with errors and doing:
du -s /path/to/index/on/master
du -s /path/to/index/on/slave/without/errors
du -s /path/to/index/on/slave/with/errors


Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



----- Original Message ----
> From: Maduranga Kannangara <mk...@infomedia.com.au>
> To: "solr-user@lucene.apache.org" <so...@lucene.apache.org>
> Sent: Mon, November 9, 2009 7:47:04 PM
> Subject: RE: Segment file not found error - after replicating
>
> Thanks Otis!
>
> Yes, I checked the index directories and they are 100% same, both timestamp and
> size wise.
>
> Not all the slaves face this issue. I would say roughly 50% has this trouble.
>
> Logs do not have any errors too :-(
>
> Any other things I should do/look at?
>
> Cheers
> Madu
>
>
> -----Original Message-----
> From: Otis Gospodnetic [mailto:otis_gospodnetic@yahoo.com]
> Sent: Tuesday, 10 November 2009 9:26 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Segment file not found error - after replicating
>
> It's hard to troubleshoot blindly like this, but have you tried manually
> comparing the contents of the index dir on the master and on the slave(s)?
> If they are out of sync, have you tried forcing of replication to see if one of
> the subsequent replication attempts gets the dirs in sync?
> Do you have more than 1 slave and do they all start having this problem at the
> same time?
> Any errors in the logs for any of the scripts involved in replication in 1.3?
>
> Otis
> --
> Sematext is hiring -- http://sematext.com/about/jobs.html?mls
> Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
>
>
>
> ----- Original Message ----
> > From: Maduranga Kannangara
> > To: "solr-user@lucene.apache.org"
> > Sent: Sun, November 8, 2009 10:30:44 PM
> > Subject: Segment file not found error - after replicating
> >
> > Hi guys,
> >
> > We use Solr 1.3 for indexing large amounts of data (50G avg) on Linux
> > environment and use the replication scripts to make replicas those live in
> load
> > balancing slaves.
> >
> > The issue we face quite often (only in Linux servers) is that they tend to not
>
> > been able to find the segment file (segment_x etc) after the replicating
> > completed. As this has become quite common, we started hitting a serious
> issue.
> >
> > Below is a stack trace, if that helps and any help on this matter is greatly
> > appreciated.
> >
> > --------------------------------
> >
> > Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader load
> > INFO: created /admin/: org.apache.solr.handler.admin.AdminHandlers
> > Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader load
> > INFO: created /admin/ping: org.apache.solr.handler.PingRequestHandler
> > Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader load
> > INFO: created /debug/dump: org.apache.solr.handler.DumpRequestHandler
> > Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader load
> > INFO: created gap: org.apache.solr.highlight.GapFragmenter
> > Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader load
> > INFO: created regex: org.apache.solr.highlight.RegexFragmenter
> > Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader load
> > INFO: created html: org.apache.solr.highlight.HtmlFormatter
> > Nov 5, 2009 11:34:46 PM org.apache.solr.servlet.SolrDispatchFilter init
> > SEVERE: Could not start SOLR. Check solr/home property
> > java.lang.RuntimeException: java.io.FileNotFoundException:
> > /solrinstances/solrhome01/data/index/segments_v (No such file or directory)
> >         at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:960)
> >         at org.apache.solr.core.SolrCore.(SolrCore.java:470)
> >         at
> >
> org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:119)
> >         at
> > org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69)
> >         at
> >
> org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:275)
> >         at
> >
> org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:397)
> >         at
> >
> org.apache.catalina.core.ApplicationFilterConfig.(ApplicationFilterConfig.java:108)
> >         at
> >
> org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3709)
> >         at
> > org.apache.catalina.core.StandardContext.start(StandardContext.java:4363)
> >         at
> > org.apache.catalina.core.StandardContext.reload(StandardContext.java:3099)
> >         at
> > org.apache.catalina.manager.ManagerServlet.reload(ManagerServlet.java:916)
> >         at
> >
> org.apache.catalina.manager.HTMLManagerServlet.reload(HTMLManagerServlet.java:536)
> >         at
> >
> org.apache.catalina.manager.HTMLManagerServlet.doGet(HTMLManagerServlet.java:114)
> >         at javax.servlet.http.HttpServlet.service(HttpServlet.java:617)
> >         at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
> >         at
> >
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
> >         at
> >
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
> >         at com.jamonapi.JAMonFilter.doFilter(JAMonFilter.java:57)
> >         at
> >
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
> >         at
> >
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
> >         at
> >
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
> >         at
> >
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
> >         at
> >
> org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:525)
> >         at
> > org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
> >         at
> > org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
> >         at
> >
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
> >         at
> > org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286)
> >         at
> > org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:845)
> >         at
> >
> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
> >         at
> > org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
> >         at java.lang.Thread.run(Thread.java:619)
> > Caused by: java.io.FileNotFoundException:
> > /solrinstances/solrhome01/data/index/segments_v (No such file or directory)
> >         at java.io.RandomAccessFile.open(Native Method)
> >         at java.io.RandomAccessFile.(RandomAccessFile.java:212)
> >         at
> >
> org.apache.lucene.store.FSDirectory$FSIndexInput$Descriptor.(FSDirectory.java:552)
> >         at
> > org.apache.lucene.store.FSDirectory$FSIndexInput.(FSDirectory.java:582)
> >         at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:488)
> >         at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:482)
> >         at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:214)
> >         at
> >
> org.apache.lucene.index.DirectoryIndexReader$1.doBody(DirectoryIndexReader.java:94)
> >         at
> >
> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:653)
> >         at
> >
> org.apache.lucene.index.DirectoryIndexReader.open(DirectoryIndexReader.java:111)
> >         at org.apache.lucene.index.IndexReader.open(IndexReader.java:316)
> >         at org.apache.lucene.index.IndexReader.open(IndexReader.java:237)
> >         at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:951)
> >         ... 30 more
> > Nov 5, 2009 11:34:46 PM org.apache.solr.common.SolrException log
> > SEVERE: java.lang.RuntimeException: java.io.FileNotFoundException:
> > /solrinstances/solrhome01/data/index/segments_v (No such file or directory)
> >         at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:960)
> >         at org.apache.solr.core.SolrCore.(SolrCore.java:470)
> >         at
> >
> org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:119)
> >         at
> > org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69)
> >         at
> >
> org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:275)
> >         at
> >
> org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:397)
> >         at
> >
> org.apache.catalina.core.ApplicationFilterConfig.(ApplicationFilterConfig.java:108)
> >         at
> >
> org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3709)
> >         at
> > org.apache.catalina.core.StandardContext.start(StandardContext.java:4363)
> >         at
> > org.apache.catalina.core.StandardContext.reload(StandardContext.java:3099)
> >         at
> > org.apache.catalina.manager.ManagerServlet.reload(ManagerServlet.java:916)
> >         at
> >
> org.apache.catalina.manager.HTMLManagerServlet.reload(HTMLManagerServlet.java:536)
> >         at
> >
> org.apache.catalina.manager.HTMLManagerServlet.doGet(HTMLManagerServlet.java:114)
> >         at javax.servlet.http.HttpServlet.service(HttpServlet.java:617)
> >         at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
> >         at
> >
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
> >         at
> >
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
> >         at com.jamonapi.JAMonFilter.doFilter(JAMonFilter.java:57)
> >         at
> >
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
> >         at
> >
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
> >         at
> >
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
> >         at
> >
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
> >         at
> >
> org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:525)
> >         at
> > org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
> >         at
> > org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
> >         at
> >
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
> >         at
> > org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286)
> >         at
> > org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:845)
> >         at
> >
> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
> >         at
> > org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
> >         at java.lang.Thread.run(Thread.java:619)
> > Caused by: java.io.FileNotFoundException:
> > /solrinstances/solrhome01/data/index/segments_v (No such file or directory)
> >         at java.io.RandomAccessFile.open(Native Method)
> >         at java.io.RandomAccessFile.(RandomAccessFile.java:212)
> >         at
> >
> org.apache.lucene.store.FSDirectory$FSIndexInput$Descriptor.(FSDirectory.java:552)
> >         at
> > org.apache.lucene.store.FSDirectory$FSIndexInput.(FSDirectory.java:582)
> >         at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:488)
> >         at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:482)
> >         at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:214)
> >         at
> >
> org.apache.lucene.index.DirectoryIndexReader$1.doBody(DirectoryIndexReader.java:94)
> >         at
> >
> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:653)
> >         at
> >
> org.apache.lucene.index.DirectoryIndexReader.open(DirectoryIndexReader.java:111)
> >         at org.apache.lucene.index.IndexReader.open(IndexReader.java:316)
> >         at org.apache.lucene.index.IndexReader.open(IndexReader.java:237)
> >         at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:951)
> >         ... 30 more
> >
> > Nov 5, 2009 11:34:46 PM org.apache.solr.servlet.SolrDispatchFilter init
> > INFO: SolrDispatchFilter.init() done
> > Nov 5, 2009 11:34:46 PM org.apache.solr.servlet.SolrServlet init
> > INFO: SolrServlet.init()
> >
> > --------------------------------
> >
> > Steps to re-produce the error (However, for me did not work in my local box.
> > Also remote server is too far away to remote-debug!).
> >
> > -  Post some new data to the master server (Usually about 1Gb worth text
> files)
> > -  Run the replicate script in slave Solr instance
> > -  Try to login to admin in slave Solr instance
> >
> > And you should see above stack trace even in the Tomcat output.
> >
> >
> > Thanks in advance.
> > Madu


Re: Segment file not found error - after replicating

Posted by Otis Gospodnetic <ot...@yahoo.com>.
Madu,

So are you saying that all slaves have the exact same index, and that index is exactly the same as the one on the master, yet only some of those slaves exhibit this error, while others do not?  Mind listing index directories of 1) master 2) slave without errors, 3) slave with errors and doing:
du -s /path/to/index/on/master
du -s /path/to/index/on/slave/without/errors
du -s /path/to/index/on/slave/with/errors


Otis 
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



----- Original Message ----
> From: Maduranga Kannangara <mk...@infomedia.com.au>
> To: "solr-user@lucene.apache.org" <so...@lucene.apache.org>
> Sent: Mon, November 9, 2009 7:47:04 PM
> Subject: RE: Segment file not found error - after replicating
> 
> Thanks Otis!
> 
> Yes, I checked the index directories and they are 100% same, both timestamp and 
> size wise.
> 
> Not all the slaves face this issue. I would say roughly 50% has this trouble.
> 
> Logs do not have any errors too :-(
> 
> Any other things I should do/look at?
> 
> Cheers
> Madu
> 
> 
> -----Original Message-----
> From: Otis Gospodnetic [mailto:otis_gospodnetic@yahoo.com] 
> Sent: Tuesday, 10 November 2009 9:26 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Segment file not found error - after replicating
> 
> It's hard to troubleshoot blindly like this, but have you tried manually 
> comparing the contents of the index dir on the master and on the slave(s)?
> If they are out of sync, have you tried forcing of replication to see if one of 
> the subsequent replication attempts gets the dirs in sync?
> Do you have more than 1 slave and do they all start having this problem at the 
> same time?
> Any errors in the logs for any of the scripts involved in replication in 1.3?
> 
> Otis
> --
> Sematext is hiring -- http://sematext.com/about/jobs.html?mls
> Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
> 
> 
> 
> ----- Original Message ----
> > From: Maduranga Kannangara 
> > To: "solr-user@lucene.apache.org" 
> > Sent: Sun, November 8, 2009 10:30:44 PM
> > Subject: Segment file not found error - after replicating
> > 
> > Hi guys,
> > 
> > We use Solr 1.3 for indexing large amounts of data (50G avg) on Linux 
> > environment and use the replication scripts to make replicas those live in 
> load 
> > balancing slaves.
> > 
> > The issue we face quite often (only in Linux servers) is that they tend to not 
> 
> > been able to find the segment file (segment_x etc) after the replicating 
> > completed. As this has become quite common, we started hitting a serious 
> issue.
> > 
> > Below is a stack trace, if that helps and any help on this matter is greatly 
> > appreciated.
> > 
> > --------------------------------
> > 
> > Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader load
> > INFO: created /admin/: org.apache.solr.handler.admin.AdminHandlers
> > Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader load
> > INFO: created /admin/ping: org.apache.solr.handler.PingRequestHandler
> > Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader load
> > INFO: created /debug/dump: org.apache.solr.handler.DumpRequestHandler
> > Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader load
> > INFO: created gap: org.apache.solr.highlight.GapFragmenter
> > Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader load
> > INFO: created regex: org.apache.solr.highlight.RegexFragmenter
> > Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader load
> > INFO: created html: org.apache.solr.highlight.HtmlFormatter
> > Nov 5, 2009 11:34:46 PM org.apache.solr.servlet.SolrDispatchFilter init
> > SEVERE: Could not start SOLR. Check solr/home property
> > java.lang.RuntimeException: java.io.FileNotFoundException: 
> > /solrinstances/solrhome01/data/index/segments_v (No such file or directory)
> >         at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:960)
> >         at org.apache.solr.core.SolrCore.(SolrCore.java:470)
> >         at 
> > 
> org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:119)
> >         at 
> > org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69)
> >         at 
> > 
> org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:275)
> >         at 
> > 
> org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:397)
> >         at 
> > 
> org.apache.catalina.core.ApplicationFilterConfig.(ApplicationFilterConfig.java:108)
> >         at 
> > 
> org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3709)
> >         at 
> > org.apache.catalina.core.StandardContext.start(StandardContext.java:4363)
> >         at 
> > org.apache.catalina.core.StandardContext.reload(StandardContext.java:3099)
> >         at 
> > org.apache.catalina.manager.ManagerServlet.reload(ManagerServlet.java:916)
> >         at 
> > 
> org.apache.catalina.manager.HTMLManagerServlet.reload(HTMLManagerServlet.java:536)
> >         at 
> > 
> org.apache.catalina.manager.HTMLManagerServlet.doGet(HTMLManagerServlet.java:114)
> >         at javax.servlet.http.HttpServlet.service(HttpServlet.java:617)
> >         at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
> >         at 
> > 
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
> >         at 
> > 
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
> >         at com.jamonapi.JAMonFilter.doFilter(JAMonFilter.java:57)
> >         at 
> > 
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
> >         at 
> > 
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
> >         at 
> > 
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
> >         at 
> > 
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
> >         at 
> > 
> org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:525)
> >         at 
> > org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
> >         at 
> > org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
> >         at 
> > 
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
> >         at 
> > org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286)
> >         at 
> > org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:845)
> >         at 
> > 
> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
> >         at 
> > org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
> >         at java.lang.Thread.run(Thread.java:619)
> > Caused by: java.io.FileNotFoundException: 
> > /solrinstances/solrhome01/data/index/segments_v (No such file or directory)
> >         at java.io.RandomAccessFile.open(Native Method)
> >         at java.io.RandomAccessFile.(RandomAccessFile.java:212)
> >         at 
> > 
> org.apache.lucene.store.FSDirectory$FSIndexInput$Descriptor.(FSDirectory.java:552)
> >         at 
> > org.apache.lucene.store.FSDirectory$FSIndexInput.(FSDirectory.java:582)
> >         at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:488)
> >         at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:482)
> >         at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:214)
> >         at 
> > 
> org.apache.lucene.index.DirectoryIndexReader$1.doBody(DirectoryIndexReader.java:94)
> >         at 
> > 
> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:653)
> >         at 
> > 
> org.apache.lucene.index.DirectoryIndexReader.open(DirectoryIndexReader.java:111)
> >         at org.apache.lucene.index.IndexReader.open(IndexReader.java:316)
> >         at org.apache.lucene.index.IndexReader.open(IndexReader.java:237)
> >         at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:951)
> >         ... 30 more
> > Nov 5, 2009 11:34:46 PM org.apache.solr.common.SolrException log
> > SEVERE: java.lang.RuntimeException: java.io.FileNotFoundException: 
> > /solrinstances/solrhome01/data/index/segments_v (No such file or directory)
> >         at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:960)
> >         at org.apache.solr.core.SolrCore.(SolrCore.java:470)
> >         at 
> > 
> org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:119)
> >         at 
> > org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69)
> >         at 
> > 
> org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:275)
> >         at 
> > 
> org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:397)
> >         at 
> > 
> org.apache.catalina.core.ApplicationFilterConfig.(ApplicationFilterConfig.java:108)
> >         at 
> > 
> org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3709)
> >         at 
> > org.apache.catalina.core.StandardContext.start(StandardContext.java:4363)
> >         at 
> > org.apache.catalina.core.StandardContext.reload(StandardContext.java:3099)
> >         at 
> > org.apache.catalina.manager.ManagerServlet.reload(ManagerServlet.java:916)
> >         at 
> > 
> org.apache.catalina.manager.HTMLManagerServlet.reload(HTMLManagerServlet.java:536)
> >         at 
> > 
> org.apache.catalina.manager.HTMLManagerServlet.doGet(HTMLManagerServlet.java:114)
> >         at javax.servlet.http.HttpServlet.service(HttpServlet.java:617)
> >         at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
> >         at 
> > 
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
> >         at 
> > 
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
> >         at com.jamonapi.JAMonFilter.doFilter(JAMonFilter.java:57)
> >         at 
> > 
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
> >         at 
> > 
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
> >         at 
> > 
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
> >         at 
> > 
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
> >         at 
> > 
> org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:525)
> >         at 
> > org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
> >         at 
> > org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
> >         at 
> > 
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
> >         at 
> > org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286)
> >         at 
> > org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:845)
> >         at 
> > 
> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
> >         at 
> > org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
> >         at java.lang.Thread.run(Thread.java:619)
> > Caused by: java.io.FileNotFoundException: 
> > /solrinstances/solrhome01/data/index/segments_v (No such file or directory)
> >         at java.io.RandomAccessFile.open(Native Method)
> >         at java.io.RandomAccessFile.(RandomAccessFile.java:212)
> >         at 
> > 
> org.apache.lucene.store.FSDirectory$FSIndexInput$Descriptor.(FSDirectory.java:552)
> >         at 
> > org.apache.lucene.store.FSDirectory$FSIndexInput.(FSDirectory.java:582)
> >         at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:488)
> >         at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:482)
> >         at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:214)
> >         at 
> > 
> org.apache.lucene.index.DirectoryIndexReader$1.doBody(DirectoryIndexReader.java:94)
> >         at 
> > 
> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:653)
> >         at 
> > 
> org.apache.lucene.index.DirectoryIndexReader.open(DirectoryIndexReader.java:111)
> >         at org.apache.lucene.index.IndexReader.open(IndexReader.java:316)
> >         at org.apache.lucene.index.IndexReader.open(IndexReader.java:237)
> >         at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:951)
> >         ... 30 more
> > 
> > Nov 5, 2009 11:34:46 PM org.apache.solr.servlet.SolrDispatchFilter init
> > INFO: SolrDispatchFilter.init() done
> > Nov 5, 2009 11:34:46 PM org.apache.solr.servlet.SolrServlet init
> > INFO: SolrServlet.init()
> > 
> > --------------------------------
> > 
> > Steps to re-produce the error (However, for me did not work in my local box. 
> > Also remote server is too far away to remote-debug!).
> > 
> > -  Post some new data to the master server (Usually about 1Gb worth text 
> files)
> > -  Run the replicate script in slave Solr instance
> > -  Try to login to admin in slave Solr instance 
> > 
> > And you should see above stack trace even in the Tomcat output.
> > 
> > 
> > Thanks in advance.
> > Madu


RE: Segment file not found error - after replicating

Posted by Maduranga Kannangara <mk...@infomedia.com.au>.
Thanks Otis!

Yes, I checked the index directories and they are 100% same, both timestamp and size wise.

Not all the slaves face this issue. I would say roughly 50% has this trouble.

Logs do not have any errors too :-(

Any other things I should do/look at?

Cheers
Madu


-----Original Message-----
From: Otis Gospodnetic [mailto:otis_gospodnetic@yahoo.com] 
Sent: Tuesday, 10 November 2009 9:26 AM
To: solr-user@lucene.apache.org
Subject: Re: Segment file not found error - after replicating

It's hard to troubleshoot blindly like this, but have you tried manually comparing the contents of the index dir on the master and on the slave(s)?
If they are out of sync, have you tried forcing of replication to see if one of the subsequent replication attempts gets the dirs in sync?
Do you have more than 1 slave and do they all start having this problem at the same time?
Any errors in the logs for any of the scripts involved in replication in 1.3?

Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



----- Original Message ----
> From: Maduranga Kannangara <mk...@infomedia.com.au>
> To: "solr-user@lucene.apache.org" <so...@lucene.apache.org>
> Sent: Sun, November 8, 2009 10:30:44 PM
> Subject: Segment file not found error - after replicating
> 
> Hi guys,
> 
> We use Solr 1.3 for indexing large amounts of data (50G avg) on Linux 
> environment and use the replication scripts to make replicas those live in load 
> balancing slaves.
> 
> The issue we face quite often (only in Linux servers) is that they tend to not 
> been able to find the segment file (segment_x etc) after the replicating 
> completed. As this has become quite common, we started hitting a serious issue.
> 
> Below is a stack trace, if that helps and any help on this matter is greatly 
> appreciated.
> 
> --------------------------------
> 
> Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader load
> INFO: created /admin/: org.apache.solr.handler.admin.AdminHandlers
> Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader load
> INFO: created /admin/ping: org.apache.solr.handler.PingRequestHandler
> Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader load
> INFO: created /debug/dump: org.apache.solr.handler.DumpRequestHandler
> Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader load
> INFO: created gap: org.apache.solr.highlight.GapFragmenter
> Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader load
> INFO: created regex: org.apache.solr.highlight.RegexFragmenter
> Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader load
> INFO: created html: org.apache.solr.highlight.HtmlFormatter
> Nov 5, 2009 11:34:46 PM org.apache.solr.servlet.SolrDispatchFilter init
> SEVERE: Could not start SOLR. Check solr/home property
> java.lang.RuntimeException: java.io.FileNotFoundException: 
> /solrinstances/solrhome01/data/index/segments_v (No such file or directory)
>         at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:960)
>         at org.apache.solr.core.SolrCore.(SolrCore.java:470)
>         at 
> org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:119)
>         at 
> org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69)
>         at 
> org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:275)
>         at 
> org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:397)
>         at 
> org.apache.catalina.core.ApplicationFilterConfig.(ApplicationFilterConfig.java:108)
>         at 
> org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3709)
>         at 
> org.apache.catalina.core.StandardContext.start(StandardContext.java:4363)
>         at 
> org.apache.catalina.core.StandardContext.reload(StandardContext.java:3099)
>         at 
> org.apache.catalina.manager.ManagerServlet.reload(ManagerServlet.java:916)
>         at 
> org.apache.catalina.manager.HTMLManagerServlet.reload(HTMLManagerServlet.java:536)
>         at 
> org.apache.catalina.manager.HTMLManagerServlet.doGet(HTMLManagerServlet.java:114)
>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:617)
>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
>         at 
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
>         at 
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
>         at com.jamonapi.JAMonFilter.doFilter(JAMonFilter.java:57)
>         at 
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
>         at 
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
>         at 
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
>         at 
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
>         at 
> org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:525)
>         at 
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
>         at 
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
>         at 
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
>         at 
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286)
>         at 
> org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:845)
>         at 
> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
>         at 
> org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
>         at java.lang.Thread.run(Thread.java:619)
> Caused by: java.io.FileNotFoundException: 
> /solrinstances/solrhome01/data/index/segments_v (No such file or directory)
>         at java.io.RandomAccessFile.open(Native Method)
>         at java.io.RandomAccessFile.(RandomAccessFile.java:212)
>         at 
> org.apache.lucene.store.FSDirectory$FSIndexInput$Descriptor.(FSDirectory.java:552)
>         at 
> org.apache.lucene.store.FSDirectory$FSIndexInput.(FSDirectory.java:582)
>         at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:488)
>         at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:482)
>         at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:214)
>         at 
> org.apache.lucene.index.DirectoryIndexReader$1.doBody(DirectoryIndexReader.java:94)
>         at 
> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:653)
>         at 
> org.apache.lucene.index.DirectoryIndexReader.open(DirectoryIndexReader.java:111)
>         at org.apache.lucene.index.IndexReader.open(IndexReader.java:316)
>         at org.apache.lucene.index.IndexReader.open(IndexReader.java:237)
>         at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:951)
>         ... 30 more
> Nov 5, 2009 11:34:46 PM org.apache.solr.common.SolrException log
> SEVERE: java.lang.RuntimeException: java.io.FileNotFoundException: 
> /solrinstances/solrhome01/data/index/segments_v (No such file or directory)
>         at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:960)
>         at org.apache.solr.core.SolrCore.(SolrCore.java:470)
>         at 
> org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:119)
>         at 
> org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69)
>         at 
> org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:275)
>         at 
> org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:397)
>         at 
> org.apache.catalina.core.ApplicationFilterConfig.(ApplicationFilterConfig.java:108)
>         at 
> org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3709)
>         at 
> org.apache.catalina.core.StandardContext.start(StandardContext.java:4363)
>         at 
> org.apache.catalina.core.StandardContext.reload(StandardContext.java:3099)
>         at 
> org.apache.catalina.manager.ManagerServlet.reload(ManagerServlet.java:916)
>         at 
> org.apache.catalina.manager.HTMLManagerServlet.reload(HTMLManagerServlet.java:536)
>         at 
> org.apache.catalina.manager.HTMLManagerServlet.doGet(HTMLManagerServlet.java:114)
>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:617)
>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
>         at 
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
>         at 
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
>         at com.jamonapi.JAMonFilter.doFilter(JAMonFilter.java:57)
>         at 
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
>         at 
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
>         at 
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
>         at 
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
>         at 
> org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:525)
>         at 
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
>         at 
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
>         at 
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
>         at 
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286)
>         at 
> org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:845)
>         at 
> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
>         at 
> org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
>         at java.lang.Thread.run(Thread.java:619)
> Caused by: java.io.FileNotFoundException: 
> /solrinstances/solrhome01/data/index/segments_v (No such file or directory)
>         at java.io.RandomAccessFile.open(Native Method)
>         at java.io.RandomAccessFile.(RandomAccessFile.java:212)
>         at 
> org.apache.lucene.store.FSDirectory$FSIndexInput$Descriptor.(FSDirectory.java:552)
>         at 
> org.apache.lucene.store.FSDirectory$FSIndexInput.(FSDirectory.java:582)
>         at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:488)
>         at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:482)
>         at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:214)
>         at 
> org.apache.lucene.index.DirectoryIndexReader$1.doBody(DirectoryIndexReader.java:94)
>         at 
> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:653)
>         at 
> org.apache.lucene.index.DirectoryIndexReader.open(DirectoryIndexReader.java:111)
>         at org.apache.lucene.index.IndexReader.open(IndexReader.java:316)
>         at org.apache.lucene.index.IndexReader.open(IndexReader.java:237)
>         at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:951)
>         ... 30 more
> 
> Nov 5, 2009 11:34:46 PM org.apache.solr.servlet.SolrDispatchFilter init
> INFO: SolrDispatchFilter.init() done
> Nov 5, 2009 11:34:46 PM org.apache.solr.servlet.SolrServlet init
> INFO: SolrServlet.init()
> 
> --------------------------------
> 
> Steps to re-produce the error (However, for me did not work in my local box. 
> Also remote server is too far away to remote-debug!).
> 
> -  Post some new data to the master server (Usually about 1Gb worth text files)
> -  Run the replicate script in slave Solr instance
> -  Try to login to admin in slave Solr instance 
> 
> And you should see above stack trace even in the Tomcat output.
> 
> 
> Thanks in advance.
> Madu


Re: Segment file not found error - after replicating

Posted by Otis Gospodnetic <ot...@yahoo.com>.
It's hard to troubleshoot blindly like this, but have you tried manually comparing the contents of the index dir on the master and on the slave(s)?
If they are out of sync, have you tried forcing of replication to see if one of the subsequent replication attempts gets the dirs in sync?
Do you have more than 1 slave and do they all start having this problem at the same time?
Any errors in the logs for any of the scripts involved in replication in 1.3?

Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



----- Original Message ----
> From: Maduranga Kannangara <mk...@infomedia.com.au>
> To: "solr-user@lucene.apache.org" <so...@lucene.apache.org>
> Sent: Sun, November 8, 2009 10:30:44 PM
> Subject: Segment file not found error - after replicating
> 
> Hi guys,
> 
> We use Solr 1.3 for indexing large amounts of data (50G avg) on Linux 
> environment and use the replication scripts to make replicas those live in load 
> balancing slaves.
> 
> The issue we face quite often (only in Linux servers) is that they tend to not 
> been able to find the segment file (segment_x etc) after the replicating 
> completed. As this has become quite common, we started hitting a serious issue.
> 
> Below is a stack trace, if that helps and any help on this matter is greatly 
> appreciated.
> 
> --------------------------------
> 
> Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader load
> INFO: created /admin/: org.apache.solr.handler.admin.AdminHandlers
> Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader load
> INFO: created /admin/ping: org.apache.solr.handler.PingRequestHandler
> Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader load
> INFO: created /debug/dump: org.apache.solr.handler.DumpRequestHandler
> Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader load
> INFO: created gap: org.apache.solr.highlight.GapFragmenter
> Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader load
> INFO: created regex: org.apache.solr.highlight.RegexFragmenter
> Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader load
> INFO: created html: org.apache.solr.highlight.HtmlFormatter
> Nov 5, 2009 11:34:46 PM org.apache.solr.servlet.SolrDispatchFilter init
> SEVERE: Could not start SOLR. Check solr/home property
> java.lang.RuntimeException: java.io.FileNotFoundException: 
> /solrinstances/solrhome01/data/index/segments_v (No such file or directory)
>         at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:960)
>         at org.apache.solr.core.SolrCore.(SolrCore.java:470)
>         at 
> org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:119)
>         at 
> org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69)
>         at 
> org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:275)
>         at 
> org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:397)
>         at 
> org.apache.catalina.core.ApplicationFilterConfig.(ApplicationFilterConfig.java:108)
>         at 
> org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3709)
>         at 
> org.apache.catalina.core.StandardContext.start(StandardContext.java:4363)
>         at 
> org.apache.catalina.core.StandardContext.reload(StandardContext.java:3099)
>         at 
> org.apache.catalina.manager.ManagerServlet.reload(ManagerServlet.java:916)
>         at 
> org.apache.catalina.manager.HTMLManagerServlet.reload(HTMLManagerServlet.java:536)
>         at 
> org.apache.catalina.manager.HTMLManagerServlet.doGet(HTMLManagerServlet.java:114)
>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:617)
>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
>         at 
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
>         at 
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
>         at com.jamonapi.JAMonFilter.doFilter(JAMonFilter.java:57)
>         at 
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
>         at 
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
>         at 
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
>         at 
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
>         at 
> org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:525)
>         at 
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
>         at 
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
>         at 
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
>         at 
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286)
>         at 
> org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:845)
>         at 
> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
>         at 
> org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
>         at java.lang.Thread.run(Thread.java:619)
> Caused by: java.io.FileNotFoundException: 
> /solrinstances/solrhome01/data/index/segments_v (No such file or directory)
>         at java.io.RandomAccessFile.open(Native Method)
>         at java.io.RandomAccessFile.(RandomAccessFile.java:212)
>         at 
> org.apache.lucene.store.FSDirectory$FSIndexInput$Descriptor.(FSDirectory.java:552)
>         at 
> org.apache.lucene.store.FSDirectory$FSIndexInput.(FSDirectory.java:582)
>         at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:488)
>         at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:482)
>         at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:214)
>         at 
> org.apache.lucene.index.DirectoryIndexReader$1.doBody(DirectoryIndexReader.java:94)
>         at 
> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:653)
>         at 
> org.apache.lucene.index.DirectoryIndexReader.open(DirectoryIndexReader.java:111)
>         at org.apache.lucene.index.IndexReader.open(IndexReader.java:316)
>         at org.apache.lucene.index.IndexReader.open(IndexReader.java:237)
>         at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:951)
>         ... 30 more
> Nov 5, 2009 11:34:46 PM org.apache.solr.common.SolrException log
> SEVERE: java.lang.RuntimeException: java.io.FileNotFoundException: 
> /solrinstances/solrhome01/data/index/segments_v (No such file or directory)
>         at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:960)
>         at org.apache.solr.core.SolrCore.(SolrCore.java:470)
>         at 
> org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:119)
>         at 
> org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69)
>         at 
> org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:275)
>         at 
> org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:397)
>         at 
> org.apache.catalina.core.ApplicationFilterConfig.(ApplicationFilterConfig.java:108)
>         at 
> org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3709)
>         at 
> org.apache.catalina.core.StandardContext.start(StandardContext.java:4363)
>         at 
> org.apache.catalina.core.StandardContext.reload(StandardContext.java:3099)
>         at 
> org.apache.catalina.manager.ManagerServlet.reload(ManagerServlet.java:916)
>         at 
> org.apache.catalina.manager.HTMLManagerServlet.reload(HTMLManagerServlet.java:536)
>         at 
> org.apache.catalina.manager.HTMLManagerServlet.doGet(HTMLManagerServlet.java:114)
>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:617)
>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
>         at 
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
>         at 
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
>         at com.jamonapi.JAMonFilter.doFilter(JAMonFilter.java:57)
>         at 
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
>         at 
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
>         at 
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
>         at 
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
>         at 
> org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:525)
>         at 
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
>         at 
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
>         at 
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
>         at 
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286)
>         at 
> org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:845)
>         at 
> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
>         at 
> org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
>         at java.lang.Thread.run(Thread.java:619)
> Caused by: java.io.FileNotFoundException: 
> /solrinstances/solrhome01/data/index/segments_v (No such file or directory)
>         at java.io.RandomAccessFile.open(Native Method)
>         at java.io.RandomAccessFile.(RandomAccessFile.java:212)
>         at 
> org.apache.lucene.store.FSDirectory$FSIndexInput$Descriptor.(FSDirectory.java:552)
>         at 
> org.apache.lucene.store.FSDirectory$FSIndexInput.(FSDirectory.java:582)
>         at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:488)
>         at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:482)
>         at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:214)
>         at 
> org.apache.lucene.index.DirectoryIndexReader$1.doBody(DirectoryIndexReader.java:94)
>         at 
> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:653)
>         at 
> org.apache.lucene.index.DirectoryIndexReader.open(DirectoryIndexReader.java:111)
>         at org.apache.lucene.index.IndexReader.open(IndexReader.java:316)
>         at org.apache.lucene.index.IndexReader.open(IndexReader.java:237)
>         at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:951)
>         ... 30 more
> 
> Nov 5, 2009 11:34:46 PM org.apache.solr.servlet.SolrDispatchFilter init
> INFO: SolrDispatchFilter.init() done
> Nov 5, 2009 11:34:46 PM org.apache.solr.servlet.SolrServlet init
> INFO: SolrServlet.init()
> 
> --------------------------------
> 
> Steps to re-produce the error (However, for me did not work in my local box. 
> Also remote server is too far away to remote-debug!).
> 
> -  Post some new data to the master server (Usually about 1Gb worth text files)
> -  Run the replicate script in slave Solr instance
> -  Try to login to admin in slave Solr instance 
> 
> And you should see above stack trace even in the Tomcat output.
> 
> 
> Thanks in advance.
> Madu


Segment file not found error - after replicating

Posted by Maduranga Kannangara <mk...@infomedia.com.au>.
Hi guys,

We use Solr 1.3 for indexing large amounts of data (50G avg) on Linux environment and use the replication scripts to make replicas those live in load balancing slaves.

The issue we face quite often (only in Linux servers) is that they tend to not been able to find the segment file (segment_x etc) after the replicating completed. As this has become quite common, we started hitting a serious issue.

Below is a stack trace, if that helps and any help on this matter is greatly appreciated.

--------------------------------

Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader load
INFO: created /admin/: org.apache.solr.handler.admin.AdminHandlers
Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader load
INFO: created /admin/ping: org.apache.solr.handler.PingRequestHandler
Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader load
INFO: created /debug/dump: org.apache.solr.handler.DumpRequestHandler
Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader load
INFO: created gap: org.apache.solr.highlight.GapFragmenter
Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader load
INFO: created regex: org.apache.solr.highlight.RegexFragmenter
Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader load
INFO: created html: org.apache.solr.highlight.HtmlFormatter
Nov 5, 2009 11:34:46 PM org.apache.solr.servlet.SolrDispatchFilter init
SEVERE: Could not start SOLR. Check solr/home property
java.lang.RuntimeException: java.io.FileNotFoundException: /solrinstances/solrhome01/data/index/segments_v (No such file or directory)
        at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:960)
        at org.apache.solr.core.SolrCore.<init>(SolrCore.java:470)
        at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:119)
        at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69)
        at org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:275)
        at org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:397)
        at org.apache.catalina.core.ApplicationFilterConfig.<init>(ApplicationFilterConfig.java:108)
        at org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3709)
        at org.apache.catalina.core.StandardContext.start(StandardContext.java:4363)
        at org.apache.catalina.core.StandardContext.reload(StandardContext.java:3099)
        at org.apache.catalina.manager.ManagerServlet.reload(ManagerServlet.java:916)
        at org.apache.catalina.manager.HTMLManagerServlet.reload(HTMLManagerServlet.java:536)
        at org.apache.catalina.manager.HTMLManagerServlet.doGet(HTMLManagerServlet.java:114)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:617)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at com.jamonapi.JAMonFilter.doFilter(JAMonFilter.java:57)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
        at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
        at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:525)
        at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
        at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
        at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
        at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286)
        at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:845)
        at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
        at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
        at java.lang.Thread.run(Thread.java:619)
Caused by: java.io.FileNotFoundException: /solrinstances/solrhome01/data/index/segments_v (No such file or directory)
        at java.io.RandomAccessFile.open(Native Method)
        at java.io.RandomAccessFile.<init>(RandomAccessFile.java:212)
        at org.apache.lucene.store.FSDirectory$FSIndexInput$Descriptor.<init>(FSDirectory.java:552)
        at org.apache.lucene.store.FSDirectory$FSIndexInput.<init>(FSDirectory.java:582)
        at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:488)
        at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:482)
        at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:214)
        at org.apache.lucene.index.DirectoryIndexReader$1.doBody(DirectoryIndexReader.java:94)
        at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:653)
        at org.apache.lucene.index.DirectoryIndexReader.open(DirectoryIndexReader.java:111)
        at org.apache.lucene.index.IndexReader.open(IndexReader.java:316)
        at org.apache.lucene.index.IndexReader.open(IndexReader.java:237)
        at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:951)
        ... 30 more
Nov 5, 2009 11:34:46 PM org.apache.solr.common.SolrException log
SEVERE: java.lang.RuntimeException: java.io.FileNotFoundException: /solrinstances/solrhome01/data/index/segments_v (No such file or directory)
        at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:960)
        at org.apache.solr.core.SolrCore.<init>(SolrCore.java:470)
        at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:119)
        at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69)
        at org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:275)
        at org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:397)
        at org.apache.catalina.core.ApplicationFilterConfig.<init>(ApplicationFilterConfig.java:108)
        at org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3709)
        at org.apache.catalina.core.StandardContext.start(StandardContext.java:4363)
        at org.apache.catalina.core.StandardContext.reload(StandardContext.java:3099)
        at org.apache.catalina.manager.ManagerServlet.reload(ManagerServlet.java:916)
        at org.apache.catalina.manager.HTMLManagerServlet.reload(HTMLManagerServlet.java:536)
        at org.apache.catalina.manager.HTMLManagerServlet.doGet(HTMLManagerServlet.java:114)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:617)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at com.jamonapi.JAMonFilter.doFilter(JAMonFilter.java:57)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
        at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
        at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:525)
        at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
        at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
        at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
        at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286)
        at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:845)
        at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
        at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
        at java.lang.Thread.run(Thread.java:619)
Caused by: java.io.FileNotFoundException: /solrinstances/solrhome01/data/index/segments_v (No such file or directory)
        at java.io.RandomAccessFile.open(Native Method)
        at java.io.RandomAccessFile.<init>(RandomAccessFile.java:212)
        at org.apache.lucene.store.FSDirectory$FSIndexInput$Descriptor.<init>(FSDirectory.java:552)
        at org.apache.lucene.store.FSDirectory$FSIndexInput.<init>(FSDirectory.java:582)
        at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:488)
        at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:482)
        at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:214)
        at org.apache.lucene.index.DirectoryIndexReader$1.doBody(DirectoryIndexReader.java:94)
        at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:653)
        at org.apache.lucene.index.DirectoryIndexReader.open(DirectoryIndexReader.java:111)
        at org.apache.lucene.index.IndexReader.open(IndexReader.java:316)
        at org.apache.lucene.index.IndexReader.open(IndexReader.java:237)
        at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:951)
        ... 30 more

Nov 5, 2009 11:34:46 PM org.apache.solr.servlet.SolrDispatchFilter init
INFO: SolrDispatchFilter.init() done
Nov 5, 2009 11:34:46 PM org.apache.solr.servlet.SolrServlet init
INFO: SolrServlet.init()

--------------------------------

Steps to re-produce the error (However, for me did not work in my local box. Also remote server is too far away to remote-debug!).

-  Post some new data to the master server (Usually about 1Gb worth text files)
-  Run the replicate script in slave Solr instance
-  Try to login to admin in slave Solr instance 

And you should see above stack trace even in the Tomcat output.


Thanks in advance.
Madu



RE: Lucene FieldCache memory requirements

Posted by Fuad Efendi <fu...@efendi.ca>.
Sorry Mike, Mark, I am confused again...

Yes, I need some more memory for processing ("while FieldCache is being
loaded"), obviously, but it was not main subject...

With StringIndexCache, I have 10 arrays (cardinality of this field is 10)
storing  (int) Lucene Document ID.

> Except: as Mark said, you'll also need transient memory = pointer (4
> or 8 bytes) * (1+maxdoc), while the FieldCache is being loaded.

Ok, I see it:
      final int[] retArray = new int[reader.maxDoc()];
      String[] mterms = new String[reader.maxDoc()+1];

I can't track right now (limited in time), I think mterms is local variable
and will size down to 0...



So that correct formula is... weird one... if you don't want unexpected OOM
or overloaded GC (WeakHashMaps...):

      [some heap] + [Non-Tokenized_Field_Count] x [maxdoc] x [4 bytes + 8
bytes]

(for 64-bit)


-Fuad


> -----Original Message-----
> From: Michael McCandless [mailto:lucene@mikemccandless.com]
> Sent: November-03-09 5:00 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Lucene FieldCache memory requirements
> 
> On Mon, Nov 2, 2009 at 9:27 PM, Fuad Efendi <fu...@efendi.ca> wrote:
> > I believe this is correct estimate:
> >
> >> C. [maxdoc] x [4 bytes ~ (int) Lucene Document ID]
> >>
> >>   same as
> >> [String1_Document_Count + ... + String10_Document_Count + ...]
> >> x [4 bytes per DocumentID]
> 
> That's right.
> 
> Except: as Mark said, you'll also need transient memory = pointer (4
> or 8 bytes) * (1+maxdoc), while the FieldCache is being loaded.  After
> it's done being loaded, this sizes down to the number of unique terms.
> 
> But, if Lucene did the basic int packing, which really we should do,
> since you only have 10 unique values, with a naive 4 bits per doc
> encoding, you'd only need 1/8th the memory usage.  We could do a bit
> better by encoding more than one document at a time...
> 
> Mike



Re: Lucene FieldCache memory requirements

Posted by Michael McCandless <lu...@mikemccandless.com>.
On Mon, Nov 2, 2009 at 9:27 PM, Fuad Efendi <fu...@efendi.ca> wrote:
> I believe this is correct estimate:
>
>> C. [maxdoc] x [4 bytes ~ (int) Lucene Document ID]
>>
>>   same as
>> [String1_Document_Count + ... + String10_Document_Count + ...]
>> x [4 bytes per DocumentID]

That's right.

Except: as Mark said, you'll also need transient memory = pointer (4
or 8 bytes) * (1+maxdoc), while the FieldCache is being loaded.  After
it's done being loaded, this sizes down to the number of unique terms.

But, if Lucene did the basic int packing, which really we should do,
since you only have 10 unique values, with a naive 4 bits per doc
encoding, you'd only need 1/8th the memory usage.  We could do a bit
better by encoding more than one document at a time...

Mike

RE: Lucene FieldCache memory requirements

Posted by Fuad Efendi <fu...@efendi.ca>.
I believe this is correct estimate:

> C. [maxdoc] x [4 bytes ~ (int) Lucene Document ID]
>
>   same as 
> [String1_Document_Count + ... + String10_Document_Count + ...] 
> x [4 bytes per DocumentID]


So, for 100 millions docs we need 400Mb for each(!) non-tokenized field.
Although FieldCacheImpl is based on WeakHashMap (somewhere...), we can't
rely on "sizing down" with SOLR faceting features....


I think I finally found the answer...

  /** Expert: Stores term text values and document ordering data. */
  public static class StringIndex {
    ...	  
    /** All the term values, in natural order. */
    public final String[] lookup;

    /** For each document, an index into the lookup array. */
    public final int[] order;
    ...
  }



Another API:
  /** Checks the internal cache for an appropriate entry, and if none
   * is found, reads the term values in <code>field</code> and returns an
array
   * of size <code>reader.maxDoc()</code> containing the value each document
   * has in the given field.
   * @param reader  Used to get field values.
   * @param field   Which field contains the strings.
   * @return The values in the given field for each document.
   * @throws IOException  If any error occurs.
   */
  public String[] getStrings (IndexReader reader, String field)
  throws IOException;


Looks similar; cache size is [maxdoc]; however values stored are 8-byte
pointers for 64-bit JVM.


  private Map<Class<?>,Cache> caches;
  private synchronized void init() {
    caches = new HashMap<Class<?>,Cache>(7);
    ...
    caches.put(String.class, new StringCache(this));
    caches.put(StringIndex.class, new StringIndexCache(this));
    ...
  }


StringCache and StringIndexCache use WeakHashMap internally... but objects
won't be ever garbage collected in a "faceted" production system...

SOLR SimpleFacets don't use "getStrings" API, so the hope is memory
requirements are minimized.


However, Lucene may use it internally for some queries (or, for instance, to
get access to a nontokenized cached field without reading index)... to be
safe, use this in your basic memory estimates:


[512Mb ~ 1Gb] + [non_tokenized_fields_count] x [maxdoc] x [8 bytes]


-Fuad



> -----Original Message-----
> From: Fuad Efendi [mailto:fuad@efendi.ca]
> Sent: November-02-09 7:37 PM
> To: solr-user@lucene.apache.org
> Subject: RE: Lucene FieldCache memory requirements
> 
> 
> Simple field (10 different values: Canada, USA, UK, ...), 64-bit JVM... no
> difference between maxdoc and maxdoc + 1 for such estimate... difference
is
> between 0.4Gb and 1.2Gb...
> 
> 
> So, let's vote ;)
> 
> A. [maxdoc] x [8 bytes ~ pointer to String object]
> 
> B. [maxdoc] x [8 bytes ~ pointer to Document object]
> 
> C. [maxdoc] x [4 bytes ~ (int) Lucene Document ID]
> - same as [String1_Document_Count + ... + String10_Document_Count] x [4
> bytes ~ DocumentID]
> 
> D. [maxdoc] x [4 bytes + 8 bytes ~ my initial naive thinking...]
> 
> 
> Please confirm that it is Pointer to Object and not Lucene Document ID...
I
> hope it is (int) Document ID...
> 
> 
> 
> 
> 
> > -----Original Message-----
> > From: Mark Miller [mailto:markrmiller@gmail.com]
> > Sent: November-02-09 6:52 PM
> > To: solr-user@lucene.apache.org
> > Subject: Re: Lucene FieldCache memory requirements
> >
> > It also briefly requires more memory than just that - it allocates an
> > array the size of maxdoc+1 to hold the unique terms - and then sizes
down.
> >
> > Possibly we can use the getUnuiqeTermCount method in the flexible
> > indexing branch to get rid of that - which is why I was thinking it
> > might be a good idea to drop the unsupported exception in that method
> > for things like multi reader and just do the work to get the right
> > number (currently there is a comment that the user should do that work
> > if necessary, making the call unreliable for this).
> >
> > Fuad Efendi wrote:
> > > Thank you very much Mike,
> > >
> > > I found it:
> > > org.apache.solr.request.SimpleFacets
> > > ...
> > >         // TODO: future logic could use filters instead of the
> fieldcache if
> > >         // the number of terms in the field is small enough.
> > >         counts = getFieldCacheCounts(searcher, base, field,
> offset,limit,
> > > mincount, missing, sort, prefix);
> > > ...
> > >     FieldCache.StringIndex si =
> > > FieldCache.DEFAULT.getStringIndex(searcher.getReader(), fieldName);
> > >     final String[] terms = si.lookup;
> > >     final int[] termNum = si.order;
> > > ...
> > >
> > >
> > > So that 64-bit requires more memory :)
> > >
> > >
> > > Mike, am I right here?
> > > [(8 bytes pointer) + (4 bytes DocID)] x [Number of Documents
(100mlns)]
> > > (64-bit JVM)
> > > 1.2Gb RAM for this...
> > >
> > > Or, may be I am wrong:
> > >
> > >> For Lucene directly, simple strings would consume an pointer (4 or 8
> > >> bytes depending on whether your JRE is 64bit) per doc, and the string
> > >> index would consume an int (4 bytes) per doc.
> > >>
> > >
> > > [8 bytes (64bit)] x [number of documents (100mlns)]?
> > > 0.8Gb
> > >
> > > Kind of Map between String and DocSet, saving 4 bytes... "Key" is
> String,
> > > and "Value" is array of 64-bit pointers to Document. Why 64-bit (for
> 64-bit
> > > JVM)? I always thought it is (int) documentId...
> > >
> > > Am I right?
> > >
> > >
> > > Thanks for pointing to
http://issues.apache.org/jira/browse/LUCENE-1990!
> > >
> > >
> > >>> Note that for your use case, this is exceptionally wasteful.
> > >>>
> > > This is probably very common case... I think it should be confirmed by
> > > Lucene developers too... FieldCache is warmed anyway, even when we
don't
> use
> > > SOLR...
> > >
> > >
> > > -Fuad
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >> -----Original Message-----
> > >> From: Michael McCandless [mailto:lucene@mikemccandless.com]
> > >> Sent: November-02-09 6:00 PM
> > >> To: solr-user@lucene.apache.org
> > >> Subject: Re: Lucene FieldCache memory requirements
> > >>
> > >> OK I think someone who knows how Solr uses the fieldCache for this
> > >> type of field will have to pipe up.
> > >>
> > >> For Lucene directly, simple strings would consume an pointer (4 or 8
> > >> bytes depending on whether your JRE is 64bit) per doc, and the string
> > >> index would consume an int (4 bytes) per doc.  (Each also consume
> > >> negligible (for your case) memory to hold the actual string values).
> > >>
> > >> Note that for your use case, this is exceptionally wasteful.  If
> > >> Lucene had simple bit-packed ints (I've opened LUCENE-1990 for this)
> > >> then it'd take much fewer bits to reference the values, since you
have
> > >> only 10 unique string values.
> > >>
> > >> Mike
> > >>
> > >> On Mon, Nov 2, 2009 at 3:57 PM, Fuad Efendi <fu...@efendi.ca> wrote:
> > >>
> > >>> I am not using Lucene API directly; I am using SOLR which uses
Lucene
> > >>> FieldCache for faceting on non-tokenized fields...
> > >>> I think this cache will be lazily loaded, until user executes sorted
> (by
> > >>> this field) SOLR query for all documents *:* - in this case it will
be
> > >>>
> > > fully
> > >
> > >>> populated...
> > >>>
> > >>>
> > >>>
> > >>>> Subject: Re: Lucene FieldCache memory requirements
> > >>>>
> > >>>> Which FieldCache API are you using?  getStrings?  or getStringIndex
> > >>>> (which is used, under the hood, if you sort by this field).
> > >>>>
> > >>>> Mike
> > >>>>
> > >>>> On Mon, Nov 2, 2009 at 2:27 PM, Fuad Efendi <fu...@efendi.ca> wrote:
> > >>>>
> > >>>>> Any thoughts regarding the subject? I hope FieldCache doesn't use
> > >>>>>
> > > more
> > >
> > >>> than
> > >>>
> > >>>>> 6 bytes per document-field instance... I am too lazy to research
> > >>>>>
> > > Lucene
> > >
> > >>>>> source code, I hope someone can provide exact answer... Thanks
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>>> Subject: Lucene FieldCache memory requirements
> > >>>>>>
> > >>>>>> Hi,
> > >>>>>>
> > >>>>>>
> > >>>>>> Can anyone confirm Lucene FieldCache memory requirements? I have
> 100
> > >>>>>> millions docs with non-tokenized field "country" (10 different
> > >>>>>>
> > >>> countries);
> > >>>
> > >>>>> I
> > >>>>>
> > >>>>>> expect it requires array of ("int", "long"), size of array
> > >>>>>>
> > > 100,000,000,
> > >
> > >>>>>> without any impact of "country" field length;
> > >>>>>>
> > >>>>>> it requires 600,000,000 bytes: "int" is pointer to document
(Lucene
> > >>>>>>
> > >>>>> document
> > >>>>>
> > >>>>>> ID),  and "long" is pointer to String value...
> > >>>>>>
> > >>>>>> Am I right, is it 600Mb just for this "country" (indexed,
> > >>>>>>
> > >>> non-tokenized,
> > >>>
> > >>>>>> non-boolean) field and 100 millions docs? I need to calculate
exact
> > >>>>>>
> > >>>>> minimum RAM
> > >>>>>
> > >>>>>> requirements...
> > >>>>>>
> > >>>>>> I believe it shouldn't depend on cardinality (distribution) of
> > >>>>>>
> > > field...
> > >
> > >>>>>> Thanks,
> > >>>>>> Fuad
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>
> > >>>
> > >
> > >
> > >
> >
> >
> > --
> > - Mark
> >
> > http://www.lucidimagination.com
> >
> >
> - Fuad
> 
> http://www.linkedin.com/in/liferay
> 




RE: Lucene FieldCache memory requirements

Posted by Fuad Efendi <fu...@efendi.ca>.
Ok, my "naive" thinking about FieldCache: for each Term we can quickly
retrieve DocSet. What are memory requirements? Theoretically,
[maxdoc]x[4-bytes DocumentID], plus some (small) array to store terms
pointing to (large) arrays of DocumentIDs.

Mike suggested http://issues.apache.org/jira/browse/LUCENE-1990 to make this
memory requirement even lower... but please correct me if I am wrong with
formula, and I am unsure how it is currently implemented...


Thanks,
Fuad


> -----Original Message-----
> From: Fuad Efendi [mailto:fuad@efendi.ca]
> Sent: November-02-09 8:21 PM
> To: solr-user@lucene.apache.org
> Subject: RE: Lucene FieldCache memory requirements
> 
> Mark,
> 
> I don't understand this:
> > so with a ton of docs and a few uniques, you get a temp boost in the RAM
> > reqs until it sizes it down.
> 
> Sizes down??? Why is it called Cache indeed? And how SOLR uses it if it is
> not cache?
> 
> 
> And this:
> > A pointer for each doc.
> 
> Why can't we use (int) DocumentID? For me, it is natural; 64-bit pointer
to
> an object in RAM is not natural (in Lucene world)...
> 
> 
> So, is it [maxdoc]x[4-bytes], or [maxdoc]x[8-bytes]?...
> -Fuad
> 
> 
> 
> 




RE: Lucene FieldCache memory requirements

Posted by Fuad Efendi <fu...@efendi.ca>.
FieldCache uses internally WeakHashMap... nothing wrong, but... no any
Garbage Collection tuning will help in case if allocated RAM is not enough
for replacing Weak** with Strong**, especially for SOLR faceting... 10%-15%
CPU taken by GC were reported...
-Fuad




RE: Lucene FieldCache memory requirements

Posted by Fuad Efendi <fu...@efendi.ca>.
Even in simplistic scenario, when it is Garbage Collected, we still
_need_to_be_able_ to allocate enough RAM to FieldCache on demand... linear
dependency on document count...


> 
> Hi Mark,
> 
> Yes, I understand it now; however, how will StringIndexCache size down in
a
> production system faceting by Country on a homepage? This is SOLR
> specific...
> 
> 
> Lucene specific: Lucene doesn't read from disk if it can retrieve field
> value for a specific document ID from cache. How will it size down in
purely
> Lucene-based heavy-loaded production system? Especially if this cache is
> used for query optimizations.
> 



RE: Lucene FieldCache memory requirements

Posted by Fuad Efendi <fu...@efendi.ca>.
Hi Mark,

Yes, I understand it now; however, how will StringIndexCache size down in a
production system faceting by Country on a homepage? This is SOLR
specific...


Lucene specific: Lucene doesn't read from disk if it can retrieve field
value for a specific document ID from cache. How will it size down in purely
Lucene-based heavy-loaded production system? Especially if this cache is
used for query optimizations.



> -----Original Message-----
> From: Mark Miller [mailto:markrmiller@gmail.com]
> Sent: November-02-09 8:53 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Lucene FieldCache memory requirements
> 
>  static final class StringIndexCache extends Cache {
>     StringIndexCache(FieldCache wrapper) {
>       super(wrapper);
>     }
> 
>     @Override
>     protected Object createValue(IndexReader reader, Entry entryKey)
>         throws IOException {
>       String field = StringHelper.intern(entryKey.field);
>       final int[] retArray = new int[reader.maxDoc()];
>       String[] mterms = new String[reader.maxDoc()+1];
>       TermDocs termDocs = reader.termDocs();
>       TermEnum termEnum = reader.terms (new Term (field));
>       int t = 0;  // current term number
> 
>       // an entry for documents that have no terms in this field
>       // should a document with no terms be at top or bottom?
>       // this puts them at the top - if it is changed,
> FieldDocSortedHitQueue
>       // needs to change as well.
>       mterms[t++] = null;
> 
>       try {
>         do {
>           Term term = termEnum.term();
>           if (term==null || term.field() != field) break;
> 
>           // store term text
>           // we expect that there is at most one term per document
>           if (t >= mterms.length) throw new RuntimeException ("there are
> more terms than " +
>                   "documents in field \"" + field + "\", but it's
> impossible to sort on " +
>                   "tokenized fields");
>           mterms[t] = term.text();
> 
>           termDocs.seek (termEnum);
>           while (termDocs.next()) {
>             retArray[termDocs.doc()] = t;
>           }
> 
>           t++;
>         } while (termEnum.next());
>       } finally {
>         termDocs.close();
>         termEnum.close();
>       }
> 
>       if (t == 0) {
>         // if there are no terms, make the term array
>         // have a single null entry
>         mterms = new String[1];
>       } else if (t < mterms.length) {
>         // if there are less terms than documents,
>         // trim off the dead array space
>         String[] terms = new String[t];
>         System.arraycopy (mterms, 0, terms, 0, t);
>         mterms = terms;
>       }
> 
>       StringIndex value = new StringIndex (retArray, mterms);
>       return value;
>     }
>   };
> 
> The formula for a String Index fieldcache is essentially the String
> array of unique terms (which does indeed "size down" at the bottom) and
> the int array indexing into the String array.
> 
> 
> Fuad Efendi wrote:
> > To be correct, I analyzed FieldCache awhile ago and I believed it never
> > "sizes down"...
> >
> > /**
> >  * Expert: The default cache implementation, storing all values in
memory.
> >  * A WeakHashMap is used for storage.
> >  *
> >  * <p>Created: May 19, 2004 4:40:36 PM
> >  *
> >  * @since   lucene 1.4
> >  */
> >
> >
> > Will it size down? Only if we are not faceting (as in SOLR v.1.3)...
> >
> > And I am still unsure, Document ID vs. Object Pointer.
> >
> >
> >
> >
> >
> >> I don't understand this:
> >>
> >>> so with a ton of docs and a few uniques, you get a temp boost in the
RAM
> >>> reqs until it sizes it down.
> >>>
> >> Sizes down??? Why is it called Cache indeed? And how SOLR uses it if it
is
> >> not cache?
> >>
> >>
> >
> >
> >
> 
> 
> --
> - Mark
> 
> http://www.lucidimagination.com
> 
> 




Re: Lucene FieldCache memory requirements

Posted by Mark Miller <ma...@gmail.com>.
 static final class StringIndexCache extends Cache {
    StringIndexCache(FieldCache wrapper) {
      super(wrapper);
    }

    @Override
    protected Object createValue(IndexReader reader, Entry entryKey)
        throws IOException {
      String field = StringHelper.intern(entryKey.field);
      final int[] retArray = new int[reader.maxDoc()];
      String[] mterms = new String[reader.maxDoc()+1];
      TermDocs termDocs = reader.termDocs();
      TermEnum termEnum = reader.terms (new Term (field));
      int t = 0;  // current term number

      // an entry for documents that have no terms in this field
      // should a document with no terms be at top or bottom?
      // this puts them at the top - if it is changed,
FieldDocSortedHitQueue
      // needs to change as well.
      mterms[t++] = null;

      try {
        do {
          Term term = termEnum.term();
          if (term==null || term.field() != field) break;

          // store term text
          // we expect that there is at most one term per document
          if (t >= mterms.length) throw new RuntimeException ("there are
more terms than " +
                  "documents in field \"" + field + "\", but it's
impossible to sort on " +
                  "tokenized fields");
          mterms[t] = term.text();

          termDocs.seek (termEnum);
          while (termDocs.next()) {
            retArray[termDocs.doc()] = t;
          }

          t++;
        } while (termEnum.next());
      } finally {
        termDocs.close();
        termEnum.close();
      }

      if (t == 0) {
        // if there are no terms, make the term array
        // have a single null entry
        mterms = new String[1];
      } else if (t < mterms.length) {
        // if there are less terms than documents,
        // trim off the dead array space
        String[] terms = new String[t];
        System.arraycopy (mterms, 0, terms, 0, t);
        mterms = terms;
      }

      StringIndex value = new StringIndex (retArray, mterms);
      return value;
    }
  };

The formula for a String Index fieldcache is essentially the String
array of unique terms (which does indeed "size down" at the bottom) and
the int array indexing into the String array.


Fuad Efendi wrote:
> To be correct, I analyzed FieldCache awhile ago and I believed it never
> "sizes down"...
>
> /**
>  * Expert: The default cache implementation, storing all values in memory.
>  * A WeakHashMap is used for storage.
>  *
>  * <p>Created: May 19, 2004 4:40:36 PM
>  *
>  * @since   lucene 1.4
>  */
>
>
> Will it size down? Only if we are not faceting (as in SOLR v.1.3)...
>
> And I am still unsure, Document ID vs. Object Pointer.
>
>
>
>
>   
>> I don't understand this:
>>     
>>> so with a ton of docs and a few uniques, you get a temp boost in the RAM
>>> reqs until it sizes it down.
>>>       
>> Sizes down??? Why is it called Cache indeed? And how SOLR uses it if it is
>> not cache?
>>
>>     
>
>
>   


-- 
- Mark

http://www.lucidimagination.com




RE: Lucene FieldCache memory requirements

Posted by Fuad Efendi <fu...@efendi.ca>.
To be correct, I analyzed FieldCache awhile ago and I believed it never
"sizes down"...

/**
 * Expert: The default cache implementation, storing all values in memory.
 * A WeakHashMap is used for storage.
 *
 * <p>Created: May 19, 2004 4:40:36 PM
 *
 * @since   lucene 1.4
 */


Will it size down? Only if we are not faceting (as in SOLR v.1.3)...

And I am still unsure, Document ID vs. Object Pointer.




> 
> I don't understand this:
> > so with a ton of docs and a few uniques, you get a temp boost in the RAM
> > reqs until it sizes it down.
> 
> Sizes down??? Why is it called Cache indeed? And how SOLR uses it if it is
> not cache?
> 



RE: Lucene FieldCache memory requirements

Posted by Fuad Efendi <fu...@efendi.ca>.
Mark,

I don't understand this: 
> so with a ton of docs and a few uniques, you get a temp boost in the RAM
> reqs until it sizes it down.

Sizes down??? Why is it called Cache indeed? And how SOLR uses it if it is
not cache?


And this:
> A pointer for each doc.

Why can't we use (int) DocumentID? For me, it is natural; 64-bit pointer to
an object in RAM is not natural (in Lucene world)...


So, is it [maxdoc]x[4-bytes], or [maxdoc]x[8-bytes]?... 
-Fuad






Re: Lucene FieldCache memory requirements

Posted by Mark Miller <ma...@gmail.com>.
Fuad Efendi wrote:
> Simple field (10 different values: Canada, USA, UK, ...), 64-bit JVM... no
> difference between maxdoc and maxdoc + 1 for such estimate... difference is
> between 0.4Gb and 1.2Gb...
>
>   
I'm not sure I understand - but I didn't mean to imply the +1 on maxdoc
meant anything. The issue is that in the end, it only needs a String
array the size of String[UniqueTerms] - but because it can't easily
figure out that number, it first creates an array of String[MaxDoc+1] -
so with a ton of docs and a few uniques, you get a temp boost in the RAM
reqs until it sizes it down. A pointer for each doc.

-- 
- Mark

http://www.lucidimagination.com




RE: Lucene FieldCache memory requirements

Posted by Fuad Efendi <fu...@efendi.ca>.
I just did some tests in a completely new index (Slave), sort by
low-distributed non-tokenized Field (such as Country) takes milliseconds,
but sort (ascending) on tokenized field with heavy distribution took 30
seconds (initially). Second sort (descending) took milliseconds. Generic
query *.*; FieldCache is not used for tokenized fields... how it is sorted
:)
Fortunately, no any OOM.
-Fuad



RE: Lucene FieldCache memory requirements

Posted by Fuad Efendi <fu...@efendi.ca>.
Simple field (10 different values: Canada, USA, UK, ...), 64-bit JVM... no
difference between maxdoc and maxdoc + 1 for such estimate... difference is
between 0.4Gb and 1.2Gb...


So, let's vote ;)

A. [maxdoc] x [8 bytes ~ pointer to String object]

B. [maxdoc] x [8 bytes ~ pointer to Document object]

C. [maxdoc] x [4 bytes ~ (int) Lucene Document ID] 
- same as [String1_Document_Count + ... + String10_Document_Count] x [4
bytes ~ DocumentID]

D. [maxdoc] x [4 bytes + 8 bytes ~ my initial naive thinking...]


Please confirm that it is Pointer to Object and not Lucene Document ID... I
hope it is (int) Document ID...





> -----Original Message-----
> From: Mark Miller [mailto:markrmiller@gmail.com]
> Sent: November-02-09 6:52 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Lucene FieldCache memory requirements
> 
> It also briefly requires more memory than just that - it allocates an
> array the size of maxdoc+1 to hold the unique terms - and then sizes down.
> 
> Possibly we can use the getUnuiqeTermCount method in the flexible
> indexing branch to get rid of that - which is why I was thinking it
> might be a good idea to drop the unsupported exception in that method
> for things like multi reader and just do the work to get the right
> number (currently there is a comment that the user should do that work
> if necessary, making the call unreliable for this).
> 
> Fuad Efendi wrote:
> > Thank you very much Mike,
> >
> > I found it:
> > org.apache.solr.request.SimpleFacets
> > ...
> >         // TODO: future logic could use filters instead of the
fieldcache if
> >         // the number of terms in the field is small enough.
> >         counts = getFieldCacheCounts(searcher, base, field,
offset,limit,
> > mincount, missing, sort, prefix);
> > ...
> >     FieldCache.StringIndex si =
> > FieldCache.DEFAULT.getStringIndex(searcher.getReader(), fieldName);
> >     final String[] terms = si.lookup;
> >     final int[] termNum = si.order;
> > ...
> >
> >
> > So that 64-bit requires more memory :)
> >
> >
> > Mike, am I right here?
> > [(8 bytes pointer) + (4 bytes DocID)] x [Number of Documents (100mlns)]
> > (64-bit JVM)
> > 1.2Gb RAM for this...
> >
> > Or, may be I am wrong:
> >
> >> For Lucene directly, simple strings would consume an pointer (4 or 8
> >> bytes depending on whether your JRE is 64bit) per doc, and the string
> >> index would consume an int (4 bytes) per doc.
> >>
> >
> > [8 bytes (64bit)] x [number of documents (100mlns)]?
> > 0.8Gb
> >
> > Kind of Map between String and DocSet, saving 4 bytes... "Key" is
String,
> > and "Value" is array of 64-bit pointers to Document. Why 64-bit (for
64-bit
> > JVM)? I always thought it is (int) documentId...
> >
> > Am I right?
> >
> >
> > Thanks for pointing to http://issues.apache.org/jira/browse/LUCENE-1990!
> >
> >
> >>> Note that for your use case, this is exceptionally wasteful.
> >>>
> > This is probably very common case... I think it should be confirmed by
> > Lucene developers too... FieldCache is warmed anyway, even when we don't
use
> > SOLR...
> >
> >
> > -Fuad
> >
> >
> >
> >
> >
> >
> >
> >
> >> -----Original Message-----
> >> From: Michael McCandless [mailto:lucene@mikemccandless.com]
> >> Sent: November-02-09 6:00 PM
> >> To: solr-user@lucene.apache.org
> >> Subject: Re: Lucene FieldCache memory requirements
> >>
> >> OK I think someone who knows how Solr uses the fieldCache for this
> >> type of field will have to pipe up.
> >>
> >> For Lucene directly, simple strings would consume an pointer (4 or 8
> >> bytes depending on whether your JRE is 64bit) per doc, and the string
> >> index would consume an int (4 bytes) per doc.  (Each also consume
> >> negligible (for your case) memory to hold the actual string values).
> >>
> >> Note that for your use case, this is exceptionally wasteful.  If
> >> Lucene had simple bit-packed ints (I've opened LUCENE-1990 for this)
> >> then it'd take much fewer bits to reference the values, since you have
> >> only 10 unique string values.
> >>
> >> Mike
> >>
> >> On Mon, Nov 2, 2009 at 3:57 PM, Fuad Efendi <fu...@efendi.ca> wrote:
> >>
> >>> I am not using Lucene API directly; I am using SOLR which uses Lucene
> >>> FieldCache for faceting on non-tokenized fields...
> >>> I think this cache will be lazily loaded, until user executes sorted
(by
> >>> this field) SOLR query for all documents *:* - in this case it will be
> >>>
> > fully
> >
> >>> populated...
> >>>
> >>>
> >>>
> >>>> Subject: Re: Lucene FieldCache memory requirements
> >>>>
> >>>> Which FieldCache API are you using?  getStrings?  or getStringIndex
> >>>> (which is used, under the hood, if you sort by this field).
> >>>>
> >>>> Mike
> >>>>
> >>>> On Mon, Nov 2, 2009 at 2:27 PM, Fuad Efendi <fu...@efendi.ca> wrote:
> >>>>
> >>>>> Any thoughts regarding the subject? I hope FieldCache doesn't use
> >>>>>
> > more
> >
> >>> than
> >>>
> >>>>> 6 bytes per document-field instance... I am too lazy to research
> >>>>>
> > Lucene
> >
> >>>>> source code, I hope someone can provide exact answer... Thanks
> >>>>>
> >>>>>
> >>>>>
> >>>>>> Subject: Lucene FieldCache memory requirements
> >>>>>>
> >>>>>> Hi,
> >>>>>>
> >>>>>>
> >>>>>> Can anyone confirm Lucene FieldCache memory requirements? I have
100
> >>>>>> millions docs with non-tokenized field "country" (10 different
> >>>>>>
> >>> countries);
> >>>
> >>>>> I
> >>>>>
> >>>>>> expect it requires array of ("int", "long"), size of array
> >>>>>>
> > 100,000,000,
> >
> >>>>>> without any impact of "country" field length;
> >>>>>>
> >>>>>> it requires 600,000,000 bytes: "int" is pointer to document (Lucene
> >>>>>>
> >>>>> document
> >>>>>
> >>>>>> ID),  and "long" is pointer to String value...
> >>>>>>
> >>>>>> Am I right, is it 600Mb just for this "country" (indexed,
> >>>>>>
> >>> non-tokenized,
> >>>
> >>>>>> non-boolean) field and 100 millions docs? I need to calculate exact
> >>>>>>
> >>>>> minimum RAM
> >>>>>
> >>>>>> requirements...
> >>>>>>
> >>>>>> I believe it shouldn't depend on cardinality (distribution) of
> >>>>>>
> > field...
> >
> >>>>>> Thanks,
> >>>>>> Fuad
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>
> >>>
> >
> >
> >
> 
> 
> --
> - Mark
> 
> http://www.lucidimagination.com
> 
> 
- Fuad

http://www.linkedin.com/in/liferay



Re: Lucene FieldCache memory requirements

Posted by Mark Miller <ma...@gmail.com>.
It also briefly requires more memory than just that - it allocates an
array the size of maxdoc+1 to hold the unique terms - and then sizes down.

Possibly we can use the getUnuiqeTermCount method in the flexible
indexing branch to get rid of that - which is why I was thinking it
might be a good idea to drop the unsupported exception in that method
for things like multi reader and just do the work to get the right
number (currently there is a comment that the user should do that work
if necessary, making the call unreliable for this).

Fuad Efendi wrote:
> Thank you very much Mike,
>
> I found it:
> org.apache.solr.request.SimpleFacets
> ...
>         // TODO: future logic could use filters instead of the fieldcache if
>         // the number of terms in the field is small enough.
>         counts = getFieldCacheCounts(searcher, base, field, offset,limit,
> mincount, missing, sort, prefix);
> ...
>     FieldCache.StringIndex si =
> FieldCache.DEFAULT.getStringIndex(searcher.getReader(), fieldName);
>     final String[] terms = si.lookup;
>     final int[] termNum = si.order;
> ...
>
>
> So that 64-bit requires more memory :)
>
>
> Mike, am I right here?
> [(8 bytes pointer) + (4 bytes DocID)] x [Number of Documents (100mlns)]
> (64-bit JVM)
> 1.2Gb RAM for this...
>
> Or, may be I am wrong:
>   
>> For Lucene directly, simple strings would consume an pointer (4 or 8
>> bytes depending on whether your JRE is 64bit) per doc, and the string
>> index would consume an int (4 bytes) per doc.
>>     
>
> [8 bytes (64bit)] x [number of documents (100mlns)]? 
> 0.8Gb
>
> Kind of Map between String and DocSet, saving 4 bytes... "Key" is String,
> and "Value" is array of 64-bit pointers to Document. Why 64-bit (for 64-bit
> JVM)? I always thought it is (int) documentId...
>
> Am I right?
>
>
> Thanks for pointing to http://issues.apache.org/jira/browse/LUCENE-1990!
>
>   
>>> Note that for your use case, this is exceptionally wasteful.  
>>>       
> This is probably very common case... I think it should be confirmed by
> Lucene developers too... FieldCache is warmed anyway, even when we don't use
> SOLR...
>
>  
> -Fuad
>
>
>
>
>
>
>
>   
>> -----Original Message-----
>> From: Michael McCandless [mailto:lucene@mikemccandless.com]
>> Sent: November-02-09 6:00 PM
>> To: solr-user@lucene.apache.org
>> Subject: Re: Lucene FieldCache memory requirements
>>
>> OK I think someone who knows how Solr uses the fieldCache for this
>> type of field will have to pipe up.
>>
>> For Lucene directly, simple strings would consume an pointer (4 or 8
>> bytes depending on whether your JRE is 64bit) per doc, and the string
>> index would consume an int (4 bytes) per doc.  (Each also consume
>> negligible (for your case) memory to hold the actual string values).
>>
>> Note that for your use case, this is exceptionally wasteful.  If
>> Lucene had simple bit-packed ints (I've opened LUCENE-1990 for this)
>> then it'd take much fewer bits to reference the values, since you have
>> only 10 unique string values.
>>
>> Mike
>>
>> On Mon, Nov 2, 2009 at 3:57 PM, Fuad Efendi <fu...@efendi.ca> wrote:
>>     
>>> I am not using Lucene API directly; I am using SOLR which uses Lucene
>>> FieldCache for faceting on non-tokenized fields...
>>> I think this cache will be lazily loaded, until user executes sorted (by
>>> this field) SOLR query for all documents *:* - in this case it will be
>>>       
> fully
>   
>>> populated...
>>>
>>>
>>>       
>>>> Subject: Re: Lucene FieldCache memory requirements
>>>>
>>>> Which FieldCache API are you using?  getStrings?  or getStringIndex
>>>> (which is used, under the hood, if you sort by this field).
>>>>
>>>> Mike
>>>>
>>>> On Mon, Nov 2, 2009 at 2:27 PM, Fuad Efendi <fu...@efendi.ca> wrote:
>>>>         
>>>>> Any thoughts regarding the subject? I hope FieldCache doesn't use
>>>>>           
> more
>   
>>> than
>>>       
>>>>> 6 bytes per document-field instance... I am too lazy to research
>>>>>           
> Lucene
>   
>>>>> source code, I hope someone can provide exact answer... Thanks
>>>>>
>>>>>
>>>>>           
>>>>>> Subject: Lucene FieldCache memory requirements
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>>
>>>>>> Can anyone confirm Lucene FieldCache memory requirements? I have 100
>>>>>> millions docs with non-tokenized field "country" (10 different
>>>>>>             
>>> countries);
>>>       
>>>>> I
>>>>>           
>>>>>> expect it requires array of ("int", "long"), size of array
>>>>>>             
> 100,000,000,
>   
>>>>>> without any impact of "country" field length;
>>>>>>
>>>>>> it requires 600,000,000 bytes: "int" is pointer to document (Lucene
>>>>>>             
>>>>> document
>>>>>           
>>>>>> ID),  and "long" is pointer to String value...
>>>>>>
>>>>>> Am I right, is it 600Mb just for this "country" (indexed,
>>>>>>             
>>> non-tokenized,
>>>       
>>>>>> non-boolean) field and 100 millions docs? I need to calculate exact
>>>>>>             
>>>>> minimum RAM
>>>>>           
>>>>>> requirements...
>>>>>>
>>>>>> I believe it shouldn't depend on cardinality (distribution) of
>>>>>>             
> field...
>   
>>>>>> Thanks,
>>>>>> Fuad
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>             
>>>>>
>>>>>
>>>>>           
>>>
>>>       
>
>
>   


-- 
- Mark

http://www.lucidimagination.com




RE: Lucene FieldCache memory requirements

Posted by Fuad Efendi <fu...@efendi.ca>.
Thank you very much Mike,

I found it:
org.apache.solr.request.SimpleFacets
...
        // TODO: future logic could use filters instead of the fieldcache if
        // the number of terms in the field is small enough.
        counts = getFieldCacheCounts(searcher, base, field, offset,limit,
mincount, missing, sort, prefix);
...
    FieldCache.StringIndex si =
FieldCache.DEFAULT.getStringIndex(searcher.getReader(), fieldName);
    final String[] terms = si.lookup;
    final int[] termNum = si.order;
...


So that 64-bit requires more memory :)


Mike, am I right here?
[(8 bytes pointer) + (4 bytes DocID)] x [Number of Documents (100mlns)]
(64-bit JVM)
1.2Gb RAM for this...

Or, may be I am wrong:
> For Lucene directly, simple strings would consume an pointer (4 or 8
> bytes depending on whether your JRE is 64bit) per doc, and the string
> index would consume an int (4 bytes) per doc.

[8 bytes (64bit)] x [number of documents (100mlns)]? 
0.8Gb

Kind of Map between String and DocSet, saving 4 bytes... "Key" is String,
and "Value" is array of 64-bit pointers to Document. Why 64-bit (for 64-bit
JVM)? I always thought it is (int) documentId...

Am I right?


Thanks for pointing to http://issues.apache.org/jira/browse/LUCENE-1990!

>> Note that for your use case, this is exceptionally wasteful.  
This is probably very common case... I think it should be confirmed by
Lucene developers too... FieldCache is warmed anyway, even when we don't use
SOLR...

 
-Fuad







> -----Original Message-----
> From: Michael McCandless [mailto:lucene@mikemccandless.com]
> Sent: November-02-09 6:00 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Lucene FieldCache memory requirements
> 
> OK I think someone who knows how Solr uses the fieldCache for this
> type of field will have to pipe up.
> 
> For Lucene directly, simple strings would consume an pointer (4 or 8
> bytes depending on whether your JRE is 64bit) per doc, and the string
> index would consume an int (4 bytes) per doc.  (Each also consume
> negligible (for your case) memory to hold the actual string values).
> 
> Note that for your use case, this is exceptionally wasteful.  If
> Lucene had simple bit-packed ints (I've opened LUCENE-1990 for this)
> then it'd take much fewer bits to reference the values, since you have
> only 10 unique string values.
> 
> Mike
> 
> On Mon, Nov 2, 2009 at 3:57 PM, Fuad Efendi <fu...@efendi.ca> wrote:
> > I am not using Lucene API directly; I am using SOLR which uses Lucene
> > FieldCache for faceting on non-tokenized fields...
> > I think this cache will be lazily loaded, until user executes sorted (by
> > this field) SOLR query for all documents *:* - in this case it will be
fully
> > populated...
> >
> >
> >> Subject: Re: Lucene FieldCache memory requirements
> >>
> >> Which FieldCache API are you using?  getStrings?  or getStringIndex
> >> (which is used, under the hood, if you sort by this field).
> >>
> >> Mike
> >>
> >> On Mon, Nov 2, 2009 at 2:27 PM, Fuad Efendi <fu...@efendi.ca> wrote:
> >> > Any thoughts regarding the subject? I hope FieldCache doesn't use
more
> > than
> >> > 6 bytes per document-field instance... I am too lazy to research
Lucene
> >> > source code, I hope someone can provide exact answer... Thanks
> >> >
> >> >
> >> >> Subject: Lucene FieldCache memory requirements
> >> >>
> >> >> Hi,
> >> >>
> >> >>
> >> >> Can anyone confirm Lucene FieldCache memory requirements? I have 100
> >> >> millions docs with non-tokenized field "country" (10 different
> > countries);
> >> > I
> >> >> expect it requires array of ("int", "long"), size of array
100,000,000,
> >> >> without any impact of "country" field length;
> >> >>
> >> >> it requires 600,000,000 bytes: "int" is pointer to document (Lucene
> >> > document
> >> >> ID),  and "long" is pointer to String value...
> >> >>
> >> >> Am I right, is it 600Mb just for this "country" (indexed,
> > non-tokenized,
> >> >> non-boolean) field and 100 millions docs? I need to calculate exact
> >> > minimum RAM
> >> >> requirements...
> >> >>
> >> >> I believe it shouldn't depend on cardinality (distribution) of
field...
> >> >>
> >> >> Thanks,
> >> >> Fuad
> >> >>
> >> >>
> >> >>
> >> >>
> >> >
> >> >
> >> >
> >> >
> >
> >
> >



Re: Lucene FieldCache memory requirements

Posted by Michael McCandless <lu...@mikemccandless.com>.
OK I think someone who knows how Solr uses the fieldCache for this
type of field will have to pipe up.

For Lucene directly, simple strings would consume an pointer (4 or 8
bytes depending on whether your JRE is 64bit) per doc, and the string
index would consume an int (4 bytes) per doc.  (Each also consume
negligible (for your case) memory to hold the actual string values).

Note that for your use case, this is exceptionally wasteful.  If
Lucene had simple bit-packed ints (I've opened LUCENE-1990 for this)
then it'd take much fewer bits to reference the values, since you have
only 10 unique string values.

Mike

On Mon, Nov 2, 2009 at 3:57 PM, Fuad Efendi <fu...@efendi.ca> wrote:
> I am not using Lucene API directly; I am using SOLR which uses Lucene
> FieldCache for faceting on non-tokenized fields...
> I think this cache will be lazily loaded, until user executes sorted (by
> this field) SOLR query for all documents *:* - in this case it will be fully
> populated...
>
>
>> Subject: Re: Lucene FieldCache memory requirements
>>
>> Which FieldCache API are you using?  getStrings?  or getStringIndex
>> (which is used, under the hood, if you sort by this field).
>>
>> Mike
>>
>> On Mon, Nov 2, 2009 at 2:27 PM, Fuad Efendi <fu...@efendi.ca> wrote:
>> > Any thoughts regarding the subject? I hope FieldCache doesn't use more
> than
>> > 6 bytes per document-field instance... I am too lazy to research Lucene
>> > source code, I hope someone can provide exact answer... Thanks
>> >
>> >
>> >> Subject: Lucene FieldCache memory requirements
>> >>
>> >> Hi,
>> >>
>> >>
>> >> Can anyone confirm Lucene FieldCache memory requirements? I have 100
>> >> millions docs with non-tokenized field "country" (10 different
> countries);
>> > I
>> >> expect it requires array of ("int", "long"), size of array 100,000,000,
>> >> without any impact of "country" field length;
>> >>
>> >> it requires 600,000,000 bytes: "int" is pointer to document (Lucene
>> > document
>> >> ID),  and "long" is pointer to String value...
>> >>
>> >> Am I right, is it 600Mb just for this "country" (indexed,
> non-tokenized,
>> >> non-boolean) field and 100 millions docs? I need to calculate exact
>> > minimum RAM
>> >> requirements...
>> >>
>> >> I believe it shouldn't depend on cardinality (distribution) of field...
>> >>
>> >> Thanks,
>> >> Fuad
>> >>
>> >>
>> >>
>> >>
>> >
>> >
>> >
>> >
>
>
>

RE: Lucene FieldCache memory requirements

Posted by Fuad Efendi <fu...@efendi.ca>.
I am not using Lucene API directly; I am using SOLR which uses Lucene
FieldCache for faceting on non-tokenized fields...
I think this cache will be lazily loaded, until user executes sorted (by
this field) SOLR query for all documents *:* - in this case it will be fully
populated...


> Subject: Re: Lucene FieldCache memory requirements
> 
> Which FieldCache API are you using?  getStrings?  or getStringIndex
> (which is used, under the hood, if you sort by this field).
> 
> Mike
> 
> On Mon, Nov 2, 2009 at 2:27 PM, Fuad Efendi <fu...@efendi.ca> wrote:
> > Any thoughts regarding the subject? I hope FieldCache doesn't use more
than
> > 6 bytes per document-field instance... I am too lazy to research Lucene
> > source code, I hope someone can provide exact answer... Thanks
> >
> >
> >> Subject: Lucene FieldCache memory requirements
> >>
> >> Hi,
> >>
> >>
> >> Can anyone confirm Lucene FieldCache memory requirements? I have 100
> >> millions docs with non-tokenized field "country" (10 different
countries);
> > I
> >> expect it requires array of ("int", "long"), size of array 100,000,000,
> >> without any impact of "country" field length;
> >>
> >> it requires 600,000,000 bytes: "int" is pointer to document (Lucene
> > document
> >> ID),  and "long" is pointer to String value...
> >>
> >> Am I right, is it 600Mb just for this "country" (indexed,
non-tokenized,
> >> non-boolean) field and 100 millions docs? I need to calculate exact
> > minimum RAM
> >> requirements...
> >>
> >> I believe it shouldn't depend on cardinality (distribution) of field...
> >>
> >> Thanks,
> >> Fuad
> >>
> >>
> >>
> >>
> >
> >
> >
> >



Re: Lucene FieldCache memory requirements

Posted by Michael McCandless <lu...@mikemccandless.com>.
Which FieldCache API are you using?  getStrings?  or getStringIndex
(which is used, under the hood, if you sort by this field).

Mike

On Mon, Nov 2, 2009 at 2:27 PM, Fuad Efendi <fu...@efendi.ca> wrote:
> Any thoughts regarding the subject? I hope FieldCache doesn't use more than
> 6 bytes per document-field instance... I am too lazy to research Lucene
> source code, I hope someone can provide exact answer... Thanks
>
>
>> Subject: Lucene FieldCache memory requirements
>>
>> Hi,
>>
>>
>> Can anyone confirm Lucene FieldCache memory requirements? I have 100
>> millions docs with non-tokenized field "country" (10 different countries);
> I
>> expect it requires array of ("int", "long"), size of array 100,000,000,
>> without any impact of "country" field length;
>>
>> it requires 600,000,000 bytes: "int" is pointer to document (Lucene
> document
>> ID),  and "long" is pointer to String value...
>>
>> Am I right, is it 600Mb just for this "country" (indexed, non-tokenized,
>> non-boolean) field and 100 millions docs? I need to calculate exact
> minimum RAM
>> requirements...
>>
>> I believe it shouldn't depend on cardinality (distribution) of field...
>>
>> Thanks,
>> Fuad
>>
>>
>>
>>
>
>
>
>