You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by amamare <la...@gmail.com> on 2008/01/03 11:06:57 UTC

Re: SolrIndexWriter holding reference to deleted file?

I haven't been able to get a profiler at the server yet, but I thought I
might show how my code works, because it's quite different from the example
in the link you provided...


public synchronized ResultItem[] search(String query) throws
CorruptIndexException, IOException{
  SolrIndexSearcher searcher = new SolrIndexSearcher(solrCore.getSchema(),
"MySearcher", solrCore.getIndexDir(), true);
  Hits hits = search(searcher, query);
  for(int i =0; i < hits.length(); i++){
       parse(hits.doc(i));
       //add to result-array
  }
  searcher.close();
  //return result-array
}

private Hits search(SolrIndexSearcher searcher, String pQuery){
  try {
	SolrQueryParser parser = new SolrQueryParser(solrCore.getSchema(), "text");
//default search field is called "text"
        Query query = parser.parse(pQuery);
  	return searcher.search(query);
  }
  //catch exceptions
}


This is the code that does the searching. The searcher is passed as a
parameter to the search-method, because it needs to be open while I'm
parsing the documents in the hits. I know I should move the closure of the
search-operation to a finally-block, will do that in any case, but I doubt
it will solve the problem because I've never had any exceptions in this
code. Might the problem be that I'm not using SolrQueryRequest objects?

Best regards, 


Yonik Seeley wrote:
> 
> This is probably related to "using Solr/Lucene embeddedly"
> See the warning at the top of http://wiki.apache.org/solr/EmbeddedSolr
> 
> It does sound like your SolrIndexSearcher objects aren't being closed.
> Solr (via SolrCore) doesn't rely on garbage collection to close the
> searchers (since gc unfortunately can't be triggered by low
> descriptors).  SolrIndexSearcher objects are reference counted and
> closed when no longer in use.  This means that SolrQueryRequest
> objects must always be closed or the refcount will be off.
> 
> Not sure where you could start except perhaps trying to verify the
> number of live SolrIndexSearcher objects.
> 
> -Yonik
> 
> On Dec 20, 2007 8:20 AM, amamare <la...@gmail.com> wrote:
>>
>> I have an application consisting of three web applications running on
>> JBoss
>> 1.4.2 on a Linux Redhat server. I'm using Solr/Lucene embeddedly to
>> create
>> and maintain a frequently updated index. Once updated, the index is
>> copied
>> to another directory used for searching. Old index-files in the search
>> directory are then deleted. The streams used to copy the files are closed
>> in
>> finally-blocks. After a few days an IOException occurs because of "too
>> many
>> open files". When I run the linux command
>>
>> ls -l /proc/26788/fd/
>>
>> where 26788 is jboss' process id, it gives me a seemingly ever-increasing
>> list of deleted files (1 per update since I optimize on every update and
>> use
>> compound file format), marked with 'deleted' in parantheses. They are all
>> located in the search directory. From what I understand this means that
>> something still holds a reference to the file, and that the file will be
>> permanently deleted once this something loses its reference to it.
>>
>> Only SolrIndexSearcher objects are in direct contact with these files in
>> the
>> search application. The searchers are local objects in search-methods,
>> and
>> are closed after every search operation. In theory, the garbage collector
>> should collect these objects later (though while profiling other
>> applications I've noticed that it often doesn't garbage collect until the
>> allocated memory starts running out).
>>
>> The other objects in contact with the files are the FileOutputStreams
>> used
>> to copy them, but as stated above, these are closed in finally-blocks and
>> thus should hold no reference to the files.
>>
>> I need to get rid of the "too many open files"-problem. I suspect that it
>> is
>> related to the almost-deleted files in the proc-dir, but I know too
>> little
>> of Linux to be sure. Does the problem ring a bell to anyone, or do you
>> have
>> any ideas as to how I can get rid of the problem?
>>
>> All help is greatly appreciated.
>> --
>> View this message in context:
>> http://www.nabble.com/SolrIndexWriter-holding-reference-to-deleted-file--tp14436326p14436326.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
> 
> 

-- 
View this message in context: http://www.nabble.com/SolrIndexWriter-holding-reference-to-deleted-file--tp14436326p14594325.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: SolrIndexWriter holding reference to deleted file?

Posted by amamare <la...@gmail.com>.

Hossman, thank you for clearing that up. The reason I create a new searcher
for every search is that the index is frequently updated, and as far as I
could read the documentation, a searcher will not detect changes in the
index that occured after it was opened. I tried using just one searcher, but
that did not work. As for the rest of the code, SolrJ is not available until
Solr 1.3, and I actually never found the example provided at
http://wiki.apache.org/solr/EmbeddedSolr (I can't see that it's linked to
from the main wiki), I only found this http://wiki.apache.org/solr/SolJava.

Anyway, I'll see if I can convert my code to the logic of the example at
EmbeddedSolr.



hossman wrote:
> 
> 
> : I haven't been able to get a profiler at the server yet, but I thought I
> : might show how my code works, because it's quite different from the
> example
> : in the link you provided...
> 
> i'm not even sure i really understand the orrigins of this thread, but 
> regardless of what the "main" topic is, regarding the specific topic of 
> hte code you posted: this is all a very bad idea.  
> 
> Creating a new searcher for every query is a bad idea.  Using the Hits 
> class for any reason is a bad idea.  I say all of this without 
> having any idea what "ResultItem" looks like, or what the code in the 
> "parse" method does ... they may also be bad ideas.
> 
> If you must do "Embedded Solr" then please follow the examples from the 
> wiki (as i recall there is even some solrj.embedded code to make it even 
> easier then that), and bear in mind that this is seriously "expert" level 
> stuff using very low level APIs that were really never ment for most 
> people to see ... it is very easy to simulteneously shot yourself in the 
> foot while tripping over all the rope Embedded Solr gives you to hang 
> yourself with.
> 
> : public synchronized ResultItem[] search(String query) throws
> : CorruptIndexException, IOException{
> :   SolrIndexSearcher searcher = new
> SolrIndexSearcher(solrCore.getSchema(),
> : "MySearcher", solrCore.getIndexDir(), true);
> :   Hits hits = search(searcher, query);
> :   for(int i =0; i < hits.length(); i++){
> :        parse(hits.doc(i));
> :        //add to result-array
> :   }
> :   searcher.close();
> :   //return result-array
> : }
> : 
> : private Hits search(SolrIndexSearcher searcher, String pQuery){
> :   try {
> : 	SolrQueryParser parser = new SolrQueryParser(solrCore.getSchema(),
> "text");
> : //default search field is called "text"
> :         Query query = parser.parse(pQuery);
> :   	return searcher.search(query);
> :   }
> :   //catch exceptions
> : }
> 
> -Hoss
> 
> 

-- 
View this message in context: http://www.nabble.com/SolrIndexWriter-holding-reference-to-deleted-file--tp14436326p14660123.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: SolrIndexWriter holding reference to deleted file?

Posted by Erick Erickson <er...@gmail.com>.

"it is very easy to simultaneously shoot yourself in the
foot while tripping over all the rope <insert package here>
gives you to hang yourself with"

I'm writing that one down and posting it on my wall <G>.

Erick

On Jan 6, 2008 3:25 AM, Chris Hostetter <ho...@fucit.org> wrote:

>
> : I haven't been able to get a profiler at the server yet, but I thought I
> : might show how my code works, because it's quite different from the
> example
> : in the link you provided...
>
> i'm not even sure i really understand the orrigins of this thread, but
> regardless of what the "main" topic is, regarding the specific topic of
> hte code you posted: this is all a very bad idea.
>
> Creating a new searcher for every query is a bad idea.  Using the Hits
> class for any reason is a bad idea.  I say all of this without
> having any idea what "ResultItem" looks like, or what the code in the
> "parse" method does ... they may also be bad ideas.
>
> If you must do "Embedded Solr" then please follow the examples from the
> wiki (as i recall there is even some solrj.embedded code to make it even
> easier then that), and bear in mind that this is seriously "expert" level
> stuff using very low level APIs that were really never ment for most
> people to see ... it is very easy to simulteneously shot yourself in the
> foot while tripping over all the rope Embedded Solr gives you to hang
> yourself with.
>
> : public synchronized ResultItem[] search(String query) throws
> : CorruptIndexException, IOException{
> :   SolrIndexSearcher searcher = new SolrIndexSearcher(solrCore.getSchema
> (),
> : "MySearcher", solrCore.getIndexDir(), true);
> :   Hits hits = search(searcher, query);
> :   for(int i =0; i < hits.length(); i++){
> :        parse(hits.doc(i));
> :        //add to result-array
> :   }
> :   searcher.close();
> :   //return result-array
> : }
> :
> : private Hits search(SolrIndexSearcher searcher, String pQuery){
> :   try {
> :       SolrQueryParser parser = new SolrQueryParser(solrCore.getSchema(),
> "text");
> : //default search field is called "text"
> :         Query query = parser.parse(pQuery);
> :       return searcher.search(query);
> :   }
> :   //catch exceptions
> : }
>
> -Hoss
>

Re: SolrIndexWriter holding reference to deleted file?

Posted by Chris Hostetter <ho...@fucit.org>.

: I haven't been able to get a profiler at the server yet, but I thought I
: might show how my code works, because it's quite different from the example
: in the link you provided...

i'm not even sure i really understand the orrigins of this thread, but 
regardless of what the "main" topic is, regarding the specific topic of 
hte code you posted: this is all a very bad idea.  

Creating a new searcher for every query is a bad idea.  Using the Hits 
class for any reason is a bad idea.  I say all of this without 
having any idea what "ResultItem" looks like, or what the code in the 
"parse" method does ... they may also be bad ideas.

If you must do "Embedded Solr" then please follow the examples from the 
wiki (as i recall there is even some solrj.embedded code to make it even 
easier then that), and bear in mind that this is seriously "expert" level 
stuff using very low level APIs that were really never ment for most 
people to see ... it is very easy to simulteneously shot yourself in the 
foot while tripping over all the rope Embedded Solr gives you to hang 
yourself with.

: public synchronized ResultItem[] search(String query) throws
: CorruptIndexException, IOException{
:   SolrIndexSearcher searcher = new SolrIndexSearcher(solrCore.getSchema(),
: "MySearcher", solrCore.getIndexDir(), true);
:   Hits hits = search(searcher, query);
:   for(int i =0; i < hits.length(); i++){
:        parse(hits.doc(i));
:        //add to result-array
:   }
:   searcher.close();
:   //return result-array
: }
: 
: private Hits search(SolrIndexSearcher searcher, String pQuery){
:   try {
: 	SolrQueryParser parser = new SolrQueryParser(solrCore.getSchema(), "text");
: //default search field is called "text"
:         Query query = parser.parse(pQuery);
:   	return searcher.search(query);
:   }
:   //catch exceptions
: }

-Hoss