You are viewing a plain text version of this content. The canonical link for it is here.

Posted to java-user@lucene.apache.org by Chetan Shah <ch...@gmail.com> on 2009/03/23 18:19:08 UTC

Memory Leak?

I am initiating a simple search and after profiling the my application using
NetBeans. I see a constant heap consumption and eventually a server (tomcat)
crash due to "out of memory" error. The thread count also keeps on
increasing and most of the threads in "wait" state. 

Please let me know what am I doing wrong here so that I can avoid server
crash. I am using Lucene 2.4.0.


			IndexSearcher indexSearcher =
IndexSearcherFactory.getInstance().getIndexSearcher();										
			
			//Create the query and search
			QueryParser queryParser = new QueryParser("contents", new
StandardAnalyzer());
			Query query = queryParser.parse(searchCriteria);
			
			
			TermsFilter categoryFilter = null;
			
			// Create the filter if it is needed.
			if (filter != null) {		
				Term aTerm = new Term(Constants.WATCH_LIST_TYPE_TERM);
				categoryFilter = new TermsFilter();
				for (int i = 0; i < filter.length; i++) {				
					aTerm = aTerm.createTerm(filter[i]);
					categoryFilter.addTerm(aTerm);
				}
			}
			
			// Create sort criteria
			SortField [] sortFields = new SortField[2];
			SortField watchList = new SortField(Constants.WATCH_LIST_TYPE_TERM,
SortField.STRING);
			SortField score = SortField.FIELD_SCORE;
			if (sortByWatchList) {
				sortFields[0] = watchList;
				sortFields[1] = score;
			} else {
				sortFields[1] = watchList;
				sortFields[0] = score;				
				
			}
			Sort sort = new Sort(sortFields);
			
			// Collect results
			TopDocs topDocs = indexSearcher.search(query, categoryFilter,
Constants.MAX_HITS, sort);
			ScoreDoc scoreDoc[] = topDocs.scoreDocs;
			int numDocs = scoreDoc.length;
			if (numDocs > 0) results = scoreDoc;	

-- 
View this message in context: http://www.nabble.com/Memory-Leak--tp22663917p22663917.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Memory Leak?

Posted by Michael McCandless <lu...@mikemccandless.com>.

Is there anything else in this JRE?

65 MB ought to be plenty for what you are trying to do w/ just Lucene,
I think.

Though to differentiate whether "you are not giving enough RAM to
Lucene" vs "you truly have a memory leak", you should try increasing
the heap size to something absurdly big (256 MB?), then see if you can
get the OOME again.  If you do get OOME then it's a real leak, and I
think next step after that is to get a heap dump to see what's using
all the RAM.

Mike

Chetan Shah wrote:

>
> I am using the default heap size which according to Netbeans is  
> around 65MB.
>
> If the RAM directory was not initialized correctly, how am I getting  
> valid
> search results? I am able to execute searches for quite some time  
> before I
> get OOME.
>
> Makes Sense? Or Maybe I am missing something, please let me know.
>
>
>
> Matthew Hall-7 wrote:
>>
>> Perhaps this is a simple question, but looking at your stack trace,  
>> I'm
>> not seeing where it was set during the tomcat initialization, so here
>> goes:
>>
>> Are you setting up the jvm's heap size during your Tomcat  
>> initialization
>> somewhere?
>>
>> If not, that very well could be part of your issue, as the standard  
>> JVM
>> heapsize varies from platform to platform, so your windows based
>> installation of tomcat simply might not have enough JVM Heap  
>> available
>> to completely instantiate your RAMDirectory.
>>
>> So, to start what is your heap currently set at for tomcat?
>>
>> Secondly, if you try to increase it to a more reasonable value (say  
>> 512M
>> or 1G) do you still run into this issue?
>>
>> Matt
>>
>> Chetan Shah wrote:
>>> The stack trace is attached.
>>> http://www.nabble.com/file/p22667542/dump dump
>>>
>>>
>>> The file size of
>>> _30.cfx - 1462KB
>>> _32.cfs - 3432KB
>>> _30.cfs - 645KB
>>>
>>>
>>> The source code of WatchListHTMLUtilities.getHTMLTitle is as  
>>> follows :
>>>
>>> 		File f = new File(htmlFileName);
>>> 		FileInputStream fis = new FileInputStream(f);
>>> 		org.apache.lucene.demo.html.HTMLParser parser = new  
>>> HTMLParser(fis);		
>>> 		String title = parser.getTitle();
>>> 		fis.close();
>>> 		fis = null;
>>> 		f = null;
>>> 		return title;
>>>
>>>
>>>
>>>
>>>
>>> Michael McCandless-2 wrote:
>>>
>>>> Hmm... after how many queries do you see the crash?
>>>>
>>>> Can you post the full OOME stack trace?
>>>>
>>>> You're using a RAMDirectory to hold the entire index... how large  
>>>> is
>>>> your index?
>>>>
>>>> Mike
>>>>
>>>> Chetan Shah wrote:
>>>>
>>>>
>>>>> After reading this forum post :
>>>>> http://www.nabble.com/Lucene-Memory-Leak-tt19276999.html#a19364866
>>>>>
>>>>> I created a Singleton For Standard Analyzer too. But the problem  
>>>>> still
>>>>> persists.
>>>>>
>>>>> I have 2 singletons now. 1 for Standard Analyzer and other for
>>>>> IndexSearcher.
>>>>>
>>>>> The code is as follows :
>>>>>
>>>>> package watchlistsearch.core;
>>>>>
>>>>> import java.io.IOException;
>>>>>
>>>>> import org.apache.lucene.search.IndexSearcher;
>>>>> import org.apache.lucene.store.Directory;
>>>>> import org.apache.lucene.store.RAMDirectory;
>>>>>
>>>>> import watchlistsearch.utils.Constants;
>>>>>
>>>>> public class IndexSearcherFactory {
>>>>> 	
>>>>> 	private static IndexSearcherFactory instance = null;
>>>>> 	
>>>>> 	private IndexSearcher indexSearcher;
>>>>> 	
>>>>> 	private IndexSearcherFactory() {
>>>>> 		
>>>>> 	}
>>>>> 	
>>>>> 	public static IndexSearcherFactory getInstance() {
>>>>> 		
>>>>> 		if (IndexSearcherFactory.instance == null) {			
>>>>> 			IndexSearcherFactory.instance = new IndexSearcherFactory();		
>>>>> 		}
>>>>> 		
>>>>> 		return IndexSearcherFactory.instance;	
>>>>> 		
>>>>> 	}
>>>>> 	
>>>>> 	public IndexSearcher getIndexSearcher() throws IOException {
>>>>> 		
>>>>> 		if (this.indexSearcher == null) {			
>>>>> 			Directory directory = new  
>>>>> RAMDirectory(Constants.INDEX_DIRECTORY);
>>>>> 			indexSearcher = new IndexSearcher(directory);						
>>>>> 		}
>>>>> 		
>>>>> 		return this.indexSearcher;		
>>>>> 	}
>>>>> 			
>>>>> }
>>>>>
>>>>>
>>>>>
>>>>> package watchlistsearch.core;
>>>>>
>>>>> import java.io.IOException;
>>>>>
>>>>> import org.apache.log4j.Logger;
>>>>> import org.apache.lucene.analysis.standard.StandardAnalyzer;
>>>>>
>>>>>
>>>>> ---------------------------------------------------------------
>>>>>
>>>>> public class AnalyzerFactory {
>>>>> 	
>>>>> 	private static AnalyzerFactory instance = null;
>>>>> 	
>>>>> 	private StandardAnalyzer standardAnalyzer;
>>>>> 	
>>>>> 	Logger logger = Logger.getLogger(AnalyzerFactory.class);
>>>>> 	
>>>>> 	private AnalyzerFactory() {
>>>>> 		
>>>>> 	}
>>>>> 	
>>>>> 	public static AnalyzerFactory getInstance() {
>>>>> 		
>>>>> 		if (AnalyzerFactory.instance == null) {			
>>>>> 			AnalyzerFactory.instance = new AnalyzerFactory();		
>>>>> 		}
>>>>> 		
>>>>> 		return AnalyzerFactory.instance;	
>>>>> 		
>>>>> 	}
>>>>> 	
>>>>> 	public StandardAnalyzer getStandardAnalyzer() throws  
>>>>> IOException {
>>>>> 		
>>>>> 		if (this.standardAnalyzer == null) {
>>>>> 			this.standardAnalyzer = new StandardAnalyzer();
>>>>> 			logger.debug("StandardAnalyzer Initialized..");
>>>>> 			
>>>>> 		}
>>>>> 		
>>>>> 		return this.standardAnalyzer;		
>>>>> 	}
>>>>> 			
>>>>> }
>>>>>
>>>>> -- 
>>>>> View this message in context:
>>>>> http://www.nabble.com/Memory-Leak--tp22663917p22666121.html
>>>>> Sent from the Lucene - Java Users mailing list archive at  
>>>>> Nabble.com.
>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>
>>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>>
>
> -- 
> View this message in context: http://www.nabble.com/Memory-Leak--tp22663917p22668265.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Memory Leak?

Posted by Chetan Shah <ch...@gmail.com>.

I am using the default heap size which according to Netbeans is around 65MB.

If the RAM directory was not initialized correctly, how am I getting valid
search results? I am able to execute searches for quite some time before I
get OOME. 

Makes Sense? Or Maybe I am missing something, please let me know.



Matthew Hall-7 wrote:
> 
> Perhaps this is a simple question, but looking at your stack trace, I'm 
> not seeing where it was set during the tomcat initialization, so here
> goes:
> 
> Are you setting up the jvm's heap size during your Tomcat initialization 
> somewhere?
> 
> If not, that very well could be part of your issue, as the standard JVM 
> heapsize varies from platform to platform, so your windows based 
> installation of tomcat simply might not have enough JVM Heap available 
> to completely instantiate your RAMDirectory.
> 
> So, to start what is your heap currently set at for tomcat?
> 
> Secondly, if you try to increase it to a more reasonable value (say 512M 
> or 1G) do you still run into this issue?
> 
> Matt
> 
> Chetan Shah wrote:
>> The stack trace is attached.
>> http://www.nabble.com/file/p22667542/dump dump 
>>
>>
>> The file size of 
>> _30.cfx - 1462KB
>> _32.cfs - 3432KB
>> _30.cfs - 645KB
>>
>>
>> The source code of WatchListHTMLUtilities.getHTMLTitle is as follows :
>>
>> 		File f = new File(htmlFileName);
>> 		FileInputStream fis = new FileInputStream(f);
>> 		org.apache.lucene.demo.html.HTMLParser parser = new HTMLParser(fis);		
>> 		String title = parser.getTitle();
>> 		fis.close();
>> 		fis = null;
>> 		f = null;
>> 		return title;
>>
>>
>>
>>
>>
>> Michael McCandless-2 wrote:
>>   
>>> Hmm... after how many queries do you see the crash?
>>>
>>> Can you post the full OOME stack trace?
>>>
>>> You're using a RAMDirectory to hold the entire index... how large is  
>>> your index?
>>>
>>> Mike
>>>
>>> Chetan Shah wrote:
>>>
>>>     
>>>> After reading this forum post :
>>>> http://www.nabble.com/Lucene-Memory-Leak-tt19276999.html#a19364866
>>>>
>>>> I created a Singleton For Standard Analyzer too. But the problem still
>>>> persists.
>>>>
>>>> I have 2 singletons now. 1 for Standard Analyzer and other for
>>>> IndexSearcher.
>>>>
>>>> The code is as follows :
>>>>
>>>> package watchlistsearch.core;
>>>>
>>>> import java.io.IOException;
>>>>
>>>> import org.apache.lucene.search.IndexSearcher;
>>>> import org.apache.lucene.store.Directory;
>>>> import org.apache.lucene.store.RAMDirectory;
>>>>
>>>> import watchlistsearch.utils.Constants;
>>>>
>>>> public class IndexSearcherFactory {
>>>> 	
>>>> 	private static IndexSearcherFactory instance = null;
>>>> 	
>>>> 	private IndexSearcher indexSearcher;
>>>> 	
>>>> 	private IndexSearcherFactory() {
>>>> 		
>>>> 	}
>>>> 	
>>>> 	public static IndexSearcherFactory getInstance() {
>>>> 		
>>>> 		if (IndexSearcherFactory.instance == null) {			
>>>> 			IndexSearcherFactory.instance = new IndexSearcherFactory();		
>>>> 		}
>>>> 		
>>>> 		return IndexSearcherFactory.instance;	
>>>> 		
>>>> 	}
>>>> 	
>>>> 	public IndexSearcher getIndexSearcher() throws IOException {
>>>> 		
>>>> 		if (this.indexSearcher == null) {			
>>>> 			Directory directory = new RAMDirectory(Constants.INDEX_DIRECTORY);
>>>> 			indexSearcher = new IndexSearcher(directory);						
>>>> 		}
>>>> 		
>>>> 		return this.indexSearcher;		
>>>> 	}
>>>> 			
>>>> }
>>>>
>>>>
>>>>
>>>> package watchlistsearch.core;
>>>>
>>>> import java.io.IOException;
>>>>
>>>> import org.apache.log4j.Logger;
>>>> import org.apache.lucene.analysis.standard.StandardAnalyzer;
>>>>
>>>>
>>>> ---------------------------------------------------------------
>>>>
>>>> public class AnalyzerFactory {
>>>> 	
>>>> 	private static AnalyzerFactory instance = null;
>>>> 	
>>>> 	private StandardAnalyzer standardAnalyzer;
>>>> 	
>>>> 	Logger logger = Logger.getLogger(AnalyzerFactory.class);
>>>> 	
>>>> 	private AnalyzerFactory() {
>>>> 		
>>>> 	}
>>>> 	
>>>> 	public static AnalyzerFactory getInstance() {
>>>> 		
>>>> 		if (AnalyzerFactory.instance == null) {			
>>>> 			AnalyzerFactory.instance = new AnalyzerFactory();		
>>>> 		}
>>>> 		
>>>> 		return AnalyzerFactory.instance;	
>>>> 		
>>>> 	}
>>>> 	
>>>> 	public StandardAnalyzer getStandardAnalyzer() throws IOException {
>>>> 		
>>>> 		if (this.standardAnalyzer == null) {
>>>> 			this.standardAnalyzer = new StandardAnalyzer();
>>>> 			logger.debug("StandardAnalyzer Initialized..");
>>>> 			
>>>> 		}
>>>> 		
>>>> 		return this.standardAnalyzer;		
>>>> 	}
>>>> 			
>>>> }
>>>>
>>>> -- 
>>>> View this message in context:
>>>> http://www.nabble.com/Memory-Leak--tp22663917p22666121.html
>>>> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>>       
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>
>>>
>>>     
>>
>>   
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/Memory-Leak--tp22663917p22668265.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Memory Leak?

Posted by Matthew Hall <mh...@informatics.jax.org>.

Perhaps this is a simple question, but looking at your stack trace, I'm 
not seeing where it was set during the tomcat initialization, so here goes:

Are you setting up the jvm's heap size during your Tomcat initialization 
somewhere?

If not, that very well could be part of your issue, as the standard JVM 
heapsize varies from platform to platform, so your windows based 
installation of tomcat simply might not have enough JVM Heap available 
to completely instantiate your RAMDirectory.

So, to start what is your heap currently set at for tomcat?

Secondly, if you try to increase it to a more reasonable value (say 512M 
or 1G) do you still run into this issue?

Matt

Chetan Shah wrote:
> The stack trace is attached.
> http://www.nabble.com/file/p22667542/dump dump 
>
>
> The file size of 
> _30.cfx - 1462KB
> _32.cfs - 3432KB
> _30.cfs - 645KB
>
>
> The source code of WatchListHTMLUtilities.getHTMLTitle is as follows :
>
> 		File f = new File(htmlFileName);
> 		FileInputStream fis = new FileInputStream(f);
> 		org.apache.lucene.demo.html.HTMLParser parser = new HTMLParser(fis);		
> 		String title = parser.getTitle();
> 		fis.close();
> 		fis = null;
> 		f = null;
> 		return title;
>
>
>
>
>
> Michael McCandless-2 wrote:
>   
>> Hmm... after how many queries do you see the crash?
>>
>> Can you post the full OOME stack trace?
>>
>> You're using a RAMDirectory to hold the entire index... how large is  
>> your index?
>>
>> Mike
>>
>> Chetan Shah wrote:
>>
>>     
>>> After reading this forum post :
>>> http://www.nabble.com/Lucene-Memory-Leak-tt19276999.html#a19364866
>>>
>>> I created a Singleton For Standard Analyzer too. But the problem still
>>> persists.
>>>
>>> I have 2 singletons now. 1 for Standard Analyzer and other for
>>> IndexSearcher.
>>>
>>> The code is as follows :
>>>
>>> package watchlistsearch.core;
>>>
>>> import java.io.IOException;
>>>
>>> import org.apache.lucene.search.IndexSearcher;
>>> import org.apache.lucene.store.Directory;
>>> import org.apache.lucene.store.RAMDirectory;
>>>
>>> import watchlistsearch.utils.Constants;
>>>
>>> public class IndexSearcherFactory {
>>> 	
>>> 	private static IndexSearcherFactory instance = null;
>>> 	
>>> 	private IndexSearcher indexSearcher;
>>> 	
>>> 	private IndexSearcherFactory() {
>>> 		
>>> 	}
>>> 	
>>> 	public static IndexSearcherFactory getInstance() {
>>> 		
>>> 		if (IndexSearcherFactory.instance == null) {			
>>> 			IndexSearcherFactory.instance = new IndexSearcherFactory();		
>>> 		}
>>> 		
>>> 		return IndexSearcherFactory.instance;	
>>> 		
>>> 	}
>>> 	
>>> 	public IndexSearcher getIndexSearcher() throws IOException {
>>> 		
>>> 		if (this.indexSearcher == null) {			
>>> 			Directory directory = new RAMDirectory(Constants.INDEX_DIRECTORY);
>>> 			indexSearcher = new IndexSearcher(directory);						
>>> 		}
>>> 		
>>> 		return this.indexSearcher;		
>>> 	}
>>> 			
>>> }
>>>
>>>
>>>
>>> package watchlistsearch.core;
>>>
>>> import java.io.IOException;
>>>
>>> import org.apache.log4j.Logger;
>>> import org.apache.lucene.analysis.standard.StandardAnalyzer;
>>>
>>>
>>> ---------------------------------------------------------------
>>>
>>> public class AnalyzerFactory {
>>> 	
>>> 	private static AnalyzerFactory instance = null;
>>> 	
>>> 	private StandardAnalyzer standardAnalyzer;
>>> 	
>>> 	Logger logger = Logger.getLogger(AnalyzerFactory.class);
>>> 	
>>> 	private AnalyzerFactory() {
>>> 		
>>> 	}
>>> 	
>>> 	public static AnalyzerFactory getInstance() {
>>> 		
>>> 		if (AnalyzerFactory.instance == null) {			
>>> 			AnalyzerFactory.instance = new AnalyzerFactory();		
>>> 		}
>>> 		
>>> 		return AnalyzerFactory.instance;	
>>> 		
>>> 	}
>>> 	
>>> 	public StandardAnalyzer getStandardAnalyzer() throws IOException {
>>> 		
>>> 		if (this.standardAnalyzer == null) {
>>> 			this.standardAnalyzer = new StandardAnalyzer();
>>> 			logger.debug("StandardAnalyzer Initialized..");
>>> 			
>>> 		}
>>> 		
>>> 		return this.standardAnalyzer;		
>>> 	}
>>> 			
>>> }
>>>
>>> -- 
>>> View this message in context:
>>> http://www.nabble.com/Memory-Leak--tp22663917p22666121.html
>>> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>       
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>>
>>     
>
>   


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Memory Leak?

Posted by Chetan Shah <ch...@gmail.com>.

The stack trace is attached.
http://www.nabble.com/file/p22667542/dump dump 


The file size of 
_30.cfx - 1462KB
_32.cfs - 3432KB
_30.cfs - 645KB






Michael McCandless-2 wrote:
> 
> 
> Hmm... after how many queries do you see the crash?
> 
> Can you post the full OOME stack trace?
> 
> You're using a RAMDirectory to hold the entire index... how large is  
> your index?
> 
> Mike
> 
> Chetan Shah wrote:
> 
>>
>> After reading this forum post :
>> http://www.nabble.com/Lucene-Memory-Leak-tt19276999.html#a19364866
>>
>> I created a Singleton For Standard Analyzer too. But the problem still
>> persists.
>>
>> I have 2 singletons now. 1 for Standard Analyzer and other for
>> IndexSearcher.
>>
>> The code is as follows :
>>
>> package watchlistsearch.core;
>>
>> import java.io.IOException;
>>
>> import org.apache.lucene.search.IndexSearcher;
>> import org.apache.lucene.store.Directory;
>> import org.apache.lucene.store.RAMDirectory;
>>
>> import watchlistsearch.utils.Constants;
>>
>> public class IndexSearcherFactory {
>> 	
>> 	private static IndexSearcherFactory instance = null;
>> 	
>> 	private IndexSearcher indexSearcher;
>> 	
>> 	private IndexSearcherFactory() {
>> 		
>> 	}
>> 	
>> 	public static IndexSearcherFactory getInstance() {
>> 		
>> 		if (IndexSearcherFactory.instance == null) {			
>> 			IndexSearcherFactory.instance = new IndexSearcherFactory();		
>> 		}
>> 		
>> 		return IndexSearcherFactory.instance;	
>> 		
>> 	}
>> 	
>> 	public IndexSearcher getIndexSearcher() throws IOException {
>> 		
>> 		if (this.indexSearcher == null) {			
>> 			Directory directory = new RAMDirectory(Constants.INDEX_DIRECTORY);
>> 			indexSearcher = new IndexSearcher(directory);						
>> 		}
>> 		
>> 		return this.indexSearcher;		
>> 	}
>> 			
>> }
>>
>>
>>
>> package watchlistsearch.core;
>>
>> import java.io.IOException;
>>
>> import org.apache.log4j.Logger;
>> import org.apache.lucene.analysis.standard.StandardAnalyzer;
>>
>>
>> ---------------------------------------------------------------
>>
>> public class AnalyzerFactory {
>> 	
>> 	private static AnalyzerFactory instance = null;
>> 	
>> 	private StandardAnalyzer standardAnalyzer;
>> 	
>> 	Logger logger = Logger.getLogger(AnalyzerFactory.class);
>> 	
>> 	private AnalyzerFactory() {
>> 		
>> 	}
>> 	
>> 	public static AnalyzerFactory getInstance() {
>> 		
>> 		if (AnalyzerFactory.instance == null) {			
>> 			AnalyzerFactory.instance = new AnalyzerFactory();		
>> 		}
>> 		
>> 		return AnalyzerFactory.instance;	
>> 		
>> 	}
>> 	
>> 	public StandardAnalyzer getStandardAnalyzer() throws IOException {
>> 		
>> 		if (this.standardAnalyzer == null) {
>> 			this.standardAnalyzer = new StandardAnalyzer();
>> 			logger.debug("StandardAnalyzer Initialized..");
>> 			
>> 		}
>> 		
>> 		return this.standardAnalyzer;		
>> 	}
>> 			
>> }
>>
>> -- 
>> View this message in context:
>> http://www.nabble.com/Memory-Leak--tp22663917p22666121.html
>> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/Memory-Leak--tp22663917p22667542.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Memory Leak?

Posted by Michael McCandless <lu...@mikemccandless.com>.

Hmm... after how many queries do you see the crash?

Can you post the full OOME stack trace?

You're using a RAMDirectory to hold the entire index... how large is  
your index?

Mike

Chetan Shah wrote:

>
> After reading this forum post :
> http://www.nabble.com/Lucene-Memory-Leak-tt19276999.html#a19364866
>
> I created a Singleton For Standard Analyzer too. But the problem still
> persists.
>
> I have 2 singletons now. 1 for Standard Analyzer and other for
> IndexSearcher.
>
> The code is as follows :
>
> package watchlistsearch.core;
>
> import java.io.IOException;
>
> import org.apache.lucene.search.IndexSearcher;
> import org.apache.lucene.store.Directory;
> import org.apache.lucene.store.RAMDirectory;
>
> import watchlistsearch.utils.Constants;
>
> public class IndexSearcherFactory {
> 	
> 	private static IndexSearcherFactory instance = null;
> 	
> 	private IndexSearcher indexSearcher;
> 	
> 	private IndexSearcherFactory() {
> 		
> 	}
> 	
> 	public static IndexSearcherFactory getInstance() {
> 		
> 		if (IndexSearcherFactory.instance == null) {			
> 			IndexSearcherFactory.instance = new IndexSearcherFactory();		
> 		}
> 		
> 		return IndexSearcherFactory.instance;	
> 		
> 	}
> 	
> 	public IndexSearcher getIndexSearcher() throws IOException {
> 		
> 		if (this.indexSearcher == null) {			
> 			Directory directory = new RAMDirectory(Constants.INDEX_DIRECTORY);
> 			indexSearcher = new IndexSearcher(directory);						
> 		}
> 		
> 		return this.indexSearcher;		
> 	}
> 			
> }
>
>
>
> package watchlistsearch.core;
>
> import java.io.IOException;
>
> import org.apache.log4j.Logger;
> import org.apache.lucene.analysis.standard.StandardAnalyzer;
>
>
> ---------------------------------------------------------------
>
> public class AnalyzerFactory {
> 	
> 	private static AnalyzerFactory instance = null;
> 	
> 	private StandardAnalyzer standardAnalyzer;
> 	
> 	Logger logger = Logger.getLogger(AnalyzerFactory.class);
> 	
> 	private AnalyzerFactory() {
> 		
> 	}
> 	
> 	public static AnalyzerFactory getInstance() {
> 		
> 		if (AnalyzerFactory.instance == null) {			
> 			AnalyzerFactory.instance = new AnalyzerFactory();		
> 		}
> 		
> 		return AnalyzerFactory.instance;	
> 		
> 	}
> 	
> 	public StandardAnalyzer getStandardAnalyzer() throws IOException {
> 		
> 		if (this.standardAnalyzer == null) {
> 			this.standardAnalyzer = new StandardAnalyzer();
> 			logger.debug("StandardAnalyzer Initialized..");
> 			
> 		}
> 		
> 		return this.standardAnalyzer;		
> 	}
> 			
> }
>
> -- 
> View this message in context: http://www.nabble.com/Memory-Leak--tp22663917p22666121.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Memory Leak?

Posted by Chetan Shah <ch...@gmail.com>.

After reading this forum post :
http://www.nabble.com/Lucene-Memory-Leak-tt19276999.html#a19364866

I created a Singleton For Standard Analyzer too. But the problem still
persists. 

I have 2 singletons now. 1 for Standard Analyzer and other for
IndexSearcher. 

The code is as follows :

package watchlistsearch.core;

import java.io.IOException;

import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.RAMDirectory;

import watchlistsearch.utils.Constants;

public class IndexSearcherFactory {
	
	private static IndexSearcherFactory instance = null;
	
	private IndexSearcher indexSearcher;
	
	private IndexSearcherFactory() {
		
	}
	
	public static IndexSearcherFactory getInstance() {
		
		if (IndexSearcherFactory.instance == null) {			
			IndexSearcherFactory.instance = new IndexSearcherFactory();		
		}
		
		return IndexSearcherFactory.instance;	
		
	}
	
	public IndexSearcher getIndexSearcher() throws IOException {
		
		if (this.indexSearcher == null) {			
			Directory directory = new RAMDirectory(Constants.INDEX_DIRECTORY);
			indexSearcher = new IndexSearcher(directory);						
		}
		
		return this.indexSearcher;		
	}
			
}



package watchlistsearch.core;

import java.io.IOException;

import org.apache.log4j.Logger;
import org.apache.lucene.analysis.standard.StandardAnalyzer;


---------------------------------------------------------------

public class AnalyzerFactory {
	
	private static AnalyzerFactory instance = null;
	
	private StandardAnalyzer standardAnalyzer;
	
	Logger logger = Logger.getLogger(AnalyzerFactory.class);
	
	private AnalyzerFactory() {
		
	}
	
	public static AnalyzerFactory getInstance() {
		
		if (AnalyzerFactory.instance == null) {			
			AnalyzerFactory.instance = new AnalyzerFactory();		
		}
		
		return AnalyzerFactory.instance;	
		
	}
	
	public StandardAnalyzer getStandardAnalyzer() throws IOException {
		
		if (this.standardAnalyzer == null) {
			this.standardAnalyzer = new StandardAnalyzer();
			logger.debug("StandardAnalyzer Initialized..");
			
		}
		
		return this.standardAnalyzer;		
	}
			
}

-- 
View this message in context: http://www.nabble.com/Memory-Leak--tp22663917p22666121.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Memory Leak?

Posted by Michael McCandless <lu...@mikemccandless.com>.

Actually, I was hoping you could try leaving the getHTML calls in, but
increase the heap size of your Tomcat instance.

Ie, to be sure there really is a leak vs you're just not giving the
JRE enough memory.

I do like your hypothesis, but looking at HTMLParser it seems like the
thread should exit after parsing the HTML.  Or, maybe there's
something about the particular HTML documents you're parsing?  I just
tested this test case:

  public void testHTMLParserLeak() throws Exception {
    for(int i=0;i<100000;i++) {
      InputStream is = new
ByteArrayInputStream("<title>Here</title>".getBytes());
      HTMLParser parser = new HTMLParser(is);
      String title = parser.getTitle();
      assertEquals("Here", title);
      is.close();
    }
  }

And it runs fine and memory seems stable.  Can you try that test case,
but swap in some of your own HTML docs?

Also: can you run "kill -QUIT" on your app to get a full thread dump?
(Hmm I think you may be on windows; I'm not sure what the equivalent
operation is).

Mike

Chetan Shah <ch...@gmail.com> wrote:
>
> Highly appreciate your replies Michael.
>
> No, I don't hit OOME if I comment out the call to getHTMLTitle. The heap
> behaves perfectly.
>
> I completely agree with you, the thread count goes haywire the moment I call
> the HTMLParser.getTitle(). I have seen a thread count of like 600 before my
> I hit OOME (with the getTitle() call on) and 90% of those threads are in
> wait state. They are not doing anything but just sitting there forever, I am
> sure they are consuming the heap and never giving it back.
>
> Does my hypothesis make sense?
>
>
>
>
>
>
>
>
> Michael McCandless-2 wrote:
>>
>> Odd.  I don't know of any memory leaks w/ the demo HTMLParser, hmm
>> though it's doing some fairly scary stuff in its getReader() method.
>> EG it spawns a new thread every time you run it.  And, it's parsing
>> the entire HTML document even though you only want the title.
>>
>> You may want to switch to better supported HTMLParsers, eg NekoHTML.
>>
>> Plus, it would be better if you extracted the title during indexing,
>> and stored in the document, than doing all this work at search time.
>> You want CPU at search time to be minimized (think of all the
>> electricity...).
>>
>> But: if you increase the HEAP do you still eventually hit OOME?
>>
>> Mike
>>
>> Chetan Shah <ch...@gmail.com> wrote:
>>>
>>> After some more researching I discovered that the following code snippet
>>> seems to be the culprit. I have to call this to get the "title" of the
>>> indexed html page. And this is called 10 times as my I display 10 results
>>> on
>>> a page.
>>>
>>> Any Suggestions on how to achieve this without the OOME issue.
>>>
>>>
>>>                File f = new File(htmlFileName);
>>>                FileInputStream fis = new FileInputStream(f);
>>>                HTMLParser parser = new HTMLParser(fis);
>>>                String title = parser.getTitle();
>>>                /* following was added to for my sanity :) */
>>>                parser = null;
>>>                fis.close();
>>>                fis = null;
>>>                f = null;
>>>                /* till here */
>>>                return title;
>>>
>>>
>>> Chetan Shah wrote:
>>>>
>>>> I am initiating a simple search and after profiling the my application
>>>> using NetBeans. I see a constant heap consumption and eventually a
>>>> server
>>>> (tomcat) crash due to "out of memory" error. The thread count also keeps
>>>> on increasing and most of the threads in "wait" state.
>>>>
>>>> Please let me know what am I doing wrong here so that I can avoid server
>>>> crash. I am using Lucene 2.4.0.
>>>>
>>>>
>>>>                       IndexSearcher indexSearcher =
>>>> IndexSearcherFactory.getInstance().getIndexSearcher();
>>>>
>>>>                       //Create the query and search
>>>>                       QueryParser queryParser = new
>>>> QueryParser("contents", new
>>>> StandardAnalyzer());
>>>>                       Query query = queryParser.parse(searchCriteria);
>>>>
>>>>
>>>>                       TermsFilter categoryFilter = null;
>>>>
>>>>                       // Create the filter if it is needed.
>>>>                       if (filter != null) {
>>>>                               Term aTerm = new
>>>> Term(Constants.WATCH_LIST_TYPE_TERM);
>>>>                               categoryFilter = new TermsFilter();
>>>>                               for (int i = 0; i < filter.length; i++) {
>>>>                                       aTerm =
>>>> aTerm.createTerm(filter[i]);
>>>>                                       categoryFilter.addTerm(aTerm);
>>>>                               }
>>>>                       }
>>>>
>>>>                       // Create sort criteria
>>>>                       SortField [] sortFields = new SortField[2];
>>>>                       SortField watchList = new
>>>> SortField(Constants.WATCH_LIST_TYPE_TERM,
>>>> SortField.STRING);
>>>>                       SortField score = SortField.FIELD_SCORE;
>>>>                       if (sortByWatchList) {
>>>>                               sortFields[0] = watchList;
>>>>                               sortFields[1] = score;
>>>>                       } else {
>>>>                               sortFields[1] = watchList;
>>>>                               sortFields[0] = score;
>>>>
>>>>                       }
>>>>                       Sort sort = new Sort(sortFields);
>>>>
>>>>                       // Collect results
>>>>                       TopDocs topDocs = indexSearcher.search(query,
>>>> categoryFilter,
>>>> Constants.MAX_HITS, sort);
>>>>                       ScoreDoc scoreDoc[] = topDocs.scoreDocs;
>>>>                       int numDocs = scoreDoc.length;
>>>>                       if (numDocs > 0) results = scoreDoc;
>>>>
>>>>
>>>
>>> --
>>> View this message in context:
>>> http://www.nabble.com/Memory-Leak--tp22663917p22685294.html
>>> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>>
>
> --
> View this message in context: http://www.nabble.com/Memory-Leak--tp22663917p22686500.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Memory Leak?

Posted by Paul Smith <ps...@aconex.com>.

>
> No, I don't hit OOME if I comment out the call to getHTMLTitle. The  
> heap
> behaves perfectly.
>
> I completely agree with you, the thread count goes haywire the  
> moment I call
> the HTMLParser.getTitle(). I have seen a thread count of like 600  
> before my
> I hit OOME (with the getTitle() call on) and 90% of those threads  
> are in
> wait state. They are not doing anything but just sitting there  
> forever, I am
> sure they are consuming the heap and never giving it back.


Just FYI, on Linux platforms (and I think Windows) the default stack  
size for a thread is 1Mb.  600 extra threads is 600Mb of virtual  
address space, that's outside the heap though so is unlikely to be the  
cause of an actual OutOfMemoryError (if that is actually what you're  
seeing, it's not a different sort of memory error is it?).  Even if  
you fix the OOM condition, but still have 600 threads lying around  
you're on your way to a serious problem on a 32-bit Operating system  
which usually causes a process a horrible death when it's virtual size  
reaches the magically 3Gb mark.  It' only takes 3000 threads (only x5  
more than you have) even without any _heap_ space utilising the  
virtual address space before you reach the cliff with the jagged rocks  
of process death below.

Hope that helps too.

cheers,

Paul

Re: Memory Leak?

Posted by Chetan Shah <ch...@gmail.com>.

Highly appreciate your replies Michael.

No, I don't hit OOME if I comment out the call to getHTMLTitle. The heap
behaves perfectly. 

I completely agree with you, the thread count goes haywire the moment I call
the HTMLParser.getTitle(). I have seen a thread count of like 600 before my
I hit OOME (with the getTitle() call on) and 90% of those threads are in
wait state. They are not doing anything but just sitting there forever, I am
sure they are consuming the heap and never giving it back.

Does my hypothesis make sense?








Michael McCandless-2 wrote:
> 
> Odd.  I don't know of any memory leaks w/ the demo HTMLParser, hmm
> though it's doing some fairly scary stuff in its getReader() method.
> EG it spawns a new thread every time you run it.  And, it's parsing
> the entire HTML document even though you only want the title.
> 
> You may want to switch to better supported HTMLParsers, eg NekoHTML.
> 
> Plus, it would be better if you extracted the title during indexing,
> and stored in the document, than doing all this work at search time.
> You want CPU at search time to be minimized (think of all the
> electricity...).
> 
> But: if you increase the HEAP do you still eventually hit OOME?
> 
> Mike
> 
> Chetan Shah <ch...@gmail.com> wrote:
>>
>> After some more researching I discovered that the following code snippet
>> seems to be the culprit. I have to call this to get the "title" of the
>> indexed html page. And this is called 10 times as my I display 10 results
>> on
>> a page.
>>
>> Any Suggestions on how to achieve this without the OOME issue.
>>
>>
>>                File f = new File(htmlFileName);
>>                FileInputStream fis = new FileInputStream(f);
>>                HTMLParser parser = new HTMLParser(fis);
>>                String title = parser.getTitle();
>>                /* following was added to for my sanity :) */
>>                parser = null;
>>                fis.close();
>>                fis = null;
>>                f = null;
>>                /* till here */
>>                return title;
>>
>>
>> Chetan Shah wrote:
>>>
>>> I am initiating a simple search and after profiling the my application
>>> using NetBeans. I see a constant heap consumption and eventually a
>>> server
>>> (tomcat) crash due to "out of memory" error. The thread count also keeps
>>> on increasing and most of the threads in "wait" state.
>>>
>>> Please let me know what am I doing wrong here so that I can avoid server
>>> crash. I am using Lucene 2.4.0.
>>>
>>>
>>>                       IndexSearcher indexSearcher =
>>> IndexSearcherFactory.getInstance().getIndexSearcher();
>>>
>>>                       //Create the query and search
>>>                       QueryParser queryParser = new
>>> QueryParser("contents", new
>>> StandardAnalyzer());
>>>                       Query query = queryParser.parse(searchCriteria);
>>>
>>>
>>>                       TermsFilter categoryFilter = null;
>>>
>>>                       // Create the filter if it is needed.
>>>                       if (filter != null) {
>>>                               Term aTerm = new
>>> Term(Constants.WATCH_LIST_TYPE_TERM);
>>>                               categoryFilter = new TermsFilter();
>>>                               for (int i = 0; i < filter.length; i++) {
>>>                                       aTerm =
>>> aTerm.createTerm(filter[i]);
>>>                                       categoryFilter.addTerm(aTerm);
>>>                               }
>>>                       }
>>>
>>>                       // Create sort criteria
>>>                       SortField [] sortFields = new SortField[2];
>>>                       SortField watchList = new
>>> SortField(Constants.WATCH_LIST_TYPE_TERM,
>>> SortField.STRING);
>>>                       SortField score = SortField.FIELD_SCORE;
>>>                       if (sortByWatchList) {
>>>                               sortFields[0] = watchList;
>>>                               sortFields[1] = score;
>>>                       } else {
>>>                               sortFields[1] = watchList;
>>>                               sortFields[0] = score;
>>>
>>>                       }
>>>                       Sort sort = new Sort(sortFields);
>>>
>>>                       // Collect results
>>>                       TopDocs topDocs = indexSearcher.search(query,
>>> categoryFilter,
>>> Constants.MAX_HITS, sort);
>>>                       ScoreDoc scoreDoc[] = topDocs.scoreDocs;
>>>                       int numDocs = scoreDoc.length;
>>>                       if (numDocs > 0) results = scoreDoc;
>>>
>>>
>>
>> --
>> View this message in context:
>> http://www.nabble.com/Memory-Leak--tp22663917p22685294.html
>> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/Memory-Leak--tp22663917p22686500.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Memory Leak?

Posted by Michael McCandless <lu...@mikemccandless.com>.

Odd.  I don't know of any memory leaks w/ the demo HTMLParser, hmm
though it's doing some fairly scary stuff in its getReader() method.
EG it spawns a new thread every time you run it.  And, it's parsing
the entire HTML document even though you only want the title.

You may want to switch to better supported HTMLParsers, eg NekoHTML.

Plus, it would be better if you extracted the title during indexing,
and stored in the document, than doing all this work at search time.
You want CPU at search time to be minimized (think of all the
electricity...).

But: if you increase the HEAP do you still eventually hit OOME?

Mike

Chetan Shah <ch...@gmail.com> wrote:
>
> After some more researching I discovered that the following code snippet
> seems to be the culprit. I have to call this to get the "title" of the
> indexed html page. And this is called 10 times as my I display 10 results on
> a page.
>
> Any Suggestions on how to achieve this without the OOME issue.
>
>
>                File f = new File(htmlFileName);
>                FileInputStream fis = new FileInputStream(f);
>                HTMLParser parser = new HTMLParser(fis);
>                String title = parser.getTitle();
>                /* following was added to for my sanity :) */
>                parser = null;
>                fis.close();
>                fis = null;
>                f = null;
>                /* till here */
>                return title;
>
>
> Chetan Shah wrote:
>>
>> I am initiating a simple search and after profiling the my application
>> using NetBeans. I see a constant heap consumption and eventually a server
>> (tomcat) crash due to "out of memory" error. The thread count also keeps
>> on increasing and most of the threads in "wait" state.
>>
>> Please let me know what am I doing wrong here so that I can avoid server
>> crash. I am using Lucene 2.4.0.
>>
>>
>>                       IndexSearcher indexSearcher =
>> IndexSearcherFactory.getInstance().getIndexSearcher();
>>
>>                       //Create the query and search
>>                       QueryParser queryParser = new QueryParser("contents", new
>> StandardAnalyzer());
>>                       Query query = queryParser.parse(searchCriteria);
>>
>>
>>                       TermsFilter categoryFilter = null;
>>
>>                       // Create the filter if it is needed.
>>                       if (filter != null) {
>>                               Term aTerm = new Term(Constants.WATCH_LIST_TYPE_TERM);
>>                               categoryFilter = new TermsFilter();
>>                               for (int i = 0; i < filter.length; i++) {
>>                                       aTerm = aTerm.createTerm(filter[i]);
>>                                       categoryFilter.addTerm(aTerm);
>>                               }
>>                       }
>>
>>                       // Create sort criteria
>>                       SortField [] sortFields = new SortField[2];
>>                       SortField watchList = new SortField(Constants.WATCH_LIST_TYPE_TERM,
>> SortField.STRING);
>>                       SortField score = SortField.FIELD_SCORE;
>>                       if (sortByWatchList) {
>>                               sortFields[0] = watchList;
>>                               sortFields[1] = score;
>>                       } else {
>>                               sortFields[1] = watchList;
>>                               sortFields[0] = score;
>>
>>                       }
>>                       Sort sort = new Sort(sortFields);
>>
>>                       // Collect results
>>                       TopDocs topDocs = indexSearcher.search(query, categoryFilter,
>> Constants.MAX_HITS, sort);
>>                       ScoreDoc scoreDoc[] = topDocs.scoreDocs;
>>                       int numDocs = scoreDoc.length;
>>                       if (numDocs > 0) results = scoreDoc;
>>
>>
>
> --
> View this message in context: http://www.nabble.com/Memory-Leak--tp22663917p22685294.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Memory Leak?

Posted by Chetan Shah <ch...@gmail.com>.

After some more researching I discovered that the following code snippet
seems to be the culprit. I have to call this to get the "title" of the
indexed html page. And this is called 10 times as my I display 10 results on
a page. 

Any Suggestions on how to achieve this without the OOME issue.


		File f = new File(htmlFileName);
		FileInputStream fis = new FileInputStream(f);
		HTMLParser parser = new HTMLParser(fis);		
		String title = parser.getTitle();	
	        /* following was added to for my sanity :) */
		parser = null;
		fis.close();
		fis = null;
		f = null;
                /* till here */
		return title;


Chetan Shah wrote:
> 
> I am initiating a simple search and after profiling the my application
> using NetBeans. I see a constant heap consumption and eventually a server
> (tomcat) crash due to "out of memory" error. The thread count also keeps
> on increasing and most of the threads in "wait" state. 
> 
> Please let me know what am I doing wrong here so that I can avoid server
> crash. I am using Lucene 2.4.0.
> 
> 
> 			IndexSearcher indexSearcher =
> IndexSearcherFactory.getInstance().getIndexSearcher();										
> 			
> 			//Create the query and search
> 			QueryParser queryParser = new QueryParser("contents", new
> StandardAnalyzer());
> 			Query query = queryParser.parse(searchCriteria);
> 			
> 			
> 			TermsFilter categoryFilter = null;
> 			
> 			// Create the filter if it is needed.
> 			if (filter != null) {		
> 				Term aTerm = new Term(Constants.WATCH_LIST_TYPE_TERM);
> 				categoryFilter = new TermsFilter();
> 				for (int i = 0; i < filter.length; i++) {				
> 					aTerm = aTerm.createTerm(filter[i]);
> 					categoryFilter.addTerm(aTerm);
> 				}
> 			}
> 			
> 			// Create sort criteria
> 			SortField [] sortFields = new SortField[2];
> 			SortField watchList = new SortField(Constants.WATCH_LIST_TYPE_TERM,
> SortField.STRING);
> 			SortField score = SortField.FIELD_SCORE;
> 			if (sortByWatchList) {
> 				sortFields[0] = watchList;
> 				sortFields[1] = score;
> 			} else {
> 				sortFields[1] = watchList;
> 				sortFields[0] = score;				
> 				
> 			}
> 			Sort sort = new Sort(sortFields);
> 			
> 			// Collect results
> 			TopDocs topDocs = indexSearcher.search(query, categoryFilter,
> Constants.MAX_HITS, sort);
> 			ScoreDoc scoreDoc[] = topDocs.scoreDocs;
> 			int numDocs = scoreDoc.length;
> 			if (numDocs > 0) results = scoreDoc;	
> 
> 

-- 
View this message in context: http://www.nabble.com/Memory-Leak--tp22663917p22685294.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Memory Leak?

Posted by Michael McCandless <lu...@mikemccandless.com>.

OK thanks for bringing closure.

Mike

On Thu, Mar 26, 2009 at 8:37 AM, Chetan Shah <ch...@gmail.com> wrote:
>
> Ok. I was able to conclude that the I am getting OOME due to my usage of HTML
> Parser to get the HTML title and HTML text. I display 10 results per page
> and therefore end up calling the org.apache.lucene.demo.html.HTMLParser 10
> times.
>
> I modified my code to store the title and html summary in the index itself
> and found out that the OOME problem is gone.
>
> I tested this with 256MB heap size.
>
> Thank you all for your valuable advice and help.
>
>
>
> Chetan Shah wrote:
>>
>> I am initiating a simple search and after profiling the my application
>> using NetBeans. I see a constant heap consumption and eventually a server
>> (tomcat) crash due to "out of memory" error. The thread count also keeps
>> on increasing and most of the threads in "wait" state.
>>
>> Please let me know what am I doing wrong here so that I can avoid server
>> crash. I am using Lucene 2.4.0.
>>
>>
>>                       IndexSearcher indexSearcher =
>> IndexSearcherFactory.getInstance().getIndexSearcher();
>>
>>                       //Create the query and search
>>                       QueryParser queryParser = new QueryParser("contents", new
>> StandardAnalyzer());
>>                       Query query = queryParser.parse(searchCriteria);
>>
>>
>>                       TermsFilter categoryFilter = null;
>>
>>                       // Create the filter if it is needed.
>>                       if (filter != null) {
>>                               Term aTerm = new Term(Constants.WATCH_LIST_TYPE_TERM);
>>                               categoryFilter = new TermsFilter();
>>                               for (int i = 0; i < filter.length; i++) {
>>                                       aTerm = aTerm.createTerm(filter[i]);
>>                                       categoryFilter.addTerm(aTerm);
>>                               }
>>                       }
>>
>>                       // Create sort criteria
>>                       SortField [] sortFields = new SortField[2];
>>                       SortField watchList = new SortField(Constants.WATCH_LIST_TYPE_TERM,
>> SortField.STRING);
>>                       SortField score = SortField.FIELD_SCORE;
>>                       if (sortByWatchList) {
>>                               sortFields[0] = watchList;
>>                               sortFields[1] = score;
>>                       } else {
>>                               sortFields[1] = watchList;
>>                               sortFields[0] = score;
>>
>>                       }
>>                       Sort sort = new Sort(sortFields);
>>
>>                       // Collect results
>>                       TopDocs topDocs = indexSearcher.search(query, categoryFilter,
>> Constants.MAX_HITS, sort);
>>                       ScoreDoc scoreDoc[] = topDocs.scoreDocs;
>>                       int numDocs = scoreDoc.length;
>>                       if (numDocs > 0) results = scoreDoc;
>>
>>
>
> --
> View this message in context: http://www.nabble.com/Memory-Leak--tp22663917p22721161.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Memory Leak?

Posted by Chetan Shah <ch...@gmail.com>.

Ok. I was able to conclude that the I am getting OOME due to my usage of HTML
Parser to get the HTML title and HTML text. I display 10 results per page
and therefore end up calling the org.apache.lucene.demo.html.HTMLParser 10
times. 

I modified my code to store the title and html summary in the index itself
and found out that the OOME problem is gone. 

I tested this with 256MB heap size.

Thank you all for your valuable advice and help.



Chetan Shah wrote:
> 
> I am initiating a simple search and after profiling the my application
> using NetBeans. I see a constant heap consumption and eventually a server
> (tomcat) crash due to "out of memory" error. The thread count also keeps
> on increasing and most of the threads in "wait" state. 
> 
> Please let me know what am I doing wrong here so that I can avoid server
> crash. I am using Lucene 2.4.0.
> 
> 
> 			IndexSearcher indexSearcher =
> IndexSearcherFactory.getInstance().getIndexSearcher();										
> 			
> 			//Create the query and search
> 			QueryParser queryParser = new QueryParser("contents", new
> StandardAnalyzer());
> 			Query query = queryParser.parse(searchCriteria);
> 			
> 			
> 			TermsFilter categoryFilter = null;
> 			
> 			// Create the filter if it is needed.
> 			if (filter != null) {		
> 				Term aTerm = new Term(Constants.WATCH_LIST_TYPE_TERM);
> 				categoryFilter = new TermsFilter();
> 				for (int i = 0; i < filter.length; i++) {				
> 					aTerm = aTerm.createTerm(filter[i]);
> 					categoryFilter.addTerm(aTerm);
> 				}
> 			}
> 			
> 			// Create sort criteria
> 			SortField [] sortFields = new SortField[2];
> 			SortField watchList = new SortField(Constants.WATCH_LIST_TYPE_TERM,
> SortField.STRING);
> 			SortField score = SortField.FIELD_SCORE;
> 			if (sortByWatchList) {
> 				sortFields[0] = watchList;
> 				sortFields[1] = score;
> 			} else {
> 				sortFields[1] = watchList;
> 				sortFields[0] = score;				
> 				
> 			}
> 			Sort sort = new Sort(sortFields);
> 			
> 			// Collect results
> 			TopDocs topDocs = indexSearcher.search(query, categoryFilter,
> Constants.MAX_HITS, sort);
> 			ScoreDoc scoreDoc[] = topDocs.scoreDocs;
> 			int numDocs = scoreDoc.length;
> 			if (numDocs > 0) results = scoreDoc;	
> 
> 

-- 
View this message in context: http://www.nabble.com/Memory-Leak--tp22663917p22721161.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Memory Leak?

Posted by Chetan Shah <ch...@gmail.com>.

No, I have a singleton from where I get my searcher and it is kept through
out the application.



Michael McCandless-2 wrote:
> 
> 
> Are you not closing the IndexSearcher?
> 
> Mike
> 
> Chetan Shah wrote:
> 
>>
>> I am initiating a simple search and after profiling the my  
>> application using
>> NetBeans. I see a constant heap consumption and eventually a server  
>> (tomcat)
>> crash due to "out of memory" error. The thread count also keeps on
>> increasing and most of the threads in "wait" state.
>>
>> Please let me know what am I doing wrong here so that I can avoid  
>> server
>> crash. I am using Lucene 2.4.0.
>>
>>
>> 			IndexSearcher indexSearcher =
>> IndexSearcherFactory.getInstance().getIndexSearcher();										
>> 			
>> 			//Create the query and search
>> 			QueryParser queryParser = new QueryParser("contents", new
>> StandardAnalyzer());
>> 			Query query = queryParser.parse(searchCriteria);
>> 			
>> 			
>> 			TermsFilter categoryFilter = null;
>> 			
>> 			// Create the filter if it is needed.
>> 			if (filter != null) {		
>> 				Term aTerm = new Term(Constants.WATCH_LIST_TYPE_TERM);
>> 				categoryFilter = new TermsFilter();
>> 				for (int i = 0; i < filter.length; i++) {				
>> 					aTerm = aTerm.createTerm(filter[i]);
>> 					categoryFilter.addTerm(aTerm);
>> 				}
>> 			}
>> 			
>> 			// Create sort criteria
>> 			SortField [] sortFields = new SortField[2];
>> 			SortField watchList = new SortField(Constants.WATCH_LIST_TYPE_TERM,
>> SortField.STRING);
>> 			SortField score = SortField.FIELD_SCORE;
>> 			if (sortByWatchList) {
>> 				sortFields[0] = watchList;
>> 				sortFields[1] = score;
>> 			} else {
>> 				sortFields[1] = watchList;
>> 				sortFields[0] = score;				
>> 				
>> 			}
>> 			Sort sort = new Sort(sortFields);
>> 			
>> 			// Collect results
>> 			TopDocs topDocs = indexSearcher.search(query, categoryFilter,
>> Constants.MAX_HITS, sort);
>> 			ScoreDoc scoreDoc[] = topDocs.scoreDocs;
>> 			int numDocs = scoreDoc.length;
>> 			if (numDocs > 0) results = scoreDoc;	
>>
>> -- 
>> View this message in context:
>> http://www.nabble.com/Memory-Leak--tp22663917p22663917.html
>> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/Memory-Leak--tp22663917p22666060.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Memory Leak?

Posted by Michael McCandless <lu...@mikemccandless.com>.

Are you not closing the IndexSearcher?

Mike

Chetan Shah wrote:

>
> I am initiating a simple search and after profiling the my  
> application using
> NetBeans. I see a constant heap consumption and eventually a server  
> (tomcat)
> crash due to "out of memory" error. The thread count also keeps on
> increasing and most of the threads in "wait" state.
>
> Please let me know what am I doing wrong here so that I can avoid  
> server
> crash. I am using Lucene 2.4.0.
>
>
> 			IndexSearcher indexSearcher =
> IndexSearcherFactory.getInstance().getIndexSearcher();										
> 			
> 			//Create the query and search
> 			QueryParser queryParser = new QueryParser("contents", new
> StandardAnalyzer());
> 			Query query = queryParser.parse(searchCriteria);
> 			
> 			
> 			TermsFilter categoryFilter = null;
> 			
> 			// Create the filter if it is needed.
> 			if (filter != null) {		
> 				Term aTerm = new Term(Constants.WATCH_LIST_TYPE_TERM);
> 				categoryFilter = new TermsFilter();
> 				for (int i = 0; i < filter.length; i++) {				
> 					aTerm = aTerm.createTerm(filter[i]);
> 					categoryFilter.addTerm(aTerm);
> 				}
> 			}
> 			
> 			// Create sort criteria
> 			SortField [] sortFields = new SortField[2];
> 			SortField watchList = new SortField(Constants.WATCH_LIST_TYPE_TERM,
> SortField.STRING);
> 			SortField score = SortField.FIELD_SCORE;
> 			if (sortByWatchList) {
> 				sortFields[0] = watchList;
> 				sortFields[1] = score;
> 			} else {
> 				sortFields[1] = watchList;
> 				sortFields[0] = score;				
> 				
> 			}
> 			Sort sort = new Sort(sortFields);
> 			
> 			// Collect results
> 			TopDocs topDocs = indexSearcher.search(query, categoryFilter,
> Constants.MAX_HITS, sort);
> 			ScoreDoc scoreDoc[] = topDocs.scoreDocs;
> 			int numDocs = scoreDoc.length;
> 			if (numDocs > 0) results = scoreDoc;	
>
> -- 
> View this message in context: http://www.nabble.com/Memory-Leak--tp22663917p22663917.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org