You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Jan Philipp Seng <jp...@gmx.de> on 2005/08/02 15:30:57 UTC

how to free memory after index ist build.

I am using the Lucene 1.4.3 API. After building the index over 150000
documents (~250 MB data), Lucene does not free the memory that is used
during indexing. The searcher runs as a servlet under Tomcat. Every time the
index is build new, the indexing process takes free memory, so after ten
runs the memory is completly full. I tried to call the garbage collector
explicitly, but that does not help.
I build the index to harddisc and load it into a RAMDirectory after
building. There is no reference to my indexer left after indexing and I
cannot find a reason why the garbage collector does not free the memory.
Here are some important points of my code reduced to the central
functionality. Do you have any ideas, what kind of problem this could be ?
An answer would help me a lot.


IndexTablesDaemon start the indexing process:
IndexTablesDaemon.run():
-----------------------------
while (true) {
  IndexTables indexer = new IndexTables();
  indexer.indexTables(); // indexing if necessary, writing the new index to
harddisc
  indexer = null;        // for freeing the memory, does not help
  FTS.initNewIndex();    // switching the old with the new index
  Runtime.getRuntime().gc();   // calling the garbage collector. Does not
help.
  Thread.sleep(lWait);         // waiting a fix space of time
} 

		
		
Building the index: sending queries to a mySQLDatabase, 
										buidling a Lucene-document with the mySQL-data und indexing it:
IndexTables.indexTables():
--------------------------
  IndexWriter writer = new IndexWriter(PATHNAME_INDEX_NEW), 
    new TTAnalyser(), true);		

  writer.mergeFactor = 250;
  writer.minMergeDocs = 250;
 
  Document doc = null;

  String sQuery = "SELECT columns FROM table";

  Connection conn = DriverManager.getConnection("jdbc:mysql: ...");
  Statement stmt = conn.createStatement()
  ResultSet	rs = stmt.executeQuery(sQuery);

  while (m_rs.next()) {
    doc = getTTDocument();
    // fill doc with fields from the database query
    writer.addDocument(doc);
  }

  writer.optimize();
  writer.close();
  writer = null;              // for freeing memory. Does not help

  Runtime.getRuntime().gc();  // explicitely running the garbage collector.
Does not help
}
	
	
		
Exchanging the old an the new index for queries to the Lucene-index.
FTS.initNewIndex():
-------------------
  File oldIndex = new File(sPathIndex);
  File newIndex = new File(sPathIndexNew);
 
  // deleting all files in the old index from harddisc
  // let the new index become the operating index
  oldIndex.delete();
  newIndex.renameTo(oldIndex);
 
  // load the new index to RAMDirectory from harddisc
  SearchTables.reloadIndex();		
				
			
Thanks for your help,

Jan Philipp Seng, Germany, Aachen

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: how to free memory after index ist build.

Posted by Jan Philipp Seng <jp...@gmx.de>.
Hi Otis,

thank you for you tips so far. I will use a profiles for finding out,
where the problem is. All your hints were helpful for me. I think the
problem is the connection to mySQL, not the indexer. So I will look
for newsgroup on the connector mysql-connector-java-3.1.10-bin.jar.
Or do you have any experience with memory leaks with this connector
and how to avoid them ?

Bye bye,

Jan

> However, i see a few problems in your code.
> 1) you should take the JDBC code for getting the connection and
> creation of an SQL statement out of that method, so it is not called
> repeteadly - you can reuse the same connection!
> 
> 2) I don't see the code to close your statement, connection, and
> ResultSet.  Those typically go to a finally block.

I have a finalize method, that release mySQL-resources in reverse-order 
of their creation, but I left this out for campacting my problem.

> 
> 3) mergeFactor looks suspiciously high.
I have measured the speed of indexing with different values for 
mergefactor. 
  writer.mergeFactor = 250;
  writer.minMergeDocs = 250;
produces highest speed for me. I have enough memory for doing this. 
But I need to reuse this memory on the next index run ;-)

> 
> 4) You opening IndexWriter, optimizing the index and closing it a lot. 
> Do you really need to optimize it that often?
I do this once a day. I donĀ“t want to leave the connection open for
about 23:45 hours (indexing takes only 15 mins for 150000 docs). 
Do you think it is a problem to load the driver over and over again ?

conn = DriverManager.getConnection("jdbc:mysql://" ...");


> 
> The gc() call you are making is just a suggestion for the JVM - "now
> may be a good time to consider running GC".  The JVM may ignore this
> suggestion.
> 
> Here is a handy method:
> 
>     private static long gc()
>     {
>         long freeMemBefore = Runtime.getRuntime().freeMemory();
>         System.out.println("Free Memory Before: " + freeMemBefore);
>         System.gc();
>         try {
>             Thread.sleep(1000);
>             System.runFinalization();
>             Thread.sleep(1000);
>         } catch (InterruptedException e) {
>             e.printStackTrace();
>         }
>         System.gc();
>         long freeMemAfter = Runtime.getRuntime().freeMemory();
>         System.out.println("Total Memory      : " +
> Runtime.getRuntime().totalMemory());
>         System.out.println("Max Memory        : " +
> Runtime.getRuntime().maxMemory());
>         System.out.println("Free Memory After : " + freeMemAfter);
>         return freeMemBefore-freeMemAfter;
>     }
> 
> 
> Otis
> 
> 
> --- Jan Philipp Seng <jp...@gmx.de> wrote:
> 
> > I am using the Lucene 1.4.3 API. After building the index over 150000
> > documents (~250 MB data), Lucene does not free the memory that is
> > used
> > during indexing. The searcher runs as a servlet under Tomcat. Every
> > time the
> > index is build new, the indexing process takes free memory, so after
> > ten
> > runs the memory is completly full. I tried to call the garbage
> > collector
> > explicitly, but that does not help.
> > I build the index to harddisc and load it into a RAMDirectory after
> > building. There is no reference to my indexer left after indexing and
> > I
> > cannot find a reason why the garbage collector does not free the
> > memory.
> > Here are some important points of my code reduced to the central
> > functionality. Do you have any ideas, what kind of problem this could
> > be ?
> > An answer would help me a lot.
> > 
> > 
> > IndexTablesDaemon start the indexing process:
> > IndexTablesDaemon.run():
> > -----------------------------
> > while (true) {
> >   IndexTables indexer = new IndexTables();
> >   indexer.indexTables(); // indexing if necessary, writing the new
> > index to
> > harddisc
> >   indexer = null;        // for freeing the memory, does not help
> >   FTS.initNewIndex();    // switching the old with the new index
> >   Runtime.getRuntime().gc();   // calling the garbage collector. Does
> > not
> > help.
> >   Thread.sleep(lWait);         // waiting a fix space of time
> > } 
> > 
> > 		
> > 		
> > Building the index: sending queries to a mySQLDatabase, 
> > 										buidling a Lucene-document with the mySQL-data und indexing
> > it:
> > IndexTables.indexTables():
> > --------------------------
> >   IndexWriter writer = new IndexWriter(PATHNAME_INDEX_NEW), 
> >     new TTAnalyser(), true);		
> > 
> >   writer.mergeFactor = 250;
> >   writer.minMergeDocs = 250;
> >  
> >   Document doc = null;
> > 
> >   String sQuery = "SELECT columns FROM table";
> > 
> >   Connection conn = DriverManager.getConnection("jdbc:mysql: ...");
> >   Statement stmt = conn.createStatement()
> >   ResultSet	rs = stmt.executeQuery(sQuery);
> > 
> >   while (m_rs.next()) {
> >     doc = getTTDocument();
> >     // fill doc with fields from the database query
> >     writer.addDocument(doc);
> >   }
> > 
> >   writer.optimize();
> >   writer.close();
> >   writer = null;              // for freeing memory. Does not help
> > 
> >   Runtime.getRuntime().gc();  // explicitely running the garbage
> > collector.
> > Does not help
> > }
> > 	
> > 	
> > 		
> > Exchanging the old an the new index for queries to the Lucene-index.
> > FTS.initNewIndex():
> > -------------------
> >   File oldIndex = new File(sPathIndex);
> >   File newIndex = new File(sPathIndexNew);
> >  
> >   // deleting all files in the old index from harddisc
> >   // let the new index become the operating index
> >   oldIndex.delete();
> >   newIndex.renameTo(oldIndex);
> >  
> >   // load the new index to RAMDirectory from harddisc
> >   SearchTables.reloadIndex();		
> > 				
> > 			
> > Thanks for your help,
> > 
> > Jan Philipp Seng, Germany, Aachen
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> > 
> > 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: how to free memory after index ist build.

Posted by Chris Lu <ch...@gmail.com>.
If your SQL data volume is large, try to switch to the latest JDBC
driver from mySQL.

mysql-connector-java-3.2.0-alpha-bin.jar

More details is the lower part of this page:
http://wiki.dbsight.com/index.php?title=JDBC_Driver

-- 
Chris Lu
------------
Lucene Search on Any Database
http://www.dbsight.net

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: how to free memory after index ist build.

Posted by Richard Krenek <ri...@gmail.com>.
Try doing that in reverse order:

rs.close();
rs = null;
stmnt.close();
stmnt = null;
conn.close();
conn = null;

I usual do one more step also, just to be safe.

try {rs.close();} catch (Exception ignore) {}
rs = null;
try {stmnt.close();} catch (Exception ignore) {}
stmnt = null;
try {conn.close();} catch (Exception ignore) {}
conn = null;

On 8/3/05, Jan Philipp Seng <jp...@gmx.de> wrote:
> > : 2) I don't see the code to close your statement, connection, and
> > : ResultSet.  Those typically go to a finally block.
> >
> > I'm 85% sure that's the memory leak right there... in absence of a
> > good memory profiler, have you tried commenting out all of the Lucene
> > related code, to make sure that your basic DB Data retrieval code doesn't
> > leak memory?
> >
> > i'm guessing that without the Lucene code, it won't run out of RAM as fast
> > (because their won't be a RAMDirectory index taking up space) but you
> > should still see your free memory steadily decrease.
> 
> Thank you Hoos,
> 
> I left some of my code out to make it compact. There is a finally-block that
> does rigth this.
> But you are completely rigth. I commented out all Lucene related code and
> the free memory decreases every indexing run. I close the connection, but
> this does not free the memory. Do you have another hint for me what I can do
> against the memory leak in the connector to the mySQL-database? I am using
> mysql-connector-java-3.1.10-bin.jar.
> 
> Constructor: inits the members:
>         Connection conn
>   Statement stmt
>   ResultSet rs
> finally block:
>         conn.close();
>         conn = null;
>   stmt.close();
>   stmt = null;
>   rs.close();
>   rs = null;
> 
> Bye for now.
> 
> Jan
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: how to free memory after index ist build.

Posted by Jan Philipp Seng <jp...@gmx.de>.
> : 2) I don't see the code to close your statement, connection, and
> : ResultSet.  Those typically go to a finally block.
> 
> I'm 85% sure that's the memory leak right there... in absence of a
> good memory profiler, have you tried commenting out all of the Lucene
> related code, to make sure that your basic DB Data retrieval code doesn't
> leak memory?
> 
> i'm guessing that without the Lucene code, it won't run out of RAM as fast
> (because their won't be a RAMDirectory index taking up space) but you
> should still see your free memory steadily decrease.

Thank you Hoos,

I left some of my code out to make it compact. There is a finally-block that
does rigth this.
But you are completely rigth. I commented out all Lucene related code and
the free memory decreases every indexing run. I close the connection, but
this does not free the memory. Do you have another hint for me what I can do
against the memory leak in the connector to the mySQL-database? I am using
mysql-connector-java-3.1.10-bin.jar.

Constructor: inits the members:
	Connection conn
  Statement stmt
  ResultSet rs
finally block:
	conn.close();
	conn = null;
  stmt.close();
  stmt = null;
  rs.close();
  rs = null;

Bye for now.

Jan

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: how to free memory after index ist build.

Posted by Chris Hostetter <ho...@fucit.org>.
: 2) I don't see the code to close your statement, connection, and
: ResultSet.  Those typically go to a finally block.

I'm 85% sure that's the memory leak right there... in absence of a
good memory profiler, have you tried commenting out all of the Lucene
related code, to make sure that your basic DB Data retrieval code doesn't
leak memory?

i'm guessing that without the Lucene code, it won't run out of RAM as fast
(because their won't be a RAMDirectory index taking up space) but you
should still see your free memory steadily decrease.


:     private static long gc()
:     {
:         long freeMemBefore = Runtime.getRuntime().freeMemory();
:         System.out.println("Free Memory Before: " + freeMemBefore);
:         System.gc();
:         try {
:             Thread.sleep(1000);
:             System.runFinalization();
:             Thread.sleep(1000);
:         } catch (InterruptedException e) {
:             e.printStackTrace();
:         }
:         System.gc();
:         long freeMemAfter = Runtime.getRuntime().freeMemory();
:         System.out.println("Total Memory      : " +
: Runtime.getRuntime().totalMemory());
:         System.out.println("Max Memory        : " +
: Runtime.getRuntime().maxMemory());
:         System.out.println("Free Memory After : " + freeMemAfter);
:         return freeMemBefore-freeMemAfter;
:     }



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: how to free memory after index ist build.

Posted by Otis Gospodnetic <ot...@yahoo.com>.
Hi Jan,

I don't know where your memory goes - it could be any number of things.
 For instance, somebody mentioned recently that some MySQL JDBC drivers
have known memory leaks.  To figure out where the memory leaks is, and
what's consuming your RAM, run your application under a profiler
(OptimizeIt, JProfiler...).

However, i see a few problems in your code.
1) you should take the JDBC code for getting the connection and
creation of an SQL statement out of that method, so it is not called
repeteadly - you can reuse the same connection!

2) I don't see the code to close your statement, connection, and
ResultSet.  Those typically go to a finally block.

3) mergeFactor looks suspiciously high.

4) You opening IndexWriter, optimizing the index and closing it a lot. 
Do you really need to optimize it that often?

The gc() call you are making is just a suggestion for the JVM - "now
may be a good time to consider running GC".  The JVM may ignore this
suggestion.

Here is a handy method:

    private static long gc()
    {
        long freeMemBefore = Runtime.getRuntime().freeMemory();
        System.out.println("Free Memory Before: " + freeMemBefore);
        System.gc();
        try {
            Thread.sleep(1000);
            System.runFinalization();
            Thread.sleep(1000);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
        System.gc();
        long freeMemAfter = Runtime.getRuntime().freeMemory();
        System.out.println("Total Memory      : " +
Runtime.getRuntime().totalMemory());
        System.out.println("Max Memory        : " +
Runtime.getRuntime().maxMemory());
        System.out.println("Free Memory After : " + freeMemAfter);
        return freeMemBefore-freeMemAfter;
    }


Otis


--- Jan Philipp Seng <jp...@gmx.de> wrote:

> I am using the Lucene 1.4.3 API. After building the index over 150000
> documents (~250 MB data), Lucene does not free the memory that is
> used
> during indexing. The searcher runs as a servlet under Tomcat. Every
> time the
> index is build new, the indexing process takes free memory, so after
> ten
> runs the memory is completly full. I tried to call the garbage
> collector
> explicitly, but that does not help.
> I build the index to harddisc and load it into a RAMDirectory after
> building. There is no reference to my indexer left after indexing and
> I
> cannot find a reason why the garbage collector does not free the
> memory.
> Here are some important points of my code reduced to the central
> functionality. Do you have any ideas, what kind of problem this could
> be ?
> An answer would help me a lot.
> 
> 
> IndexTablesDaemon start the indexing process:
> IndexTablesDaemon.run():
> -----------------------------
> while (true) {
>   IndexTables indexer = new IndexTables();
>   indexer.indexTables(); // indexing if necessary, writing the new
> index to
> harddisc
>   indexer = null;        // for freeing the memory, does not help
>   FTS.initNewIndex();    // switching the old with the new index
>   Runtime.getRuntime().gc();   // calling the garbage collector. Does
> not
> help.
>   Thread.sleep(lWait);         // waiting a fix space of time
> } 
> 
> 		
> 		
> Building the index: sending queries to a mySQLDatabase, 
> 										buidling a Lucene-document with the mySQL-data und indexing
> it:
> IndexTables.indexTables():
> --------------------------
>   IndexWriter writer = new IndexWriter(PATHNAME_INDEX_NEW), 
>     new TTAnalyser(), true);		
> 
>   writer.mergeFactor = 250;
>   writer.minMergeDocs = 250;
>  
>   Document doc = null;
> 
>   String sQuery = "SELECT columns FROM table";
> 
>   Connection conn = DriverManager.getConnection("jdbc:mysql: ...");
>   Statement stmt = conn.createStatement()
>   ResultSet	rs = stmt.executeQuery(sQuery);
> 
>   while (m_rs.next()) {
>     doc = getTTDocument();
>     // fill doc with fields from the database query
>     writer.addDocument(doc);
>   }
> 
>   writer.optimize();
>   writer.close();
>   writer = null;              // for freeing memory. Does not help
> 
>   Runtime.getRuntime().gc();  // explicitely running the garbage
> collector.
> Does not help
> }
> 	
> 	
> 		
> Exchanging the old an the new index for queries to the Lucene-index.
> FTS.initNewIndex():
> -------------------
>   File oldIndex = new File(sPathIndex);
>   File newIndex = new File(sPathIndexNew);
>  
>   // deleting all files in the old index from harddisc
>   // let the new index become the operating index
>   oldIndex.delete();
>   newIndex.renameTo(oldIndex);
>  
>   // load the new index to RAMDirectory from harddisc
>   SearchTables.reloadIndex();		
> 				
> 			
> Thanks for your help,
> 
> Jan Philipp Seng, Germany, Aachen
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org