You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@lucene.apache.org by "Mirza Hadzic (JIRA)" <ji...@apache.org> on 2007/12/17 22:41:43 UTC

[jira] Created: (LUCENE-1091) Big IndexWriter memory leak: when Field.Index.TOKENIZED

Big IndexWriter memory leak: when Field.Index.TOKENIZED
-------------------------------------------------------

                 Key: LUCENE-1091
                 URL: https://issues.apache.org/jira/browse/LUCENE-1091
             Project: Lucene - Java
          Issue Type: Bug
          Components: Index
    Affects Versions: 2.2
         Environment: Ubuntu Linux 7.10, 32-bit
Java 1.6.0 buld 1.6.0_03-b05 (default in Ubuntu 7.10)
1GB RAM
            Reporter: Mirza Hadzic


This little program eats incrementally 2MB of virtual RAM per each 1000 documents indexed, only when Field.Index.TOKENIZED used :

public Document getDoc() {
   Document document = new Document();
   document.add(new Field("foo", "foo bar", Field.Store.NO, Field.Index.TOKENIZED));
   return document;
}

public Document run() {
   IndexWriter writer = new IndexWriter(new File(jIndexFileName), new StandardAnalyzer(), true);					
   for (int i = 0; i < 1000000; i++) {
      writer.addDocument(getDoc());
   }
}


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

[jira] Updated: (LUCENE-1091) Big IndexWriter memory leak: when Field.Index.TOKENIZED

Posted by "Mirza Hadzic (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/LUCENE-1091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mirza Hadzic updated LUCENE-1091:
---------------------------------

    Attachment: lucene.txt

> Big IndexWriter memory leak: when Field.Index.TOKENIZED
> -------------------------------------------------------
>
>                 Key: LUCENE-1091
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1091
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Index
>    Affects Versions: 2.2
>         Environment: Ubuntu Linux 7.10, 32-bit
> Java 1.6.0 buld 1.6.0_03-b05 (default in Ubuntu 7.10)
> 1GB RAM
>            Reporter: Mirza Hadzic
>         Attachments: lucene.txt, TestOOM.java
>
>
> This little program eats incrementally 2MB of virtual RAM per each 1000 documents indexed, only when Field.Index.TOKENIZED used :
> public Document getDoc() {
>    Document document = new Document();
>    document.add(new Field("foo", "foo bar", Field.Store.NO, Field.Index.TOKENIZED));
>    return document;
> }
> public Document run() {
>    IndexWriter writer = new IndexWriter(new File(jIndexFileName), new StandardAnalyzer(), true);					
>    for (int i = 0; i < 1000000; i++) {
>       writer.addDocument(getDoc());
>    }
> }

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

[jira] Commented: (LUCENE-1091) Big IndexWriter memory leak: when Field.Index.TOKENIZED

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/LUCENE-1091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12552744 ] 

Grant Ingersoll commented on LUCENE-1091:
-----------------------------------------

What are your settings for heap size?  Are you actually getting an OutOfMemory exception?  Is it possible that garbage collection isn't being called b/c it doesn't need to?  I would suggest hooking up to a profiler.

> Big IndexWriter memory leak: when Field.Index.TOKENIZED
> -------------------------------------------------------
>
>                 Key: LUCENE-1091
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1091
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Index
>    Affects Versions: 2.2
>         Environment: Ubuntu Linux 7.10, 32-bit
> Java 1.6.0 buld 1.6.0_03-b05 (default in Ubuntu 7.10)
> 1GB RAM
>            Reporter: Mirza Hadzic
>         Attachments: lucene.txt, TestOOM.java
>
>
> This little program eats incrementally 2MB of virtual RAM per each 1000 documents indexed, only when Field.Index.TOKENIZED used :
> public Document getDoc() {
>    Document document = new Document();
>    document.add(new Field("foo", "foo bar", Field.Store.NO, Field.Index.TOKENIZED));
>    return document;
> }
> public Document run() {
>    IndexWriter writer = new IndexWriter(new File(jIndexFileName), new StandardAnalyzer(), true);					
>    for (int i = 0; i < 1000000; i++) {
>       writer.addDocument(getDoc());
>    }
> }

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

[jira] Resolved: (LUCENE-1091) Big IndexWriter memory leak: when Field.Index.TOKENIZED

Posted by "Doron Cohen (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/LUCENE-1091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doron Cohen resolved LUCENE-1091.
---------------------------------

    Resolution: Invalid

Not a Lucene bug...

> Big IndexWriter memory leak: when Field.Index.TOKENIZED
> -------------------------------------------------------
>
>                 Key: LUCENE-1091
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1091
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Index
>    Affects Versions: 2.2
>         Environment: Ubuntu Linux 7.10, 32-bit
> Java 1.6.0 buld 1.6.0_03-b05 (default in Ubuntu 7.10)
> 1GB RAM
>            Reporter: Mirza Hadzic
>         Attachments: lucene.txt, LuceneOOM.PNG, screenshot-1.jpg, TestOOM.java
>
>
> This little program eats incrementally 2MB of virtual RAM per each 1000 documents indexed, only when Field.Index.TOKENIZED used :
> public Document getDoc() {
>    Document document = new Document();
>    document.add(new Field("foo", "foo bar", Field.Store.NO, Field.Index.TOKENIZED));
>    return document;
> }
> public Document run() {
>    IndexWriter writer = new IndexWriter(new File(jIndexFileName), new StandardAnalyzer(), true);					
>    for (int i = 0; i < 1000000; i++) {
>       writer.addDocument(getDoc());
>    }
> }

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

[jira] Commented: (LUCENE-1091) Big IndexWriter memory leak: when Field.Index.TOKENIZED

Posted by "Doron Cohen (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/LUCENE-1091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12552707 ] 

Doron Cohen commented on LUCENE-1091:
-------------------------------------

Hi Mirza, The log you attached indicates that Java's total memory consumption never reaches 5.6 MB. You can see this by the last line printed:
*   after adding 465000 docs, mem:: tot:5.5296 , free:2.33572 , used:3.19388

So it seems you are getting the information on how much memory is consumed from elsewhere?

> Big IndexWriter memory leak: when Field.Index.TOKENIZED
> -------------------------------------------------------
>
>                 Key: LUCENE-1091
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1091
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Index
>    Affects Versions: 2.2
>         Environment: Ubuntu Linux 7.10, 32-bit
> Java 1.6.0 buld 1.6.0_03-b05 (default in Ubuntu 7.10)
> 1GB RAM
>            Reporter: Mirza Hadzic
>         Attachments: lucene.txt, TestOOM.java
>
>
> This little program eats incrementally 2MB of virtual RAM per each 1000 documents indexed, only when Field.Index.TOKENIZED used :
> public Document getDoc() {
>    Document document = new Document();
>    document.add(new Field("foo", "foo bar", Field.Store.NO, Field.Index.TOKENIZED));
>    return document;
> }
> public Document run() {
>    IndexWriter writer = new IndexWriter(new File(jIndexFileName), new StandardAnalyzer(), true);					
>    for (int i = 0; i < 1000000; i++) {
>       writer.addDocument(getDoc());
>    }
> }

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

[jira] Commented: (LUCENE-1091) Big IndexWriter memory leak: when Field.Index.TOKENIZED

Posted by "Mirza Hadzic (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/LUCENE-1091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12552704 ] 

Mirza Hadzic commented on LUCENE-1091:
--------------------------------------

I tested Windows XP / JDK 6, but problem is same like in Linux. When program starts java.exe takes 30MB of Virtual RAM, after 500000 iterations it uses 500MB of virtual RAM and some 450MB of physical RAM (so I killed it, results are attached (lucene.txt). Maybe problem is Java 6 related? Can someone check?

> Big IndexWriter memory leak: when Field.Index.TOKENIZED
> -------------------------------------------------------
>
>                 Key: LUCENE-1091
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1091
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Index
>    Affects Versions: 2.2
>         Environment: Ubuntu Linux 7.10, 32-bit
> Java 1.6.0 buld 1.6.0_03-b05 (default in Ubuntu 7.10)
> 1GB RAM
>            Reporter: Mirza Hadzic
>         Attachments: lucene.txt, TestOOM.java
>
>
> This little program eats incrementally 2MB of virtual RAM per each 1000 documents indexed, only when Field.Index.TOKENIZED used :
> public Document getDoc() {
>    Document document = new Document();
>    document.add(new Field("foo", "foo bar", Field.Store.NO, Field.Index.TOKENIZED));
>    return document;
> }
> public Document run() {
>    IndexWriter writer = new IndexWriter(new File(jIndexFileName), new StandardAnalyzer(), true);					
>    for (int i = 0; i < 1000000; i++) {
>       writer.addDocument(getDoc());
>    }
> }

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

[jira] Commented: (LUCENE-1091) Big IndexWriter memory leak: when Field.Index.TOKENIZED

Posted by "Mirza Hadzic (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/LUCENE-1091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12552725 ] 

Mirza Hadzic commented on LUCENE-1091:
--------------------------------------

yes, I am getting info from process viewer/task manager on both Linux and Windows, it is memory used obviously by JVM that runs IndexWriter, so maybe Java 6 has leak? I also found this, maybe related problem, please look: http://www.mail-archive.com/java-user@lucene.apache.org/msg14459.html

> Big IndexWriter memory leak: when Field.Index.TOKENIZED
> -------------------------------------------------------
>
>                 Key: LUCENE-1091
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1091
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Index
>    Affects Versions: 2.2
>         Environment: Ubuntu Linux 7.10, 32-bit
> Java 1.6.0 buld 1.6.0_03-b05 (default in Ubuntu 7.10)
> 1GB RAM
>            Reporter: Mirza Hadzic
>         Attachments: lucene.txt, TestOOM.java
>
>
> This little program eats incrementally 2MB of virtual RAM per each 1000 documents indexed, only when Field.Index.TOKENIZED used :
> public Document getDoc() {
>    Document document = new Document();
>    document.add(new Field("foo", "foo bar", Field.Store.NO, Field.Index.TOKENIZED));
>    return document;
> }
> public Document run() {
>    IndexWriter writer = new IndexWriter(new File(jIndexFileName), new StandardAnalyzer(), true);					
>    for (int i = 0; i < 1000000; i++) {
>       writer.addDocument(getDoc());
>    }
> }

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

[jira] Commented: (LUCENE-1091) Big IndexWriter memory leak: when Field.Index.TOKENIZED

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/LUCENE-1091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12552772 ] 

Grant Ingersoll commented on LUCENE-1091:
-----------------------------------------

My understanding of virtual memory is this is just the OS being smart about making sure any given process will have the memory it needs when it needs it, plus it will be able to swap it out when needed.  http://tweakhound.com/xp/virtualmemory.htm and http://forums.cnet.com/5208-6142_102-0.html?forumID=7&threadID=42641&messageID=499859 might be of some help.

I don't know if you are actually seeing any problem.  On OS X, I routinely see IntelliJ and other applications with a large gap between the physical memory and the virtual memory.  For instance, right now IntelliJ physical mem is ~450MB while the virtual mem is ~1GB.  Even w/ that, IntelliJ itself is reporting that memory available for garbage collection (in the lower right corner) is around 200M.  Try Googling for XP Process Manager virtual memory or something like that to read more on it.

Granted, I am no expert in how operating systems implement this, but it doesn't seem to me like there actually is a problem.  I would worry more about what the JDK is returning based on Doron's test, which seems to be reasonable.

> Big IndexWriter memory leak: when Field.Index.TOKENIZED
> -------------------------------------------------------
>
>                 Key: LUCENE-1091
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1091
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Index
>    Affects Versions: 2.2
>         Environment: Ubuntu Linux 7.10, 32-bit
> Java 1.6.0 buld 1.6.0_03-b05 (default in Ubuntu 7.10)
> 1GB RAM
>            Reporter: Mirza Hadzic
>         Attachments: lucene.txt, LuceneOOM.PNG, TestOOM.java
>
>
> This little program eats incrementally 2MB of virtual RAM per each 1000 documents indexed, only when Field.Index.TOKENIZED used :
> public Document getDoc() {
>    Document document = new Document();
>    document.add(new Field("foo", "foo bar", Field.Store.NO, Field.Index.TOKENIZED));
>    return document;
> }
> public Document run() {
>    IndexWriter writer = new IndexWriter(new File(jIndexFileName), new StandardAnalyzer(), true);					
>    for (int i = 0; i < 1000000; i++) {
>       writer.addDocument(getDoc());
>    }
> }

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

[jira] Commented: (LUCENE-1091) Big IndexWriter memory leak: when Field.Index.TOKENIZED

Posted by "Doron Cohen (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/LUCENE-1091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12552786 ] 

Doron Cohen commented on LUCENE-1091:
-------------------------------------

I tried on XP with Java 1.6:

{noformat}
  > java -version
  java version "1.6.0_02"
  Java(TM) SE Runtime Environment (build 1.6.0_02-b06)
  Java HotSpot(TM) Client VM (build 1.6.0_02-b06, mixed mode)
{noformat}

!screenshot-1.jpg!

The above screenshot is what happens at the end of adding 9,000,000 docs. The mem usage reported by Java matches that of the task manager (for java should look at total-memory), and is never greater than 36MB. 

I wonder whether you can reproduce this behavior with another, synthetic - non Lucene - simple program, which say, creates some text in string buffers and open random access files, writes in them, and occasionally close the files and open new ones? 

Also (repeating Grant's question)  did you get an out-of-mem error?


> Big IndexWriter memory leak: when Field.Index.TOKENIZED
> -------------------------------------------------------
>
>                 Key: LUCENE-1091
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1091
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Index
>    Affects Versions: 2.2
>         Environment: Ubuntu Linux 7.10, 32-bit
> Java 1.6.0 buld 1.6.0_03-b05 (default in Ubuntu 7.10)
> 1GB RAM
>            Reporter: Mirza Hadzic
>         Attachments: lucene.txt, LuceneOOM.PNG, screenshot-1.jpg, TestOOM.java
>
>
> This little program eats incrementally 2MB of virtual RAM per each 1000 documents indexed, only when Field.Index.TOKENIZED used :
> public Document getDoc() {
>    Document document = new Document();
>    document.add(new Field("foo", "foo bar", Field.Store.NO, Field.Index.TOKENIZED));
>    return document;
> }
> public Document run() {
>    IndexWriter writer = new IndexWriter(new File(jIndexFileName), new StandardAnalyzer(), true);					
>    for (int i = 0; i < 1000000; i++) {
>       writer.addDocument(getDoc());
>    }
> }

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

[jira] Issue Comment Edited: (LUCENE-1091) Big IndexWriter memory leak: when Field.Index.TOKENIZED

Posted by "Doron Cohen (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/LUCENE-1091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12552634 ] 

doronc edited comment on LUCENE-1091 at 12/17/07 10:09 PM:
----------------------------------------------------------------

I was not able to recreate this.

Can you run the attached TestOOM, and see how much memory is consumed and what used-memory stats gets printed?


      was (Author: doronc):
    I was not able to recreate this.

Can you run the attached TestOOM (it expects a single indexDir argument on your system, and see how much memory is consumed and what used-memory stats gets printed?

  
> Big IndexWriter memory leak: when Field.Index.TOKENIZED
> -------------------------------------------------------
>
>                 Key: LUCENE-1091
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1091
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Index
>    Affects Versions: 2.2
>         Environment: Ubuntu Linux 7.10, 32-bit
> Java 1.6.0 buld 1.6.0_03-b05 (default in Ubuntu 7.10)
> 1GB RAM
>            Reporter: Mirza Hadzic
>         Attachments: TestOOM.java
>
>
> This little program eats incrementally 2MB of virtual RAM per each 1000 documents indexed, only when Field.Index.TOKENIZED used :
> public Document getDoc() {
>    Document document = new Document();
>    document.add(new Field("foo", "foo bar", Field.Store.NO, Field.Index.TOKENIZED));
>    return document;
> }
> public Document run() {
>    IndexWriter writer = new IndexWriter(new File(jIndexFileName), new StandardAnalyzer(), true);					
>    for (int i = 0; i < 1000000; i++) {
>       writer.addDocument(getDoc());
>    }
> }

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

[jira] Commented: (LUCENE-1091) Big IndexWriter memory leak: when Field.Index.TOKENIZED

Posted by "Mirza Hadzic (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/LUCENE-1091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12552764 ] 

Mirza Hadzic commented on LUCENE-1091:
--------------------------------------

I am inclined to think this is JVM 6 issue. When running TestOOM with JVM parameters -Xms200m -Xmx200m , it easily increases virtual memory usage past 500MB ! Obviously, JVM have no idea how much memory it actually uses, and log from TestOOM suggest this is the case. Running System.gc() every 100 added documents changes nothing, just like I expected. 

Attached is screenshot of my task list on XP sorted by VM usage when running TestOOM. Second java.exe is Netbeans (CPU: 0%), ignore that. TestOOM is java.exe using 48-50% all the time (I have dual-core, so it uses one full CPU, which is true/OK).

> Big IndexWriter memory leak: when Field.Index.TOKENIZED
> -------------------------------------------------------
>
>                 Key: LUCENE-1091
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1091
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Index
>    Affects Versions: 2.2
>         Environment: Ubuntu Linux 7.10, 32-bit
> Java 1.6.0 buld 1.6.0_03-b05 (default in Ubuntu 7.10)
> 1GB RAM
>            Reporter: Mirza Hadzic
>         Attachments: lucene.txt, TestOOM.java
>
>
> This little program eats incrementally 2MB of virtual RAM per each 1000 documents indexed, only when Field.Index.TOKENIZED used :
> public Document getDoc() {
>    Document document = new Document();
>    document.add(new Field("foo", "foo bar", Field.Store.NO, Field.Index.TOKENIZED));
>    return document;
> }
> public Document run() {
>    IndexWriter writer = new IndexWriter(new File(jIndexFileName), new StandardAnalyzer(), true);					
>    for (int i = 0; i < 1000000; i++) {
>       writer.addDocument(getDoc());
>    }
> }

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

[jira] Updated: (LUCENE-1091) Big IndexWriter memory leak: when Field.Index.TOKENIZED

Posted by "Doron Cohen (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/LUCENE-1091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doron Cohen updated LUCENE-1091:
--------------------------------

    Attachment: screenshot-1.jpg

> Big IndexWriter memory leak: when Field.Index.TOKENIZED
> -------------------------------------------------------
>
>                 Key: LUCENE-1091
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1091
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Index
>    Affects Versions: 2.2
>         Environment: Ubuntu Linux 7.10, 32-bit
> Java 1.6.0 buld 1.6.0_03-b05 (default in Ubuntu 7.10)
> 1GB RAM
>            Reporter: Mirza Hadzic
>         Attachments: lucene.txt, LuceneOOM.PNG, screenshot-1.jpg, TestOOM.java
>
>
> This little program eats incrementally 2MB of virtual RAM per each 1000 documents indexed, only when Field.Index.TOKENIZED used :
> public Document getDoc() {
>    Document document = new Document();
>    document.add(new Field("foo", "foo bar", Field.Store.NO, Field.Index.TOKENIZED));
>    return document;
> }
> public Document run() {
>    IndexWriter writer = new IndexWriter(new File(jIndexFileName), new StandardAnalyzer(), true);					
>    for (int i = 0; i < 1000000; i++) {
>       writer.addDocument(getDoc());
>    }
> }

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

[jira] Resolved: (LUCENE-1091) Big IndexWriter memory leak: when Field.Index.TOKENIZED

Posted by "Mirza Hadzic (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/LUCENE-1091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mirza Hadzic resolved LUCENE-1091.
----------------------------------

    Resolution: Won't Fix

This is bug of (probably) JVM running  in debug mode, Lucene only expose it.

> Big IndexWriter memory leak: when Field.Index.TOKENIZED
> -------------------------------------------------------
>
>                 Key: LUCENE-1091
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1091
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Index
>    Affects Versions: 2.2
>         Environment: Ubuntu Linux 7.10, 32-bit
> Java 1.6.0 buld 1.6.0_03-b05 (default in Ubuntu 7.10)
> 1GB RAM
>            Reporter: Mirza Hadzic
>         Attachments: lucene.txt, LuceneOOM.PNG, screenshot-1.jpg, TestOOM.java
>
>
> This little program eats incrementally 2MB of virtual RAM per each 1000 documents indexed, only when Field.Index.TOKENIZED used :
> public Document getDoc() {
>    Document document = new Document();
>    document.add(new Field("foo", "foo bar", Field.Store.NO, Field.Index.TOKENIZED));
>    return document;
> }
> public Document run() {
>    IndexWriter writer = new IndexWriter(new File(jIndexFileName), new StandardAnalyzer(), true);					
>    for (int i = 0; i < 1000000; i++) {
>       writer.addDocument(getDoc());
>    }
> }

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

[jira] Reopened: (LUCENE-1091) Big IndexWriter memory leak: when Field.Index.TOKENIZED

Posted by "Doron Cohen (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/LUCENE-1091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doron Cohen reopened LUCENE-1091:
---------------------------------


Reopening just to close with "Invalid" - "Won't fix" suggests a known issue that we are not going to fix, I think that "Invalid" is more adequate here.


> Big IndexWriter memory leak: when Field.Index.TOKENIZED
> -------------------------------------------------------
>
>                 Key: LUCENE-1091
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1091
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Index
>    Affects Versions: 2.2
>         Environment: Ubuntu Linux 7.10, 32-bit
> Java 1.6.0 buld 1.6.0_03-b05 (default in Ubuntu 7.10)
> 1GB RAM
>            Reporter: Mirza Hadzic
>         Attachments: lucene.txt, LuceneOOM.PNG, screenshot-1.jpg, TestOOM.java
>
>
> This little program eats incrementally 2MB of virtual RAM per each 1000 documents indexed, only when Field.Index.TOKENIZED used :
> public Document getDoc() {
>    Document document = new Document();
>    document.add(new Field("foo", "foo bar", Field.Store.NO, Field.Index.TOKENIZED));
>    return document;
> }
> public Document run() {
>    IndexWriter writer = new IndexWriter(new File(jIndexFileName), new StandardAnalyzer(), true);					
>    for (int i = 0; i < 1000000; i++) {
>       writer.addDocument(getDoc());
>    }
> }

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

[jira] Updated: (LUCENE-1091) Big IndexWriter memory leak: when Field.Index.TOKENIZED

Posted by "Mirza Hadzic (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/LUCENE-1091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mirza Hadzic updated LUCENE-1091:
---------------------------------

    Attachment: LuceneOOM.PNG

> Big IndexWriter memory leak: when Field.Index.TOKENIZED
> -------------------------------------------------------
>
>                 Key: LUCENE-1091
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1091
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Index
>    Affects Versions: 2.2
>         Environment: Ubuntu Linux 7.10, 32-bit
> Java 1.6.0 buld 1.6.0_03-b05 (default in Ubuntu 7.10)
> 1GB RAM
>            Reporter: Mirza Hadzic
>         Attachments: lucene.txt, LuceneOOM.PNG, TestOOM.java
>
>
> This little program eats incrementally 2MB of virtual RAM per each 1000 documents indexed, only when Field.Index.TOKENIZED used :
> public Document getDoc() {
>    Document document = new Document();
>    document.add(new Field("foo", "foo bar", Field.Store.NO, Field.Index.TOKENIZED));
>    return document;
> }
> public Document run() {
>    IndexWriter writer = new IndexWriter(new File(jIndexFileName), new StandardAnalyzer(), true);					
>    for (int i = 0; i < 1000000; i++) {
>       writer.addDocument(getDoc());
>    }
> }

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

[jira] Commented: (LUCENE-1091) Big IndexWriter memory leak: when Field.Index.TOKENIZED

Posted by "Mirza Hadzic (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/LUCENE-1091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12552784 ] 

Mirza Hadzic commented on LUCENE-1091:
--------------------------------------

I found the reason: Only when running in *debug* mode (NetBeans) JVM takes vast amount of virtual memory . In "normal mode" everything works as expected. Seems something is wrong with debug mode of JVM, but thats nothing to do with Lucene.I will report to NetBeans.

> Big IndexWriter memory leak: when Field.Index.TOKENIZED
> -------------------------------------------------------
>
>                 Key: LUCENE-1091
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1091
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Index
>    Affects Versions: 2.2
>         Environment: Ubuntu Linux 7.10, 32-bit
> Java 1.6.0 buld 1.6.0_03-b05 (default in Ubuntu 7.10)
> 1GB RAM
>            Reporter: Mirza Hadzic
>         Attachments: lucene.txt, LuceneOOM.PNG, screenshot-1.jpg, TestOOM.java
>
>
> This little program eats incrementally 2MB of virtual RAM per each 1000 documents indexed, only when Field.Index.TOKENIZED used :
> public Document getDoc() {
>    Document document = new Document();
>    document.add(new Field("foo", "foo bar", Field.Store.NO, Field.Index.TOKENIZED));
>    return document;
> }
> public Document run() {
>    IndexWriter writer = new IndexWriter(new File(jIndexFileName), new StandardAnalyzer(), true);					
>    for (int i = 0; i < 1000000; i++) {
>       writer.addDocument(getDoc());
>    }
> }

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

[jira] Updated: (LUCENE-1091) Big IndexWriter memory leak: when Field.Index.TOKENIZED

Posted by "Doron Cohen (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/LUCENE-1091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doron Cohen updated LUCENE-1091:
--------------------------------

    Attachment: TestOOM.java

Attached TestOMM, not reproducing the problem on XP, JRE 1.5

> Big IndexWriter memory leak: when Field.Index.TOKENIZED
> -------------------------------------------------------
>
>                 Key: LUCENE-1091
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1091
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Index
>    Affects Versions: 2.2
>         Environment: Ubuntu Linux 7.10, 32-bit
> Java 1.6.0 buld 1.6.0_03-b05 (default in Ubuntu 7.10)
> 1GB RAM
>            Reporter: Mirza Hadzic
>         Attachments: TestOOM.java
>
>
> This little program eats incrementally 2MB of virtual RAM per each 1000 documents indexed, only when Field.Index.TOKENIZED used :
> public Document getDoc() {
>    Document document = new Document();
>    document.add(new Field("foo", "foo bar", Field.Store.NO, Field.Index.TOKENIZED));
>    return document;
> }
> public Document run() {
>    IndexWriter writer = new IndexWriter(new File(jIndexFileName), new StandardAnalyzer(), true);					
>    for (int i = 0; i < 1000000; i++) {
>       writer.addDocument(getDoc());
>    }
> }

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

[jira] Commented: (LUCENE-1091) Big IndexWriter memory leak: when Field.Index.TOKENIZED

Posted by "Doron Cohen (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/LUCENE-1091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12552634 ] 

Doron Cohen commented on LUCENE-1091:
-------------------------------------

I was not able to recreate this.

Can you run the attached TestOOM (it expects a single indexDir argument on your system, and see how much memory is consumed and what used-memory stats gets printed?


> Big IndexWriter memory leak: when Field.Index.TOKENIZED
> -------------------------------------------------------
>
>                 Key: LUCENE-1091
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1091
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Index
>    Affects Versions: 2.2
>         Environment: Ubuntu Linux 7.10, 32-bit
> Java 1.6.0 buld 1.6.0_03-b05 (default in Ubuntu 7.10)
> 1GB RAM
>            Reporter: Mirza Hadzic
>
> This little program eats incrementally 2MB of virtual RAM per each 1000 documents indexed, only when Field.Index.TOKENIZED used :
> public Document getDoc() {
>    Document document = new Document();
>    document.add(new Field("foo", "foo bar", Field.Store.NO, Field.Index.TOKENIZED));
>    return document;
> }
> public Document run() {
>    IndexWriter writer = new IndexWriter(new File(jIndexFileName), new StandardAnalyzer(), true);					
>    for (int i = 0; i < 1000000; i++) {
>       writer.addDocument(getDoc());
>    }
> }

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org