You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by onetwothree <jo...@telenet.be> on 2014/01/20 11:02:37 UTC

Memory Usage on Windows Os while indexing

Facts:


OS Windows server 2008

4 Cpu
8 GB Ram

Tomcat Service version 7.0 (64 bit)

Only running Solr
Optional JVM parameters set xmx = 3072, xms = 1024
Solr version 4.5.0.

One Core instance (both for querying and indexing)
*Schema config:*
minGramSize="2" maxGramSize="20"
most of the fields are stored = "true" (required)

*Solr config:*
ramBufferSizeMB: 100
maxIndexingThreads: 8
directoryFactory: MMapDirectory
autocommit: maxdocs 10000, maxtime 15000, opensearcher false
cache (defaults): 
filtercache initialsize:512 size: 512 autowarm: 0
queryresultcache initialsize:512 size: 512 autowarm: 0
documentcache initialsize:512 size: 512 autowarm: 0

Problem description:


We're using a .Net Service (based on Solr.Net) for updating and inserting
documents on a single Solr Core instance. The size of documents sent to Solr
vary from 1 Kb up to 8Mb, we're sending the documents in batches, using one
or multiple threads. The current size of the Solr Index is about 15GB.

The indexing service is running around 4 a 5 hours per day, to complete all
inserts and updates to Solr. While the indexing process is running the
Tomcat process memory usage keeps growing up to > 7GB Ram (using Process
Explorer monitor tool) and does not reduce, even after 24 hours. After a
restart of Tomcat, or a Reload Core in the Solr Admin the memory drops back
to 1 a 2 GB Ram. While using a tool like VisualVM to monitor the Tomcat
process, the memory usage of Tomcat seems ok, memory consumption is in range
of defined jvm startup params (see image).

So it seems that filesystem buffers are consuming all the leftover memory??,
and don't release memory, even after a quite amount of time? Is there a way
handle this behaviour, in a way that not all memory is consumed? Are there
other alternatives? Best practices?

<http://lucene.472066.n3.nabble.com/file/n4112262/Capture.png> 

Thanks in advance




--
View this message in context: http://lucene.472066.n3.nabble.com/Memory-Usage-on-Windows-Os-while-indexing-tp4112262.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Memory Usage on Windows Os while indexing

Posted by Toke Eskildsen <te...@statsbiblioteket.dk>.
On Mon, 2014-01-20 at 11:02 +0100, onetwothree wrote:
> Optional JVM parameters set xmx = 3072, xms = 1024
> directoryFactory: MMapDirectory

[...]

> So it seems that filesystem buffers are consuming all the leftover memory??,
> and don't release memory, even after a quite amount of time?

As long as the memory is indeed leftover, that is the optimal strategy.
Maybe Uwe's explanation of MMapDirectory will help:

http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html

Regards,
Toke Eskildsen, State and University Library, Denmark



Re: Memory Usage on Windows Os while indexing

Posted by Jason Hellman <jh...@innoventsolutions.com>.
To a very large extent, the capability of a platform is measurable by the skill of the team administering it.

If core competencies lie in Windows OS then I would wager heavily the platform will outperform a similar Linux OS installation in the long haul.

All things being equal, it’s really hard to argue with Linux.  But nothing is ever equal.

On Jan 21, 2014, at 8:57 PM, Shawn Heisey <so...@elyograg.org> wrote:

> On 1/21/2014 2:17 AM, onetwothree wrote:
>> Does Solr on a Linux Os has a better memory management than a Windows Os, or
>> can you neglect this comparison?  
> 
> As Toke said, this is indeed debatable.
> 
> I personally believe that Linux is better at almost everything, but if
> you're running a recent 64-bit Windows Server OS, you may not actually
> see a lot of difference.  Microsoft has VERY talented people working for
> them, and even though I won't use it for most server applications,
> Windows is a very capable platform.
> 
> If you ignore personal bias and proceed with the idea that Linux and
> Windows are approximately equal in terms of real-world performance, then
> one factor that might be critical is price.  Linux can be installed for
> zero cost, a standalone bare metal Windows Server license is several
> hundred dollars, sometimes more.
> 
> Thanks,
> Shawn
> 


Re: Memory Usage on Windows Os while indexing

Posted by Shawn Heisey <so...@elyograg.org>.
On 1/21/2014 2:17 AM, onetwothree wrote:
> Does Solr on a Linux Os has a better memory management than a Windows Os, or
> can you neglect this comparison?  

As Toke said, this is indeed debatable.

I personally believe that Linux is better at almost everything, but if
you're running a recent 64-bit Windows Server OS, you may not actually
see a lot of difference.  Microsoft has VERY talented people working for
them, and even though I won't use it for most server applications,
Windows is a very capable platform.

If you ignore personal bias and proceed with the idea that Linux and
Windows are approximately equal in terms of real-world performance, then
one factor that might be critical is price.  Linux can be installed for
zero cost, a standalone bare metal Windows Server license is several
hundred dollars, sometimes more.

Thanks,
Shawn


Re: Memory Usage on Windows Os while indexing

Posted by Toke Eskildsen <te...@statsbiblioteket.dk>.
On Tue, 2014-01-21 at 10:17 +0100, onetwothree wrote:
> Does Solr on a Linux Os has a better memory management than a Windows Os, or
> can you neglect this comparison?  

That is debatable, but in this context you can see them as fairly equal:
Out of the box, they will both use all free memory for caching. Since
MMapDirectory is tied to the OS caching, having a large non-heap memory
allocation should be seen as a good thing: It means that a fair amount
of the index is cached.

Do not worry about the memory being taken from the system: If you start
a another memory-hungry process, the physical memory should be taken
from the buffers used by Solr's MMapDirectory.


NB: You forgot to provide a link to your image on DropBox.

- Toke Eskildsen, State and University Library, Denmark



Re: Memory Usage on Windows Os while indexing

Posted by onetwothree <jo...@telenet.be>.
Does Solr on a Linux Os has a better memory management than a Windows Os, or
can you neglect this comparison?  






--
View this message in context: http://lucene.472066.n3.nabble.com/Memory-Usage-on-Windows-Os-while-indexing-tp4112262p4112416.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Memory Usage on Windows Os while indexing

Posted by onetwothree <jo...@telenet.be>.
Thanks for the reply, dropbox image added.



--
View this message in context: http://lucene.472066.n3.nabble.com/Memory-Usage-on-Windows-Os-while-indexing-tp4112262p4112403.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Memory Usage on Windows Os while indexing

Posted by Shawn Heisey <so...@elyograg.org>.
On 1/20/2014 3:02 AM, onetwothree wrote:
> OS Windows server 2008
>
> 4 Cpu
> 8 GB Ram

<snip>

> We're using a .Net Service (based on Solr.Net) for updating and inserting
> documents on a single Solr Core instance. The size of documents sent to Solr
> vary from 1 Kb up to 8Mb, we're sending the documents in batches, using one
> or multiple threads. The current size of the Solr Index is about 15GB.
>
> The indexing service is running around 4 a 5 hours per day, to complete all
> inserts and updates to Solr. While the indexing process is running the
> Tomcat process memory usage keeps growing up to > 7GB Ram (using Process
> Explorer monitor tool) and does not reduce, even after 24 hours. After a
> restart of Tomcat, or a Reload Core in the Solr Admin the memory drops back
> to 1 a 2 GB Ram. While using a tool like VisualVM to monitor the Tomcat
> process, the memory usage of Tomcat seems ok, memory consumption is in range
> of defined jvm startup params (see image).
>
> So it seems that filesystem buffers are consuming all the leftover memory??,
> and don't release memory, even after a quite amount of time? Is there a way
> handle this behaviour, in a way that not all memory is consumed? Are there
> other alternatives? Best practices?
>
> <http://lucene.472066.n3.nabble.com/file/n4112262/Capture.png>

That picture seems to be a very low-res copy of your screenshot.  I 
can't really make it out.  I can tell you that it's completely normal 
for the OS disk cache (the filesystem buffers you mention) to take up 
all leftover memory.  If an application requests some of that memory, 
the OS will instantly give it up.

First, I'm going to explain something about memory reporting and Solr 
that I've noticed, then I will give you some news you probably won't like.

The numbers reported by visualvm are a true picture of Java heap memory 
usage.  The actual memory usage for Solr will be just a little bit more 
than those numbers.  In the newest versions of Solr, there seems to be a 
side effect of the Java MMAP implementation that results in incorrect 
memory usage reporting at the operating system level.  Here's a "top" 
output on one of my Solr servers running CentOS, sorted by memory 
usage.  The process at the top of the list is Solr.

https://www.dropbox.com/s/y1nus7lpzlb1mp9/solr-memory-usage-2014-01-20%2010.28.28.png

Some quick numbers for you:  The machine has 64GB of RAM.  Solr shows a 
virtual memory size of 59.2GB.  My indexes take up 51293336 of disk 
space, and Solr has a 6GB heap, so 59.2GB is not out of line for the 
virtual memory size.

Now for where things get weird: There is 48GB of RAM taken up by the 
"cached" value, which is the OS disk cache.  The screenshot also shows 
that Solr is using 22GB of resident RAM.  If you add the 48GB in the OS 
disk cache and the 22GB of resident RAM for Solr, you get 70GB ... which 
is more memory than the machine even HAS, so we know something's off.  
The 'shared' memory for Solr is 15GB, which when you subtract it from 
the 22GB, gives you 7GB, which is much more realistic with a 6GB heap, 
and also makes it fit within the total system RAM.

The news that you probably won't like:

I'm assuming that the whole reason you looked into memory usage was 
because you're having performance problems.  With 8GB of RAM and 3GB 
given to Solr, you basically have a little bit less than 5GB of RAM for 
the OS disk cache.  With that much RAM, most people can effectively 
cache an index up to about 10GB before performance problems show up.  
Your index is 15GB.  You need more total system RAM.  If Solr isn't 
crashing, you can probably leave the heap at 3GB with no problem.

http://wiki.apache.org/solr/SolrPerformanceProblems

Thanks,
Shawn


Re: Memory Usage on Windows Os while indexing

Posted by Yago Riveiro <ya...@gmail.com>.
Other thing, Solr use a lot the OS cache to cache the index and gain performance. This can be another reason why the solr process has a high memory value allocated.


/yago
—
/Yago Riveiro

On Mon, Jan 20, 2014 at 10:03 AM, onetwothree <jo...@telenet.be>
wrote:

> Facts:
> OS Windows server 2008
> 4 Cpu
> 8 GB Ram
> Tomcat Service version 7.0 (64 bit)
> Only running Solr
> Optional JVM parameters set xmx = 3072, xms = 1024
> Solr version 4.5.0.
> One Core instance (both for querying and indexing)
> *Schema config:*
> minGramSize="2" maxGramSize="20"
> most of the fields are stored = "true" (required)
> *Solr config:*
> ramBufferSizeMB: 100
> maxIndexingThreads: 8
> directoryFactory: MMapDirectory
> autocommit: maxdocs 10000, maxtime 15000, opensearcher false
> cache (defaults): 
> filtercache initialsize:512 size: 512 autowarm: 0
> queryresultcache initialsize:512 size: 512 autowarm: 0
> documentcache initialsize:512 size: 512 autowarm: 0
> Problem description:
> We're using a .Net Service (based on Solr.Net) for updating and inserting
> documents on a single Solr Core instance. The size of documents sent to Solr
> vary from 1 Kb up to 8Mb, we're sending the documents in batches, using one
> or multiple threads. The current size of the Solr Index is about 15GB.
> The indexing service is running around 4 a 5 hours per day, to complete all
> inserts and updates to Solr. While the indexing process is running the
> Tomcat process memory usage keeps growing up to > 7GB Ram (using Process
> Explorer monitor tool) and does not reduce, even after 24 hours. After a
> restart of Tomcat, or a Reload Core in the Solr Admin the memory drops back
> to 1 a 2 GB Ram. While using a tool like VisualVM to monitor the Tomcat
> process, the memory usage of Tomcat seems ok, memory consumption is in range
> of defined jvm startup params (see image).
> So it seems that filesystem buffers are consuming all the leftover memory??,
> and don't release memory, even after a quite amount of time? Is there a way
> handle this behaviour, in a way that not all memory is consumed? Are there
> other alternatives? Best practices?
> <http://lucene.472066.n3.nabble.com/file/n4112262/Capture.png> 
> Thanks in advance
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Memory-Usage-on-Windows-Os-while-indexing-tp4112262.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Re: Memory Usage on Windows Os while indexing

Posted by Yago Riveiro <ya...@gmail.com>.
The fact that you see the memory consumed too high should be consecuency of that some memory of the heap is only released after a full GC. With the VisualVM tool you can try to force a full GC and see if the memory is released.


/yago
—
/Yago Riveiro

On Mon, Jan 20, 2014 at 10:03 AM, onetwothree <jo...@telenet.be>
wrote:

> Facts:
> OS Windows server 2008
> 4 Cpu
> 8 GB Ram
> Tomcat Service version 7.0 (64 bit)
> Only running Solr
> Optional JVM parameters set xmx = 3072, xms = 1024
> Solr version 4.5.0.
> One Core instance (both for querying and indexing)
> *Schema config:*
> minGramSize="2" maxGramSize="20"
> most of the fields are stored = "true" (required)
> *Solr config:*
> ramBufferSizeMB: 100
> maxIndexingThreads: 8
> directoryFactory: MMapDirectory
> autocommit: maxdocs 10000, maxtime 15000, opensearcher false
> cache (defaults): 
> filtercache initialsize:512 size: 512 autowarm: 0
> queryresultcache initialsize:512 size: 512 autowarm: 0
> documentcache initialsize:512 size: 512 autowarm: 0
> Problem description:
> We're using a .Net Service (based on Solr.Net) for updating and inserting
> documents on a single Solr Core instance. The size of documents sent to Solr
> vary from 1 Kb up to 8Mb, we're sending the documents in batches, using one
> or multiple threads. The current size of the Solr Index is about 15GB.
> The indexing service is running around 4 a 5 hours per day, to complete all
> inserts and updates to Solr. While the indexing process is running the
> Tomcat process memory usage keeps growing up to > 7GB Ram (using Process
> Explorer monitor tool) and does not reduce, even after 24 hours. After a
> restart of Tomcat, or a Reload Core in the Solr Admin the memory drops back
> to 1 a 2 GB Ram. While using a tool like VisualVM to monitor the Tomcat
> process, the memory usage of Tomcat seems ok, memory consumption is in range
> of defined jvm startup params (see image).
> So it seems that filesystem buffers are consuming all the leftover memory??,
> and don't release memory, even after a quite amount of time? Is there a way
> handle this behaviour, in a way that not all memory is consumed? Are there
> other alternatives? Best practices?
> <http://lucene.472066.n3.nabble.com/file/n4112262/Capture.png> 
> Thanks in advance
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Memory-Usage-on-Windows-Os-while-indexing-tp4112262.html
> Sent from the Solr - User mailing list archive at Nabble.com.